-
Notifications
You must be signed in to change notification settings - Fork 56
Description
I have a script that extracts parameters from a DLL such as the Author and Product name and I have identified a case where the attributes are encoded twice within the PackageURL. I then use these attributes to create a PURL that can be used as a decoded UTF-8 string.
Here are some values that can be used to reproduce the issue. Using file attributes from DotNetNuke.DLL as an example.
The 'dll' contains the following methods:
def get_product_name():
product_name = "https://dnncommunity.org" # This is the unencoded value
return urllib.parse.quote(product_name, safe='') # This is the encoded value "https%%3A%2F%2Fdnncommunity.org"
def get_author():
author = ".NET Foundation" # This is the unencoded value
return urllib.parse.quote(author, safe='') # This is the encoded value ".NET%20Foundation"
I need to combine the content in a forward slash ('/' separated format so that Nexus can understand it.
e.g. /<product_name>
purlattrs = f'{dll.get_author()}%2F{dll.get_product_name()}'
print(purlattrs) # output = '.NET%20Foundation%2Fhttps%3A%2F%2Fdnncommunity.org'
# This is the correctly encoded URL safe string
_qualifiers = {'Attr1':purlattrs), 'Attr2':'Foo'}
purl = PackageURL(type='generic', name="DotNetNuke.dll", version="9.11.0.46", qualifiers=_qualifiers)
print(purl)
The purl that is printed is
"pkg:generic/DotNetNuke.dll@9.11.0.46?Attr1=.NET%2520Foundation%252Fhttps%253A%252F%252Fdnncommunity.org&Attr2=Foo"
- As you can see, the Space characters are encoded now as %2520
- The Forward Slash is now %252F instead of %2F
- The colon is now %253A instead of %3A.
- The % Character is being encoded to %25.
If I pass in the raw string value to the PackageURL like below:
purlattrs = f".Net Foundation/https://dnncommunity.org"
print(purlattrs) # output = ".Net Foundation/https://dnncommunity.org"
I get the following output from print() when I pass in the raw string value.
"pkg:generic/DotNetNuke.dll@9.11.0.46?Attr1=.NET%20Foundation/https://dnncommunity.org&Attr2=Foo"
-
In this scenario the Encoding works for the Space, but does not work for the Slashes or Colon.
-
Recommend changing the behavior of the PURL encoding to urllib.parse.quote and url.parse.unquote ,or eliminating the encoding portion and having the PackageURL user perform the encoding/decoding.