-
-
Notifications
You must be signed in to change notification settings - Fork 112
add URLSearchParams::to_raw_string() method #1023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
tests/ada_c.cpp
Outdated
|
|
||
| // to_raw_string preserves %20 encoding for spaces | ||
| ada_owned_string raw_str = ada_search_params_to_raw_string(out); | ||
| ASSERT_EQ(convert_string(raw_str), "a=b%20c&d=e%20f"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anonrig Thanks for your effort. This is wonderful for supporting this new feature from us(Kong). I was wondering if it is possible to only remove b and then keep the other parts the same as before.
Before: a=b c&b=remove&d=e+f
Removed b: a=b c&d=e+f (What we expected.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following. Can you recommend a test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote bindings to LuaJIT on this, and it is almost what we are after, here is an example code:
local search = require("resty.ada.search").parse("a=%20&b=,&c=remove&e=+&f=a b")
local normalized = search:remove("c"):tostring()
local raw = search:to_raw_string()
print("NORMALIZED: ", normalized)
print("RAW: ", raw)
This outputs:
NORMALIZED: a=+&b=%2C&e=+&f=a+b
RAW: a=%20&b=,&e=%20&f=a%20b
So the RAW seems to still do space normalization aka + and (space) is turned to %20. In fact I was expecting it to turn them to + as you see in NORMALIZED version. But it is probably best if no normalization at all happens in raw mode, that is the output would look like this:
RAW: a=%20&b=,&e=+&f=a b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, if we remove percent encode calls above, we can make it more raw. Does that work for you? (just to double validate)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anonrig that would work for us! Thank you. I was thinking about exactly the same (removing the percent encoding in raw).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bungle I've updated the implementation. Please take a look. Once we're OK with the result, I'll land it and make a new release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anonrig, actually it now seems to normalize space as (space) (and perhaps others too -> e.g. in decoded form, not in given form).
input:
a=%20&b=,&c=remove&e=+&f=a b
raw/unsafe output:
a= &b=,&e= &f=a b
So it seems Ada will percent decode (and + decode) internally and then output that (multiple different "things" will decode to same, and then we lose the original form — my bad I didn't notice it earlier). The problem is that we are looking a way to get query in like this a=%20&b=,&c=remove&e=+&f=a b, and when we want to remove e.g. c, we don't want it to touch at all to rest of it. Meaning, it should just remove &c=remove and that's it. Everything else we like to maintain as we given, a=%20&b=,&e=+&f=a b in to_raw/unsafe_string (aka don't do any processing on those that we are not touching — the removal being our current case and it is fine, if it could just handle that at least for now).
Thus the previous implementation was closer to our goal. Now I am not sure if that is easy to do with Ada code base. I hope that I have not caused a lot of pain on my answer on this.
Or let's take a bit more complete example of remove c:
Input:
a=%20&b=,&c=remove&ä=ö&%C3%A4=%C3%B6&e=+&f=a b
NORMALIZED (to_string): a=+&b=%2C&%C3%A4=%C3%B6&%C3%A4=%C3%B6&e=+&f=a+b
RAW (to_raw_string): a= &b=,&ä=ö&ä=ö&e= &f=a b
WANTED: a=%20&b=,&ä=ö&%C3%A4=%C3%B6&e=+&f=a b
We are looking to retain much of the original. Aka non-destructive/non-changing modification.
I am afraid it went more difficult than originally anticipated.
I think what we see is that when Ada takes the input, it will decode it, and in that process we lose information from the original, e.g. was there + or %20 (or even non-valid space) OR was there , or %2C. The decoding is needed though for APIs to work naturally, e.g. remove should work like the keys are in decoded form. Perhaps what we need is then: ada_parse_unsafe_search_params in addition to ada_search_params_to_unsafe_string? (though how to implement ada_search_params_remove then as the key needs to still be normalized for it to work?)
|
@lemire would you mind taking a look? |
|
Looking |
|
@anonrig I pushed a new test. It seems that we want to allow the production of a URL such as... My recommendation:
Where lies the responsibility must be clear. It is fine to deliberately break the standard, but we must be absolutely explicit so that it does not come back to haunt us. |
lemire
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend that @anonrig considers my comment before merging this (specifically, renaming the function with the word unsafe in it).
|
Yes, I agree. Let's rename the function to to_unsafe_string |
|
@copilot can you rename to_raw_string to to_unsafe_string? |
Fixes #1022