Add special characters to Uri, Kotlin - kotlin

So, I have my base URL, which is this:
val GITHUB_BASE_URL: String = "https://api.github.com/search/repositories"
And then I have this code that appends the param q (REPO_NAME_PARAM == query) to the Uri and builds it:
val builtUri: Uri = Uri.parse(GITHUB_BASE_URL).buildUpon()
.appendQueryParameter(REPO_NAME_PARAM, repoName)
.build()
Until here, everything works fine. But, when I try to filter the search of the repositories by the language they are written in (which the URL, for example, should be https://api.github.com/search/repositories?q=hello+language:Kotlin), the + and the : characters get replaced by %2B and %3A. This causes the app to not retrieve the expected results, as the characters got changed in the final url.
This is the code that I currently have
val WRITTEN_IN_PARAM: String = "+language:"
val builtUri: Uri = Uri.parse(GITHUB_BASE_URL).buildUpon()
.appendQueryParameter(REPO_NAME_PARAM, repoName+ WRITTEN_IN_PARAM+"Kotlin")
.build()

2B or not 2B, that is the question. :)
The problem is that the URL parameter is being URL Encoded twice. When we send certain characters in HTTP queries, they need to be encoded. One encoding (considered a shortcut) is to turn a space into a + symbol. The proper way to encode a space is with %20.
However, when the code above gets that already encoded String it doesn't know that the + is already encoded from a space and tries to encode it again (using %2B, the encoding for +).
If you hit the URL you've provided with %20 in place of +, and %3A in place of :, it should work fine. Therefore, the fix is to not send + unless you really want a +, in which case it will be properly encoded to a %2B.
The Fix: The library being used appears to correctly encode strings, just leave the + as a space and it should give you what you need.
Here is a good list of characters and their encoding, if you are interested.

Related

Is it possible to use apache's URIBuilder to set a parameter with percentage sign?

I want to build this complete URL:
locahost/some/path?param1=%06
using org.apache.http.client.utils.URIBuilder method setParameter(final String param, final String value). At its javadoc, there's line:
The parameter name and value are expected to be unescaped and may contain non ASCII characters
But when I use setParameter("param1","%06") I always get ...param1=%2506 instead of ...param1=%06. Looking here I noticed percent sign is 25 in hex.
Should I parse this manually or there's a way to keep using URIBuilder and keep the parameters as is?

How to replace "%2B" with "+" when calling RedirectToAction()

I'm using the RedirectToAction() method in an ASP.NET Core 2.1 controller called CatalogController:
return RedirectToAction("search", new { search_string = "example+string" });
This redirects to a URL:
catalog/search/?search_string=example%2Bstring
How do I replace the %2B encoding with a + instead?
The URL should instead look like:
catalog/search/?search_string=example+string
The RedirectToAction() method assumes that any values passed via the RouteValues parameter are not encoded; the RedirectToAction() method will take care of the URL encoding on your behalf. As such, when you enter a +, it's treating it as a literal + symbol, not an encoded space.
%2B is the correct encoding for a literal + symbol. If you want a space to be encoded in the URL, then you should enter a space in your RouteValues dictionary (e.g., search_string = "example string"). This will encode the space as a %20.
Note: A %20 is the equivalent of a + in an encoded URL, so I'm assuming that will satisfy your requirements.
If your search_string value is coming from a URL encoded source, you will need to first decode it using e.g. WebUtility.UrlDecode(). That said, if you're retrieving your search_string value from an action parameter or binding model, this decoding should already be done for you.
If, for some reason, you want to treat literal + symbols as spaces, you'll need to explicitly perform that replace on your source value (e.g., search_string.Replace("+", " ")).

Inconsistencies in URL encoding methods across Objective-C and Swift

I have the following Objective-C code:
[#"http://www.google.com" stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLPathAllowedCharacterSet]];
// http%3A//www.google.com
And yet, in Swift:
"http://www.google.com".addingPercentEncoding(withAllowedCharacters: .urlPathAllowed)
// http://www.google.com
To what can I attribute this discrepancy?
..and for extra credit, can I rely on this code to encode for url path reserved characters while passing a full url like this?
The issue actually rests in the difference between NSString method stringByAddingPercentEncodingWithAllowedCharacters and String method addingPercentEncoding(withAllowedCharacters:). And this behavior has been changing from version to version. (It looks like the latest beta of iOS 11 now restores this behavior we used to see.)
I believe the root of the issue rests in the particulars of how paths are percent encoded. Section 3.3 of RFC 3986 says that colons are permitted in paths except in the first segment of a relative path.
The NSString method captures this notion, e.g. imagine a path whose first directory was foo: (with a colon) and a subdirectory of bar: (also with a colon):
NSString *string = #"foo:/bar:";
NSCharacterSet *cs = [NSCharacterSet URLPathAllowedCharacterSet];
NSLog(#"%#", [string stringByAddingPercentEncodingWithAllowedCharacters:cs]);
That results in:
foo%3A/bar:
The : in the first segment of the page is percent encoded, but the : in subsequent segments are not. This captures the logic of how to handle colons in relative paths per RFC 3986.
The String method addingPercentEncoding(withAllowedCharacters:), however, does not do this:
let string = "foo:/bar:"
os_log("%#", string.addingPercentEncoding(withAllowedCharacters: .urlPathAllowed)!)
Yields:
foo:/bar:
Clearly, the String method does not attempt that position-sensitive logic. This implementation is more in keeping with the name of the method (it considers solely what characters are "allowed" with no special logic that tries to guess, based upon where the allowed character appears, whether it's truly allowed or not.)
I gather that you are saddled with the code supplied in the question, but we should note that this behavior of percent escaping colons in relative paths, while interesting to explain what you experienced, is not really relevant to your immediate problem. The code you have been provided is simply incorrect. It is attempting to percent encode a URL as if it was just a path. But, it’s not a path; it’s a URL, which is a different thing with its own rules.
The deeper insight in percent encoding URLs is to acknowledge that different components of a URL allow different sets of characters, i.e. they require different percent encoding. That’s why NSCharacterSet has so many different URL-related character sets.
You really should percent encode the individual components, percent encoding each with the character set allowed for that type of component. Only when the individual components are percent encoded should they then be concatenated together to form the whole the URL.
Alternatively, NSURLComponents is designed precisely for this purpose, getting you out of the weeds of percent-encoding the individual components yourself. For example:
var components = URLComponents(string: "http://httpbin.org/post")!
let foo = URLQueryItem(name: "foo", value: "bar & baz")
let qux = URLQueryItem(name: "qux", value: "42")
components.queryItems = [foo, qux]
let url = components.url!
That yields the following, with the & and the two spaces properly percent escaped within the foo value, but it correctly left the & in-between foo and qux:
http://httpbin.org/post?foo=bar%20%26%20baz&qux=42
It’s worth noting, though, that NSURLComponents has a small, yet fairly fundamental flaw: Specifically, if you have query values, NSURLQueryItem, that could have + characters, most web services need that percent escaped, but NSURLComponents won’t. If your URL has query components and if those query values might include + characters, I’d advise against NSURLComponents and would instead advise percent encoding the individual components of a URL yourself.

Amazon s3 URL + being encoded to %2?

I've got Amazon s3 integrated with my hosting account at WP Engine. Everything works great except when it comes to files with + characters in them.
For example in the following case when a file is named: test+2.pdf
http://support.mcsolutions.com/wp-content/uploads/2011/11/test+2.pdf = does not work.
The following URL is the amazon URL. Notice the + charcter is encoded. Is there a way to prevent/change this?
http://mcsolutionswpe.s3.amazonaws.com/mcsupport/wp-content/uploads/2011/11/test%2b2.pdf
Other URLs work fine:
Amazon -> http://mcsolutionswpe.s3.amazonaws.com/mcsupport/wp-content/uploads/2011/11/test2.pdf
Website -> http://support.mcsolutions.com/wp-content/uploads/2011/11/test2.pdf
If I understand your question correctly, then no, there is no way to really change this.
The cause appears to be an unfortunate design decision made on S3 many years ago -- which, of course, cannot be fixed, now, because it would break too many other things -- which involves S3 using an incorrect variant of URL-escaping (which includes but is not quite limited to "percent-encoding") in the path part of the URL, where the object's key is sent.
In the query string (the optional part of a URL after ? but before the fragment, if present, which begins with #), the + character is considered equivalent to [SPACE], (ASCII Dec 32, Hex 0x20).
...but in the path of a URL, this is not supposed to be the case.
...but in S3's implementation, it is.
So + doesn't actually mean +, it means [SPACE]... and therefore, + can't also mean +... which means that a different expression is required to convey + -- and that value is %2B, the url-escaped value of + (ASCII Dec 43, Hex 0x2B).
When you upload your files, the + is converted by the code you're using (assuming it understands this quirk, as apparently it does) into the format S3 expects (%2B)... and so it must be requested using %2B so when you download the files.
Strangely, but not surprisingly, if you store the file in S3 with a space in the path, you can actually request it with a + or a space or even %20 and all three of these should actually fetch the file... so if seeing the + in the path is what you want, you can sort of work around the issue by saving it with a space instead, though this workaround deserves to be described as a "hack" if ever a workaround did. This tactic will not work with libraries that generate pre-signed GET URLs, unless they specifically are designed to ignore the standard behavior of S3 and do what you want, instead... but for public links, it should be essentially equivalent.

Usage of url_encode

I tried using Ruby's url_encode (doc here.)
It encodes http://www.google.com as http%3A%2F%2Fwww.google.com. But it turns out that I cannot open the latter via a browser. If so, what's the use of this function? What is it useful for, when the URL that it encodes can't even be opened?
A typical use is the HTTP GET method, in where you need a query String.
Query String 1:
valudA=john&valueB=john2
Actual value server get:
valueA : "john"
valueB : "john2"
url_encode is used to make the key-value pair enable to store the string which includes some non-ASCII encoded character such as space and special character.
Suppose the valueB will store my name, code 4 j, you need to encode it because there are some spaces.
url_encode("code 4 j")
code%204%20j
Query string 2:
valueA=john&valueB=code%204%20j
Actual value server get:
valueA: "john"
valueB: "code 4 j"
You can use url_encode to encode for example the keys/values of a GET request.
Here is an example of what a SO search query URL looks after encoding:
https://stackoverflow.com/questions/tagged/c%23+or+.net+or+asp.net
As you can see, url encoding appears to be applied only on the last part of the URL, after the last slash.
In general you cannot use url_encode on your entire URL or you will also encode the special characters in a normal URL like the :// in your example.
You can check a tutorial that explains how it works here: http://www.permadi.com/tutorial/urlEncoding/