Whitespace encoding using stringByAddingPercentEscapesUsingEncoding - objective-c

I am encoding white spaces in a string using
[#"iPhone Content.doc" stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding]
in SKPSMTP message sending. But while receiving mail at attachments place I am getting the name iPhone%20Content.doc - instead of a space it shows %20. How can this be avoided / correctly encoded?

If you're doing stringByAddingPercentEscapesUsingEncoding then you're going to get percent signs in your result string... You can either use something different, or go back through and remove the percent signs later.
From the doc:
stringByAddingPercentEscapesUsingEncoding: Returns a representation of
the receiver using a given encoding to determine the percent escapes
necessary to convert the receiver into a legal URL string.
aka, "this method adds percent signs". If you want to reverse this process, use stringByReplacingPercentEscapesUsingEncoding
Just a side note, %20 is there because the hex representation of the space character is 20 and the % sign is an escape. You only need to do this for URLs, as they disallow the use of whitespace characters.

I got solution for my question. Actually am missed to set the "" to a string.

Of course the remote receiver can not accept the url with whitespace, so we must convert the URL address using the stringByAddingPercentEscapesUsingEncoding function.
This function replaces spaces in the URL expression with %20. It is especially useful when the URL contains non-ascii characters - you have use the function to percent-escape the URL so that the remote server can accept your request.

Related

AspNet core Url decoding

I am using AspNetCore 2.1.
I encountered an issue to deserialize a portion of URL:
http://localhost:55381/api/Umbrellas/cc1892b0-b790-4698-ae3e-07bee39fd29b/ModeOperationnelWithAppliedEvents?dateDeValeur=2018-09-01T02:00:00.000+02:00
the part "2018-09-01T02:00:00.000+02:00" is expected to be deserialized as DateTimeOffset. But it failed to do it. A default(DateTimeOffset) is returned.
If I encode to this format "2018-09-01T02%3A00%3A00.000%2B02%3A00" => correctly deserialized.
When it is enclosed in URL, that does not work.
In the contrarily, when the same format is enclosed in the body of message, it is correctly deserialized.
{"lastKnownAggregateVersion":4,"validFrom":"2017-09-03T00:00:00.000+02:00","commandId":"0cfa7da0-7895-4917-89ac-24ffa3abb87c","newDateDeValeur":"2017-09-03T00:00:00.000+02:00","eventUniqueIdentifier":{"streamName":"umbrella-54576b92-0234-4ec1-8eee-142375c53325","eventVersion":0},"aggregateId":"54576b92-0234-4ec1-8eee-142375c53325"}
According to RFC3986 both colon ':' and '+' is legal char in a URL. Does anyone have an idea on this?
Ok it turns out URL and URI have different standard
the URL standard is here RFC1738: Uniform Resource Locators (URL). So according to the doc, ':' is reserved for scheme.
Many URL schemes reserve certain characters for a special meaning:
their appearance in the scheme-specific part of the URL has a
designated semantics. If the character corresponding to an octet is
reserved in a scheme, the octet must be encoded. The characters ";",
"/", "?", ":", "#", "=" and "&" are the characters which may be
reserved for special meaning within a scheme. No other characters may
be reserved within a scheme.
and when it goes to +:
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

Defining the contents of a URL

I have a web site which contains forms for my customers to download. They are constantly telling me that the %20 listed in the url when they are looking at a form means there is a 20% discount on the items listed on the form. The following url is what is displayed in one example. Can you explain to me what the %20 means in this url? http://www.schumachersuniforms.com/form/Atonement%20PreK.pdf
Percent Encoding
A URL cannot contain certain characters. The SPACE character is one of the those forbidden chapters.
Your PDF document is apparently named with a SPACE in the middle, Atonement PreK.pdf.
Percent Encoding, also known as URL Encoding, is a way to replace the offending characters with a sequence of other characters. That sequence begins with a PERCENT SIGN character. A hexadecimal number of the character’s code point follows.
The decimal code point for SPACE is 32, the hex is 20. So the string %20 substitutes for the SPACE.
No way around this:
If you really don't want the %20, then avoid naming your PDF document with space characters. Example: AtonementPreK.pdf.
Or use a more sophisticated web scheme for handling the URL triggering a download other than directly referencing the file name.
Do not confuse URL encoding with HTML (and XML) character entity references.

Sum symbol in POST

I send to my sever this POST petition:
name=pepito&friends=pep+mar
And I notice what arrives is
name=pepito&friends=pep mar
Why sum symbol disappears?
Than you
A + sign is one way to represent spaces in URLs. The other would be %20. There has been a question describing this: When to encode space to plus (+) or %20?
If you want to send a plus, you have to properly encode it with %2B.
Read up on allowed characters in URLs and URLEncode functions.

unicode escapes in objective-c

I have a string "Artîsté". I use json_encode from PHP on it and I get "Art\u00eest\u00e9".
How do I convert that to an NSString? I have tried many things and none of them work I always end up getting Artîsté
For Example:
NSString stringWithUTF8String:"Art\u00c3\u00aest\u00c3\u00a9"];//Artîsté
#"Art\u00c3\u00aest\u00c3\u00a9"; //Artîsté
You can use CFStringCreateFromExternalRepresentation with the kCFStringEncodingNonLossyASCII encoding to parse the \uXXXX escape sequences. Check out my answer here:
Converting escaped UTF8 characters back to their original form
The problem is your input string:
"Art\u00c3\u00aest\u00c3\u00a9"
does in fact literally mean "Artîsté". \u00c3 is 'Ã', \u00ae is '®', and \u00a9 is '©'.
Whatever is producing your input string is receiving UTF-8 input but expecting something else (e.g., cp1252, ISO-8859-1, or ISO-8859-15)

When should space be encoded to plus (+) or %20? [duplicate]

This question already has answers here:
URL encoding the space character: + or %20?
(5 answers)
Closed 1 year ago.
Sometimes the spaces get URL encoded to the + sign, and some other times to %20. What is the difference and why should this happen?
+ means a space only in application/x-www-form-urlencoded content, such as the query part of a URL:
http://www.example.com/path/foo+bar/path?query+name=query+value
In this URL, the parameter name is query name with a space and the value is query value with a space, but the folder name in the path is literally foo+bar, not foo bar.
%20 is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20 and pluses with %2B. This is what, e.g., encodeURIComponent() does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).
See Also
HTML 4.01 Specification application/x-www-form-urlencoded
So, the answers here are all a bit incomplete. The use of a '%20' to encode a space in URLs is explicitly defined in RFC 3986, which defines how a URI is built. There is no mention in this specification of using a '+' for encoding spaces - if you go solely by this specification, a space must be encoded as '%20'.
The mention of using '+' for encoding spaces comes from the various incarnations of the HTML specification - specifically in the section describing content type 'application/x-www-form-urlencoded'. This is used for posting form data.
Now, the HTML 2.0 specification (RFC 1866) explicitly said, in section 8.2.2, that the query part of a GET request's URL string should be encoded as 'application/x-www-form-urlencoded'. This, in theory, suggests that it's legal to use a '+' in the URL in the query string (after the '?').
But... does it really? Remember, HTML is itself a content specification, and URLs with query strings can be used with content other than HTML. Further, while the later versions of the HTML spec continue to define '+' as legal in 'application/x-www-form-urlencoded' content, they completely omit the part saying that GET request query strings are defined as that type. There is, in fact, no mention whatsoever about the query string encoding in anything after the HTML 2.0 specification.
Which leaves us with the question - is it valid? Certainly there's a lot of legacy code which supports '+' in query strings, and a lot of code which generates it as well. So odds are good you won't break if you use '+'. (And, in fact, I did all the research on this recently because I discovered a major site which failed to accept '%20' in a GET query as a space. They actually failed to decode any percent encoded character. So the service you're using may be relevant as well.)
But from a pure reading of the specifications, without the language from the HTML 2.0 specification carried over into later versions, URLs are covered entirely by RFC 3986, which means spaces ought to be converted to '%20'. And definitely that should be the case if you are requesting anything other than an HTML document.
http://www.example.com/some/path/to/resource?param1=value1
The part before the question mark must use % encoding (so %20 for space), after the question mark you can use either %20 or + for a space. If you need an actual + after the question mark use %2B.
For compatibility reasons, it's better to always encode spaces as "%20", not as "+".
It was RFC 1866 (HTML 2.0 specification), which specified that space characters should be encoded as "+" in "application/x-www-form-urlencoded" content-type key-value pairs. (see paragraph 8.2.1. subparagraph 1.). This way of encoding form data is also given in later HTML specifications, look for relevant paragraphs about application/x-www-form-urlencoded.
Here is an example of a URL string where RFC 1866 allows encoding spaces as pluses: "http://example.com/over/there?name=foo+bar". So, only after "?", spaces can be replaced by pluses, according to RFC 1866. In other cases, spaces should be encoded to %20. But since it's hard to determine the context, it's the best practice to never encode spaces as "+".
I would recommend to percent-encode all characters except "unreserved" defined in RFC 3986, p.2.3.
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
The only situation when you may want to encode spaces as "+" (one byte) rather than "%20" (three bytes) is when you know for sure how to interpret the context, and when the size of the query string is of the essence.
What's the difference? See the other answers.
When should we use + instead of %20? Use + if, for some reason, you want to make the URL query string (?.....) or hash fragment (#....) more readable. Example: You can actually read this:
https://www.google.se/#q=google+doesn%27t+encode+:+and+uses+%2B+instead+of+spaces
(%2B = +)
But the following is a lot harder to read (at least to me):
https://www.google.se/#q=google%20doesn%27t%20oops%20:%20%20this%20text%20%2B%20is%20different%20spaces
I would think + is unlikely to break anything, since Google uses + (see the 1st link above) and they've probably thought about this. I'm going to use + myself just because readable + Google thinks it's OK.