Is it right to encode "+" and "-" in that way? - objective-c

I am trying to encode string with UTF-8 format and all required characters instead of "+" and "-" are not formatted.
I investigated NSCharacterSet, and found out that standard URLHostAllowedCharacterSet method is not fully covering my issue.
I decide to create own NSCharacterSet with symbols that should be replaced:
NSCharacterSet *customCharacterSet = [[NSCharacterSet characterSetWithCharactersInString:#" \"#%/:<>?#[\\]^`{|}+-"] invertedSet];
This is working, but I am interested if it is right to do this replacement by my own, or maybe there is some standard methods that are doing this replacements that I do not find?

If you explicitly need to encode also the valid characters + and - then your custom character set is the right solution.

Related

NSURL returning nil for certain cases

I'm creating an NSURL to send as a request to a PHP rest API that I've got setup. Here's my code below:
NSMutableString *url = [NSMutableString stringWithFormat:#"http://www.private.com/recievedata.php?item=%#&contact=%#&discovery=%#&summary=%#",__item,__contactDetails,__lostFound,__explain];
//The " ' " in PHP is a special character, so we have to escape it in the URL
//The two slashes are because the "\" itself is a special character in Objective C
NSString *formattedURL = [url stringByReplacingOccurrencesOfString:#"'" withString:#"\\'"];
NSURLSession *session = [NSURLSession sharedSession];
NSURL *address = [NSURL URLWithString:formattedURL];
For some reason though, the address variable holds nil whenever I try to make the following url happen:
http://www.private.com/recievedata.php?item=hell\'&contact=hell&discovery=lost&summary=hell
I had to add "\" in front of the apostrophes as is evident from my code because PHP needs to have the " ' " escaped in its URLs. But by doing so, it seems like I've violated some requirement set out for NSURL. What do you guys think?
You said:
I had to add "" in front of the apostrophes as is evident from my code because PHP needs to have the " ' " escaped in its URLs. But by doing so, it seems like I've violated some requirement set out for NSURL. What do you guys think?
Yes, in your PHP code, if you are using single quotation marks around your string values, then you have to escape any ' characters that appear in your string literals.
But it is not the case that you should be using \ character to escape ' characters that appear in the values you are trying to pass in the URL of your request. (Nor should you do so in the body of a POST request.)
And if you are tempted to escape these ' characters because you're inserting these values into SQL statements, you should not be manually escaping them. Instead you should call mysqli_real_escape_string or explicitly bind values to ? placeholders in your SQL. Never rely upon the client code to escape these values (because you'd still be susceptible to SQL injection attacks).
Getting back to the encoding of reserved characters in a URL, this is governed by RFC 3986. The prudent strategy is to percent escape any character other than those in the unreserved character set. Notably, if your values might include & or + characters, the percent escaping is critical.
Thus, the correct encoding of the ' character in a URL query value is not \' but rather %27. Unfortunately, encoding stringByAddingPercentEscapesUsingEncoding is not sufficient, though (as it will let certain critical characters go unescaped in the value strings).
The historically, we'd percent escape value using CFURLCreateStringByAddingPercentEscapes (see Urlencode in Objective-C). The stringByAddingPercentEncodingWithAllowedCharacters, introduced in Mac OS 10.9 and iOS 7, can work, too (see https://stackoverflow.com/a/24888789/1271826 for an illustration of the possible character sets).

How do I remove hidden characters from a NSString?

After copying pasting a text from the web, in my mac app NSTextArea, I see
EE
If I copy these 2 letters in a browser I see:
E?E
If I copy them in google translator I get
E 'E
I cannot identify this character in between the two E. But the question is: how do I remove these hidden characters from my NSString?
In your uploaded file the specific hex code for the hidden character is 0x18
(found via Hex Fiend)
This character, along with others are part of a 'control character set'. The set also contains characters such as the tab (0x09) and newline (0x0A) - obviously those we don't want to remove.
In Objective-C, we can use the NSCharacterSet controlCharacterSet in conjunction with whitespaceAndNewlineCharacterSet to get just the blank characters that have no rendered width.
NSMutableCharacterSet* zeroWidthCharacterSet = [[NSCharacterSet controlCharacterSet] mutableCopy];
[zeroWidthCharacterSet formIntersectionWithCharacterSet:[[NSCharacterSet whitespaceAndNewlineCharacterSet] invertedSet]];
Then we can simply use the good old split by character set method
string = [[string componentsSeparatedByCharactersInSet:zeroWidthCharacterSet] componentsJoinedByString:#""];
Note that if a special character that uses more than one UTF8 character to represent itself (like Emoji) uses 0x18 then stripping it will break the character combo
Because the control characters are special, I don't believe you'd ever find them in an Emoji sequence.

Regex (searching for function(#"string content") to get "string content"

I have a little regex problem (don't we all sometimes).
The few pieces of code are from Objective C but regex expressions are still the same I believe.
I have two functions called
NSString * CRLocalizedString(NSString *key)
NSString * CRLocalizedArgString(NSString *key, ...)
These are scattered around my project for localisation.
Now I want to find them all.
Well go to directory, parse all files, etc
All fine there.
The regexes I use on the files are
[NSRegularExpression regularExpressionWithPattern:#"CRLocalizedString\\(#\\\"[^)]+\\\"\\)" options:0 error:&error];
[NSRegularExpression regularExpressionWithPattern:#"CRLocalizedArgString\\([^)]+\\)" options:0 error:&error];
And this works perfect except that my terminates character is an ).
The problem occurs with function calls like this
CRLocalizedString(#"Happy =), o so happy =D");
CRLocalizedArgString(#"Filter (%i)", 0.75f);
The regex ends the string at "Filter (%i" and at "Happy =)".
And this is where my regex knowledge ends and I do not now what to do anymore.
I thought using ");" as an end but this isn't always the case.
So I was hoping someone here knew something for me (complete different things then regex are also allowed of course)
Kind regards
Saren
Let's write your first regex without the extra level of C escapes:
CRLocalizedString\(#\"[^)]+\"\)
You don't have to escape a " for a regex, so let's get rid of those extra backslashes:
CRLocalizedString\(#"[^)]+"\)
So, you want to match a quoted string using "[^)]+". But that doesn't match every quoted string.
What is a quoted string? It's a ", followed by any number of string atoms, followed by another ". What is a string atom? It's any character except " or \, or a \ followed by any character. So here's a regex for a quoted string:
"([^"\\]|\\.)*"
Sticking that back into your first regex, we get this:
CRLocalizedString\(#"([^"\\]|\\.)*"\)
Here's a link to a regex tester demonstrating that regex.
Quoting it in an Objective-C string literal gives us this:
#"CRLocalizedString\\(#\"([^\"\\\\]|\\\\.)*\"\\)"
It is impossible to write a regex to match calls to CRLocalizedArgString in the general case, because such calls can take arbitrary expressions as arguments, and regexes cannot match arbitrary expressions (because they can contain arbitrary levels of nested parentheses, which regexes cannot match).
You could just hope that there are no parentheses in the argument list, and use this regex:
CRLocalizedArgString\(#"([^"\\]|\\.)*"[^)]*\)
Here's a link to a regex tester demonstrating that regex.
Quoting it in an Objective-C string literal gives us this:
#"CRLocalizedArgString\\(#\"([^\"\\\\]|\\\\.)*\"[^)]*\\)"

Format specifies type 'unsigned short' but the argument has type 'int'

I have a method that scans a string, converting any new lines to <br> for HTML. The line in question is:
NSCharacterSet *newLineCharacters = [NSCharacterSet characterSetWithCharactersInString:
[NSString stringWithFormat:#"\n\r%C%C%C%C", 0x0085, 0x000C, 0x2028, 0x2029]];
Xcode gives me a warning here:
Format specifies type 'unsigned short' but the argument has type 'int'
It's recommending that I change all %C to %d, however that turns out to break the function. What is the correct way to do this, and why is Xcode recommending the wrong thing?
One option is to cast your arguments to unsigned short: i.e. (unsigned short)0x0085 etc
But if you're looking for newlines, you should just use the newline character set. This is "Unicode compliant":[ NSCharacterSet newlineCharacterSet ]
edit
Revisiting this question: If you are trying to separate an NSString/CFString by line breaks, you should probably use -[ NSString getLineStart:end:contentsEnd:forRange:].
Using the canned character sets as #nielsbot suggests is definitely the way to go when there's one that matches your app's needs.
But as far as writing a string literal with Unicode codepoints in it, you don't need -stringWithFormat:, you can just use Unicode escapes:
NSCharacterSet *aCharSet = [NSCharacterSet
characterSetWithCharactersInString:#"\u2704\u2710\u2764"];

Objective C, How to delete string character set?

I download strings from a web server and it contains special characters such as /n /p and so on. What is the best way to get rid of these?
do you have a list of characters that need stripping?
or you could use
[string stringByTrimmingCharactersInSet:[NSCharacterSet newlineCharacterSet]];
I'd have thought your best bet would be to use one of the NSString methods such as stringByReplacingCharactersInRange:withString: (replacing the characters in question with an empty string) or stringByTrimmingCharactersInSet:.
you can use
Str=[Str stringByReplacingOccurrencesOfString:#"/n" withString:#""];
if you have selected special characters use Array of special characters and then will also work.