Want to know the whole character set whose characters have to be escaped in an Objective-C NSString object in order to be recognized properly, e.g. " has to be escaped as \", as in
NSString *temporaryString = #"That book is dubbed as \"the little book\".";
Is the character set same with the one in C language char * string?
Thanks for your help :D
The only characters that have to be escaped are the " (double-quote) and \ (backslash) characters.
There are other special character literals such as \n that have special meaning but those are really a separate issue.
Objective-C NSString values use the same set of special character literals as C.
Related
Is there a way to have double-quotation marks in strings in Objective C without escaping them?
In PHP you can wrap a string in single quotation marks, in which case you do not have to escape anything in the string.
The only chance is to compile your source as Objective-C++ file
(file suffix ".mm"). Then the C++ raw string literals are also accepted when defining an NSString,
for example
NSString *str = #R"(Hello"World\n)";
has the 13 characters
H e l l o " W o r l d \ n
But that feature is only available in (Objective-)C++ source files,
not in (Objective-)C.
Unfortunately, there is no way (that I know of) to have an unescaped quotation mark inside a string in Objective-C. You can get a quotation mark using unicode or some other trick, but I believe that you want a less ugly way to write a quotation inside a string, not an even uglier one :)
P.S. Just for fun I've just tried to use a unicode escape sequence (#"\u0022"), and it turned out it is forbidden.
Curly quotes don't require escaping, and generally look better for messages presented to the end user:
NSString *str = #"Hello, “World”!";
I'm sorry for being annoying and asking other people to do this for me, but I have been trying for a while now and can't seem to get a working one. This is what it needs to allow:
Lower case letters
Upper case letters
Apostrophes (')
Dashes (-)
It doesn't matter what order these come in for the string that will be rejected as long as it doesn't contain anything but the above characters. It is for objective-c if that affects anything in regex expressions.
NSString *nameRegEx = #"^[A-Z][a-zA-Z]+$";
NSPredicate *firstTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", nameRegEx];
For the "upper & lower-case letters, dash and apostrophe" part of the regex, try :--
[a-zA-Z'\\-]
You need to escape the - dash, if you're not going to depend on it being in certain syntactic positions in the [] character-class.
In Java, we'd need to use \\ double-backslashes -- a single-backslash would escape a control-character into the compiler, so we need a double-backslash to get a \ single backslash past the compiler to act as an escape in the regex. It may well be similar for you.
Hope this helps.
I am new in programming. I have string NSString *string = #"\U0420\U043e\U0437\U044b"; and after each slash('\') i need put another slash to get string like this #"\\U0420\\U043e\\U0437\\U044b"
I am new to programming and objective-c. please help.
My original answer was:
Use [NSString stringByReplacingOccurrencesOfString:withString:] (reference).
NSString *string = #"\U0420\U043e\U0437\U044b";
NSString *converted = [string stringByReplacingOccurrencesOfString:#"\\"
withString:#"\\\\\\"];
However I now don't think that's right given the \ characters won't actually exist in string; instead the compiler will convert each of those sequences into a unicode character. You will need to encode string as this:
NSString *string = #"\\U0420\\U043e\\U0437\\U044b";
In order to use the above code. I cannot see any alternative to this.
Further Update: Often when I've come across questions like this there is a confusion between string literals and string data. In your question those \ characters won't appear as the compiler will have converted them into unicode characters (\Uxxx is a unicode escape sequence for a single character). However if you provided a string like that at runtime (say read from a text file) then those \ characters will exist and you can use the code above.
I have an NSString instance (let's called it myString) containing the following UTF-8 unicode character: \xc2\x96 ( that is the long dash seen in, e.g., MS Word ).
When printing the NSString to the console using NSLog and the %# format specifier, the character is replaced by an upside-down question mark indicating that something is wrong - and when using it as text in a table cell, the unicode character simply appears as blank space ( not the empty string - a blank space ).
To solve this, I would like to replace the \xc2\x96 unicode character with a "normal" dash - at first I thought this should be a 10 sec. task but after some research I have not yet found the "right way" to do this and this is where I would like your help.
What I have tried:
When I print myString in hex like this NSLog(#"%x", myString) I get the hex value: 96 for the unicode character representing the unicode character \xc2\x96.
Using this information I have made the following implementation to replace it with its "normal" dash equivalent:
for(int index = 0; index < [myString length]; index++)
{
NSLog(#"Hex:'%x' Char:'%c'", [myString characterAtIndex:index],[myString characterAtIndex:index]);
if([[NSString stringWithFormat:#"%x", [myString characterAtIndex:index]] isEqualToString:#"96"])
myString = [myString stringByReplacingCharactersInRange:NSMakeRange(index, 1) withString:#"-"];
}
... it works, but my eyes don't like it, and I would like to know if this can be done in much more cleaner and "right" way? E.g. like C#'s String.Replace(char,char) which supports unicode characters .
So to wrap up:
I'm looking for the "right way" to replace unicode chars in a string - I have done some research, but apparently, there is only methods available that replaces occurrences of a given NSString with another NSString.
I have read the following:
https://stackoverflow.com/a/5223737/700926
https://stackoverflow.com/a/5217703/700926
https://stackoverflow.com/a/714009/700926
https://stackoverflow.com/a/668254/700926
https://stackoverflow.com/a/2039396/700926
... but all of them explains how to replace a given NSString with another NSString and do not cover how specific unicode characters ( in particular double byte ) can be replaced.
You can make your string mutable (i. e. use an NSMutableString instead of an NSString). Also, the call to [[NSString stringWithFormat:#"%x", character] isEqualToString:#"96"] is as inefficient as possible - why not simply if (character == 0x96)? All in all, try
NSString *longDash = #"\xc2\x96";
[string replaceOccurrencesOfString:longDash withString:#"-"];
Alright, I'm trying to write some code that removes words that contain an apostrophe from an NSString. To do this, I've decided to use regular expressions, and I wrote one, that I tested using this website: http://rubular.com/r/YTV90BcgoQ
Here, the expression is: \S*'+\S
As shown on the website, the words containing an apostrophe are matched. But for some reason, in the application I'm writing, using this code:
sourceString = [sourceString stringByReplacingOccurrencesOfRegex:#"\S*'+\S" withString:#""];
Doesn't return any positive result. By NSLogging the 'sourceString', I notice that words like 'Don't' and 'Doesn't' are still present in the output.
It doesn't seem like my expression is the problem, but maybe RegexKitLite doesn't accept certain types of expressions? If someone knows what's going on here, please enlighten me !
Literal NSStrings use \ as an escape character so that you can put things like newlines \n into them. Regexes also use backslashes as an escape character for character classes like \S. When your literal string gets run through the compiler, the backslashes are treated as escape characters, and don't make it to the regex pattern.
Therefore, you need to escape the backslashes themselves in your literal NSString, in order to end up with backslashes in the string that is used as the pattern: #"\\S*'+\\S".
You should have seen a compiler warning about "Unknown escape sequence" -- don't ignore those warnings!