NSRegularExpression Xcode warning: Unknown escape sequence '\]' - objective-c

I've received a Warning in Xcode: Unknown escape sequence '\]'
Code in Question: _regexForFindingTags = [[NSRegularExpression alloc] initWithPattern:#"\[.*?\]" options:ops error:&error];
The Problematic Search Pattern: \[.*?\]
Why is there a Warning for this Specific Search Pattern?
How can this Warning be Overcome?
My Search Pattern works in Regex Tester (granted that's in Javascript). According to Ray Wenderlich's NSRegularExpression Tutorial the ] character should be escapable using the \ character, So I'm missing something...

You get a warning from your compiler that is parsing string literal, not from regex engine. As escaping also exists for string literals, the sequence #"\[" is just syntax error apart from regex' syntax (it is just string after all, right?). So, if original regex is \[.*?\], it must be transformed it into:
[… initWithPattern:#"\\[.*?\\]" …];
I.e. you escape brackets at regex level and then also escape backslashes at string literal level, so #"\\[.*?\\]" becomes \[.*?\] in memory bytes.

You unfortunately need to escape the \
So they need to be \ in NSString literals

Related

Cocoa Error 2048 on NSRegularExpression

I'm attempting to use NSRegularExpression to search for a string inside a pbxproj (Inside the .xcodeproj folder).
I'm searching for the compiler flags in the "Begin PBXBuildFile section" area
NSString* findFlagsRegex = #"([A-Z0-9]{24}\\s\\/\\*\\s[A-Za-z\\.\\s0-9]+\\*\\/\\s=\\s{isa\\s\\=\\s[A-Za-z]*;\\s?fileRef\\s\\=\\s[A-Z0-9]*\\s\\/\\*\\s[A-Za-z0-9\\s\\.]*\\*\\/;\\ssettings\\s\\=\\s{[A-Za-z0-9_\\s\\=\"-]*;\\s\\};\\s};)";
NSRegularExpression* expression3 = [NSRegularExpression regularExpressionWithPattern:findFlagsRegex options:kNilOptions error:&err];
NSLog(#"Error: %#",[err description]);
Error Domain=NSCocoaErrorDomain Code=2048 "The value “([A-Z0-9]{24}\s\/\*\s[A-Za-z\.\s0-9]+\*\/\s=\s{isa\s\=\s[A-Za-z]*;\s?fileRef\s\=\s[A-Z0-9]*\s\/\*\s[A-Za-z0-9\s\.]*\*\/;\ssettings\s\=\s{[A-Za-z0-9_\s\="-]*;\s\};\s};)” is invalid." UserInfo=0x61800026a7c0 {NSInvalidValue=([A-Z0-9]{24}\s\/\*\s[A-Za-z\.\s0-9]+\*\/\s=\s{isa\s\=\s[A-Za-z]*;\s?fileRef\s\=\s[A-Z0-9]*\s\/\*\s[A-Za-z0-9\s\.]*\*\/;\ssettings\s\=\s{[A-Za-z0-9_\s\="-]*;\s\};\s};)}
I copy:
([A-Z0-9]{24}\s\/\*\s[A-Za-z\.\s0-9]+\*\/\s=\s{isa\s\=\s[A-Za-z]*;\s?fileRef\s\=\s[A-Z0-9]*\s\/\*\s[A-Za-z0-9\s\.]*\*\/;\ssettings\s\=\s{[A-Za-z0-9_\s\="-]*;\s\};\s};)
The regular expression above works in RegexPal, directly copying it from the invalid value from the error message on the same test data... so I'm not sure what's wrong :/
Not sure if this will add anything, but this is a Mac App and not an iOS app.
Your pattern contains a lone literal }. I believe you meant to have two literal {s and two literal }s - this is a slightly modified version of the pattern you had in your question, with three more \\s inserted to escape the curly braces that are currently not escaped in your code.
NSString* findFlagsRegex = #"([A-Z0-9]{24}\\s\\/\\*\\s[A-Za-z\\.\\s0-9]+\\*\\/\\s=\\s\\{isa\\s\\=\\s[A-Za-z]*;\\s?fileRef\\s\\=\\s[A-Z0-9]*\\s\\/\\*\\s[A-Za-z0-9\\s\\.]*\\*\\/;\\ssettings\\s\\=\\s\\{[A-Za-z0-9_\\s\\=\"-]*;\\s\\};\\s\\};)";
I'm not sure whether the bug is with RegexPal, or if RegexPal depends on the copy of JS that your browser uses, or if the bug is with NSRegularExpressions, but either way, escaping a character which doesn't need to be escaped shouldn't cause any issues (or at least it's not my understanding of regular expressions that it should.)

Regex (searching for function(#"string content") to get "string content"

I have a little regex problem (don't we all sometimes).
The few pieces of code are from Objective C but regex expressions are still the same I believe.
I have two functions called
NSString * CRLocalizedString(NSString *key)
NSString * CRLocalizedArgString(NSString *key, ...)
These are scattered around my project for localisation.
Now I want to find them all.
Well go to directory, parse all files, etc
All fine there.
The regexes I use on the files are
[NSRegularExpression regularExpressionWithPattern:#"CRLocalizedString\\(#\\\"[^)]+\\\"\\)" options:0 error:&error];
[NSRegularExpression regularExpressionWithPattern:#"CRLocalizedArgString\\([^)]+\\)" options:0 error:&error];
And this works perfect except that my terminates character is an ).
The problem occurs with function calls like this
CRLocalizedString(#"Happy =), o so happy =D");
CRLocalizedArgString(#"Filter (%i)", 0.75f);
The regex ends the string at "Filter (%i" and at "Happy =)".
And this is where my regex knowledge ends and I do not now what to do anymore.
I thought using ");" as an end but this isn't always the case.
So I was hoping someone here knew something for me (complete different things then regex are also allowed of course)
Kind regards
Saren
Let's write your first regex without the extra level of C escapes:
CRLocalizedString\(#\"[^)]+\"\)
You don't have to escape a " for a regex, so let's get rid of those extra backslashes:
CRLocalizedString\(#"[^)]+"\)
So, you want to match a quoted string using "[^)]+". But that doesn't match every quoted string.
What is a quoted string? It's a ", followed by any number of string atoms, followed by another ". What is a string atom? It's any character except " or \, or a \ followed by any character. So here's a regex for a quoted string:
"([^"\\]|\\.)*"
Sticking that back into your first regex, we get this:
CRLocalizedString\(#"([^"\\]|\\.)*"\)
Here's a link to a regex tester demonstrating that regex.
Quoting it in an Objective-C string literal gives us this:
#"CRLocalizedString\\(#\"([^\"\\\\]|\\\\.)*\"\\)"
It is impossible to write a regex to match calls to CRLocalizedArgString in the general case, because such calls can take arbitrary expressions as arguments, and regexes cannot match arbitrary expressions (because they can contain arbitrary levels of nested parentheses, which regexes cannot match).
You could just hope that there are no parentheses in the argument list, and use this regex:
CRLocalizedArgString\(#"([^"\\]|\\.)*"[^)]*\)
Here's a link to a regex tester demonstrating that regex.
Quoting it in an Objective-C string literal gives us this:
#"CRLocalizedArgString\\(#\"([^\"\\\\]|\\\\.)*\"[^)]*\\)"

OS X Using literal asterisk in regular expression

I'm writing a program to make text that begins with /* and ends with */ a different color (syntax highlighting for a C comment). When I try this
#"/\*.*\*/";
I get unknown escape sequence. So I figured that to get a literal asterisk I had to use this
#"/[*].*[*]/";
and I get no errors, but when I use this code
commentPattern = #"/[*].*[*]/";
reg = [NSRegularExpression regularExpressionWithPattern:commentPattern options:kNilOptions error:nil];
results = [reg matchesInString:self.string options:kNilOptions range:NSMakeRange(0, [self.string length])];
for (NSTextCheckingResult *result in results)
{
[self setTextColor:[NSColor colorWithCalibratedRed:0.0 green:0.7 blue:0.0 alpha:1.0] range:result.range];
}
the text color of the comments doesn't change, but I don't see anything wrong with my regular expression. Can someone tell me why this wont work? I don't think it's a problem with the way I get the results or change their color, because I use the same method for other regular expressions.
You want to use this: "\\*".
\* is the escape sequence for * in regular expressions, but in C strings, \ also begins an escaped character token, so you have to escape that as well.
#"/\*.*\*/";
I get unknown escape sequence.
A string first converts escape sequences in the string, then the result is handed over to the regex engine. For instance, an escape sequence might be \t, which represents a tab, or \n which represents a newline. The string first converts an escape sequence to a special code. Your error is saying that \* is not a legal escape sequence for an NSString.
The regex engine needs to see a literal back slash followed by a *. To get a literal back slash in a string you need to write \\. However, for readability I prefer using a character class like you did with your second attempt.
You should NSLog what the results array contains to see what matches you are getting. If the matches are what you expect, then the problem is not with the regex.

NSRegularExpression to add escape characters

I'm trying to replace [word] with \[word\] using NSRegularExpression:
NSRegularExpression *metaRegex = [NSRegularExpression regularExpressionWithPattern:#"([\\[\\]])"
options:0
error:&metaRegexError];
NSString *escapedTarget = [metaRegex stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, string.length)
withTemplate:#"\\$1"];
But the output of this is $1word$1. You would think the first \ would escape the second \ character but instead it looks like it's escaping the $ character... How do I tell it to escape \ and not $?
Try:
#"\\\\$1"
for the replacement template. Basically: \\ will escape the \ for the string, so it's #"\$1" when it's sent to the regex. The \ then escapes the $ in the template, causing your issue.
You actually need four backslashes, like this:
#"\\\\$1"
Why is this unwieldily system required? Well, think of it this way. The \ character is used as the C escape character and the regex escape character. So, if you create an expression with only one backslash, you might get an error, because the NSString itself will thing you're using the special character \$. To escape the slash, you need to use two slashes, which will evaluate to only one in the final NSString data.
However, you really need two backslashes in the NSString itself to be sent to the regex parser, so you need to escape two backslashes in the string literal itself. So, \\\\ resolves to \\ in the actual data, which the regex parser then collapses to a single literal \ character

RegexKitLite Not Matching NSString Correctly

Alright, I'm trying to write some code that removes words that contain an apostrophe from an NSString. To do this, I've decided to use regular expressions, and I wrote one, that I tested using this website: http://rubular.com/r/YTV90BcgoQ
Here, the expression is: \S*'+\S
As shown on the website, the words containing an apostrophe are matched. But for some reason, in the application I'm writing, using this code:
sourceString = [sourceString stringByReplacingOccurrencesOfRegex:#"\S*'+\S" withString:#""];
Doesn't return any positive result. By NSLogging the 'sourceString', I notice that words like 'Don't' and 'Doesn't' are still present in the output.
It doesn't seem like my expression is the problem, but maybe RegexKitLite doesn't accept certain types of expressions? If someone knows what's going on here, please enlighten me !
Literal NSStrings use \ as an escape character so that you can put things like newlines \n into them. Regexes also use backslashes as an escape character for character classes like \S. When your literal string gets run through the compiler, the backslashes are treated as escape characters, and don't make it to the regex pattern.
Therefore, you need to escape the backslashes themselves in your literal NSString, in order to end up with backslashes in the string that is used as the pattern: #"\\S*'+\\S".
You should have seen a compiler warning about "Unknown escape sequence" -- don't ignore those warnings!