Why is my NSRegularExpression pattern not working? - objective-c

I have the following string:
NSString *string = #"she seemed \x3cem\x3ereluctant\x3c/em\x3e to discuss the matter";
I want the final string to be: "she seemed reluctant to discuss the matter"
I have the following pattern:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"/\\x[0-9a-f]{2}/"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
NSRange matchRange = [match range];
NSLog(#"%#", NSStringFromRange(matchRange));
}
However, I get an error saying the pattern is invalid. What am I doing wrong?

The pattern you need is:
#"\\\\x[0-9a-f]{2}"
The backslash is special to both Obj-C and the RE parser - so you need to create an Obj-C string with two \'s so the RE parser can then end up with one.
Also there are no open/close delimiters in the string - you're thinking of another programming language there!

You can save yourself some regex troubles by using the NSString method
stringByReplacingOccurrencesOfString:withString:
Or
stringByReplacingOccurrencesOfString:withString:options:range:

Related

Regular expression substitution problem in Objective-C

Trying to capitalize all tags and running into trouble with substitution. Any idea why "upperCaseString" method isn't working?
NSError *error = nil;
NSMutableString *stringToCap = [NSMutableString stringWithString:#"<kaboom>stuff</kaboom>"];
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(</?[a-zA-Z].*?>)" options:NSRegularExpressionCaseInsensitive error:&error];
NSMutableString *modifiedString = [NSMutableString stringWithString:[regex stringByReplacingMatchesInString:stringToCap options:0 range:NSMakeRange(0, [stringToCap length]) withTemplate:#"$1".uppercaseString]];
NSLog(#"%#", modifiedString);
Produces: <kaboom>stuff</kaboom> when I expect <KABOOM>stuff</KABOOM>
stringByReplacingMatchesInString:options:range:withTemplate: doesn't work like that, the type of the last argument is just NSString and the string you are passing is the result of the expression #"$1".uppercaseString – which is just #"$1".
A possible algorithm (pseudo code):
for NSTextCheckingResult *match in [regex matchesInString:... options:... range:...] do
extract the substring at match.range from modified string
uppercase it
replace the substring at match.range with uppercased result

How to Get Percentage From a NSString - Objective C

I would like to get a substring for a NSString that contains a percentage value.
For example:
1. Get 10% off with this item.
2. 55% off when you purchase this.
function should return 10% and 55% respectively.
I am using regex in Java \\d+%
I don't know how to do the same in objective c.
I have searched it but I am a bit lost.
You should be able to use NSRegularExpression to execute the same regex that you use in java. There is a good tutorial for NSRegularExpression here.
https://www.raywenderlich.com/30288/nsregularexpression-tutorial-and-cheat-sheet
I was able to accomplish it with this code:
NSString *string = #"10% off with this item";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\d+%" options:0 error:&error];
NSTextCheckingResult *result = [regex firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
NSString *substring = [string substringWithRange:result.range];
NSLog(#"%#", substring); // 10%
The key is in the TextCheckingResult. It contains the NSRange for the match in the original string so you can grab a substring of the match.

NSRegularExpression get only the regex

i have a problem and i don't undestand how to do this ( after 6hours or googling)
i'have a string named "filename" containt this text :"Aachen-Merzbrück EDKA\r\r\nVerkehr"
i want to use regex to only get this part "Aachen-Merzbrück EDKA" but i cant....
here my code :
NSString *expression = #"\\w+\\s[A-Z]{4}";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:expression options:NSRegularExpressionCaseInsensitive error:&error];
NSString *noAirportString = [regex stringByReplacingMatchesInString:filename options:0 range:NSMakeRange(0, [filename length]) withTemplate:#""];
EDIT :
this one work good :
\S+\s+[A-Z]{4}
but now, how to get only this "Aachen-Merzbrück" EDKA from "Aachen-Merzbrück EDKA\r\r\nVerkehr"
my regex with NSRegularExpression return me the same string ....
A couple of issues in your question:
No need to match city name characters - there are always weird ones around (hyphens, apostrophes, etc.) You can just match the first "line" in your text with a test for the ICAO code as an extra security.
Using stringByReplacingMatchesInString: you actually remove the airport name (and ICAO code) that you want keep.
stringByReplacingMatchesInString: is a hacky (because it deletes things, so you need to make your regexes "negative") shortcut that sometimes works (I use it myself) but which risks confusing things - and future readers.
Having said that, a few changes will fix it:
NSString *filename = #"Aachen-Merzbrück EDKA\r\r\nVerkehr";
// Match anything from the beginning of the line up to a space and 4 upper case letters.
NSString *expression = #"^.+\\s[A-Z]{4}$";
NSError *error = NULL;
//Make sure ^ and $ match line endings,
//and make it case sensitive (the default) to explicitly
//match the 4 upper case characters of the ICAO code
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:expression options:NSRegularExpressionAnchorsMatchLines error:&error];
NSArray *matches = [regex matchesInString:filename
options:0
range:NSMakeRange(0, [filename length])];
// Check that there _is_ a match before you continue
if (matches.count == 0) {
// Error
}
NSRange airportNameRange = [[matches objectAtIndex: 0] range];
NSString *airportString = [filename substringWithRange: airportNameRange];
Thanks it's good working, but i use this one, it's work better in my case :
NSString *expression = #"\\S+\\s+[A-Z]{4}";

NSRegularExpression to match and replace all occurencies (porting from Ruby lang)

I have troubles while trying to port the Ruby code to the ObjC code
Ruby:
clean_url = original_url.gsub(/\\u0026[^&]*/, "")
Execution:
original_url = http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1\u0026quality=medium\u0026type=video/webm&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794\u0026fallback_host=tc.v19.cache7.c.youtube.com,url=http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0
clean_url = http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794
Ruby code works as expected.
ObjC code:
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\u0026[^&]*" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *originalUrl = #"http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1\\u0026quality=medium\\u0026type=video/webm&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794\\u0026fallback_host=tc.v19.cache7.c.youtube.com,url=http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0";
NSString *cleanUrl = [regex stringByReplacingMatchesInString:originalUrl options:0 range:NSMakeRange(0, [originalUrl length]) withTemplate:#"bla"];
NOTICE on withTemplate:#"bla" because without it we cannot see where is the problem.
Execution:
clean_url = http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0blablablablablablablablablablablablablablablablabla
Thanks in advance!
The primary problem is your regular expression. It needs to be:
#"\\\\u0026[^&]*"
You want two backslashes in the regular expression. In C and Objective-C, a backslash needs to be escaped with another backslash. This means the string needs 4 backslashes.
Here's a simpler approach if you only need to process one string:
NSString *originalUrl = #"http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1\\u0026quality=medium\\u0026type=video/webm&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794\\u0026fallback_host=tc.v19.cache7.c.youtube.com,url=http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0";
NSString *cleanURL = [originalURL stringByReplacingOccurrencesOfString:#"\\\\u0026[^&]*" withString:#"" options: NSRegularExpressionSearch range:NSMakeRange(0, originalURL.length)];
If you need to process multiple strings with the regular expression then using NSRegularExpression is more efficient.

String Trimming with Certain keyword

I have a string like below.
<br><br><br><br><br> SomeHtmlString <br><br><br><br><br>
I want to remove br tags like trim function preserving middle br tags in SomeHtmlString.
Is there any function to do this shortly?
e.g.
<br><br><br>test1<br><br>test2<br><br><br><br>
to
test1<br><br>test2
Here is a method using regular expressions. It matches only one at a time and replaces that either at the beginning of end of the string.
NSMutableString *replaceMe = [[NSMutableString alloc ]
initWithString:#"<br><br > <br > test<br>test2<br><br>"];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^ *<br *> *"
options:NSRegularExpressionCaseInsensitive
error:&error];
do {
;
} while ([regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""] != 0);
regex = [NSRegularExpression
regularExpressionWithPattern:#" *<br *> *$"
options:NSRegularExpressionCaseInsensitive
error:&error];
do {
;
} while ([regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""] != 0);
NSLog(#"string=%#", replaceMe);
and that does strip "<br><br > <br > test<br>test2<br><br>" down to test<br>test2.
It's probably not the neatest solution but it is very easy to modify to match different expressions, with different whitespace, for example.
It's also possible to use the regular expressions to match several <br>s in one go:
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^ *(<br *> *)+"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""];
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#" *(<br *> *)+$"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""];
which avoids the looping but is a little harder to modify.
You can do this:
NSString* htmlString= #"<br><br><br><br><br> SomeHtmlString <br><br><br><br><br>";
NSString* pureString= [htmlString stringByReplacingOccurrencesOfString: #"<br>" withString: #""];
So you'll have #" SomeHtmlString " in pureString.
You could use this to strip out the unwanted bits:
[yourString stringByReplacingOccurrencesOfString:#"<br>" withString:#""];
Then you would use something like this to remake your string the way you want it:
NSString *newString = [NSString stringWithFormat:#"<br>%#<br>", yourString];
You might also want to look at stringByTrimmingCharactersInSet:
There are so many things you can do with NSString. Check out the Class Reference: https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html
EDIT:
substringToIndex: could be your friend here. You can do this to find out if the first 4 characters of your string consist of the characters you want to remove:
NSString *subString = [yourString substringToIndex:4];
if ([subString isEqualToString:#"<br>"]) {
yourString = [yourString substringFromIndex:4];
}
Then you are creating a new string without those 4 characters. You keep doing this until the first 4 character are not equal to the ones you want to remove.
You can do something similar at the end of your string using substringFromIndex. You will need to know the length of your original string to make sure none of your substrings go out of bounds.
Alternative regular expression rendition:
NSString *input = #"<br><br><br><br><br><br>test<br>test2<br><br><br><br><br><br><br><br><br><br>";
__block NSString *output;
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^(<br>)*(.*?)(<br>)*$"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:input
options:0
range:NSMakeRange(0, [input length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSRange matchRange = [result rangeAtIndex:2];
output = [input substringWithRange:matchRange];
}];
if (output)
NSLog(#"Found: %#", output);