NSRegularExpression to match and replace all occurencies (porting from Ruby lang) - objective-c

I have troubles while trying to port the Ruby code to the ObjC code
Ruby:
clean_url = original_url.gsub(/\\u0026[^&]*/, "")
Execution:
original_url = http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1\u0026quality=medium\u0026type=video/webm&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794\u0026fallback_host=tc.v19.cache7.c.youtube.com,url=http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0
clean_url = http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794
Ruby code works as expected.
ObjC code:
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\u0026[^&]*" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *originalUrl = #"http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1\\u0026quality=medium\\u0026type=video/webm&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794\\u0026fallback_host=tc.v19.cache7.c.youtube.com,url=http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0";
NSString *cleanUrl = [regex stringByReplacingMatchesInString:originalUrl options:0 range:NSMakeRange(0, [originalUrl length]) withTemplate:#"bla"];
NOTICE on withTemplate:#"bla" because without it we cannot see where is the problem.
Execution:
clean_url = http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0blablablablablablablablablablablablablablablablabla
Thanks in advance!

The primary problem is your regular expression. It needs to be:
#"\\\\u0026[^&]*"
You want two backslashes in the regular expression. In C and Objective-C, a backslash needs to be escaped with another backslash. This means the string needs 4 backslashes.
Here's a simpler approach if you only need to process one string:
NSString *originalUrl = #"http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0&ms=au&expire=1368735912&id=e934f5f5c0743533&fexp=919374,909926,916713,916611,901474,924605,901208,929123,929915,929906,925714,929119,931202,928017,912518,911416,906906,904476,930807,919373,906836,933701,900345,926403,912711,929606,910075&sparams=cp,id,ip,ipbits,itag,ratebypass,source,upn,expire&sver=3&cp=U0hVTVdOU19GTENONV9PSFdKOnZFc0Uyc21YTVQw&ratebypass=yes&mv=m&source=youtube&itag=43&newshard=yes&mt=1368711866&ipbits=8&ip=92.114.198.83&key=yt1\\u0026quality=medium\\u0026type=video/webm&signature=AB8A6D618BDC38AF9D2E81916B863B724D2F12B6.8876CF4E106820B6443B4B06055BF90FD74B5794\\u0026fallback_host=tc.v19.cache7.c.youtube.com,url=http://r6---sn-hvaquxaxjvh-3p8l.c.youtube.com/videoplayback?upn=StTvWU7n7N0";
NSString *cleanURL = [originalURL stringByReplacingOccurrencesOfString:#"\\\\u0026[^&]*" withString:#"" options: NSRegularExpressionSearch range:NSMakeRange(0, originalURL.length)];
If you need to process multiple strings with the regular expression then using NSRegularExpression is more efficient.

Related

Regular expression substitution problem in Objective-C

Trying to capitalize all tags and running into trouble with substitution. Any idea why "upperCaseString" method isn't working?
NSError *error = nil;
NSMutableString *stringToCap = [NSMutableString stringWithString:#"<kaboom>stuff</kaboom>"];
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(</?[a-zA-Z].*?>)" options:NSRegularExpressionCaseInsensitive error:&error];
NSMutableString *modifiedString = [NSMutableString stringWithString:[regex stringByReplacingMatchesInString:stringToCap options:0 range:NSMakeRange(0, [stringToCap length]) withTemplate:#"$1".uppercaseString]];
NSLog(#"%#", modifiedString);
Produces: <kaboom>stuff</kaboom> when I expect <KABOOM>stuff</KABOOM>
stringByReplacingMatchesInString:options:range:withTemplate: doesn't work like that, the type of the last argument is just NSString and the string you are passing is the result of the expression #"$1".uppercaseString – which is just #"$1".
A possible algorithm (pseudo code):
for NSTextCheckingResult *match in [regex matchesInString:... options:... range:...] do
extract the substring at match.range from modified string
uppercase it
replace the substring at match.range with uppercased result

Objective C - Split string into array

How would I do this? I'm new to Objective-C but I can't find anything that would help me do this.
NSArray *splitLine = [currentLine componentsSeparatedByString:#":%#",notNumber];
Where notNumber is a string that represents anything that isn't a number. So I want to separate a string where there are colons separated by strings that aren't numbers. (I want to avoid splitting at times i.e. 3:00pm, but split at iCal parameters like DESCRIPTION: and LOCATION:.)
You can do this in several steps, like this. I have not compiled this code, but it should at least give you an idea of what to do.
1) Create a regex object to match your separators:
NSString *regexString = #"DESCRIPTION:\s|LOCATION:\s"; // or whatever makes sense for your scenario
NSRegularExpression *regex =
[NSRegularExpression regularExpressionWithPattern:regexString
options:NSRegularExpressionCaseInsensitive
error:nil];
2) Replace all the different separators matching your regex with just one separator:
NSRange range = NSMakeRange(0, string.length);
NSString *string2 = [regex stringByReplacingMatchesInString:string
options:0
range:range
withTemplate:#"SEPARATOR"];
3) Split the string!
NSArray *elements = [string2 componentsSeparatedByString:#"SEPARATOR"];
Shortest solution for splitting string.
NSString *str = #"Please split me to form array of words";
NSArray *wordsArray = [str componentsSeparatedByString:#" "];
You can use regular expressions!
Using the pattern (I believe this is the core of your question):
pattern = #"(?<=[^0-9]):(?=[^0-9])"
This pattern will only match ':' symbols not surrounded by numbers.
Then replace with a dummy value that won't show in your data
dummy = #"NEVERSEETHIS"
NSRegularExpressions *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:nil];
NSRange range = NSMakeRange(0, [string length])
NSString *modified= [regex replaceMatchesInString:yourString options:0 range:range withTemplate:dummy];
and finally, split
return [modified componentsSeparatedByString:dummy];

Why is my NSRegularExpression pattern not working?

I have the following string:
NSString *string = #"she seemed \x3cem\x3ereluctant\x3c/em\x3e to discuss the matter";
I want the final string to be: "she seemed reluctant to discuss the matter"
I have the following pattern:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"/\\x[0-9a-f]{2}/"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
NSRange matchRange = [match range];
NSLog(#"%#", NSStringFromRange(matchRange));
}
However, I get an error saying the pattern is invalid. What am I doing wrong?
The pattern you need is:
#"\\\\x[0-9a-f]{2}"
The backslash is special to both Obj-C and the RE parser - so you need to create an Obj-C string with two \'s so the RE parser can then end up with one.
Also there are no open/close delimiters in the string - you're thinking of another programming language there!
You can save yourself some regex troubles by using the NSString method
stringByReplacingOccurrencesOfString:withString:
Or
stringByReplacingOccurrencesOfString:withString:options:range:

NSRegularExpression get only the regex

i have a problem and i don't undestand how to do this ( after 6hours or googling)
i'have a string named "filename" containt this text :"Aachen-Merzbrück EDKA\r\r\nVerkehr"
i want to use regex to only get this part "Aachen-Merzbrück EDKA" but i cant....
here my code :
NSString *expression = #"\\w+\\s[A-Z]{4}";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:expression options:NSRegularExpressionCaseInsensitive error:&error];
NSString *noAirportString = [regex stringByReplacingMatchesInString:filename options:0 range:NSMakeRange(0, [filename length]) withTemplate:#""];
EDIT :
this one work good :
\S+\s+[A-Z]{4}
but now, how to get only this "Aachen-Merzbrück" EDKA from "Aachen-Merzbrück EDKA\r\r\nVerkehr"
my regex with NSRegularExpression return me the same string ....
A couple of issues in your question:
No need to match city name characters - there are always weird ones around (hyphens, apostrophes, etc.) You can just match the first "line" in your text with a test for the ICAO code as an extra security.
Using stringByReplacingMatchesInString: you actually remove the airport name (and ICAO code) that you want keep.
stringByReplacingMatchesInString: is a hacky (because it deletes things, so you need to make your regexes "negative") shortcut that sometimes works (I use it myself) but which risks confusing things - and future readers.
Having said that, a few changes will fix it:
NSString *filename = #"Aachen-Merzbrück EDKA\r\r\nVerkehr";
// Match anything from the beginning of the line up to a space and 4 upper case letters.
NSString *expression = #"^.+\\s[A-Z]{4}$";
NSError *error = NULL;
//Make sure ^ and $ match line endings,
//and make it case sensitive (the default) to explicitly
//match the 4 upper case characters of the ICAO code
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:expression options:NSRegularExpressionAnchorsMatchLines error:&error];
NSArray *matches = [regex matchesInString:filename
options:0
range:NSMakeRange(0, [filename length])];
// Check that there _is_ a match before you continue
if (matches.count == 0) {
// Error
}
NSRange airportNameRange = [[matches objectAtIndex: 0] range];
NSString *airportString = [filename substringWithRange: airportNameRange];
Thanks it's good working, but i use this one, it's work better in my case :
NSString *expression = #"\\S+\\s+[A-Z]{4}";

stringByReplacingOccurrencesOfString Regular Expression

I'm developing a Mac app and I'm trying to replace use NSString's stringByReplacingOccurrencesOfString. I'm doing something like:
NSString *new = [s stringByReplacingOccurrencesOfString:#"(special-tag)*.*</body" withString:html];
On an NSString. But whenever I try to use this function with a regular expression it seems to break. Is there something I'm missing? I found a few external regex libraries, but I'd rather use something built in that has similar functionality.
Any advice? Thanks in advance! EDIT - I know why it's breaking, I need help figuring out how to do an NSString replace with regular expressions
As the name suggest stringByReplacingOccurrencesOfString OccurrencesOfString it's a string not a RegEx. So it will replace your string rather than your RegEx.
-----------Edited-----------------
I haven't used regex before hope this will give you the idea
NSString *string = #"this is your string";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\b(a|b)(c|d)\\b" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#"$2$1"];
Here is the NSRegularExpression Class Reference
Take a look at the NSRegularExpression class. It sounds like the -stringByReplacingMatchesInString:options:range:withTemplate: method will fit your needs. You might also like –replaceMatchesInString:options:range:withTemplate:.