Regex stringByReplacingMatchesInString - objective-c

I'm trying to remove any non-alphanumeric character within a string. I tried the following code snippet, but it is not replacing the appropriate character.
NSString *theString = #"\"day's\"";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^\\B\\W^\\B" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *newString = [regex stringByReplacingMatchesInString:theString options:0 range:NSMakeRange(0, [theString length]) withTemplate:#""];
NSLog(#"the resulting string is %#", newString);

Since there'e a need to preserve the enclosing quotation marks in the string, the regex necessarily becomes a bit complex.
Here is one which does it:
(?:(?<=^")(\W+))|(?:(?!^")(\W+)(?=.))|(?:(\W+)(?="$))
It uses lookbehind and lookahead to match the quotation marks, without including them in the capture group, and hence they will not be deleted in the substitution with the empty string.
The three parts handle the initial quotation mark, all characters in the middle and the last quotation mark, respectively.
It is a bit pedestrian and there has to be a simpler way to do it, but I haven't been able to find it. Others are welcome to chime in!
NSString *theString = #"\"day's\"";
NSString *pattern = #"(?:(?<=^\")(\\W+))|(?:(?!^\")(\\W+)(?=.))|(?:(\\W+)(?=\"$))";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern: pattern
options: 0 // No need to specify case insensitive, \W makes it irrelevant
error: &error];
NSString *newString = [regex stringByReplacingMatchesInString: theString
options: 0
range: NSMakeRange(0, [theString length])
withTemplate: #""];
The (?:) construct creates a non-capturing parenthesis, meaning that you can keep the lookbehind (or lookahead) group and "real" capture group together without creating an actual capture group encapsulating the whole parenthesis. Without that you couldn't just substitute an empty string, or it would all be deleted.

Related

Regex number of matches is always zero

I want to check a UITextField text with a format like "G12-123456".
Rules are simple;
First character must be upper case letter.
The 2nd and 3rd must be number.
Fourth must be "-" character.
The last six must be only numbers.
Below code not work, number of matches always returns zero.
I also tried regex as "[A-Z0-9]{3}-[0-9]{6}"
NSString * myRegex = #"[A-Z][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9]";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:myRegex
options:NSRegularExpressionCaseInsensitive
error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string
options:NSMatchingReportProgress
range:NSMakeRange(0, [string length])];
This one works with same code [^a-zA-Z0-9] -> Check whether an NSString contains a special character and a digit.
Any help would be appreciated.
First of all basically your code is supposed to work.
However both options are nonsensical. If you want to check for uppercase letter you must not pass NSRegularExpressionCaseInsensitive and NSMatchingReportProgress affects only the block based API.
In both cases pass 0.
The pattern can be written more efficient
NSString *myRegex = #"[A-Z]\\d{2}-\\d{6}";
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:myRegex
options:0
error:&error];
if {error) {
NSLog(#"%#", error);
} else {
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string
options:0
range:NSMakeRange(0, [string length])];
NSLog(#"%lu", numberOfMatches);
}
If the regex must match the entire string add the start - end anchors.
NSString *myRegex = #"^[A-Z]\\d(2)-\\d{6}$";
If numberOfMatches is zero please check if the hyphen character is the standard one (ASCII 45, hex 0x2D).

How to transfer uncertain character such as "( " in regular expression?

NSString *yourString = #"/Users/user/Downloads/data(1).txt.download/data(1).txt";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"/Users/user/Downloads/data(1).txt-*\\d*.download"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:yourString options:0 range:NSMakeRange(0, [yourString length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
// your code to handle matches here
NSLog(#"%# %ld",match,flags);
}];
Refer to the code above, when there is special character "(" in the file name, it can not be matched because regularExpression should use "\\(", ofCourse i can use #"/Users/user/Downloads/data\\(1\\).txt-*\\d*.download" to match, but how about when the file name contains other special characters. Is there any way to handle with this scenario in a common way?
The parameter regularExpressionWithPattern should be a variable with [NSString stringWithFormat:xxx]
Is there any way to handle with this scenario in a common way?
Why not solve the problem with a regular expression (RE)? I.e. before using a string which you wish to match as-is as part of a pattern apply a RE to the string which matches any RE metacharacters and add the required escape.
E.g. the pattern [()] will match an open or a close parenthesis and you can replace any match with \\ followed by itself.
HTH

Objective C - Split string into array

How would I do this? I'm new to Objective-C but I can't find anything that would help me do this.
NSArray *splitLine = [currentLine componentsSeparatedByString:#":%#",notNumber];
Where notNumber is a string that represents anything that isn't a number. So I want to separate a string where there are colons separated by strings that aren't numbers. (I want to avoid splitting at times i.e. 3:00pm, but split at iCal parameters like DESCRIPTION: and LOCATION:.)
You can do this in several steps, like this. I have not compiled this code, but it should at least give you an idea of what to do.
1) Create a regex object to match your separators:
NSString *regexString = #"DESCRIPTION:\s|LOCATION:\s"; // or whatever makes sense for your scenario
NSRegularExpression *regex =
[NSRegularExpression regularExpressionWithPattern:regexString
options:NSRegularExpressionCaseInsensitive
error:nil];
2) Replace all the different separators matching your regex with just one separator:
NSRange range = NSMakeRange(0, string.length);
NSString *string2 = [regex stringByReplacingMatchesInString:string
options:0
range:range
withTemplate:#"SEPARATOR"];
3) Split the string!
NSArray *elements = [string2 componentsSeparatedByString:#"SEPARATOR"];
Shortest solution for splitting string.
NSString *str = #"Please split me to form array of words";
NSArray *wordsArray = [str componentsSeparatedByString:#" "];
You can use regular expressions!
Using the pattern (I believe this is the core of your question):
pattern = #"(?<=[^0-9]):(?=[^0-9])"
This pattern will only match ':' symbols not surrounded by numbers.
Then replace with a dummy value that won't show in your data
dummy = #"NEVERSEETHIS"
NSRegularExpressions *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:nil];
NSRange range = NSMakeRange(0, [string length])
NSString *modified= [regex replaceMatchesInString:yourString options:0 range:range withTemplate:dummy];
and finally, split
return [modified componentsSeparatedByString:dummy];

NSRegularExpression get only the regex

i have a problem and i don't undestand how to do this ( after 6hours or googling)
i'have a string named "filename" containt this text :"Aachen-Merzbrück EDKA\r\r\nVerkehr"
i want to use regex to only get this part "Aachen-Merzbrück EDKA" but i cant....
here my code :
NSString *expression = #"\\w+\\s[A-Z]{4}";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:expression options:NSRegularExpressionCaseInsensitive error:&error];
NSString *noAirportString = [regex stringByReplacingMatchesInString:filename options:0 range:NSMakeRange(0, [filename length]) withTemplate:#""];
EDIT :
this one work good :
\S+\s+[A-Z]{4}
but now, how to get only this "Aachen-Merzbrück" EDKA from "Aachen-Merzbrück EDKA\r\r\nVerkehr"
my regex with NSRegularExpression return me the same string ....
A couple of issues in your question:
No need to match city name characters - there are always weird ones around (hyphens, apostrophes, etc.) You can just match the first "line" in your text with a test for the ICAO code as an extra security.
Using stringByReplacingMatchesInString: you actually remove the airport name (and ICAO code) that you want keep.
stringByReplacingMatchesInString: is a hacky (because it deletes things, so you need to make your regexes "negative") shortcut that sometimes works (I use it myself) but which risks confusing things - and future readers.
Having said that, a few changes will fix it:
NSString *filename = #"Aachen-Merzbrück EDKA\r\r\nVerkehr";
// Match anything from the beginning of the line up to a space and 4 upper case letters.
NSString *expression = #"^.+\\s[A-Z]{4}$";
NSError *error = NULL;
//Make sure ^ and $ match line endings,
//and make it case sensitive (the default) to explicitly
//match the 4 upper case characters of the ICAO code
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:expression options:NSRegularExpressionAnchorsMatchLines error:&error];
NSArray *matches = [regex matchesInString:filename
options:0
range:NSMakeRange(0, [filename length])];
// Check that there _is_ a match before you continue
if (matches.count == 0) {
// Error
}
NSRange airportNameRange = [[matches objectAtIndex: 0] range];
NSString *airportString = [filename substringWithRange: airportNameRange];
Thanks it's good working, but i use this one, it's work better in my case :
NSString *expression = #"\\S+\\s+[A-Z]{4}";

Dealing with separation characters within quotes when using componentsSeparatedByCharactersInSet

I'm trying to separate a string by the use of a comma. However I do not want to include commas that are within quoted areas. What is the best way of going about this in Objective-C?
An example of what I am dealing with is:
["someRandomNumber","Some Other Info","This quotes area, has a comma",...]
Any help would be greatly appreciated.
Regular expressions might work well for this, depending on your requirements. For example, if you're always trying to match items that are enclosed in double quotes, then the it might be easier to look for the quotes rather than worrying about the commas.
For example, you could do something like this:
NSString *pattern = #"\"[^\"]*\"";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:NSRegularExpressionCaseInsensitive error:&error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
NSRange matchRange = [match range];
NString *substring = [string substringWithRange:matchRange];
// do whatever you need to do with the substring
}
This code looks for a sequence of characters enclosed in quotes (the regex pattern "[^"]*"). Then for each match it extracts the matched range as a substring.
If that doesn't exactly match your requirements, it shouldn't be too difficult to adapt it to use a different regex pattern.
I'm not in a position to test this code at the moment, so my apologies if there are any errors. Hopefully the basic concept should be clear.