RegEx (replaceMatchesInString) does not work - objective-c

Why does this RegEx with replaceMatchesInString return only "+" instead of "+123"?
NString *phoneNumberCleaned = [NSString stringWithFormat:#"++00123"];
NSString *strRegExPhoneNumberPrefixWrong = #"^([+0]*)\\d*$";
NSRegularExpression *regEx = [NSRegularExpression regularExpressionWithPattern:strRegEx options:0 error:nil];
[regEx replaceMatchesInString:phoneNumberCleaned options:0 range:NSMakeRange(0, [phoneNumberCleaned length]) withTemplate:#"+"];
return phoneNumberCleaned;
Thanks

NSString *string = #"++00123";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^[+0]+(?=\\d*)"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSString *modifiedString = [regex
stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#"+"];
return modifiedString;
The problem with your Regex was that ^([+0]*)\\d*$ is also matching the \d* which means, that it also gets replaced (you'd think that it would only replace your capture group, but evidently that isn't so). So you were essentialy replacing any string that matches the above pattern (which was including any trailing numbers), which in your case was the entire number.
What I used in my answer is called a positive lookahead.
^[+0]+(?=\\d*)$
The lookahead basically means that you're looking for zero or more + or 0 that are followed by zero or more digits EXCLUDING the digits from the match. So you only replace the zeroes and pluses, not the digits following them.

Related

Regular expression substitution problem in Objective-C

Trying to capitalize all tags and running into trouble with substitution. Any idea why "upperCaseString" method isn't working?
NSError *error = nil;
NSMutableString *stringToCap = [NSMutableString stringWithString:#"<kaboom>stuff</kaboom>"];
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(</?[a-zA-Z].*?>)" options:NSRegularExpressionCaseInsensitive error:&error];
NSMutableString *modifiedString = [NSMutableString stringWithString:[regex stringByReplacingMatchesInString:stringToCap options:0 range:NSMakeRange(0, [stringToCap length]) withTemplate:#"$1".uppercaseString]];
NSLog(#"%#", modifiedString);
Produces: <kaboom>stuff</kaboom> when I expect <KABOOM>stuff</KABOOM>
stringByReplacingMatchesInString:options:range:withTemplate: doesn't work like that, the type of the last argument is just NSString and the string you are passing is the result of the expression #"$1".uppercaseString – which is just #"$1".
A possible algorithm (pseudo code):
for NSTextCheckingResult *match in [regex matchesInString:... options:... range:...] do
extract the substring at match.range from modified string
uppercase it
replace the substring at match.range with uppercased result

Regex number of matches is always zero

I want to check a UITextField text with a format like "G12-123456".
Rules are simple;
First character must be upper case letter.
The 2nd and 3rd must be number.
Fourth must be "-" character.
The last six must be only numbers.
Below code not work, number of matches always returns zero.
I also tried regex as "[A-Z0-9]{3}-[0-9]{6}"
NSString * myRegex = #"[A-Z][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9]";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:myRegex
options:NSRegularExpressionCaseInsensitive
error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string
options:NSMatchingReportProgress
range:NSMakeRange(0, [string length])];
This one works with same code [^a-zA-Z0-9] -> Check whether an NSString contains a special character and a digit.
Any help would be appreciated.
First of all basically your code is supposed to work.
However both options are nonsensical. If you want to check for uppercase letter you must not pass NSRegularExpressionCaseInsensitive and NSMatchingReportProgress affects only the block based API.
In both cases pass 0.
The pattern can be written more efficient
NSString *myRegex = #"[A-Z]\\d{2}-\\d{6}";
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:myRegex
options:0
error:&error];
if {error) {
NSLog(#"%#", error);
} else {
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string
options:0
range:NSMakeRange(0, [string length])];
NSLog(#"%lu", numberOfMatches);
}
If the regex must match the entire string add the start - end anchors.
NSString *myRegex = #"^[A-Z]\\d(2)-\\d{6}$";
If numberOfMatches is zero please check if the hyphen character is the standard one (ASCII 45, hex 0x2D).

How delete all symbols from string except letters and numbers?

I am trying use next code:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[\\p{L}[0-9]]+|-" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#""];
but it doesn't work. I use different variations of this, but without success too.
Example text:
Это тестовый.!!?! ;$%###### (вопрос) номер 1256 - верно.
Example output:
Это тестовый вопрос номер 1256 - верно
Your regex is actually matching characters you want to remove, but it is corrupt and does not even do that (due to a "wild" ]).
If you need to delete all chars except letters, digits, hyphens and whitespaces, use #"[^\\p{L}\\p{M}0-9\\s-]+".
Details:
[^\\p{L}\\p{M}0-9\\s-]+ - one or more characters that are NOT:
\\p{L} - Unicode letters
\\p{M} - diacritics
0-9 - ASCII digits
\\s - whitespace
- - a literal hyphen.
See the online Objective-C demo:
NSString *text = #"Это тестовый.!!?! ;$%###### (вопрос) номер 1256 - верно";
NSError *error = NULL;
NSRegularExpression *regexp = [NSRegularExpression regularExpressionWithPattern:#"[^\\p{L}\\p{M}0-9\\s-]+" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *result = [regexp stringByReplacingMatchesInString:text options:0 range:NSMakeRange(0, [text length]) withTemplate:#""];
NSLog(result);
Result: Это тестовый вопрос номер 1256 - верно
1st find the num-alphabet set:
NSCharacterSet *alphaSet = [NSCharacterSet alphanumericCharacterSet];
2nd get the invert set of it, and we will use it as separators:
NSCharacterSet *separatorSet = [alphaSet invertedSet];
3nd use separator to separate the old string and then join characters back together with #"":
NSString *newString = [[oldString componentsSeparatedByCharactersInSet: separatorSet]componentsJoinedByString:#""];'

How can I replace one pair of character with multiple occurrence in a string?

Original String is: This is a sentence with (noun) (verb) (adverb).
Original sentence has three occurrence of (). I need the last one intact but replace rest with #""
Required String: This is a sentence with (adverb).
I can do it with NSRange but I am looking for NSRegularExpression pattern.
Also which is more efficient, one with NSRange or the NSRegularExpression.
CODE
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\(.*?\\)" options:NSRegularExpressionCaseInsensitive error:NULL];
NSString *newString = [regex stringByReplacingMatchesInString:modify options:0 range:NSMakeRange(0, [modify length]) withTemplate:#""];
Output:: This is a sentence with
You can obtain the match ranges themselves and do the replacement manually, ignoring the last one.
NSMutableString* newString = [modify mutableCopy];
NSArray<NSTextCheckingResult*>* matches = [regex matchesInString:newString options:0 range:NSMakeRange(0, newString.length)];
if (matches.count >= 2)
{
// Enumerate backwards so that each replacement doesn't invalidate the other ranges
for (NSInteger i = matches.count - 2; i >= 0; i--)
{
NSTextCheckingResult* result = matches[i];
[newString replaceCharactersInRange:result.range withString:#""];
}
}

String Trimming with Certain keyword

I have a string like below.
<br><br><br><br><br> SomeHtmlString <br><br><br><br><br>
I want to remove br tags like trim function preserving middle br tags in SomeHtmlString.
Is there any function to do this shortly?
e.g.
<br><br><br>test1<br><br>test2<br><br><br><br>
to
test1<br><br>test2
Here is a method using regular expressions. It matches only one at a time and replaces that either at the beginning of end of the string.
NSMutableString *replaceMe = [[NSMutableString alloc ]
initWithString:#"<br><br > <br > test<br>test2<br><br>"];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^ *<br *> *"
options:NSRegularExpressionCaseInsensitive
error:&error];
do {
;
} while ([regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""] != 0);
regex = [NSRegularExpression
regularExpressionWithPattern:#" *<br *> *$"
options:NSRegularExpressionCaseInsensitive
error:&error];
do {
;
} while ([regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""] != 0);
NSLog(#"string=%#", replaceMe);
and that does strip "<br><br > <br > test<br>test2<br><br>" down to test<br>test2.
It's probably not the neatest solution but it is very easy to modify to match different expressions, with different whitespace, for example.
It's also possible to use the regular expressions to match several <br>s in one go:
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^ *(<br *> *)+"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""];
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#" *(<br *> *)+$"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""];
which avoids the looping but is a little harder to modify.
You can do this:
NSString* htmlString= #"<br><br><br><br><br> SomeHtmlString <br><br><br><br><br>";
NSString* pureString= [htmlString stringByReplacingOccurrencesOfString: #"<br>" withString: #""];
So you'll have #" SomeHtmlString " in pureString.
You could use this to strip out the unwanted bits:
[yourString stringByReplacingOccurrencesOfString:#"<br>" withString:#""];
Then you would use something like this to remake your string the way you want it:
NSString *newString = [NSString stringWithFormat:#"<br>%#<br>", yourString];
You might also want to look at stringByTrimmingCharactersInSet:
There are so many things you can do with NSString. Check out the Class Reference: https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html
EDIT:
substringToIndex: could be your friend here. You can do this to find out if the first 4 characters of your string consist of the characters you want to remove:
NSString *subString = [yourString substringToIndex:4];
if ([subString isEqualToString:#"<br>"]) {
yourString = [yourString substringFromIndex:4];
}
Then you are creating a new string without those 4 characters. You keep doing this until the first 4 character are not equal to the ones you want to remove.
You can do something similar at the end of your string using substringFromIndex. You will need to know the length of your original string to make sure none of your substrings go out of bounds.
Alternative regular expression rendition:
NSString *input = #"<br><br><br><br><br><br>test<br>test2<br><br><br><br><br><br><br><br><br><br>";
__block NSString *output;
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^(<br>)*(.*?)(<br>)*$"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:input
options:0
range:NSMakeRange(0, [input length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSRange matchRange = [result rangeAtIndex:2];
output = [input substringWithRange:matchRange];
}];
if (output)
NSLog(#"Found: %#", output);