How can I replace one pair of character with multiple occurrence in a string? - objective-c

Original String is: This is a sentence with (noun) (verb) (adverb).
Original sentence has three occurrence of (). I need the last one intact but replace rest with #""
Required String: This is a sentence with (adverb).
I can do it with NSRange but I am looking for NSRegularExpression pattern.
Also which is more efficient, one with NSRange or the NSRegularExpression.
CODE
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\(.*?\\)" options:NSRegularExpressionCaseInsensitive error:NULL];
NSString *newString = [regex stringByReplacingMatchesInString:modify options:0 range:NSMakeRange(0, [modify length]) withTemplate:#""];
Output:: This is a sentence with

You can obtain the match ranges themselves and do the replacement manually, ignoring the last one.
NSMutableString* newString = [modify mutableCopy];
NSArray<NSTextCheckingResult*>* matches = [regex matchesInString:newString options:0 range:NSMakeRange(0, newString.length)];
if (matches.count >= 2)
{
// Enumerate backwards so that each replacement doesn't invalidate the other ranges
for (NSInteger i = matches.count - 2; i >= 0; i--)
{
NSTextCheckingResult* result = matches[i];
[newString replaceCharactersInRange:result.range withString:#""];
}
}

Related

Objective C - Split string into array

How would I do this? I'm new to Objective-C but I can't find anything that would help me do this.
NSArray *splitLine = [currentLine componentsSeparatedByString:#":%#",notNumber];
Where notNumber is a string that represents anything that isn't a number. So I want to separate a string where there are colons separated by strings that aren't numbers. (I want to avoid splitting at times i.e. 3:00pm, but split at iCal parameters like DESCRIPTION: and LOCATION:.)
You can do this in several steps, like this. I have not compiled this code, but it should at least give you an idea of what to do.
1) Create a regex object to match your separators:
NSString *regexString = #"DESCRIPTION:\s|LOCATION:\s"; // or whatever makes sense for your scenario
NSRegularExpression *regex =
[NSRegularExpression regularExpressionWithPattern:regexString
options:NSRegularExpressionCaseInsensitive
error:nil];
2) Replace all the different separators matching your regex with just one separator:
NSRange range = NSMakeRange(0, string.length);
NSString *string2 = [regex stringByReplacingMatchesInString:string
options:0
range:range
withTemplate:#"SEPARATOR"];
3) Split the string!
NSArray *elements = [string2 componentsSeparatedByString:#"SEPARATOR"];
Shortest solution for splitting string.
NSString *str = #"Please split me to form array of words";
NSArray *wordsArray = [str componentsSeparatedByString:#" "];
You can use regular expressions!
Using the pattern (I believe this is the core of your question):
pattern = #"(?<=[^0-9]):(?=[^0-9])"
This pattern will only match ':' symbols not surrounded by numbers.
Then replace with a dummy value that won't show in your data
dummy = #"NEVERSEETHIS"
NSRegularExpressions *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:nil];
NSRange range = NSMakeRange(0, [string length])
NSString *modified= [regex replaceMatchesInString:yourString options:0 range:range withTemplate:dummy];
and finally, split
return [modified componentsSeparatedByString:dummy];

Objective-C, regular expression match repetition

I found a problem in regular expression to match all group repetition.
This is a simple example:
NSString *string = #"A1BA2BA3BC";
NSString *pattern = #"(A[^AB]+B)+C";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
NSArray *array = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
Returning array have one element which contains two ranges, whole input string and last captured group "A3B". First two groups, "A1B" and "A2B", are not captured as I expected.
I've tried all from greedy to lazy matching.
A Quantifier Does not Spawn New Capture Groups
Except in .NET, which has CaptureCollections, adding a quantifier to a capture group does not create more captures. The group number stays the same (in your case, Group 1), and the content returned is the last capture of the group.
Reference
Everything about Regex Capture Groups (see Generating New Capture Groups Automatically)
Iterating the Groups
If you wanted to match all the substrings while still validating that they are in a valid string (composed of such groups and ending in C), you could use:
A[^AB]+B(?=(?:A[^AB]+B)*C)
The whole string, of course, would be
^(?:A[^AB]+B)+C$
To iterate the substrings: something like
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"A[^AB]+B(?=(?:A[^AB]+B)*C)" options:0 error:&error];
NSArray *matches = [regex matchesInString:subject options:0 range:NSMakeRange(0, [subject length])];
NSUInteger matchCount = [matches count];
if (matchCount) {
for (NSUInteger matchIdx = 0; matchIdx < matchCount; matchIdx++) {
NSTextCheckingResult *match = [matches objectAtIndex:matchIdx];
NSRange matchRange = [match range];
NSString *result = [subject substringWithRange:matchRange];
}
}
else { // Nah... No matches.
}

RegEx (replaceMatchesInString) does not work

Why does this RegEx with replaceMatchesInString return only "+" instead of "+123"?
NString *phoneNumberCleaned = [NSString stringWithFormat:#"++00123"];
NSString *strRegExPhoneNumberPrefixWrong = #"^([+0]*)\\d*$";
NSRegularExpression *regEx = [NSRegularExpression regularExpressionWithPattern:strRegEx options:0 error:nil];
[regEx replaceMatchesInString:phoneNumberCleaned options:0 range:NSMakeRange(0, [phoneNumberCleaned length]) withTemplate:#"+"];
return phoneNumberCleaned;
Thanks
NSString *string = #"++00123";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^[+0]+(?=\\d*)"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSString *modifiedString = [regex
stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#"+"];
return modifiedString;
The problem with your Regex was that ^([+0]*)\\d*$ is also matching the \d* which means, that it also gets replaced (you'd think that it would only replace your capture group, but evidently that isn't so). So you were essentialy replacing any string that matches the above pattern (which was including any trailing numbers), which in your case was the entire number.
What I used in my answer is called a positive lookahead.
^[+0]+(?=\\d*)$
The lookahead basically means that you're looking for zero or more + or 0 that are followed by zero or more digits EXCLUDING the digits from the match. So you only replace the zeroes and pluses, not the digits following them.

String Trimming with Certain keyword

I have a string like below.
<br><br><br><br><br> SomeHtmlString <br><br><br><br><br>
I want to remove br tags like trim function preserving middle br tags in SomeHtmlString.
Is there any function to do this shortly?
e.g.
<br><br><br>test1<br><br>test2<br><br><br><br>
to
test1<br><br>test2
Here is a method using regular expressions. It matches only one at a time and replaces that either at the beginning of end of the string.
NSMutableString *replaceMe = [[NSMutableString alloc ]
initWithString:#"<br><br > <br > test<br>test2<br><br>"];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^ *<br *> *"
options:NSRegularExpressionCaseInsensitive
error:&error];
do {
;
} while ([regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""] != 0);
regex = [NSRegularExpression
regularExpressionWithPattern:#" *<br *> *$"
options:NSRegularExpressionCaseInsensitive
error:&error];
do {
;
} while ([regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""] != 0);
NSLog(#"string=%#", replaceMe);
and that does strip "<br><br > <br > test<br>test2<br><br>" down to test<br>test2.
It's probably not the neatest solution but it is very easy to modify to match different expressions, with different whitespace, for example.
It's also possible to use the regular expressions to match several <br>s in one go:
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^ *(<br *> *)+"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""];
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#" *(<br *> *)+$"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex replaceMatchesInString:replaceMe options:NSMatchingCompleted range:NSMakeRange(0, replaceMe.length) withTemplate:#""];
which avoids the looping but is a little harder to modify.
You can do this:
NSString* htmlString= #"<br><br><br><br><br> SomeHtmlString <br><br><br><br><br>";
NSString* pureString= [htmlString stringByReplacingOccurrencesOfString: #"<br>" withString: #""];
So you'll have #" SomeHtmlString " in pureString.
You could use this to strip out the unwanted bits:
[yourString stringByReplacingOccurrencesOfString:#"<br>" withString:#""];
Then you would use something like this to remake your string the way you want it:
NSString *newString = [NSString stringWithFormat:#"<br>%#<br>", yourString];
You might also want to look at stringByTrimmingCharactersInSet:
There are so many things you can do with NSString. Check out the Class Reference: https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html
EDIT:
substringToIndex: could be your friend here. You can do this to find out if the first 4 characters of your string consist of the characters you want to remove:
NSString *subString = [yourString substringToIndex:4];
if ([subString isEqualToString:#"<br>"]) {
yourString = [yourString substringFromIndex:4];
}
Then you are creating a new string without those 4 characters. You keep doing this until the first 4 character are not equal to the ones you want to remove.
You can do something similar at the end of your string using substringFromIndex. You will need to know the length of your original string to make sure none of your substrings go out of bounds.
Alternative regular expression rendition:
NSString *input = #"<br><br><br><br><br><br>test<br>test2<br><br><br><br><br><br><br><br><br><br>";
__block NSString *output;
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^(<br>)*(.*?)(<br>)*$"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:input
options:0
range:NSMakeRange(0, [input length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSRange matchRange = [result rangeAtIndex:2];
output = [input substringWithRange:matchRange];
}];
if (output)
NSLog(#"Found: %#", output);

Objective-C NSRegularExpressions, finding first occurrence of numbers in a string

I'm pretty green at regex with Objective-C. I'm having some difficulty with it.
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\b([1-9]+)\\b" options:NSRegularExpressionCaseInsensitive error:&regError];
if (regError) {
NSLog(#"%#",regError.localizedDescription);
}
__block NSString *foundModel = nil;
[regex enumerateMatchesInString:self.model options:kNilOptions range:NSMakeRange(0, [self.model length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop) {
foundModel = [self.model substringWithRange:[match rangeAtIndex:0]];
*stop = YES;
}];
All I'm looking to do is take a string like
150A
And get
150
First the problems with the regex:
You are using word boundaries (\b) which means you are only
looking for a number that is by itself (e.g. 15 but not 150A).
Your number range does not include 0 so it would not capture 150. It needs to be [0-9]+ and better yet use \d+.
So to fix this, if you want to capture any number all you need is \d+. If you want to capture anything that starts with a number then only put the word boundary at the beginning \b\d+.
Now to get the first occurrence you can use -[regex rangeOfFirstMatchInString:options:range:]
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\b\\d+" options:NSRegularExpressionCaseInsensitive error:&regError];
if (regError) {
NSLog(#"%#",regError.localizedDescription);
}
NSString *model = #"150A";
NSString *foundModel = nil;
NSRange range = [regex rangeOfFirstMatchInString:model options:kNilOptions range:NSMakeRange(0, [model length])];
if(range.location != NSNotFound)
{
foundModel = [model substringWithRange:range];
}
NSLog(#"Model: %#", foundModel);
What about .*?(\d+).*? ?
Demo:
That would backreference the number and you would be able to use it wherever you want.