replace matches in NSString with template using NSRegularExpression - objective-c

I'm trying to detect <br> or <Br> or < br>,... in NSString and replace it with \n.
I use NSRegularExpression and i wrote this code:
NSString *string = #"123 < br><br>1245; Ross <Br>Test 12<br>";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"<[* ](br|BR|bR|Br|br)>" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#"\n"];
NSLog(#"%#", modifiedString);
it works fine but it replace first matching only, not replacing all matches. Please help me to detect all matches and replace them.
Thanks

You currently don't handle an arbitrary amount of white space. For good measure you should also handle white space after br and also handle the closing slash since <br /> is the correct way of writing the line break in HTML.
You would end up with an pattern that looks like this
<\s*(br|BR|bR|Br|br)\s*\/*>
or written as a NSRegularExpression
NSError *error = NULL;
NSRegularExpression *regex =
[NSRegularExpression regularExpressionWithPattern:#"<\\s*(br|BR|bR|Br|br)\\s*\\/*>"
options:0
error:&error];
Edit
You could also make the pattern more compact by separating the two letters
<\s*([bB][rR])\s*\/*>

You're close, you need to have it handle any number of spaces after your initial <, and handle if it doesn't have any space at all.
Using your example, you can use the regex <\s*(br|BR|bR|Br|br)> to have it accept the 0 to N spaces before your BR works. You can also simplify it a little bit more by making it case insensitive with i, which allows for a cleaner looking regex to handle all the variations on BR you will see. To do that, use (?i)<\s*br>.
I think for completeness you can also include an arbitrary amount of space AFTER the br, just to handle anything that could be thrown. I agree with adding in some sort of catch for a /> to end the pattern, since <br/> is valid HTML as well. It makes the regex look a little more crazy, but it boils down to just adding the other 3 pieces.
(?i)<\s*br\s*\/?\s*>
It looks really scary, but breaks down very simply into a few parts:
(?i) turns on case insensitive to handle the variations on the br.
<\s* is the start of the tag directly followed by an arbitrary number of spaces.
br\s* is your br chars followed by an arbitrary number of spaces.
\/? is to handle 0 or 1 instances of the closing slash (to handle HTML valid tags like <br/> and <br>.
\s*> is handling an arbitrary number of spaces and then the closing >.

Related

NSRegularExpression matching and replacing with exclude

I'm working on a small iOS App and got stuck with creating a pattern using NSRegularExpression class. I need a pattern that I can use to look for and match a special word and replace it later but I need to exclude this word from match in case it has already been replaced by this match. So if user processes given text several times the replacement goes only once.
Example:
I need to find and replace all "yes" in any given text with "probably yes". But I need to exclude replacement of "yes" in "probably yes" in case user processes text one more time so it won't look like "probably probably yes"
NSRegularExpression *regexYesReplace = [NSRegularExpression regularExpressionWithPattern:#"some pattern" options:0 error:&error];
NSString *replacementStringYesReplace = #"probably yes";
replacedText = [regexYesReplace stringByReplacingMatchesInString:afterText options:options range:range withTemplate:replacementStringYesReplace];
I tried to implement pattern from this question and fixed syntax for NSRegularExpression but it didn't work out.
Regex replace text but exclude when text is between specific tag
May be someone had the same problem. Thanks in advance
You can use negative look-behind
(?<!probably )yes
Regex Demo

User input validation using regex in Cocoa for osx

just faced the problem of inability to write a good regex that would satisfy my needs. I've never used regex before, so that wasn't a big surprise for me.
Here are the good input examples that I'm trying to validate:
01
00:01
01:02:03
01:02:03:04 - max 3 colons possible
123,456,789 - max two commas possible
40000035
1:2:06
here are the bad ones:
,3234
134,2343,333
000:01:00
:01:03:00
:01
01:
First I tried to write one that would cover at least all the colon cases and here is what I have:
NSUInteger numberOfMatches = -1;
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"([([0-9]{1,2}):]{1,3})([0-9]{1,2})"
options:NSRegularExpressionCaseInsensitive
error:&error];
if (error) {
NSLog(#"error %#", error);
}
numberOfMatches = [regex numberOfMatchesInString:stringToValidate
options:0
range:NSMakeRange(0, [stringToValidate length])];
I'm also curious if NSRegularExpression has any methods to extract the validated substrings in a some sort of an array?!
For example:
12:23:63 --- would be validated and the array would contain [12,23,63]
234 --- would be validated and the array would contain [234]
123,235,653 --- would be validated and the array would contain [123235653]
basically colon separated input would be as separate values and everything else would be as one value
In order to achieve what you want, you I suggest you do one for each case for validation.
Validation
Colons:
^\d{1,2}(?::\d{1,2}){0,2}(?::\d{2})?$
Commas:
^\d{3}(?:,\d{3}){0,2}$
Just the number:
^\d+$
Then assuming you pass validation of each case, use something else for data extraction.
Data Extraction
Colons
Simply do a string split on the colons.
Commas:
Remove all occurences of , in the string.
Just the number:
Well, you have what you need here, so no need to make any changes.

iOS Determine if two unicode characters are actually one letter in another language and put in tableview index

In a UITableView's index scroller (the scroller on the right side containing the chars for each section) how do I display a mix of English characters and say Japanese characters? Is there a way to grab the first char of an NSString and then check to see if it's actually part of a é or something (since é is 2 unicode characters -- e + `). Any code snippets would be very helpful. By just doing the first character, it ends up displays random characters like "=" or "~" instead of the japanese character
Thanks!
NOTE: I'm not using the UILocalizedIndexedCollation because I am using CoreData's FetchResultsController. In many places online I've read that you can't really use both.
EDIT: I can get the character now, however the tableview index doesn't seem to render them properly. Does anyone have something like Japanese characters displaying in the tableview index?
The most solid way is to use the NSString methods that are sensitive to these characters. You would probably be interested in the WWDC2011 - Session 128 - Advanced Text Processing video. It talks extensively about just this subject. Pay attention to the part about "Composed Character Sequences"
Based on the information presented there you could probably do something like this:
#warning I haven't tested this thoroughly
NSString *string = #"Hello";
__block NSString *firstCharacterSequence = nil;
[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
firstCharacterSequence = substring;
*stop = YES;
}];
NSLog(#"%#",firstCharacterSequence);

Use regex to evalue string for repeated section

I haven't used regular expressions yet in objective-c. What I'm trying to do right now is evaluate a string to see if it contains a 4 or 5 character repeating pattern - any pattern, it doesn't matter. For instance, a string like #"testA54RqA54Rq" would return a true value from the regex, while a string like #"testA54Rq" would not. Right now I'm just generating all possible 4 and 5 character substrings and matching them to each other, but obviously this is extremely inefficient. Where can I find some resources about how to start using regular expressions in objective C? If anyone's been in this situation before a small example would be nice.
-EDIT-
I would also like to have somthing like #"testQWEr30BKRe40" return true (pattern of 4 letters followed by 2 numbers). I'm not sure if this is possible.
You probably want to look at:
https://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html
The actual regex I believe would just be: (\\w{4,5})\\1
NSString *regexStr = #"(\\w{4,5})\\1";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regexStr options:0 error:&error];
if ((regex==nil) && (error!=nil)) {
warn( #"Regex failed for: %#, error was: %#", string, error);
} else {
}
For exact patterns you will be able to do such validation with regex (.{4,5})\\1
If you want to do category pattern, such as 4 letters followed by 2 numbers, then you have to:
replace all letters with one constant letter (for example replace [a-zA-Z] with X)
replace all numbers with one constant number (for example replace \\d with 0)
validate such modified input with the same regex as shown above

Regex - excluding a word from a search

I've tried the following to exclude the words 'and' and 'the' from a regex search but it doesn't seem to be working. Any idea what I'm doing wrong?
NSString *pattern = [NSString stringWithFormat:#"\\b%#\\b\\b(?!.*\\b(?:and|the)\\b)", word];
NSRegularExpression *regex = [[NSRegularExpression alloc] initWithPattern:pattern options: NSRegularExpressionCaseInsensitive error:nil];
Seems like Objective C does support negative lookbehinds.. so I'm going to link you two excellent posts about negative lookbehinds.
http://www.codinghorror.com/blog/2005/10/excluding-matches-with-regular-expressions.html (by Jeff Atwood)
http://www.regular-expressions.info/lookaround.html (second link on the above blog post)