Objective-C Find first matching regular expression in string - objective-c

I have a task with regular expressions. I have a list of NSRegularExpression objects with different patterns. Also I have a NSString object to define a source. I need to find which regular expression (from the given list) matches for the BEGINNING of source.
Is there a way to do it with Objective-C?
For example:
Expressions patterns
[a-z]
[A-Z]
[1-9]
source
Hello32
Result
Expression no 2 fits for the beginning of source, because of letter H.

Why don't you just try them out?
NSString *testString = #"Hello";
NSArray *patterns = #[
#"[a-z]",
#"[A-Z]",
#"[1-9]",
];
for (NSString *pattern in patterns) {
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:0
error:NULL];
BOOL matchAtStart = [regex rangeOfFirstMatchInString:testString
options:0
range:(NSRange){0, testString.length}].location == 0;
NSLog(#"'%#': %#", pattern, #(matchAtStart));
}

You can prepend \A(?: and append ) to each pattern to force them to match at the beggining of the string. The patterns provided as example would become:
\A(?:[a-z])
\A(?:[A-Z])
\A(?:[1-9])
\A is an anchor to the beggining of the string (behaves exactly like ^ when the Multiline flag is not set).

Related

Apostrophes (') is not recognised in regular expression

I want a regular expression for first name that can contain
1)Alphabets
2)Spaces
3)Apostrophes
Exp: Raja, Raja reddy, Raja's,
I used this ^([a-z]+[,.]?[ ]?|[a-z]+[']?)+$ but it is failing to recognise Apostrophes (').
- (BOOL)validateFirstNameOrLastNameOrCity:(NSString *) inputCanditate {
NSString *firstNameRegex = #"^([a-z]+[,.]?[ ]?|[a-z]+[']?)+$";
NSPredicate *firstNamePredicate = [NSPredicate predicateWithFormat:#"SELF MATCHES[c] %#",firstNameRegex];
return [firstNamePredicate evaluateWithObject:inputCanditate];
}
May I recommand ^[A-Z][a-zA-Z ']* ?
// The NSRegularExpression class is currently only available in the Foundation framework of iOS 4
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^[A-Z][a-zA-Z ']*" options:NSRegularExpressionAnchorsMatchLines error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:searchText options:0 range:NSMakeRange(0, [string length])];
return numberOfMatches > 1;
^[A-Z] : Force start with a capital letter from A to Z
[a-zA-Z ']* : followed by any number of charactere that an be 'a' to 'z', 'A' to 'Z', space or simple quote
I think you are looking for a pattern like this: ^[a-zA-Z ']+$
However, this is pretty bad. What about umlauts, accents, and a whole lot other letters that are not part of the ASCII alphabet?
A better solution would be to allow any kind of letter from any language.
To do so you can use the Unicode "letter" category \p{L}, e.g. ^[\p{L}]+$.
.. or you could just drop that rule all together - as reasonably suggested.

Matching a regular expression with a string (file name)

I'm trying to differentiate between 2 files (in NSString format). As far as I know, this can be done by comparing and matching a regular expression. The format of the 2 jpg files which I have are:
butter.jpg
butter-1.jpg
My question is what regular expression can I write to match the 2 strings above? I've search and found an example expression, but I'm not sure how is it read and think it's wrong.
Here is my code:
NSString *exampleFileName = [NSString stringWithFormat:#"butter-1.jpg"];
NSString *regEx = #".*l{2,}.*";
NSPredicate *regExTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", regEx];
if ([regExTest evaluateWithObject:exampleFileName] == YES) {
NSLog(#"Match!");
} else {
NSLog(#"No match!");
}
EDIT:
I tried using the following:
NSString *regEx = #"[a-z]+-[0-9]+.+jpg";
to try to match:
NSString *exampleFileName = [NSString stringWithFormat:#"abcdefg-112323.jpg"];
Tested with:
abc-11.jpg (Match)
abcsdas-.jpg (No Match)
abcdefg11. (No Match)
abcdefg-3123.jpg (Match)
As of now it works, but I want to eliminate any chances that it might not, any inputs?
NSString *regEx = #"[a-z]+-[0-9]+.+jpg";
will fail for butter.jpg, as it needs to have one - and at least on number.
NSString *regEx = #"[a-z]+(-[0-9]+){0,1}.jpg";
and if you do
NSString *regEx = #"([a-z])+(?:-([0-9])+){0,1}.jpg";
You can access the informations you probably would like to have later as capture groups.
(...) |Capturing parentheses. Range of input that matched the parenthesized subexpression is available after the match.
and if you dont need capture groups
NSString *regEx = #"(?:[a-z])+(?:-[0-9]+){0,1}.jpg";
(?:...)| Non-capturing parentheses. Groups the included pattern, but does not provide capturing of matching text. Somewhat more efficient than capturing parentheses.
You can match an alphabetic character (in any language) using \p{L}. You can match a digit using \d. You need to escape the . because in a regular expression, . means “any character”.
Parsing a regular expression is expensive, so you should only do it once.
BOOL stringMatchesMyPattern(NSString *string) {
static dispatch_once_t once;
static NSRegularExpression *re;
dispatch_once(&once, ^{
re = [NSRegularExpression regularExpressionWithPattern:
#"^\\p{L}+-\\d+\\.jpg$" options:0 error:NULL];
}
return nil != [re firstMatchInString:string options:0
range:NSMakeRange(0, string.length)];
}

Objective-C: Parsing String into an Array under Special Circumstances

I have a string:
[{"id":1,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},{"id":2,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},{"id":3,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0}]
However, I would like to parse this string into an array such as:
[{"id":1,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},
{"id":2,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},
{"id":3,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0}]
This array is delimited by the comma in between the curly braces: },{
I tride usign the command
NSArray *responseArray = [response componentsSeparatedByString:#","];
but this separates the string into values at EVERY comma, which is not desirable.
Then I tried using regex:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\{.*\\}" options:NSRegularExpressionCaseInsensitive error:&error];
NSArray *matches = [regex matchesInString:response options:0 range:NSMakeRange(0, [response length])];
which found one match: starting at the first curly brace to the last curly brace.
I was wondering if anyone new how to solve this problem efficiently?
This string seems to be valid JSON. Try a JSON parser: NSJSONSerialization
I agree with H2CO3's suggestion to use a parser where possible.
But looking at your attempted regex, it looks like you just need to make it non-greedy, i.e.
#"\\{.*?\\}"
^
|
Add this question mark for non-greedy matching.
Of course, this will fail if you have deeper levels of (what I assume to be) nested arrays. Go with the JSON parser!

Objective C. Regular expression to eliminate anything after 3 dots

I wrote the following code to eliminate anything after 3 dots
currentItem.summary = #"I am just testing. I am ... the second part should be eliminated";
NSError * error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(.)*(/././.)(.)*" options:0 error:&error];
if(nil != regex){
currentItem.summary = [regex stringByReplacingMatchesInString:currentItem.summary
options:0 range:NSMakeRange(0, [currentItem.summary length])
withTemplate:#"$1"];
}
However, my input and output are the same. The correct output should be "I am just testing. I am".
I was trying to do this using regular expression because I have a database of other regular expressions that I run on the string. I know the performance might not be as good as a plain text find or replace but the strings involved are short. I also tried using "\" to escape the dots in the regex, but I was getting a warning.
There is another question with a similar topic but the match strings are not for objective c.
This is much easier and will accomplish what you want:
NSRange range = [currentItem.summary rangeOfString:#"..."];
if (range != NSNotFound) {
currentItem.summary = [currentItem.summary substringToIndex:range.location];
}
You have forward slashes, /, instead of backward slashes, \, in your pattern. Also if you wish to match everything before the three dots you should use (.*) - tag everything matched by the enclosed .*. (The other parentheses in the pattern are redundant.)
Nice alternative:
NSScanner *scanner = [NSScanner scannerWithString:currentItem.summary];
[scanner scanUpToString:#"..." intoString: &currentItem.summary];
My recommended regex for your problem:
regularExpressionWithPattern:#"^(.*)\\s*\\.{3}.*$"
Main differences between this one and yours:
uses backslashes to escape special chars
uses ^ and $ to anchor at the beginning and end of the string
only captures the interesting section with ()
strips whitespace before the ... by ignoring any number of whitespace chars (\s*).
After correcting the slashes and other improvements, my final expression is:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^(.*)\\.{3}.*$"
options:0
error:&error];

Why NSRegularExpression says that there are two matches of ".*" in the "a" string?

I'm very happy that Lion introduced NSRegularExpression, but I can't understand why the pattern .* matches two occurrences in a string like "a" (text can be longer).
I was using following code:
NSError *anError = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#".*"
options:0
error:&anError];
NSString *text = #"a";
NSUInteger counter = [regex numberOfMatchesInString:text
options:0
range:NSMakeRange(0, [text length])];
NSLog([NSString stringWithFormat:#"counter = %u", counter]);
Output from the console is:
2011-07-27 22:03:27.689 Regex[1930:707] counter = 2
Can anyone explain why that is?
The regular expression .* matches zero or more characters. Thus, it will match the empty string as well as a and as such there are two matches.
Mildly surprised that it didn't match 3 times. One for the "" before the "a", one for the "a" and one for the "" after the "a".
As has been noted, use a more precise pattern; including anchors (^ and/or $) might also change the behaviour.
No-one has asked, but why would you want to do this anyway?
The documents on NSRegularExpression say the following:
Some regular expressions [...] can
successfully match a zero-length range, so the comparison of the
resulting range with {NSNotFound, 0} is the most reliable way to
determine whether there was a match or not.
I more reliable way to get just one match would be to change the expression to .+