Apostrophes (') is not recognised in regular expression - objective-c

I want a regular expression for first name that can contain
1)Alphabets
2)Spaces
3)Apostrophes
Exp: Raja, Raja reddy, Raja's,
I used this ^([a-z]+[,.]?[ ]?|[a-z]+[']?)+$ but it is failing to recognise Apostrophes (').
- (BOOL)validateFirstNameOrLastNameOrCity:(NSString *) inputCanditate {
NSString *firstNameRegex = #"^([a-z]+[,.]?[ ]?|[a-z]+[']?)+$";
NSPredicate *firstNamePredicate = [NSPredicate predicateWithFormat:#"SELF MATCHES[c] %#",firstNameRegex];
return [firstNamePredicate evaluateWithObject:inputCanditate];
}

May I recommand ^[A-Z][a-zA-Z ']* ?
// The NSRegularExpression class is currently only available in the Foundation framework of iOS 4
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^[A-Z][a-zA-Z ']*" options:NSRegularExpressionAnchorsMatchLines error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:searchText options:0 range:NSMakeRange(0, [string length])];
return numberOfMatches > 1;
^[A-Z] : Force start with a capital letter from A to Z
[a-zA-Z ']* : followed by any number of charactere that an be 'a' to 'z', 'A' to 'Z', space or simple quote

I think you are looking for a pattern like this: ^[a-zA-Z ']+$
However, this is pretty bad. What about umlauts, accents, and a whole lot other letters that are not part of the ASCII alphabet?
A better solution would be to allow any kind of letter from any language.
To do so you can use the Unicode "letter" category \p{L}, e.g. ^[\p{L}]+$.
.. or you could just drop that rule all together - as reasonably suggested.

Related

Objective-C Find first matching regular expression in string

I have a task with regular expressions. I have a list of NSRegularExpression objects with different patterns. Also I have a NSString object to define a source. I need to find which regular expression (from the given list) matches for the BEGINNING of source.
Is there a way to do it with Objective-C?
For example:
Expressions patterns
[a-z]
[A-Z]
[1-9]
source
Hello32
Result
Expression no 2 fits for the beginning of source, because of letter H.
Why don't you just try them out?
NSString *testString = #"Hello";
NSArray *patterns = #[
#"[a-z]",
#"[A-Z]",
#"[1-9]",
];
for (NSString *pattern in patterns) {
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:0
error:NULL];
BOOL matchAtStart = [regex rangeOfFirstMatchInString:testString
options:0
range:(NSRange){0, testString.length}].location == 0;
NSLog(#"'%#': %#", pattern, #(matchAtStart));
}
You can prepend \A(?: and append ) to each pattern to force them to match at the beggining of the string. The patterns provided as example would become:
\A(?:[a-z])
\A(?:[A-Z])
\A(?:[1-9])
\A is an anchor to the beggining of the string (behaves exactly like ^ when the Multiline flag is not set).

Objective C Regex?

I'm trying to parse a 7-digit number from a page's source code and the pattern that I look for is
/nnnnnnn"
where "n" is a digit. I'm trying with the following regex and in a regex test site it works, but not in obj-c. Is it possible that I'm passing the wrong option or something?
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"/\d\d\d\d\d\d\d\">" options:NSRegularExpressionSearch error:nil];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:contents
options:0
range:NSMakeRange(0, [contents length])];
You should double the backslashes in front of your ds, like this:
#"/\\d\\d\\d\\d\\d\\d\\d\">"
Backslash is a special character inside a string literal: the character after it is interpreted differently. In order for the regex engine to see a backslash, you need two slashes in the literal.

Matching a regular expression with a string (file name)

I'm trying to differentiate between 2 files (in NSString format). As far as I know, this can be done by comparing and matching a regular expression. The format of the 2 jpg files which I have are:
butter.jpg
butter-1.jpg
My question is what regular expression can I write to match the 2 strings above? I've search and found an example expression, but I'm not sure how is it read and think it's wrong.
Here is my code:
NSString *exampleFileName = [NSString stringWithFormat:#"butter-1.jpg"];
NSString *regEx = #".*l{2,}.*";
NSPredicate *regExTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", regEx];
if ([regExTest evaluateWithObject:exampleFileName] == YES) {
NSLog(#"Match!");
} else {
NSLog(#"No match!");
}
EDIT:
I tried using the following:
NSString *regEx = #"[a-z]+-[0-9]+.+jpg";
to try to match:
NSString *exampleFileName = [NSString stringWithFormat:#"abcdefg-112323.jpg"];
Tested with:
abc-11.jpg (Match)
abcsdas-.jpg (No Match)
abcdefg11. (No Match)
abcdefg-3123.jpg (Match)
As of now it works, but I want to eliminate any chances that it might not, any inputs?
NSString *regEx = #"[a-z]+-[0-9]+.+jpg";
will fail for butter.jpg, as it needs to have one - and at least on number.
NSString *regEx = #"[a-z]+(-[0-9]+){0,1}.jpg";
and if you do
NSString *regEx = #"([a-z])+(?:-([0-9])+){0,1}.jpg";
You can access the informations you probably would like to have later as capture groups.
(...) |Capturing parentheses. Range of input that matched the parenthesized subexpression is available after the match.
and if you dont need capture groups
NSString *regEx = #"(?:[a-z])+(?:-[0-9]+){0,1}.jpg";
(?:...)| Non-capturing parentheses. Groups the included pattern, but does not provide capturing of matching text. Somewhat more efficient than capturing parentheses.
You can match an alphabetic character (in any language) using \p{L}. You can match a digit using \d. You need to escape the . because in a regular expression, . means “any character”.
Parsing a regular expression is expensive, so you should only do it once.
BOOL stringMatchesMyPattern(NSString *string) {
static dispatch_once_t once;
static NSRegularExpression *re;
dispatch_once(&once, ^{
re = [NSRegularExpression regularExpressionWithPattern:
#"^\\p{L}+-\\d+\\.jpg$" options:0 error:NULL];
}
return nil != [re firstMatchInString:string options:0
range:NSMakeRange(0, string.length)];
}

Objective C. Regular expression to eliminate anything after 3 dots

I wrote the following code to eliminate anything after 3 dots
currentItem.summary = #"I am just testing. I am ... the second part should be eliminated";
NSError * error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(.)*(/././.)(.)*" options:0 error:&error];
if(nil != regex){
currentItem.summary = [regex stringByReplacingMatchesInString:currentItem.summary
options:0 range:NSMakeRange(0, [currentItem.summary length])
withTemplate:#"$1"];
}
However, my input and output are the same. The correct output should be "I am just testing. I am".
I was trying to do this using regular expression because I have a database of other regular expressions that I run on the string. I know the performance might not be as good as a plain text find or replace but the strings involved are short. I also tried using "\" to escape the dots in the regex, but I was getting a warning.
There is another question with a similar topic but the match strings are not for objective c.
This is much easier and will accomplish what you want:
NSRange range = [currentItem.summary rangeOfString:#"..."];
if (range != NSNotFound) {
currentItem.summary = [currentItem.summary substringToIndex:range.location];
}
You have forward slashes, /, instead of backward slashes, \, in your pattern. Also if you wish to match everything before the three dots you should use (.*) - tag everything matched by the enclosed .*. (The other parentheses in the pattern are redundant.)
Nice alternative:
NSScanner *scanner = [NSScanner scannerWithString:currentItem.summary];
[scanner scanUpToString:#"..." intoString: &currentItem.summary];
My recommended regex for your problem:
regularExpressionWithPattern:#"^(.*)\\s*\\.{3}.*$"
Main differences between this one and yours:
uses backslashes to escape special chars
uses ^ and $ to anchor at the beginning and end of the string
only captures the interesting section with ()
strips whitespace before the ... by ignoring any number of whitespace chars (\s*).
After correcting the slashes and other improvements, my final expression is:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^(.*)\\.{3}.*$"
options:0
error:&error];

NSRegularExpression string delimited by

I have a string for example #"You've earned Commentator and 4 ##other$$ badges". I want to retreive the substring #"other", which is delimited by ## and $$. I made a NSRegularExpression like this:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"##(.*)$$" options:NSRegularExpressionCaseInsensitive error:nil];
This completely ignores $$ and returns stuff starting with ##. What am I doing wrong? thanks.
Thats because '$' is a special character that represents the end of the line. Try \$\$ to escape it and tell the parser you want the characters.
I wouldn't use a regex in this situation, since the string bashing is so simple. No need for the overhead of compiling the expression.
NSString *source = #"You've earned Commentator and 4 ##other$$ badges";
NSRange firstDelimiterRange = [source rangeOfString:#"##"];
NSRange secondDelimiterRange = [source rangeOfString:#"$$"];
NSString *result = [source substringWithRange:
NSMakeRange(firstDelimiterRange.origin +2,
firstDelimiterRange.origin - secondDelimiterRange.origin)];