since regular exressions are not supported in Cocoa I find RegexKitLite very usefull.
But all examples extract matching strings.
I just want to test if a string matches a regular expression and get a Yes or No.
How can I do that?
I've used NSPredicate for that purpose:
NSString *someRegexp = ...;
NSPredicate *myTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", someRegexp];
if ([myTest evaluateWithObject: testString]){
//Matches
}
Another way to do this, which is a bit simpler than using NSPredicate, is an almost undocumented option to NSString's -rangeOfString:options: method:
NSRange range = [string rangeOfString:#"^\\w+$" options:NSRegularExpressionSearch];
BOOL matches = range.location != NSNotFound;
I say "almost undocumented", because the method itself doesn't list the option as available, but if you happen upon the documentation for the Search and Comparison operators and find NSRegularExpressionSearch you'll see that it's a valid option for the -rangeOfString... methods since OS X 10.7 and iOS 3.2.
NSRegularExpression is another option:
http://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html
Use the -isMatchedByRegex: method.
if([someString isMatchedByRegex:#"^[0-9a-fA-F]+:"] == YES) { NSLog(#"Matched!\n"); }
Related
I would like to loop through an NSString and call a custom function on every word that has certain criterion (For example, "has 2 'L's"). I was wondering what the best way of approaching that was. Should I use Find/Replace patterns? Blocks?
-(NSString *)convert:(NSString *)wordToConvert{
/// This I have already written
Return finalWord;
}
-(NSString *) method:(NSString *) sentenceContainingWords{
// match every word that meets the criteria (for example the 2Ls) and replace it with what convert: does.
}
To enumerate the words in a string, you should use -[NSString enumerateSubstringsInRange:options:usingBlock:] with NSStringEnumerationByWords and NSStringEnumerationLocalized. All of the other methods listed use a means of identifying words which may not be locale-appropriate or correspond to the system definition. For example, two words separated by a comma but not whitespace (e.g. "foo,bar") would not be treated as separate words by any of the other answers, but they are in Cocoa text views.
[aString enumerateSubstringsInRange:NSMakeRange(0, [aString length])
options:NSStringEnumerationByWords | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
if ([substring rangeOfString:#"ll" options:NSCaseInsensitiveSearch].location != NSNotFound)
/* do whatever */;
}];
As documented for -enumerateSubstringsInRange:options:usingBlock:, if you call it on a mutable string, you can safely mutate the string being enumerated within the enclosingRange. So, if you want to replace the matching words, you can with something like [aString replaceCharactersInRange:substringRange withString:replacementString].
The two ways I know of looping an array that will work for you are as follows:
NSArray *words = [sentence componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
for (NSString *word in words)
{
NSString *transformedWord = [obj method:word];
}
and
NSArray *words = [sentence componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[words enumerateObjectsWithOptions:NSEnumerationConcurrent usingBlock:^(id word, NSUInteger idx, BOOL *stop){
NSString *transformedWord = [obj method:word];
}];
The other method, –makeObjectsPerformSelector:withObject:, won't work for you. It expects to be able to call [word method:obj] which is backwards from what you expect.
If you could write your criteria with regular expressions, then you could probably do a regular expression matching to fetch these words and then pass them to your convert: method.
You could also do a split of string into an array of words using componentsSeparatedByString: or componentsSeparatedByCharactersInSet:, then go over the words in the array and detect if they fit your criteria somehow. If they fit, then pass them to convert:.
Hope this helps.
As of iOS 12/macOS 10.14 the recommended way to do this is with the Natural Language framework.
For example:
import NaturalLanguage
let myString = "..."
let tokeniser = NLTokenizer(unit: .word)
tokeniser.string = myString
tokeniser.enumerateTokens(in: myString.startIndex..<myString.endIndex) { wordRange, attributes in
performActionOnWord(myString[wordRange])
return true // or return false to stop enumeration
}
Using NLTokenizer also has the benefit of allowing you to optionally specify the language of the string beforehand:
tokeniser.setLanguage(.hebrew)
I would recommend using a while loop to go through the string like this.
NSRange spaceRange = [sentenceContainingWords rangeOfString:#" "];
NSRange previousRange = (NSRange){0,0};
do {
NSString *wordString;
wordString = [sentenceContainingWord substringWithRange:(NSRange){previousRange.location+1,(spaceRange.location-1)-(previousRange.location+1)}];
//use the +1's to not include the spaces in the strings
[self convert:wordString];
previousRange = spaceRange;
spaceRange = [sentenceContainingWords rangeOfString:#" "];
} while(spaceRange.location != NSNotFound);
This code would probably need to be rewritten because its pretty rough, but you should get the idea.
Edit: Just saw Jacob Gorban's post, you should definitely do it like that.
I have implemented a UISearchDisplayController that allows users to search a table. Currently the predicate I am using to search is as follows,
NSPredicate *resultPredicate = [NSPredicate predicateWithFormat:#"Name contains[cd] %#", searchText];
Now lets say a users searches for "beans, cooked" the corresponding matches are found in the table. But if the user enters the search text as "beans cooked" without the comma, there will be no matches found.
How can I re-write my predicate to "ignore" the commas when searching? In other words how can I re-write it so that it views "beans, cooked" being equal to "beans cooked" (NO COMMMA)?
First a disclaimer:
I think that what you are trying to do is to add some "fuzzyness" to your search algorithm, seen that you want to make your match insensitive to certain differences in user input.
Predicates (which are logic constructs) are by their very nature not fuzzy, so there is an underlying impedance mismatch between the problem and the tool chosen.
Anyway, one way to go about it could be to add a method to your model object class.
In this method, you can clean your name string so it only contains the most basic characters, say numbers, ascii letters and a space.
Being totally deterministic, such a method is effectively a read-only string property on your object, and as such it can be used to match in predicates.
Here is an implementation that removes punctuation, accents and diacritics:
- (NSString *)simplifiedName
{
// First convert the name string to a pure ASCII string
NSData *asciiData = [self.name dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *asciiString = [[[NSString alloc] initWithData:asciiData encoding:NSASCIIStringEncoding] lowercaseString];
// Define the characters that we will allow in our simplified name
NSString *searchCharacters = #"0123456789 abcdefghijklmnopqrstuvwxyz";
// Remove anything else
NSString *regExPattern = [NSString stringWithFormat:#"[^%#]", searchCharacters];
NSString *simplifiedName = [asciiString stringByReplacingOccurrencesOfString:regExPattern withString:#"" options:NSRegularExpressionSearch range:NSMakeRange(0, asciiString.length)];
return simplifiedName;
}
Now, a predicate could be made to search in the simplified name:
NSPredicate *pred = [NSPredicate predicateWithFormat:#"self.simplifiedName = %#", searchString];
You would of course want to clean the search string using the same algorithm used to clean the name, so it would probably be a good idea to factor it out into a general method to be used in both places.
Last, the simplifiedName method can also be added by implementing a category to the model object class so you don't have to modify its code, which is handy in case your object class is defined in an auto-generated file by Core Data.
This may be a bit hacky, but you could just remove the comma from the search term.
Example:
searchText = [searchText stringByReplacingOccurrencesOfString:#"," withString:#""];
NSPredicate *resultPredicate = [NSPredicate predicateWithFormat:#"Name contains[cd] %#", searchText];
The best solution I found for this type of problem is to actually add an entry in each items dictionary that has the same name but will all punctuations, commas, dashes, etc. removed like in this answer
All,
I am trying to use predicates to bring back a search return, giving precedence to strings that start with the search string VS. simply contained within it.
For example if the search string was "Objective-C", I want to get the filtered results back like this:
Objective-C a Primer
Objective-C Patterns
Objective-C Programming
All About Objective-C
How to program in Objective-C
Here is what I tried but since it's an OR, it clearly does not give precedence to the first condition. Is there a way to do a type of "chaining" with predicates? Thanks
NSPredicate *filter = [NSPredicate predicateWithFormat:#"subject BEGINSWITH [cd] %# OR subject CONTAINS [cd]", searchText,searchText];
NSArray *filtered = [myArray filteredArrayUsingPredicate: filter];
I don't think there's a way to do with NSPredicate by itself. What you seem to be asking for is sorted results. You do that by sorting the results after you get them back from the predicate. In this case, since the sort order isn't simply alphabetical, you should use NSArray's -sortedArrayUsingComparator: method. Something like this (not tested, typed off the top of my head).
NSArray *sortedStrings = [filtered sortedArrayUsingComparator:^NSComparisonResult(id obj1, id obj2) {
NSString *string1 = (NSString *)obj1;
NSString *string2 = (NSString *)obj2;
NSUInteger searchStringLocation1 = [string1 rangeOfString:searchString].location;
NSUInteger searchStringLocation2 = [string2 rangeOfString:searchString].location;
if (searchStringLocation1 < searchStringLocation2) return NSOrderedDescending;
if (searchStringLocation1 > searchStringLocation2) return NSOrderedAscending;
return NSOrderedSame;
}];
It looks like you're trying to sort your array using a predicate, or else trying to filter AND sort using just a predicate. Predicates return boolean values: either a string begins with or contains the search text, or it doesn't. Predicates don't tell you anything about relative order. Filtering an array removes those objects for which the predicate you supply returns NO, so only objects beginning with or containing the search text will be present in the filtered array. Indeed, since any string that begins with the search text also contains it, you could simplify your predicate to just the 'contains' part.
If you want to change the order of your filtered results, sort the filtered array with an appropriate comparator.
I am splitting an NSString like this: (filter string is an nsstring)
seperatorSet = [NSMutableCharacterSet whitespaceAndNewlineCharacterSet];
[seperatorSet formUnionWithCharacterSet:[NSCharacterSet punctuationCharacterSet]];
NSMutableArray *words = [[filterString componentsSeparatedByCharactersInSet:seperatorSet] mutableCopy];
I want to put words back into the form of filter string with the original punctuation and spacing. The reason I want to do this is I want to change some words and put it back together as it was originally.
A more robust way to split by words is to use string enumeration. A space is not always the delimiter and not all languages delimit spaces anyway (e.g. Japanese).
NSString * string = #" \n word1! word2,%$?'/word3.word4 ";
[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
options:NSStringEnumerationByWords
usingBlock:
^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(#"Substring: '%#'", substring);
}];
// Logs:
// Substring: 'word1'
// Substring: 'word2'
// Substring: 'word3'
// Substring: 'word4'
NSString *myString = #"Foo Bar Blah B..";
NSArray *myWords = [myString componentsSeparatedByCharactersInSet:
[NSCharacterSet characterSetWithCharactersInString:#" "]
];
NSString* string = [myWords componentsJoinedByString: #" "];
NSLog(#"%#",string);
Since you eliminate the original punctuation, there's no way to turn it back automatically.
The only way is not to use componentsSeparatedByCharactersInSet.
An alternative solution may be to iterate through the string and, for each char, check if it belongs to your character set.
If yes, add the char to a list and the substring to another list (you may use NSMutableArray class).
This way, for example, you know that the punctuation char between the first and the second substring is the first character in your list of separators.
You can use the pathArray componentsJoinedByString: method of the array class to rejoin the words:
NSString *orig = [words pathArray componentsJoinedByString:#" "];
How are you determining which words need to be replaced? Instead of breaking it apart in the first place, perhaps using -stringByReplacingOccurrencesOfString:withString:options:range: would be more suitable.
My guess is you may not be using the best API. If you're really worried about words, you should be using a word-based API. I'm a bit hazy on whether that would be NSDataDetector or something else. (I believe NSRegularExpression can deal with word boundaries in a smarter way.)
If you are using Mac OS X 10.7+ or iOS 4+ you can use NSRegularExpression, The pattern to replace a word is: "\b word \b" - (no spaces around word) \b matches a word boundary. Look at methods replaceMatchesInString:options:range:withTemplate: and stringByReplacingMatchesInString:options:range:withTemplate:.
Under 10.6 pr earlier if you wish to use regular expressions you can wrap the regcomp/regexec C-based functions, they support word boundaries as well. However you may prefer to use one of the other Cocoa options mentioned in other answers for this simple case.
I am looking for a way to compare two strings and see if the second string contains a character (letter, number, other) listed in the first, let me explain:
For example: Imagine a password with only digits and "*" are allowed:
Reference chain (1): "*0123456789" NSString format, no NSArray
Work chain (2) = "156/15615=211" NSString format,
How do I know that my chain 2 contains 2 characters (/=) which are not in my chain 1?
To simplify the management letters allowed, I do not want to use NSArray to manage a chain for example a function call:
BOOL unauthorized_letter_found = check(work_chain, reference_chain);
You it must go through "for", NSPredicate, etc. ?
PS: I'm on MAC OS, not iOS so I can not use NSRegularExpression.
You could go with character sets, e.g. using -rangeOfCharacterFromSet: to check for the presence of forbidden characters:
NSCharacterSet *notAllowed = [[NSCharacterSet
characterSetWithCharactersInString:#"*0123456789"] invertedSet];
NSRange range = [inputString rangeOfCharacterFromSet:notAllowed];
BOOL unauthorized = (range.location != NSNotFound);
If you want to use an NSPredicate, you can do:
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"SELF MATCHES '[0-9*]+'"];
if ([predicate evaluateWithObject:#"0*2481347*"]) {
NSLog(#"passes!");
} else {
NSLog(#"fails!");
}
This is using NSPredicate's built-in regular expression matching stuff. :)