NSRegularExpression and NSString comparison - objective-c

I want to use NSRegularExpression to get some operations done in right order. I have a NSString:
my_simple_string
And I want to call a method (my method, doesn't matter here) like in CSS style so in my NSDictionary i have NSStrings:
*
my*
my_simple*
my_simple_string
What i want to do is that my method calls for all this values above. I currently use
IsEqualToString
and compare every substring. But this is not a perfect solution and when i do my research in web i find a post that suggest use NSRegularExpression for it, but i have no idea how can it be helpful.
One more important thing - "_" is not a separator, I don't have a separator.
EDIT:
NSString *str = #"my_simple_string";
NSArray *arr = [NSArray arrayWithObjects:#"*",#"my*",#"my_simple*",#"my_simple_string*", nil];
for(int i = 0 ; i < [str length] ; i++) {
NSString *cut = [str substringToIndex:i];
cut = [cut stringByAppendingString:#"*"];
for(int j = 0; j < [arr count] ; j++)
if([cut isEqualToString:[arr objectAtIndex:j]])
NSLog(#"find it!");
}
NSLog(#"-----");

This looks like a typical regex application. You may find examples on how to use NSRegularExpression in the Apple documentation.
If it's the first time you deal with regexs you'd better check some tutorial on internet (there are plenty of them).

Related

Call a method on every word in NSString

I would like to loop through an NSString and call a custom function on every word that has certain criterion (For example, "has 2 'L's"). I was wondering what the best way of approaching that was. Should I use Find/Replace patterns? Blocks?
-(NSString *)convert:(NSString *)wordToConvert{
/// This I have already written
Return finalWord;
}
-(NSString *) method:(NSString *) sentenceContainingWords{
// match every word that meets the criteria (for example the 2Ls) and replace it with what convert: does.
}
To enumerate the words in a string, you should use -[NSString enumerateSubstringsInRange:options:usingBlock:] with NSStringEnumerationByWords and NSStringEnumerationLocalized. All of the other methods listed use a means of identifying words which may not be locale-appropriate or correspond to the system definition. For example, two words separated by a comma but not whitespace (e.g. "foo,bar") would not be treated as separate words by any of the other answers, but they are in Cocoa text views.
[aString enumerateSubstringsInRange:NSMakeRange(0, [aString length])
options:NSStringEnumerationByWords | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
if ([substring rangeOfString:#"ll" options:NSCaseInsensitiveSearch].location != NSNotFound)
/* do whatever */;
}];
As documented for -enumerateSubstringsInRange:options:usingBlock:, if you call it on a mutable string, you can safely mutate the string being enumerated within the enclosingRange. So, if you want to replace the matching words, you can with something like [aString replaceCharactersInRange:substringRange withString:replacementString].
The two ways I know of looping an array that will work for you are as follows:
NSArray *words = [sentence componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
for (NSString *word in words)
{
NSString *transformedWord = [obj method:word];
}
and
NSArray *words = [sentence componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[words enumerateObjectsWithOptions:NSEnumerationConcurrent usingBlock:^(id word, NSUInteger idx, BOOL *stop){
NSString *transformedWord = [obj method:word];
}];
The other method, –makeObjectsPerformSelector:withObject:, won't work for you. It expects to be able to call [word method:obj] which is backwards from what you expect.
If you could write your criteria with regular expressions, then you could probably do a regular expression matching to fetch these words and then pass them to your convert: method.
You could also do a split of string into an array of words using componentsSeparatedByString: or componentsSeparatedByCharactersInSet:, then go over the words in the array and detect if they fit your criteria somehow. If they fit, then pass them to convert:.
Hope this helps.
As of iOS 12/macOS 10.14 the recommended way to do this is with the Natural Language framework.
For example:
import NaturalLanguage
let myString = "..."
let tokeniser = NLTokenizer(unit: .word)
tokeniser.string = myString
tokeniser.enumerateTokens(in: myString.startIndex..<myString.endIndex) { wordRange, attributes in
performActionOnWord(myString[wordRange])
return true // or return false to stop enumeration
}
Using NLTokenizer also has the benefit of allowing you to optionally specify the language of the string beforehand:
tokeniser.setLanguage(.hebrew)
I would recommend using a while loop to go through the string like this.
NSRange spaceRange = [sentenceContainingWords rangeOfString:#" "];
NSRange previousRange = (NSRange){0,0};
do {
NSString *wordString;
wordString = [sentenceContainingWord substringWithRange:(NSRange){previousRange.location+1,(spaceRange.location-1)-(previousRange.location+1)}];
//use the +1's to not include the spaces in the strings
[self convert:wordString];
previousRange = spaceRange;
spaceRange = [sentenceContainingWords rangeOfString:#" "];
} while(spaceRange.location != NSNotFound);
This code would probably need to be rewritten because its pretty rough, but you should get the idea.
Edit: Just saw Jacob Gorban's post, you should definitely do it like that.

Finding 2 Capitalized Words in a Row NSString

I'm writing a Mac app that goes through an NSString, and adds all its word to an NSArray (by separating them based on whitespace). Now, I've got the whole system down, but I'm still having one little problem: names (first + last), are added as two different words, and that's bothersome to me.
I thought of a couple solutions to fix this. My best idea was to, before actually adding the words to the array, join two words in a row that are capitalized. Then, through an if statement, determine if a word has two capitals in it, and then split the word and add it as one word. However, I can't find a way to find 2 words in a row with capitals.
Should I be using RegexKitLite (which I'm not familiar with), for example, to find two capitalized words in a row? I've seen this question: Regexp to pull capitalized words not at the beginning of sentence and two adjacent words
which seems somehow related, but due to my lack of understand of regular expressions, I don't really know if this is exactly what I need.
I've also seen this: Separating NSString into NSArray, but allowing quotes to group words
which is also similar, yet not exactly adapted to my needs.
So, to conclude, does anyone know how to either join capitalized words in an NSString, or even better, how to find two capitalized words in a row in an NSString ?
If you're targeting iOS 4.0 or greater OR OS 10.7 you can use NSRegularExpression
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"[A-Z]\\w*\\s[A-Z]\\w*"
options:nil
error:&error];
NSString *inputString = #"One two Three Four five six Seven Eight";
NSArray *stringsWithTwoCapitalizedWordsInARow = [regex
matchesInString:inputString
options:0
range:NSMakeRange(0, [string length])];
You'll get something like this
["Three Four", "Seven Eigth"]
You could just do a second pass on the resulting array after it has been loaded to append entries together that need to be joined.
Names are notoriously difficult to match with regular expressions alone, as it is not unheard of for names (first or last) to contain spaces themselves.
NSMutableArray* words = ...;
NSMutableArray* joinedWords = [NSMutableArray array];
for (int i=0; i < [words length]; i++)
{
NSString* currentLine = [words objectAtIndex:i];
bool capitalized = false;
bool capitalizedNext = false;
capitalized = isCap(currentLine); // Up to your discretion here
NSString* nextLine = nil;
// for the last entry
if (i+1 < [words length])
{
nextLine = [words objectAtIndex:i+1];
capitalizedNext = isCap(nextLine);
}
// Check if first letter is uppercase
if (capitalized == true && capitalizedNext == true)
{
[words replaceObjectAtIndex:i withObject:[NSString stringWithFormat:#"%# %#", currentLine, nextLine];
[words removeObjectAtIndex:i+1];
// Run test again on new version of the line
i--;
}
else
{
[joinedWords addObject:currentLine];
}
}
[A-Z][A-Za-z]* [A-Z][A-Za-z]*|[\S]*
http://rubular.com/r/DrOabOAfBr
I've written a regular expression for you. This regex will try to match a name first, then fall back to a word, so your job is as simple as feeding this into NSRegularExpression, and take all the matches as your words, or names joined.

Objective-C: -[NSString wordCount]

What's a simple implementation of the following NSString category method that returns the number of words in self, where words are separated by any number of consecutive spaces or newline characters? Also, the string will be less than 140 characters, so in this case, I prefer simplicity & readability at the sacrifice of a bit of performance.
#interface NSString (Additions)
- (NSUInteger)wordCount;
#end
I found the following solutions:
implementation of -[NSString wordCount]
implementation of -[NSString wordCount] - seems a bit simpler
But, isn't there a simpler way?
Why not just do the following?
- (NSUInteger)wordCount {
NSCharacterSet *separators = [NSCharacterSet whitespaceAndNewlineCharacterSet];
NSArray *words = [self componentsSeparatedByCharactersInSet:separators];
NSIndexSet *separatorIndexes = [words indexesOfObjectsPassingTest:^BOOL(id obj, NSUInteger idx, BOOL *stop) {
return [obj isEqualToString:#""];
}];
return [words count] - [separatorIndexes count];
}
I believe you have identified the 'simplest'. Nevertheless, to answer to your original question - "a simple implementation of the following NSString category...", and have it posted directly here for posterity:
#implementation NSString (GSBString)
- (NSUInteger)wordCount
{
__block int words = 0;
[self enumerateSubstringsInRange:NSMakeRange(0,self.length)
options:NSStringEnumerationByWords
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {words++;}];
return words;
}
#end
There are a number of simpler implementations, but they all have tradeoffs. For example, Cocoa (but not Cocoa Touch) has word-counting baked in:
- (NSUInteger)wordCount {
return [[NSSpellChecker sharedSpellChecker] countWordsInString:self language:nil];
}
It's also trivial to count words as accurately as the scanner simply using [[self componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] count]. But I've found the performance of that method degrades a lot for longer strings.
So it depends on the tradeoffs you want to make. I've found the absolute fastest is just to go straight-up ICU. If you want simplest, using existing code is probably simpler than writing any code at all.
- (NSUInteger) wordCount
{
NSArray *words = [self componentsSeparatedByString:#" "];
return [words count];
}
Looks like the second link I gave in my question still reigns as not only the fastest but also, in hindsight, a relatively simple implementation of -[NSString wordCount].
A Objective-C one-liner version
NSInteger wordCount = word ? ([word stringByTrimmingCharactersInSet:NSCharacterSet.whitespaceAndNewlineCharacterSet.invertedSet].length + 1) : 0;
Swift 3:
let words: [Any] = (string.components(separatedBy: " "))
let count = words.count

Sorting NSStrings of Numbers

So I have an NSDictionary where the keys are years as NSString's and the value for each key is also an NSString which is sort of a description for the year. So for example, one key is "943 B.C.", another "1886". The problem I am encountering is that I want to sort them, naturally, in ascending order.
The thing is that the data source of these years is already in order, it's just that when I go ahead and call setValue:forKey the order is lost, naturally. I imagine figuring out a way to sort these NSString's might be a pain and instead I should look for a method of preserving the order at the insertion phase. What should I do? Should I instead make this an NSMutableArray in which every object is actually an NSDictionary consisting of the key being the year and the value being the description?
I guess I just answered my own question, but to avoid having wasted this time I'll leave this up in case anyone can recommend a better way of doing this.
Thanks!
EDIT: I went ahead with my own idea of NSMutableArray with NSDictionary entries to hold the key/value pairs. This is how I am accessing the information later on, hopefully I'm doing this correctly:
// parsedData is the NSMutableArray which holdes the NSDictionary entries
for (id entry in parsedData) {
NSString *year = [[entry allKeys] objectAtIndex:0];
NSString *text = [entry objectForKey:year];
NSLog(#"Year: %#, Text: %#", year, text);
}
Maintain a NSMutableArray to store the keys in order, in addition to the NSDictionary which holds all key-value pairs.
Here is a similar question.
You could either do it as an array of dictionaries, as you suggest, or as an array of strings where the strings are the keys to your original dictionary. The latter is probably a simpler way of going about it. NSDictionary does not, as I understand it, maintain any particular ordering of its keys, so attempting to sort the values there may be unwise.
I needed to solve a similar problem to sort strings of operating system names, such as "Ubuntu 10.04 (lucid)".
In my case, the string could have any value, so I sort by tokenizing and testing to see if a token is a number. I'm also accounting for a string like "8.04.2" being considered a number, so I have a nested level of tokenizing. Luckily, the nested loop is typically only one iteration.
This is from the upcoming OpenStack iPhone app.
- (NSComparisonResult)compare:(ComputeModel *)aComputeModel {
NSComparisonResult result = NSOrderedSame;
NSNumberFormatter *formatter = [[NSNumberFormatter alloc] init];
NSArray *tokensA = [self.name componentsSeparatedByString:#" "];
NSArray *tokensB = [aComputeModel.name componentsSeparatedByString:#" "];
for (int i = 0; (i < [tokensA count] || i < [tokensB count]) && result == NSOrderedSame; i++) {
NSString *tokenA = [tokensA objectAtIndex:i];
NSString *tokenB = [tokensB objectAtIndex:i];
// problem: 8.04.2 is not a number, so we need to tokenize again on .
NSArray *versionTokensA = [tokenA componentsSeparatedByString:#"."];
NSArray *versionTokensB = [tokenB componentsSeparatedByString:#"."];
for (int j = 0; (j < [versionTokensA count] || j < [versionTokensB count]) && result == NSOrderedSame; j++) {
NSString *versionTokenA = [versionTokensA objectAtIndex:j];
NSString *versionTokenB = [versionTokensB objectAtIndex:j];
NSNumber *numberA = [formatter numberFromString:versionTokenA];
NSNumber *numberB = [formatter numberFromString:versionTokenB];
if (numberA && numberB) {
result = [numberA compare:numberB];
} else {
result = [versionTokenA compare:versionTokenB];
}
}
}
[formatter release];
return result;
}

NSString - Convert to pure alphabet only (i.e. remove accents+punctuation)

I'm trying to compare names without any punctuation, spaces, accents etc.
At the moment I am doing the following:
-(NSString*) prepareString:(NSString*)a {
//remove any accents and punctuation;
a=[[[NSString alloc] initWithData:[a dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES] encoding:NSASCIIStringEncoding] autorelease];
a=[a stringByReplacingOccurrencesOfString:#" " withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"'" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"`" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"-" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"_" withString:#""];
a=[a lowercaseString];
return a;
}
However, I need to do this for hundreds of strings and I need to make this more efficient. Any ideas?
NSString* finish = [[start componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
Before using any of these solutions, don't forget to use decomposedStringWithCanonicalMapping to decompose any accented letters. This will turn, for example, é (U+00E9) into e ‌́ (U+0065 U+0301). Then, when you strip out the non-alphanumeric characters, the unaccented letters will remain.
The reason why this is important is that you probably don't want, say, “dän” and “dün”* to be treated as the same. If you stripped out all accented letters, as some of these solutions may do, you'll end up with “dn”, so those strings will compare as equal.
So, you should decompose them first, so that you can strip the accents and leave the letters.
*Example from German. Thanks to Joris Weimar for providing it.
On a similar question, Ole Begemann suggests using stringByFoldingWithOptions: and I believe this is the best solution here:
NSString *accentedString = #"ÁlgeBra";
NSString *unaccentedString = [accentedString stringByFoldingWithOptions:NSDiacriticInsensitiveSearch locale:[NSLocale currentLocale]];
Depending on the nature of the strings you want to convert, you might want to set a fixed locale (e.g. English) instead of using the user's current locale. That way, you can be sure to get the same results on every machine.
One important precision over the answer of BillyTheKid18756 (that was corrected by Luiz but it was not obvious in the explanation of the code):
DO NOT USE stringWithCString as a second step to remove accents, it can add unwanted characters at the end of your string as the NSData is not NULL-terminated (as stringWithCString expects it).
Or use it and add an additional NULL byte to your NSData, like Luiz did in his code.
I think a simpler answer is to replace:
NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
By:
NSString *sanitizedText = [[[NSString alloc] initWithData:sanitizedData encoding:NSASCIIStringEncoding] autorelease];
If I take back the code of BillyTheKid18756, here is the complete correct code:
// The input text
NSString *text = #"BûvérÈ!#$&%^&(*^(_()-*/48";
// Defining what characters to accept
NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
[acceptedCharacters addCharactersInString:#" _-.!"];
// Turn accented letters into normal letters (optional)
NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
// Corrected back-conversion from NSData to NSString
NSString *sanitizedText = [[[NSString alloc] initWithData:sanitizedData encoding:NSASCIIStringEncoding] autorelease];
// Removing unaccepted characters
NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:#""];
If you are trying to compare strings, use one of these methods. Don't try to change data.
- (NSComparisonResult)localizedCompare:(NSString *)aString
- (NSComparisonResult)localizedCaseInsensitiveCompare:(NSString *)aString
- (NSComparisonResult)compare:(NSString *)aString options:(NSStringCompareOptions)mask range:(NSRange)range locale:(id)locale
You NEED to consider user locale to do things write with strings, particularly things like names.
In most languages, characters like ä and å are not the same other than they look similar. They are inherently distinct characters with meaning distinct from others, but the actual rules and semantics are distinct to each locale.
The correct way to compare and sort strings is by considering the user's locale. Anything else is naive, wrong and very 1990's. Stop doing it.
If you are trying to pass data to a system that cannot support non-ASCII, well, this is just a wrong thing to do. Pass it as data blobs.
https://developer.apple.com/library/ios/documentation/cocoa/Conceptual/Strings/Articles/SearchingStrings.html
Plus normalizing your strings first (see Peter Hosey's post) precomposing or decomposing, basically pick a normalized form.
- (NSString *)decomposedStringWithCanonicalMapping
- (NSString *)decomposedStringWithCompatibilityMapping
- (NSString *)precomposedStringWithCanonicalMapping
- (NSString *)precomposedStringWithCompatibilityMapping
No, it's not nearly as simple and easy as we tend to think.
Yes, it requires informed and careful decision making. (and a bit of non-English language experience helps)
Consider using the RegexKit framework. You could do something like:
NSString *searchString = #"This is neat.";
NSString *regexString = #"[\W]";
NSString *replaceWithString = #"";
NSString *replacedString = [searchString stringByReplacingOccurrencesOfRegex:regexString withString:replaceWithString];
NSLog (#"%#", replacedString);
//... Thisisneat
Consider using NSScanner, and specifically the methods -setCharactersToBeSkipped: (which accepts an NSCharacterSet) and -scanString:intoString: (which accepts a string and returns the scanned string by reference).
You may also want to couple this with -[NSString localizedCompare:], or perhaps -[NSString compare:options:] with the NSDiacriticInsensitiveSearch option. That could simplify having to remove/replace accents, so you can focus on removing puncuation, whitespace, etc.
If you must use an approach like you presented in your question, at least use an NSMutableString and replaceOccurrencesOfString:withString:options:range: — that will be much more efficient than creating tons of nearly-identical autoreleased strings. It could be that just reducing the number of allocations will boost performance "enough" for the time being.
To give a complete example by combining the answers from Luiz and Peter, adding a few lines, you get the code below.
The code does the following:
Creates a set of accepted characters
Turn accented letters into normal letters
Remove characters not in the set
Objective-C
// The input text
NSString *text = #"BûvérÈ!#$&%^&(*^(_()-*/48";
// Create set of accepted characters
NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
[acceptedCharacters addCharactersInString:#" _-.!"];
// Turn accented letters into normal letters (optional)
NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
// Remove characters not in the set
NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:#""];
Swift (2.2) example
let text = "BûvérÈ!#$&%^&(*^(_()-*/48"
// Create set of accepted characters
let acceptedCharacters = NSMutableCharacterSet()
acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.letterCharacterSet())
acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.decimalDigitCharacterSet())
acceptedCharacters.addCharactersInString(" _-.!")
// Turn accented letters into normal letters (optional)
let sanitizedData = text.dataUsingEncoding(NSASCIIStringEncoding, allowLossyConversion: true)
let sanitizedText = String(data: sanitizedData!, encoding: NSASCIIStringEncoding)
// Remove characters not in the set
let components = sanitizedText!.componentsSeparatedByCharactersInSet(acceptedCharacters.invertedSet)
let output = components.joinWithSeparator("")
Output
The output for both examples would be: BuverE!_-48
Just bumped into this, maybe its too late, but here is what worked for me:
// text is the input string, and this just removes accents from the letters
// lossy encoding turns accented letters into normal letters
NSMutableData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding
allowLossyConversion:YES];
// increase length by 1 adds a 0 byte (increaseLengthBy
// guarantees to fill the new space with 0s), effectively turning
// sanitizedData into a c-string
[sanitizedData increaseLengthBy:1];
// now we just create a string with the c-string in sanitizedData
NSString *final = [NSString stringWithCString:[sanitizedData bytes]];
#interface NSString (Filtering)
- (NSString*)stringByFilteringCharacters:(NSCharacterSet*)charSet;
#end
#implementation NSString (Filtering)
- (NSString*)stringByFilteringCharacters:(NSCharacterSet*)charSet {
NSMutableString * mutString = [NSMutableString stringWithCapacity:[self length]];
for (int i = 0; i < [self length]; i++){
char c = [self characterAtIndex:i];
if(![charSet characterIsMember:c]) [mutString appendFormat:#"%c", c];
}
return [NSString stringWithString:mutString];
}
#end
These answers didn't work as expected for me. Specifically, decomposedStringWithCanonicalMapping didn't strip accents/umlauts as I'd expected.
Here's a variation on what I used that answers the brief:
// replace accents, umlauts etc with equivalent letter i.e 'é' becomes 'e'.
// Always use en_GB (or a locale without the characters you wish to strip) as locale, no matter which language we're taking as input
NSString *processedString = [string stringByFoldingWithOptions: NSDiacriticInsensitiveSearch locale: [NSLocale localeWithLocaleIdentifier: #"en_GB"]];
// remove non-letters
processedString = [[processedString componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
// trim whitespace
processedString = [processedString stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceCharacterSet]];
return processedString;
Peter's Solution in Swift:
let newString = oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet).joinWithSeparator("")
Example:
let oldString = "Jo_ - h !. nn y"
// "Jo_ - h !. nn y"
oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet)
// ["Jo", "h", "nn", "y"]
oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet).joinWithSeparator("")
// "Johnny"
I wanted to filter out everything except letters and numbers, so I adapted Lorean's implementation of a Category on NSString to work a little different. In this example, you specify a string with only the characters you want to keep, and everything else is filtered out:
#interface NSString (PraxCategories)
+ (NSString *)lettersAndNumbers;
- (NSString*)stringByKeepingOnlyLettersAndNumbers;
- (NSString*)stringByKeepingOnlyCharactersInString:(NSString *)string;
#end
#implementation NSString (PraxCategories)
+ (NSString *)lettersAndNumbers { return #"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"; }
- (NSString*)stringByKeepingOnlyLettersAndNumbers {
return [self stringByKeepingOnlyCharactersInString:[NSString lettersAndNumbers]];
}
- (NSString*)stringByKeepingOnlyCharactersInString:(NSString *)string {
NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:string];
NSMutableString * mutableString = #"".mutableCopy;
for (int i = 0; i < [self length]; i++){
char character = [self characterAtIndex:i];
if([characterSet characterIsMember:character]) [mutableString appendFormat:#"%c", character];
}
return mutableString.copy;
}
#end
Once you've made your Categories, using them is trivial, and you can use them on any NSString:
NSString *string = someStringValueThatYouWantToFilter;
string = [string stringByKeepingOnlyLettersAndNumbers];
Or, for example, if you wanted to get rid of everything except vowels:
string = [string stringByKeepingOnlyCharactersInString:#"aeiouAEIOU"];
If you're still learning Objective-C and aren't using Categories, I encourage you to try them out. They're the best place to put things like this because it gives more functionality to all objects of the class you Categorize.
Categories simplify and encapsulate the code you're adding, making it easy to reuse on all of your projects. It's a great feature of Objective-C!