i am convert paragraph into words it contains many special characters like
" , " . `
how to remove this characters in nsstring and get only alphabets in nsstring
ex
"new" to new //the special characters are change
NSString *unfilteredString = #"!##$%^&*()_+|abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
NSCharacterSet *notAllowedChars = [[NSCharacterSet characterSetWithCharactersInString:#"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"] invertedSet];
NSString *resultString = [[unfilteredString componentsSeparatedByCharactersInSet:notAllowedChars] componentsJoinedByString:#""];
NSLog (#"Result: %#", resultString);
TRY THIS IT MAY HELPS YOU
There are numerous ways of dealing with this. As an example, here's a solution using regular expressions. This is just an example. We don't know the entire range of special characters that you want to remove.
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[])
{
#autoreleasepool {
NSRegularExpression *expression = [NSRegularExpression regularExpressionWithPattern:#"[,\\.`\"]"
options:0
error:NULL];
NSString *sampleString = #"The \"new\" quick brown fox, who jumped over the lazy dog.";
NSString *cleanedString = [expression stringByReplacingMatchesInString:sampleString
options:0
range:NSMakeRange(0, sampleString.length)
withTemplate:#""];
printf("cleaned = %s",[cleanedString UTF8String] );
}
return 0;
}
Per comment of #Rostyslav Druzhchenko from on the selected answer of #MadhuP:
NSString *unfilteredString = #"!##$%^&*()_+|abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
NSCharacterSet *notAllowedChars = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
NSString *escapedString = [[unfilteredString componentsSeparatedByCharactersInSet:notAllowedChars] componentsJoinedByString:#""];
NSLog (#"Result: %#", escapedString);
This is the answer that will use alphanumericCharacterSet to handle multiple countries character set.
Accepted answer in SWIFT (but not SWIFTY way):
let notAllowedCharactersSet = NSCharacterSet(charactersInString: "ABCDEFGHIJKLMNOPQRSTUVWXYZ").invertedSet
let filtered = (stringToFilter.componentsSeparatedByCharactersInSet(notAllowedCharactersSet) as NSArray).componentsJoinedByString("")
I'm sure there's a more elegant solution, but for anyone trying to do this in Swift, here's what I did to make sure there were no special characters in my users' phone numbers.
var phone = "+1 (555) 555 - 5555"
var removeChars: NSCharacterSet = NSCharacterSet(charactersInString: "1234567890").invertedSet
var charArray = phone.componentsSeparatedByCharactersInSet(removeChars)
var placeholderString = ""
var formattedPhoneNumber: String = placeholderString.join(charArray).stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceCharacterSet())
The stringByTrimmingCharactersInSet might not be necessary.
Related
I am trying to parse a set of words that contain -- first greek letters, then english letters. This would be easy if there was a delimiter between the sets.That is what I've built so far..
- (void)loadWordFileToArray:(NSBundle *)bundle {
NSLog(#"loadWordFileToArray");
if (bundle != nil) {
NSString *path = [bundle pathForResource:#"alfa" ofType:#"txt"];
//pull the content from the file into memory
NSData* data = [NSData dataWithContentsOfFile:path];
//convert the bytes from the file into a string
NSString* string = [[NSString alloc] initWithBytes:[data bytes]
length:[data length]
encoding:NSUTF8StringEncoding];
//split the string around newline characters to create an array
NSString* delimiter = #"\n";
incomingWords = [string componentsSeparatedByString:delimiter];
NSLog(#"incomingWords count: %lu", (unsigned long)incomingWords.count);
}
}
-(void)parseWordArray{
NSLog(#"parseWordArray");
NSString *seperator = #" = ";
int i = 0;
for (i=0; i < incomingWords.count; i++) {
NSString *incomingString = [incomingWords objectAtIndex:i];
NSScanner *scanner = [NSScanner localizedScannerWithString: incomingString];
NSString *firstString;
NSString *secondString;
NSInteger scanPosition;
[scanner scanUpToString:seperator intoString:&firstString];
scanPosition = [scanner scanLocation];
secondString = [[scanner string] substringFromIndex:scanPosition+[seperator length]];
// NSLog(#"greek: %#", firstString);
// NSLog(#"english: %#", secondString);
[outgoingWords insertObject:[NSMutableArray arrayWithObjects:#"greek", firstString, #"english",secondString,#"category", #"", nil] atIndex:0];
[englishWords insertObject:[NSMutableArray arrayWithObjects:secondString,nil] atIndex:0];
}
}
But I cannot count on there being delimiters.
I have looked at this question. I want something similar. This would be: grab the characters in the string until an english letter is found. Then take the first group to one new string, and all the characters after to a second new string.
I only have to run this a few times, so optimization is not my highest priority.. Any help would be appreciated..
EDIT:
I've changed my code as shown below to make use of NSLinguisticTagger. This works, but is this the best way? Note that the interpretation for english characters is -- for some reason "und"...
The incoming string is: άγαλμα, το statue, only the last 6 characters are in english.
int j = 0;
for (j=0; j<incomingString.length; j++) {
NSString *language = [tagger tagAtIndex:j scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
if ([language isEqual: #"und"]) {
NSLog(#"j is: %i", j);
int k = 0;
for (k=0; k<j; k++) {
NSRange range = NSMakeRange (0, k);
NSString *tempString = [incomingString substringWithRange:range ];
NSLog (#"tempString: %#", tempString);
}
return;
}
NSLog (#"Language: %#", language);
}
Alright so what you could do is use NSLinguisticTagger to find out the language of the word (or letter) and if the language has changed then you know where to split the string. You can use NSLinguisticTagger like this:
NSArray *tagschemes = #[NSLinguisticTagSchemeLanguage];
NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:tagschemes options: NSLinguisticTagPunctuation | NSLinguisticTaggerOmitWhitespace];
[tagger setString:#"This is my string in English."];
NSString *language = [tagger tagAtIndex:0 scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
//Loop through each index of the string's characters and check the language as above.
//If it has changed then you can assume the language has changed.
Alternatively you can use NSSpellChecker's requestCheckingOfString to get teh dominant language in a range of characters:
NSSpellChecker *spellChecker = [NSSpellChecker sharedSpellChecker];
[spellChecker setAutomaticallyIdentifiesLanguages:YES];
NSString *spellCheckText = #"Guten Herr Mustermann. Dies ist ein deutscher Text. Bitte löschen Sie diesen nicht.";
[spellChecker requestCheckingOfString:spellCheckText
range:(NSRange){0, [spellCheckText length]}
types:NSTextCheckingTypeOrthography
options:nil
inSpellDocumentWithTag:0
completionHandler:^(NSInteger sequenceNumber, NSArray *results, NSOrthography *orthography, NSInteger wordCount) {
NSLog(#"dominant language = %#", orthography.dominantLanguage);
}];
This answer has information on how to detect the language of an NSString.
Allow me to introduce two good friends of mine.
NSCharacterSet and NSRegularExpression.
Along with them, normalization. (In Unicode terms)
First, you should normalize strings before analyzing them against a character set.
You will need to look at the choices, but normalizing to all composed forms is the way I would go.
This means an accented character is one instead of two or more.
It simplifies the number of things to compare.
Next, you can easily build your own NSCharacterSet objects from strings (loaded from files even) to use to test set membership.
Lastly, regular expressions can achieve the same thing with Unicode Property Names as classes or categories of characters. Regular expressions could be more terse but more expressive.
I have a string..
NSString* string = #"%B999999^PDVS123456789012^PADILLA L. ^0X0000399 ?*;999999554749123456789012=00X990300000?*
What I want is to get the name PADILLA L. and 999999554749123456789012=00X990300000?*
Use NSString componentsSeparatedByString: to split the string up. First use #"^". The name will be at index 2. Then split the substring at index 3 using #";". The string at index 1 will give you the 2nd piece you want.
NSArray *substrings = [string componentsSeparatedByString:#"^"];
NSString *name = substrings[2];
name = [name stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSString *lastpart = substrings[3];
NSArray *moresubstrings = [lastpart componentsSeparatedByString:#";"];
NSString *secondPiece = moresubstrings[1];
Without more specifics here is a brute force way:
NSString* string = #"%B999999^PDVS123456789012^PADILLA L. ^0X0000399 ?*;999999554749123456789012=00X990300000?*";
NSRange nameRange = {26, 10};
NSString *name = [string substringWithRange:nameRange];
NSRange numRange = {80, 39};
NSString *num = [string substringWithRange:numRange];
The documentation is your friend: NSString Class Reference
Without knowing what the exact input pattern is (we have your n-of-1 example only), it's going to hard to say exactly how you might parse this properly; but NSRegularExpression offers what you need (in addition to other suggested approaches):
#import <Foundation/Foundation.h>
int main(int argc, char *argv[]) {
#autoreleasepool {
NSString *sampleText = #"%B999999^PDVS123456789012^PADILLA L. ^0X0000399 ?*;999999554749123456789012=00X990300000?*";
NSError *regexError = nil;
NSRegularExpressionOptions options = 0;
NSString *pattern = #"^%\\w+\\^\\w+\\^([A-Za-z\\s]+\\.).+\\?\\*\\;(.+)\\?\\*$";
NSRegularExpression *expression = [NSRegularExpression regularExpressionWithPattern:pattern options:options error:®exError];
NSTextCheckingResult *match = [expression firstMatchInString:sampleText options:0 range:range];
if( match ) {
NSRange nameRange = [match rangeAtIndex:1];
NSRange numberRange = [match rangeAtIndex:2];
printf("name = %s ",[[sampleText substringWithRange:nameRange] UTF8String]);
printf("number = %s\n",[[sampleText substringWithRange:numberRange] UTF8String]);
}
}
}
This little Foundation application prints the following to the console:
name = PADILLA L. number = 999999554749123456789012=00X990300000
The regex used to analyze the input string may need to be tweaked depending on how the input string varies. Right now it is (unescaped):
^%\w+\^\w+\^([A-Za-z\s]+\.).+\?\*\;(.+)\?\*$
I am trying to replace all characters except last 4 in a String with *'s.
In objective-c there is a method in NSString class replaceStringWithCharactersInRange: withString: where I would give it range (0,[string length]-4) ) with string #"*". This is what it does: 123456789ABCD is modified to *ABCD while I am looking to make ********ABCD.
I understand that it replaced range I specified with string object. How to accomplish this ?
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\d" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *newString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#"*"];
This looks like a simple problem... get the first part string and return it with the last four characters appended to it.
Here is a function that returns the needed string :
-(NSString *)neededStringWithString:(NSString *)aString {
// if the string has less than or 4 characters, return nil
if([aString length] <= 4) {
return nil;
}
NSUInteger countOfCharToReplace = [aString length] - 4;
NSString *firstPart = #"*";
while(--countOfCharToReplace) {
firstPart = [firstPart stringByAppendingString:#"*"];
}
// range for the last four
NSRange lastFourRange = NSMakeRange([aString length] - 4, 4);
// return the combined string
return [firstPart stringByAppendingString:
[aString substringWithRange:lastFourRange]];
}
The most unintuitive part in Cocoa is creating the repeating stars without some kind of awkward looping. stringByPaddingToLength:withString:startingAtIndex: allows you to create a repeating string of any length you like, so once you have that, here's a simple solution:
NSInteger starUpTo = [string length] - 4;
if (starUpTo > 0) {
NSString *stars = [#"" stringByPaddingToLength:starUpTo withString:#"*" startingAtIndex:0];
return [string stringByReplacingCharactersInRange:NSMakeRange(0, starUpTo) withString:stars];
} else {
return string;
}
I'm not sure why the accepted answer was accepted, since it only works if everything but last 4 is a digit. Here's a simple way:
NSMutableString * str1 = [[NSMutableString alloc]initWithString:#"1234567890ABCD"];
NSRange r = NSMakeRange(0, [str1 length] - 4);
[str1 replaceCharactersInRange:r withString:[[NSString string] stringByPaddingToLength:r.length withString:#"*" startingAtIndex:0]];
NSLog(#"%#",str1);
You could use [theString substringToIndex:[theString length]-4] to get the first part of the string and then combine [theString length]-4 *'s with the second part. Perhaps their is an easier way to do this..
NSMutableString * str1 = [[NSMutableString alloc]initWithString:#"1234567890ABCD"];
[str1 replaceCharactersInRange:NSMakeRange(0, [str1 length] - 4) withString:#"*"];
NSLog(#"%#",str1);
it works
The regexp didn't work on iOS7, but perhaps this helps:
- (NSString *)encryptString:(NSString *)pass {
NSMutableString *secret = [NSMutableString new];
for (int i=0; i<[pass length]; i++) {
[secret appendString:#"*"];
}
return secret;
}
In your case you should stop replacing the last 4 characters. Bit crude, but gets the job done
What's the simplest way, given a string:
NSString *str = #"Some really really long string is here and I just want the first 10 words, for example";
to result in an NSString with the first N (e.g., 10) words?
EDIT: I'd also like to make sure it doesn't fail if the str is shorter than N.
If the words are space-separated:
NSInteger nWords = 10;
NSRange wordRange = NSMakeRange(0, nWords);
NSArray *firstWords = [[str componentsSeparatedByString:#" "] subarrayWithRange:wordRange];
if you want to break on all whitespace:
NSCharacterSet *delimiterCharacterSet = [NSCharacterSet whitespaceAndNewlineCharacterSet];
NSArray *firstWords = [[str componentsSeparatedByCharactersInSet:delimiterCharacterSet] subarrayWithRange:wordRange];
Then,
NSString *result = [firstWords componentsJoinedByString:#" "];
While Barry Wark's code works well for English, it is not the preferred way to detect word breaks. Many languages, such as Chinese and Japanese, do not separate words using spaces. And German, for example, has many compounds that are difficult to separate correctly.
What you want to use is CFStringTokenizer:
CFStringRef string; // Get string from somewhere
CFLocaleRef locale = CFLocaleCopyCurrent();
CFStringTokenizerRef tokenizer = CFStringTokenizerCreate(kCFAllocatorDefault, string, CFRangeMake(0, CFStringGetLength(string)), kCFStringTokenizerUnitWord, locale);
CFStringTokenizerTokenType tokenType = kCFStringTokenizerTokenNone;
unsigned tokensFound = 0, desiredTokens = 10; // or the desired number of tokens
while(kCFStringTokenizerTokenNone != (tokenType = CFStringTokenizerAdvanceToNextToken(tokenizer)) && tokensFound < desiredTokens) {
CFRange tokenRange = CFStringTokenizerGetCurrentTokenRange(tokenizer);
CFStringRef tokenValue = CFStringCreateWithSubstring(kCFAllocatorDefault, string, tokenRange);
// Do something with the token
CFShow(tokenValue);
CFRelease(tokenValue);
++tokensFound;
}
// Clean up
CFRelease(tokenizer);
CFRelease(locale);
Based on Barry's answer, I wrote a function for the sake of this page (still giving him credit on SO)
+ (NSString*)firstWords:(NSString*)theStr howMany:(NSInteger)maxWords {
NSArray *theWords = [theStr componentsSeparatedByString:#" "];
if ([theWords count] < maxWords) {
maxWords = [theWords count];
}
NSRange wordRange = NSMakeRange(0, maxWords - 1);
NSArray *firstWords = [theWords subarrayWithRange:wordRange];
return [firstWords componentsJoinedByString:#" "];
}
Here's my solution, derived from the answers given here, for my own problem of removing the first word from a string...
NSMutableArray *words = [NSMutableArray arrayWithArray:[lowerString componentsSeparatedByString:#" "]];
[words removeObjectAtIndex:0];
return [words componentsJoinedByString:#" "];
I have an NSString (phone number) with some parenthesis and hyphens as some phone numbers are formatted. How would I remove all characters except numbers from the string?
Old question, but how about:
NSString *newString = [[origString componentsSeparatedByCharactersInSet:
[[NSCharacterSet decimalDigitCharacterSet] invertedSet]]
componentsJoinedByString:#""];
It explodes the source string on the set of non-digits, then reassembles them using an empty string separator. Not as efficient as picking through characters, but much more compact in code.
There's no need to use a regular expressions library as the other answers suggest -- the class you're after is called NSScanner. It's used as follows:
NSString *originalString = #"(123) 123123 abc";
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:originalString.length];
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet
characterSetWithCharactersInString:#"0123456789"];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:numbers intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
}
}
NSLog(#"%#", strippedString); // "123123123"
EDIT: I've updated the code because the original was written off the top of my head and I figured it would be enough to point the people in the right direction. It seems that people are after code they can just copy-paste straight into their application.
I also agree that Michael Pelz-Sherman's solution is more appropriate than using NSScanner, so you might want to take a look at that.
The accepted answer is overkill for what is being asked. This is much simpler:
NSString *pureNumbers = [[phoneNumberString componentsSeparatedByCharactersInSet:[[NSCharacterSet decimalDigitCharacterSet] invertedSet]] componentsJoinedByString:#""];
This is great, but the code does not work for me on the iPhone 3.0 SDK.
If I define strippedString as you show here, I get a BAD ACCESS error when trying to print it after the scanCharactersFromSet:intoString call.
If I do it like so:
NSMutableString *strippedString = [NSMutableString stringWithCapacity:10];
I end up with an empty string, but the code doesn't crash.
I had to resort to good old C instead:
for (int i=0; i<[phoneNumber length]; i++) {
if (isdigit([phoneNumber characterAtIndex:i])) {
[strippedString appendFormat:#"%c",[phoneNumber characterAtIndex:i]];
}
}
Though this is an old question with working answers, I missed international format support. Based on the solution of simonobo, the altered character set includes a plus sign "+". International phone numbers are supported by this amendment as well.
NSString *condensedPhoneNumber = [[phoneNumber componentsSeparatedByCharactersInSet:
[[NSCharacterSet characterSetWithCharactersInString:#"+0123456789"]
invertedSet]]
componentsJoinedByString:#""];
The Swift expressions are
var phoneNumber = " +1 (234) 567-1000 "
var allowedCharactersSet = NSMutableCharacterSet.decimalDigitCharacterSet()
allowedCharactersSet.addCharactersInString("+")
var condensedPhoneNumber = phoneNumber.componentsSeparatedByCharactersInSet(allowedCharactersSet.invertedSet).joinWithSeparator("")
Which yields +12345671000 as a common international phone number format.
Here is the Swift version of this.
import UIKit
import Foundation
var phoneNumber = " 1 (888) 555-5551 "
var strippedPhoneNumber = "".join(phoneNumber.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet))
Swift version of the most popular answer:
var newString = join("", oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet))
Edit: Syntax for Swift 2
let newString = oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet).joinWithSeparator("")
Edit: Syntax for Swift 3
let newString = oldString.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
Thanks for the example. It has only one thing missing the increment of the scanLocation in case one of the characters in originalString is not found inside the numbers CharacterSet object. I have added an else {} statement to fix this.
NSString *originalString = #"(123) 123123 abc";
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:originalString.length];
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet
characterSetWithCharactersInString:#"0123456789"];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:numbers intoString:&buffer]) {
[strippedString appendString:buffer];
}
// --------- Add the following to get out of endless loop
else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
}
// --------- End of addition
}
NSLog(#"%#", strippedString); // "123123123"
It Accept only mobile number
NSString * strippedNumber = [mobileNumber stringByReplacingOccurrencesOfString:#"[^0-9]" withString:#"" options:NSRegularExpressionSearch range:NSMakeRange(0, [mobileNumber length])];
It might be worth noting that the accepted componentsSeparatedByCharactersInSet: and componentsJoinedByString:-based answer is not a memory-efficient solution. It allocates memory for the character set, for an array and for a new string. Even if these are only temporary allocations, processing lots of strings this way can quickly fill the memory.
A memory friendlier approach would be to operate on a mutable copy of the string in place. In a category over NSString:
-(NSString *)stringWithNonDigitsRemoved {
static NSCharacterSet *decimalDigits;
if (!decimalDigits) {
decimalDigits = [NSCharacterSet decimalDigitCharacterSet];
}
NSMutableString *stringWithNonDigitsRemoved = [self mutableCopy];
for (CFIndex index = 0; index < stringWithNonDigitsRemoved.length; ++index) {
unichar c = [stringWithNonDigitsRemoved characterAtIndex: index];
if (![decimalDigits characterIsMember: c]) {
[stringWithNonDigitsRemoved deleteCharactersInRange: NSMakeRange(index, 1)];
index -= 1;
}
}
return [stringWithNonDigitsRemoved copy];
}
Profiling the two approaches have shown this using about 2/3 less memory.
You can use regular expression on mutable string:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:
#"[^\\d]"
options:0
error:nil];
[regex replaceMatchesInString:str
options:0
range:NSMakeRange(0, str.length)
withTemplate:#""];
Built the top solution as a category to help with broader problems:
Interface:
#interface NSString (easyReplace)
- (NSString *)stringByReplacingCharactersNotInSet:(NSCharacterSet *)set
with:(NSString *)string;
#end
Implemenation:
#implementation NSString (easyReplace)
- (NSString *)stringByReplacingCharactersNotInSet:(NSCharacterSet *)set
with:(NSString *)string
{
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:self.length];
NSScanner *scanner = [NSScanner scannerWithString:self];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:set intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
[strippedString appendString:string];
}
}
return [NSString stringWithString:strippedString];
}
#end
Usage:
NSString *strippedString =
[originalString stringByReplacingCharactersNotInSet:
[NSCharacterSet setWithCharactersInString:#"01234567890"
with:#""];
Swift 3
let notNumberCharacters = NSCharacterSet.decimalDigits.inverted
let intString = yourString.trimmingCharacters(in: notNumberCharacters)
swift 4.1
var str = "75003 Paris, France"
var stringWithoutDigit = (str.components(separatedBy:CharacterSet.decimalDigits)).joined(separator: "")
print(stringWithoutDigit)
Um. The first answer seems totally wrong to me. NSScanner is really meant for parsing. Unlike regex, it has you parsing the string one tiny chunk at a time. You initialize it with a string, and it maintains an index of how far along the string it's gotten; That index is always its reference point, and any commands you give it are relative to that point. You tell it, "ok, give me the next chunk of characters in this set" or "give me the integer you find in the string", and those start at the current index, and move forward until they find something that doesn't match. If the very first character already doesn't match, then the method returns NO, and the index doesn't increment.
The code in the first example is scanning "(123)456-7890" for decimal characters, which already fails from the very first character, so the call to scanCharactersFromSet:intoString: leaves the passed-in strippedString alone, and returns NO; The code totally ignores checking the return value, leaving the strippedString unassigned. Even if the first character were a digit, that code would fail, since it would only return the digits it finds up until the first dash or paren or whatever.
If you really wanted to use NSScanner, you could put something like that in a loop, and keep checking for a NO return value, and if you get that you can increment the scanLocation and scan again; and you also have to check isAtEnd, and yada yada yada. In short, wrong tool for the job. Michael's solution is better.
For those searching for phone extraction, you can extract the phone numbers from a text using NSDataDetector, for example:
NSString *userBody = #"This is a text with 30612312232 my phone";
if (userBody != nil) {
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypePhoneNumber error:&error];
NSArray *matches = [detector matchesInString:userBody options:0 range:NSMakeRange(0, [userBody length])];
if (matches != nil) {
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypePhoneNumber) {
DbgLog(#"Found phone number %#", [match phoneNumber]);
}
}
}
}
`
I created a category on NSString to simplify this common operation.
NSString+AllowCharactersInSet.h
#interface NSString (AllowCharactersInSet)
- (NSString *)stringByAllowingOnlyCharactersInSet:(NSCharacterSet *)characterSet;
#end
NSString+AllowCharactersInSet.m
#implementation NSString (AllowCharactersInSet)
- (NSString *)stringByAllowingOnlyCharactersInSet:(NSCharacterSet *)characterSet {
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:self.length];
NSScanner *scanner = [NSScanner scannerWithString:self];
while (!scanner.isAtEnd) {
NSString *buffer = nil;
if ([scanner scanCharactersFromSet:characterSet intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
scanner.scanLocation = scanner.scanLocation + 1;
}
}
return strippedString;
}
#end
I think currently best way is:
phoneNumber.replacingOccurrences(of: "\\D",
with: "",
options: String.CompareOptions.regularExpression)
If you're just looking to grab the numbers from the string, you could certainly use regular expressions to parse them out. For doing regex in Objective-C, check out RegexKit. Edit: As #Nathan points out, using NSScanner is a much simpler way to parse all numbers from a string. I totally wasn't aware of that option, so props to him for suggesting it. (I don't even like using regex myself, so I prefer approaches that don't require them.)
If you want to format phone numbers for display, it's worth taking a look at NSNumberFormatter. I suggest you read through this related SO question for tips on doing so. Remember that phone numbers are formatted differently depending on location and/or locale.
Swift 5
let newString = origString.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
Based on Jon Vogel's answer here it is as a Swift String extension along with some basic tests.
import Foundation
extension String {
func stringByRemovingNonNumericCharacters() -> String {
return self.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet).joinWithSeparator("")
}
}
And some tests proving at least basic functionality:
import XCTest
class StringExtensionTests: XCTestCase {
func testStringByRemovingNonNumericCharacters() {
let baseString = "123"
var testString = baseString
var newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == testString)
testString = "a123b"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == baseString)
testString = "a=1-2_3#b"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == baseString)
testString = "(999) 999-9999"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString.characters.count == 10)
XCTAssertTrue(newString == "9999999999")
testString = "abc"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == "")
}
}
This answers the OP's question but it could be easily modified to leave in phone number related characters like ",;*#+"
NSString *originalPhoneNumber = #"(123) 123-456 abc";
NSCharacterSet *numbers = [[NSCharacterSet characterSetWithCharactersInString:#"0123456789"] invertedSet];
NSString *trimmedPhoneNumber = [originalPhoneNumber stringByTrimmingCharactersInSet:numbers];
];
Keep it simple!