Prevent small negative numbers printing as "-0" - objective-c

If I do the following in Objective-C:
NSString *result = [NSString stringWithFormat:#"%1.1f", -0.01];
It will give result #"-0.0"
Does anybody know how I can force a result #"0.0" (without the "-") in this case?
EDIT:
I tried using NSNumberFormatter, but it has the same issue. The following also produces #"-0.0":
double value = -0.01;
NSNumberFormatter *numberFormatter = [[NSNumberFormatter alloc] init];
[numberFormatter setNumberStyle:NSNumberFormatterDecimalStyle];
[numberFormatter setMaximumFractionDigits:1];
[numberFormatter setMinimumFractionDigits:1];
NSString *result = [numberFormatter stringFromNumber:[NSNumber numberWithDouble:value]];

I wanted a general solution, independent of the configuration of the number formatter.
I've used a category to add the functionality to NSNumberFormater;
#interface NSNumberFormatter (PreventNegativeZero)
- (NSString *)stringFromNumberWithoutNegativeZero:(NSNumber *)number;
#end
With the implementation:
#implementation NSNumberFormatter (PreventNegativeZero)
- (NSString *)stringFromNumberWithoutNegativeZero:(NSNumber *)number
{
NSString *const string = [self stringFromNumber: number];
NSString *const negZeroString = [self stringFromNumber: [NSNumber numberWithFloat: -0.0f]];
if([string isEqualToString: negZeroString])
{
NSString *const posZeroString = [self stringFromNumber: [NSNumber numberWithFloat: 0.0]];
return posZeroString;
}
return string;
}
#end
How it works
The key feature is to ask the number formatter how it will format -0.0f (i.e., floating point minus zero) as an NSString so that we can detect this and take remedial action.
Why do this? Depending on the formatter configuration, -0.0f could be formatted as: #"-0", #"-0.0", #"-000", #"-0ºC", #"£-0.00", #"----0.0", #"(0.0)", #"😡𝟘.⓪零" really, pretty much anything. So, we ask the formatter how it would format -0.0f using the line: NSString *const negZeroString = [self stringFromNumber: [NSNumber numberWithFloat: -0.0f]];
Armed with the undesired -0.0f string, when an arbitrary input number is formatted, it can be tested to see if it is matches the undesirable -0.0f string.
The second important feature is that the number formatter is also asked to supply the replacement positive zero string. This is necessary so that as before, its formatting is respected. This is done with the line: [self stringFromNumber: [NSNumber numberWithFloat: 0.0]]
An optimisation that doesn't work
It's tempting to perform a numerical test yourself for whether the input number will be formatted as the -0.0f string, but this is extremely non trivial (ie, basically impossible in general). This is because the set of numbers that will format to the -0.0f string depend on the configuration of the formatter. If if happens to be rounding to the nearest million, then -5,000f as an input would be formatted as the -0.0f string.
An implementation error to avoid
When input that formats to the -0.0f string is detected, a positive zero equivalent output string is generated using [self stringFromNumber: [NSNumber numberWithFloat: 0.0]]. Note that, specifically:
The code formats the float literal 0.0f and returns it.
The code does not use the negation of the input.
Negating an input of -0.1f would result in formatting 0.1f. Depending on the formatter behaviour, this could be rounded up and result in #"1,000", which you don't want.
Final Note
For what it's worth, the approach / pattern / algorithm used here will translate to other languages and different string formatting APIs.

Use a NSNumberFormatter. In general, NSString formatting should not be used to present data to the user.
EDIT:
As stated in the question, this is not the correct answer. There is a number of solutions. It's easy to check for negative zero because it is defined to be equal to any zero (0.0f == -0.0f) but the actual problem is that a number of other values can be rounded to the negative zero. Instead of catching such values, I suggest postprocessing - a function that will check if the result contains only zero digits (skipping other characters). If yes, remove leading minus sign.

NSString *result = [NSString stringWithFormat:#"%1.1f", -0.01*-1];
If instead of a value you pass an instance you can check:
float myFloat = -0.01;
NSString *result = [NSString stringWithFormat:#"%1.1f", (myFloat<0? myFloat*-1:myFloat)];
Edit:
If you just want 0.0 as positive value:
NSString *result = [NSString stringWithFormat:#"%1.1f",(int)(myFloat*10)<0?myFloat:myFloat*-1];

Convert the number to NSString by taking the float or double value.
Convert the string back to NSNumber.
NSDecimalNumber *num = [NSDecimalNumber decimalNumberWithString:#"-0.00000000008"];
NSString *st2 = [NSString stringWithFormat:#"%0.2f", [num floatValue]];
NSDecimalNumber *result = [NSDecimalNumber decimalNumberWithString:st2]; //returns 0

The NSNumberFormatter has two methods convert from Number to String, and from String to Number. What if we use method (Number) -> String? twice?
public extension NumberFormatter {
func stringWithoutNegativeZero(from number: NSNumber) -> String? {
string(from: number)
.flatMap { [weak self] string in self?.number(from: string) }
.flatMap { [weak self] number in self?.string(from: number) }
}
}

Related

Large (but representable) integers get parsed as doubles by NSNumberFormatter

I've been using the following method to parse NSString's into NSNumber's:
// (a category method on NSString)
-(NSNumber*) tryParseAsNumber {
NSNumberFormatter* formatter = [NSNumberFormatter new];
[formatter setNumberStyle:NSNumberFormatterDecimalStyle];
return [formatter numberFromString:self];
}
And I had tests verifying that this was working correctly:
test(#"".tryParseAsNumber == nil);
...
test([(#NSUIntegerMax).description.tryParseAsNumber isEqual:#NSUIntegerMax]);
...
The max-value test started failing when I switched to testing on an iPhone 6, probably because NSUInteger is now 64 bits instead of 32 bits. The value returned by the formatter is now the double 1.844674407370955e+19 instead of the uint64_t 18446744073709551615.
Is there a built-in method that succeeds exactly for all int64s and unsigned int64s, or do I have to implement one myself?
+ [NSNumber numberWithLongLong:]
+ [NSNumber numberWithUnsignedLongLong:]
Have you tried these?
EDIT
I'm not at all certain what it is you'd ultimately do with your instances of NSNumber, but consider that NSDecimalNumber seems to do exactly what you want:
NSDecimalNumber *decNum = [NSDecimalNumber decimalNumberWithString:#"18446744073709551615"];
NSLog(#"%#", decNum);
which yields:
2014-09-21 15:11:25.472 Test[1138:812724] 18446744073709551615
Here's another thing to consider: NSDecimalNumber "is a" NSNumber, as it's a subclass of the latter. So it would appear that, whatever you can do with NSNumber, you can do with NSDecimalNumber.
trudyscousin's answer allowed me to figure it out.
NSDecimalNumber decimalNumberWithString: is capable of parsing with full precision, but it lets some bad inputs by (e.g. "88ffhih" gets parsed as 88). On the other hand, NSNumberFormatter numberFromString: always detects bad inputs but loses precision. They have opposite weaknesses.
So... just do both. For example, here's a method that should parse representable NSUIntegers but nothing else:
+(NSNumber*) parseAsNSUIntegerElseNil:(NSString*)decimalText {
// NSNumberFormatter.numberFromString is good at noticing bad inputs, but loses precision for large values
// NSDecimalNumber.decimalNumberWithString has perfect precision, but lets bad inputs through sometimes (e.g. "88ffhih" -> 88)
// We use both to get both accuracy and detection of bad inputs
NSNumberFormatter* formatter = [NSNumberFormatter new];
[formatter setNumberStyle:NSNumberFormatterDecimalStyle];
if ([formatter numberFromString:decimalText] == nil) {
return nil;
}
NSNumber* value = [NSDecimalNumber decimalNumberWithString:decimalText];
// Discard values not representable by NSUInteger
if (![value isEqual:#(value.unsignedIntegerValue)]) {
return nil;
}
return value;
}

NSNumber stringValue different from NSNumber value

I'm having problems with converting NSNumber to string and string to NSNumber.
Here's a sample problem:
NSString *stringValue = #"9.2";
NSNumberFormatter *formatter = [[NSNumberFormatter alloc] init];
NSLog(#"stringvalue:%#",[[formatter numberFromString: stringValue] stringValue]);
Output will be:
stringvalue:9.199999999999999
I need to retrieve the original value, where, in the example should be 9.2.
On the contrary, when the original string is 9.4 the output is still 9.4.
Do you have any idea how to retrieve the original string value without NSNumber doing anything about it?
You are discovering that floating point numbers can't always be represented exactly. There are numerous posts about such issues.
If you need to get back to the original string, then keep the original string as your data and only convert to a number when you need to perform a calculation.
You may want to look into NSDecimalNumber. This may better fit your needs.
NSString *numStr = #"9.2";
NSDecimalNumber *decNum = [NSDecimalNumber decimalNumberWithString:numStr];
NSString *newStr = [decNum stringValue];
NSLog(#"decNum = %#, newStr = %#", decNum, newStr);
This gives 9.2 for both values.

Unexpected behaviour formatting very large decimal numbers in ObjC

I am running into unexpected behaviour formatting very large numbers in ObjC using the NSNumberFormatter.
It seems that the number formatter rounds decimals (NSDecimalNumber) after the fifteenth digit regardless of fraction digits.
The below test fails on values 1,3 and 5.
Two requests:
Any suggestions on alternative code would be greatly appreciated?
I assume the issue is happening due to the usage of a hard-coded digit limit in NSNumberFormatter?
The post here lists a workaround without sufficient description if the problem. Also our application (banking sector) runs across multiple countries and we link the formatting to the user's locale as configured in the backend. This workaround would imply that we write our own number formatter to handle the requirement. Something I do not want to do.
- (void)testFormatterUsingOnlySDK {
NSDecimalNumber *value1 = [NSDecimalNumber decimalNumberWithMantissa: 9423372036854775808u exponent:-3 isNegative:YES];
NSDecimalNumber *value2 = [NSDecimalNumber decimalNumberWithMantissa: 9999999999999990u exponent:-3 isNegative:YES];
NSDecimalNumber *value3 = [NSDecimalNumber decimalNumberWithMantissa: 9999999999999991u exponent:-3 isNegative:YES];
NSDecimalNumber *value4 = [NSDecimalNumber decimalNumberWithMantissa: 99999999999999900u exponent:-4 isNegative:YES];
NSDecimalNumber *value5 = [NSDecimalNumber decimalNumberWithMantissa: 11111111111111110u exponent:-4 isNegative:YES];
NSNumberFormatter *formatter = [[[NSNumberFormatter alloc] init] autorelease];
formatter.allowsFloats = YES;
formatter.maximumFractionDigits = 3;
[self assertStringAreEqualWithActual:[formatter stringFromNumber:value1] andExpeted: #"-9423372036854775.808"];
[self assertStringAreEqualWithActual:[formatter stringFromNumber:value2] andExpeted: #"-9999999999999.99"];
[self assertStringAreEqualWithActual:[formatter stringFromNumber:value3] andExpeted: #"-9999999999999.991"];
[self assertStringAreEqualWithActual:[formatter stringFromNumber:value4] andExpeted: #"-9999999999999.99"];
[self assertStringAreEqualWithActual:[formatter stringFromNumber:value5] andExpeted: #"-1111111111111.111"];
}
- (void)assertStringAreEqualWithActual:(NSString *)actual andExpeted:(NSString *)expected {
STAssertTrue([expected isEqualToString:actual], #"Expected %# but got %#", expected, actual);
}
Unfortunately, NSNumberFormatter doesn't work correctly with NSDecimalNumber.
The problem (very probably) is that the first thing it does is calling doubleValue on the number it wants to format.
See also NSDecimalNumber round long numbers
After many tries with NSNumberFormatter, I have created my own formatter, it's actually very easy:
Handle NaN.
Round using roundToScale:
Get stringValue
Check if negative, remove leading -
Find decimal point (.)
Localize decimal point ([locale objectForKey:NSLocaleDecimalSeparator])
Add grouping separators ([locale objectForKey:NSLocaleGroupingSeparator])
If negative, add leading - or put the number into parenthesis if you are formatting currency.
Done.
You should compile your own NSNumberFormatter from this open source code, changing the prefix. This should allow you to debug into the formatting and to understand why this is happening. Worst case you can submit a patch to Apple.
http://code.google.com/p/cocotron/source/browse/Foundation/NSNumberFormatter.m?r=7542c3a7ef0ef75479e6154a75d304113f5a9738
You've set maximumFractionDigits to two. All of the failing tests have three fraction digits in the expected value. Either the expectation or the code needs to change to match. If I make this change:
formatter.maximumFractionDigits = 3;
then all of your test cases are met.

parsing string into different kind of number string

I have a string called realEstateWorth with a value of $12,000,000.
I need this same string to remain a string but for any number (such as the one above) to be displayed as $12 MILLION or $6 MILLION. The point is it needs the words "MILLION" to come after the number.
I know there is nsNumberFormatter that can convert strings into numbers and vice versa but can it do what I need?
If anyone has any ideas or suggestions, it would be much appreciated.
Thank you!
So as I see it, you have two problems:
You have a string representation of something that's actually a number
You (potentially) have a number that you want formatted as a string
So, problem #1:
To convert a string into a number, you use an NSNumberFormatter. You've got a pretty simple case:
NSNumberFormatter *f = [[NSNumberFormatter alloc] init];
[f setNumberStyle:NSNumberFormatterCurrencyStyle];
NSNumber *n = [f numberFromString:#"$12,000,000"];
// n is 12000000
That was easy! Now problem #2:
This is trickier, because you want a mixed spell-out style. You could consider using an NSNumberFormatter again, but it's not quite right:
[f setNumberStyle:NSNumberFormatterSpellOutStyle];
NSString *s = [f stringFromNumber:n];
// s is "twelve million"
So, we're closer. At this point, you could perhaps maybe do something like:
NSInteger numberOfMillions = [n integerValue] / 1000000;
if (numberOfMillions > 0) {
NSNumber *millions = [NSNumber numberWithInteger:numberOfMillions];
NSString *numberOfMillionsString = [f stringFromNumber:millions]; // "twelve"
[f setNumberStyle:NSNumberFormatterCurrencyStyle];
NSString *formattedMillions = [f stringFromNumber:millions]; // "$12.00"
if ([s hasPrefix:numberOfMillionsString]) {
// replace "twelve" with "$12.00"
s = [s stringByReplacingCharactersInRange:NSMakeRange(0, [numberOfMillionsString length]) withString:formattedMillions];
// if this all works, s should be "$12.00 million"
// you can use the -setMaximumFractionDigits: method on NSNumberFormatter to fiddle with the ".00" bit
}
}
However
I don't know how well this would work in anything other than english. CAVEAT IMPLEMENTOR
Worst case scenario, you could implement a category on NSString to implement the behaviour you want.
In the method that you would do in that category you could take an NSNumberFormatter to bring that string to a number and by doing some modulo operation you could define if you need the word Million, or Billion, etc. and put back a string with the modulo for Million or other way you need it to be.
That way you could just call that method on your NSString like this :
NSString *humanReadable = [realEstateWorth myCustomMethodFromMyCategory];
And also.
NSString are immutable, so you can't change it unless you assign a new one to your variable.
I'd recommend storing this value as an NSNumber or a float. Then you could have a method to generate an NSString to display it like:
- (NSString*)numberToCurrencyString:(float)num
{
NSString *postfix = #"";
if (num > 1000000000)
{
num = num / 1000000000;
postfix = #" Billion";
}
else if (num > 1000000)
{
num = num / 1000000;
postfix = #" Million";
}
NSString *currencyString = [NSString stringWithFormat:#"%.0f%#", num, postfix];
return currencyString;
}
Note: Your question states that your input needs to remain a string. That's fine. So you'd need to 1.) first parse the number out of the string and 2.) then reconvert it to a string from a number. I've shown how to do step 2 of this process.

NSString - Convert to pure alphabet only (i.e. remove accents+punctuation)

I'm trying to compare names without any punctuation, spaces, accents etc.
At the moment I am doing the following:
-(NSString*) prepareString:(NSString*)a {
//remove any accents and punctuation;
a=[[[NSString alloc] initWithData:[a dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES] encoding:NSASCIIStringEncoding] autorelease];
a=[a stringByReplacingOccurrencesOfString:#" " withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"'" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"`" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"-" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"_" withString:#""];
a=[a lowercaseString];
return a;
}
However, I need to do this for hundreds of strings and I need to make this more efficient. Any ideas?
NSString* finish = [[start componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
Before using any of these solutions, don't forget to use decomposedStringWithCanonicalMapping to decompose any accented letters. This will turn, for example, é (U+00E9) into e ‌́ (U+0065 U+0301). Then, when you strip out the non-alphanumeric characters, the unaccented letters will remain.
The reason why this is important is that you probably don't want, say, “dän” and “dün”* to be treated as the same. If you stripped out all accented letters, as some of these solutions may do, you'll end up with “dn”, so those strings will compare as equal.
So, you should decompose them first, so that you can strip the accents and leave the letters.
*Example from German. Thanks to Joris Weimar for providing it.
On a similar question, Ole Begemann suggests using stringByFoldingWithOptions: and I believe this is the best solution here:
NSString *accentedString = #"ÁlgeBra";
NSString *unaccentedString = [accentedString stringByFoldingWithOptions:NSDiacriticInsensitiveSearch locale:[NSLocale currentLocale]];
Depending on the nature of the strings you want to convert, you might want to set a fixed locale (e.g. English) instead of using the user's current locale. That way, you can be sure to get the same results on every machine.
One important precision over the answer of BillyTheKid18756 (that was corrected by Luiz but it was not obvious in the explanation of the code):
DO NOT USE stringWithCString as a second step to remove accents, it can add unwanted characters at the end of your string as the NSData is not NULL-terminated (as stringWithCString expects it).
Or use it and add an additional NULL byte to your NSData, like Luiz did in his code.
I think a simpler answer is to replace:
NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
By:
NSString *sanitizedText = [[[NSString alloc] initWithData:sanitizedData encoding:NSASCIIStringEncoding] autorelease];
If I take back the code of BillyTheKid18756, here is the complete correct code:
// The input text
NSString *text = #"BûvérÈ!#$&%^&(*^(_()-*/48";
// Defining what characters to accept
NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
[acceptedCharacters addCharactersInString:#" _-.!"];
// Turn accented letters into normal letters (optional)
NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
// Corrected back-conversion from NSData to NSString
NSString *sanitizedText = [[[NSString alloc] initWithData:sanitizedData encoding:NSASCIIStringEncoding] autorelease];
// Removing unaccepted characters
NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:#""];
If you are trying to compare strings, use one of these methods. Don't try to change data.
- (NSComparisonResult)localizedCompare:(NSString *)aString
- (NSComparisonResult)localizedCaseInsensitiveCompare:(NSString *)aString
- (NSComparisonResult)compare:(NSString *)aString options:(NSStringCompareOptions)mask range:(NSRange)range locale:(id)locale
You NEED to consider user locale to do things write with strings, particularly things like names.
In most languages, characters like ä and å are not the same other than they look similar. They are inherently distinct characters with meaning distinct from others, but the actual rules and semantics are distinct to each locale.
The correct way to compare and sort strings is by considering the user's locale. Anything else is naive, wrong and very 1990's. Stop doing it.
If you are trying to pass data to a system that cannot support non-ASCII, well, this is just a wrong thing to do. Pass it as data blobs.
https://developer.apple.com/library/ios/documentation/cocoa/Conceptual/Strings/Articles/SearchingStrings.html
Plus normalizing your strings first (see Peter Hosey's post) precomposing or decomposing, basically pick a normalized form.
- (NSString *)decomposedStringWithCanonicalMapping
- (NSString *)decomposedStringWithCompatibilityMapping
- (NSString *)precomposedStringWithCanonicalMapping
- (NSString *)precomposedStringWithCompatibilityMapping
No, it's not nearly as simple and easy as we tend to think.
Yes, it requires informed and careful decision making. (and a bit of non-English language experience helps)
Consider using the RegexKit framework. You could do something like:
NSString *searchString = #"This is neat.";
NSString *regexString = #"[\W]";
NSString *replaceWithString = #"";
NSString *replacedString = [searchString stringByReplacingOccurrencesOfRegex:regexString withString:replaceWithString];
NSLog (#"%#", replacedString);
//... Thisisneat
Consider using NSScanner, and specifically the methods -setCharactersToBeSkipped: (which accepts an NSCharacterSet) and -scanString:intoString: (which accepts a string and returns the scanned string by reference).
You may also want to couple this with -[NSString localizedCompare:], or perhaps -[NSString compare:options:] with the NSDiacriticInsensitiveSearch option. That could simplify having to remove/replace accents, so you can focus on removing puncuation, whitespace, etc.
If you must use an approach like you presented in your question, at least use an NSMutableString and replaceOccurrencesOfString:withString:options:range: — that will be much more efficient than creating tons of nearly-identical autoreleased strings. It could be that just reducing the number of allocations will boost performance "enough" for the time being.
To give a complete example by combining the answers from Luiz and Peter, adding a few lines, you get the code below.
The code does the following:
Creates a set of accepted characters
Turn accented letters into normal letters
Remove characters not in the set
Objective-C
// The input text
NSString *text = #"BûvérÈ!#$&%^&(*^(_()-*/48";
// Create set of accepted characters
NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
[acceptedCharacters addCharactersInString:#" _-.!"];
// Turn accented letters into normal letters (optional)
NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
// Remove characters not in the set
NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:#""];
Swift (2.2) example
let text = "BûvérÈ!#$&%^&(*^(_()-*/48"
// Create set of accepted characters
let acceptedCharacters = NSMutableCharacterSet()
acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.letterCharacterSet())
acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.decimalDigitCharacterSet())
acceptedCharacters.addCharactersInString(" _-.!")
// Turn accented letters into normal letters (optional)
let sanitizedData = text.dataUsingEncoding(NSASCIIStringEncoding, allowLossyConversion: true)
let sanitizedText = String(data: sanitizedData!, encoding: NSASCIIStringEncoding)
// Remove characters not in the set
let components = sanitizedText!.componentsSeparatedByCharactersInSet(acceptedCharacters.invertedSet)
let output = components.joinWithSeparator("")
Output
The output for both examples would be: BuverE!_-48
Just bumped into this, maybe its too late, but here is what worked for me:
// text is the input string, and this just removes accents from the letters
// lossy encoding turns accented letters into normal letters
NSMutableData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding
allowLossyConversion:YES];
// increase length by 1 adds a 0 byte (increaseLengthBy
// guarantees to fill the new space with 0s), effectively turning
// sanitizedData into a c-string
[sanitizedData increaseLengthBy:1];
// now we just create a string with the c-string in sanitizedData
NSString *final = [NSString stringWithCString:[sanitizedData bytes]];
#interface NSString (Filtering)
- (NSString*)stringByFilteringCharacters:(NSCharacterSet*)charSet;
#end
#implementation NSString (Filtering)
- (NSString*)stringByFilteringCharacters:(NSCharacterSet*)charSet {
NSMutableString * mutString = [NSMutableString stringWithCapacity:[self length]];
for (int i = 0; i < [self length]; i++){
char c = [self characterAtIndex:i];
if(![charSet characterIsMember:c]) [mutString appendFormat:#"%c", c];
}
return [NSString stringWithString:mutString];
}
#end
These answers didn't work as expected for me. Specifically, decomposedStringWithCanonicalMapping didn't strip accents/umlauts as I'd expected.
Here's a variation on what I used that answers the brief:
// replace accents, umlauts etc with equivalent letter i.e 'é' becomes 'e'.
// Always use en_GB (or a locale without the characters you wish to strip) as locale, no matter which language we're taking as input
NSString *processedString = [string stringByFoldingWithOptions: NSDiacriticInsensitiveSearch locale: [NSLocale localeWithLocaleIdentifier: #"en_GB"]];
// remove non-letters
processedString = [[processedString componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
// trim whitespace
processedString = [processedString stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceCharacterSet]];
return processedString;
Peter's Solution in Swift:
let newString = oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet).joinWithSeparator("")
Example:
let oldString = "Jo_ - h !. nn y"
// "Jo_ - h !. nn y"
oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet)
// ["Jo", "h", "nn", "y"]
oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet).joinWithSeparator("")
// "Johnny"
I wanted to filter out everything except letters and numbers, so I adapted Lorean's implementation of a Category on NSString to work a little different. In this example, you specify a string with only the characters you want to keep, and everything else is filtered out:
#interface NSString (PraxCategories)
+ (NSString *)lettersAndNumbers;
- (NSString*)stringByKeepingOnlyLettersAndNumbers;
- (NSString*)stringByKeepingOnlyCharactersInString:(NSString *)string;
#end
#implementation NSString (PraxCategories)
+ (NSString *)lettersAndNumbers { return #"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"; }
- (NSString*)stringByKeepingOnlyLettersAndNumbers {
return [self stringByKeepingOnlyCharactersInString:[NSString lettersAndNumbers]];
}
- (NSString*)stringByKeepingOnlyCharactersInString:(NSString *)string {
NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:string];
NSMutableString * mutableString = #"".mutableCopy;
for (int i = 0; i < [self length]; i++){
char character = [self characterAtIndex:i];
if([characterSet characterIsMember:character]) [mutableString appendFormat:#"%c", character];
}
return mutableString.copy;
}
#end
Once you've made your Categories, using them is trivial, and you can use them on any NSString:
NSString *string = someStringValueThatYouWantToFilter;
string = [string stringByKeepingOnlyLettersAndNumbers];
Or, for example, if you wanted to get rid of everything except vowels:
string = [string stringByKeepingOnlyCharactersInString:#"aeiouAEIOU"];
If you're still learning Objective-C and aren't using Categories, I encourage you to try them out. They're the best place to put things like this because it gives more functionality to all objects of the class you Categorize.
Categories simplify and encapsulate the code you're adding, making it easy to reuse on all of your projects. It's a great feature of Objective-C!