Detect type from string objective-c - objective-c

Whats the best way of detecting a data type from a string in Objective-c?
I'm importing CSV files but each value is just a string.
E.g. How do I tell that "2.0" is a number, "London" should be treated as a category and that "Monday 2nd June" or "2/6/2012" is a date.
I need to test the datatype some how and be confident about which type I use before passing the data downstream.

Regex is the only thing I can think about, but if you are on mac or iphone, than you might try e.g. RegexKitLite

----------UPDATE----------
Instead of my previous suggestion, try this:
NSString *csvString = #"333";
NSString *charSet = #"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.,";
NSScanner *typeScanner = [NSScanner scannerWithString: csvString];
[typeScanner setCharactersToBeSkipped: [NSCharacterSet characterSetWithCharactersInString:charSet]];
NSString *checkString = [[NSString alloc] init];
[typeScanner scanString:csvString intoString:&checkString];
if([csvString length] == [checkString length]){
//the string "csvString" is an integer
}
To check for other types (float, string, etc.), change this line (which checks for int type) NSString *charSet = #"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.,"; to NSString *charSet = #"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; (which checks for float type) or NSString *charSet = #"1234567890"; (which checks for a string composed only of letters).
-------Initial Post-------
You could do this:
NSString *stringToTest = #"123";
NSCharacterSet *intValueSet = [NSCharacterSet decimalDigitCharacterSet];
NSArray *test = [stringToTest componentsSeparatedByCharactersInSet:intValueSet];
if ([test count]==[stringToTest length]+1){
NSLog(#"It's an int!");
}
else {
NSLog(#"It's not an int");
}
This works for numbers that don't have a decimal point or commas as thousands separators, like "8493" and "883292837". I've tested it and it works.
Hope this provides a start for you! I'll try to figure out how to test for numbers with decimal points and strings.
Like Andrew said, regular expressions are probably good for this, but they're a bit complicated.

Related

Get Unicode point of NSString and put that into another NSString

What's the easiest way to get the Unicode value from an NSString? For example,
NSString *str = "A";
NSString *hex;
Now, I want to set the value of hex to the Unicode value of str (i.e. 0041)... How would I go about doing that?
The unichar type is defined to be a 16-bit unicode value (eg, as indirectly documented in the description of the %C specifier), and you can get a unichar from a given position in an NSString using characterAtIndex:, or use getCharacters:range: if you want to fill a C array of unichars from the NSString more quickly than by querying them one by one.
NSUTF32StringEncoding is also a valid string encoding, as are a couple of endian-specific variants, in case you want to be absolutely future proof. You'd get a C array of those using the much more longwinded getBytes:maxLength:usedLength:encoding:options:range:remainingRange:.
EDIT: so, e.g.
NSString *str = #"A";
NSLog(#"16-bit unicode values are:");
for(int index = 0; index < [str length]; index++)
NSLog(#"%04x", [str characterAtIndex:index]);
You can use
NSData * u = [str dataUsingEncoding:NSUnicodeStringEncoding];
NSString *hex = [u description];
You may replace NSUnicodeStringEncoding by NSUTF8StringEncoding, NSUTF16StringEncoding (the same as NSUnicodeStringEncoding) or NSUTF32StringEncoding, or many other values.
See here
for more

Finding a substring in a NSString object

I have an NSString object and I want to make a substring from it, by locating a word.
For example, my string is: "The dog ate the cat", I want the program to locate the word "ate" and make a substring that will be "the cat".
Can someone help me out or give me an example?
Thanks,
Sagiftw
NSRange range = [string rangeOfString:#"ate"];
NSString *substring = [[string substringFromIndex:NSMaxRange(range)] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSString *str = #"The dog ate the cat";
NSString *search = #"ate";
NSString *sub = [str substringFromIndex:NSMaxRange([str rangeOfString:search])];
If you want to trim whitespace you can do that separately.
What about this way?
It's nearly the same.
But maybe meaning of NSRange easier to understand for beginners, if it's written this way.
At last, it's the same solution of jtbandes
NSString *szHaystack= #"The dog ate the cat";
NSString *szNeedle= #"ate";
NSRange range = [szHaystack rangeOfString:szNeedle];
NSInteger idx = range.location + range.length;
NSString *szResult = [szHaystack substringFromIndex:idx];
Try this one..
BOOL isValid=[yourString containsString:#"X"];
This method return true or false. If your string contains this character it return true, and otherwise it returns false.
NSString *theNewString = [receivedString substringFromIndex:[receivedString rangeOfString:#"Ur String"].location];
You can search for a string and then get the searched string into another string...
-(BOOL)Contains:(NSString *)StrSearchTerm on:(NSString *)StrText
{
return [StrText rangeOfString:StrSearchTerm options:NSCaseInsensitiveSearch].location==NSNotFound?FALSE:TRUE;
}
You can use any of the two methods provided in NSString class, like substringToIndex: and substringFromIndex:. Pass a NSRange to it as your length and location, and you will have the desired output.

Using scanf with NSStrings

I want the user to input a string and then assign the input to an NSString. Right now my code looks like this:
NSString *word;
scanf("%s", &word);
The scanf function reads into a C string (actually an array of char), like this:
char word[40];
int nChars = scanf("%39s", word); // read up to 39 chars (leave room for NUL)
You can convert a char array into NSString like this:
NSString* word2 = [NSString stringWithBytes:word
length:nChars
encoding:NSUTF8StringEncoding];
However scanf only works with console (command line) programs. If you're trying to get input on a Mac or iOS device then scanf is not what you want to use to get user input.
scanf does not work with any object types. If you have a C string and want to create an NSString from it, use -[NSString initWithBytes:length:encoding:].
scanf does not work with NSString as scanf doesn’t work on objects. It works only on primitive datatypes such as:
int
float
BOOL
char
What to do?
Technically a string is made up of a sequence of individual characters. So to accept string input, you can read in the sequence of characters and convert it to a string.
use:
[NSString stringWithCString:cstring encoding:1];
Here is a working example:
NSLog(#"What is the first name?");
char cstring[40];
scanf("%s", cstring);
firstName = [NSString stringWithCString:cstring encoding:1];
Here’s an explanation of the above code, comment by comment:
You declare a variable called cstring to hold 40 characters.
You then tell scanf to expect a list of characters by using the %s format specifier.
Finally, you create an NSString object from the list of characters that were read in.
Run your project; if you enter a word and hit Enter, the program should print out the same word you typed. Just make sure the word is less than 40 characters; if you enter more, you might cause the program to crash — you are welcome to test that out yourself! :]
Taken from: RW.
This is how I'd do it:
char word [40];
scanf("%s",word);
NSString * userInput = [[NSString alloc] initWithCString: word encoding: NSUTF8StringEncoding];
yes, but sscanf does, and may be a good solution for complex NSString parsing.
Maybe this will work for you because it accepts string with spaces as well.
NSLog(#"Enter The Name Of State");
char name[20];
gets(name);
NSLog(#"%s",name);
Simple Solution is
char word[40];
scanf("%39s", word);
NSString* word2 = [NSString stringWithUTF8String:word];
The NSFileHandle class is an object-oriented wrapper for a file descriptor. For files, you can read, write, and seek within the file.
NSFileHandle *inputFile = [NSFileHandle fileHandleWithStandardInput];
NSData *inputData = [inputFile availableData];
NSString *word = [[NSString alloc]initWithData:inputData encoding:NSUTF8StringEncoding];

stringByAppendingFormat not working

I have an NSString and fail to apply the following statement:
NSString *myString = #"some text";
[myString stringByAppendingFormat:#"some text = %d", 3];
no log or error, the string just doesn't get changed. I already tried with NSString (as documented) and NSMutableString.
any clues most welcome.
I would suggest correcting to (documentation):
NSString *myString = #"some text";
myString = [myString stringByAppendingFormat:#" = %d", 3];
From the docs:
Returns a string made by appending to the receiver a string constructed from a given format string and the following arguments.
It's working, you're just ignoring the return value, which is the string with the appended format. (See the docs.) You can't modify an NSString — to modify an NSMutableString, use -appendFormat: instead.
Of course, in your toy example, you could shorten it to this:
NSString *myString = [NSString stringWithFormat:#"some text = %d", 3];
However, it's likely that you need to append a format string to an existing string created elsewhere. In that case, and particularly if you're appending multiple parts, it's good to think about and balance the pros and cons of using a mutable string or several immutable, autoreleased strings.
Creating strings with #"" always results in immutable strings. If you want to create a new NSMutableString do it as following.
NSMutableString *myString = [NSMutableString stringWithString:#"some text"];
[myString appendFormat:#"some text = %d", 3];
I had a similar warning message while appending a localized string. This is how I resolved it
NSString *msgBody = [msgBody stringByAppendingFormat:#"%#",NSLocalizedString(#"LOCALSTRINGMSG",#"Message Body")];

NSString - Convert to pure alphabet only (i.e. remove accents+punctuation)

I'm trying to compare names without any punctuation, spaces, accents etc.
At the moment I am doing the following:
-(NSString*) prepareString:(NSString*)a {
//remove any accents and punctuation;
a=[[[NSString alloc] initWithData:[a dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES] encoding:NSASCIIStringEncoding] autorelease];
a=[a stringByReplacingOccurrencesOfString:#" " withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"'" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"`" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"-" withString:#""];
a=[a stringByReplacingOccurrencesOfString:#"_" withString:#""];
a=[a lowercaseString];
return a;
}
However, I need to do this for hundreds of strings and I need to make this more efficient. Any ideas?
NSString* finish = [[start componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
Before using any of these solutions, don't forget to use decomposedStringWithCanonicalMapping to decompose any accented letters. This will turn, for example, é (U+00E9) into e ‌́ (U+0065 U+0301). Then, when you strip out the non-alphanumeric characters, the unaccented letters will remain.
The reason why this is important is that you probably don't want, say, “dän” and “dün”* to be treated as the same. If you stripped out all accented letters, as some of these solutions may do, you'll end up with “dn”, so those strings will compare as equal.
So, you should decompose them first, so that you can strip the accents and leave the letters.
*Example from German. Thanks to Joris Weimar for providing it.
On a similar question, Ole Begemann suggests using stringByFoldingWithOptions: and I believe this is the best solution here:
NSString *accentedString = #"ÁlgeBra";
NSString *unaccentedString = [accentedString stringByFoldingWithOptions:NSDiacriticInsensitiveSearch locale:[NSLocale currentLocale]];
Depending on the nature of the strings you want to convert, you might want to set a fixed locale (e.g. English) instead of using the user's current locale. That way, you can be sure to get the same results on every machine.
One important precision over the answer of BillyTheKid18756 (that was corrected by Luiz but it was not obvious in the explanation of the code):
DO NOT USE stringWithCString as a second step to remove accents, it can add unwanted characters at the end of your string as the NSData is not NULL-terminated (as stringWithCString expects it).
Or use it and add an additional NULL byte to your NSData, like Luiz did in his code.
I think a simpler answer is to replace:
NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
By:
NSString *sanitizedText = [[[NSString alloc] initWithData:sanitizedData encoding:NSASCIIStringEncoding] autorelease];
If I take back the code of BillyTheKid18756, here is the complete correct code:
// The input text
NSString *text = #"BûvérÈ!#$&%^&(*^(_()-*/48";
// Defining what characters to accept
NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
[acceptedCharacters addCharactersInString:#" _-.!"];
// Turn accented letters into normal letters (optional)
NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
// Corrected back-conversion from NSData to NSString
NSString *sanitizedText = [[[NSString alloc] initWithData:sanitizedData encoding:NSASCIIStringEncoding] autorelease];
// Removing unaccepted characters
NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:#""];
If you are trying to compare strings, use one of these methods. Don't try to change data.
- (NSComparisonResult)localizedCompare:(NSString *)aString
- (NSComparisonResult)localizedCaseInsensitiveCompare:(NSString *)aString
- (NSComparisonResult)compare:(NSString *)aString options:(NSStringCompareOptions)mask range:(NSRange)range locale:(id)locale
You NEED to consider user locale to do things write with strings, particularly things like names.
In most languages, characters like ä and å are not the same other than they look similar. They are inherently distinct characters with meaning distinct from others, but the actual rules and semantics are distinct to each locale.
The correct way to compare and sort strings is by considering the user's locale. Anything else is naive, wrong and very 1990's. Stop doing it.
If you are trying to pass data to a system that cannot support non-ASCII, well, this is just a wrong thing to do. Pass it as data blobs.
https://developer.apple.com/library/ios/documentation/cocoa/Conceptual/Strings/Articles/SearchingStrings.html
Plus normalizing your strings first (see Peter Hosey's post) precomposing or decomposing, basically pick a normalized form.
- (NSString *)decomposedStringWithCanonicalMapping
- (NSString *)decomposedStringWithCompatibilityMapping
- (NSString *)precomposedStringWithCanonicalMapping
- (NSString *)precomposedStringWithCompatibilityMapping
No, it's not nearly as simple and easy as we tend to think.
Yes, it requires informed and careful decision making. (and a bit of non-English language experience helps)
Consider using the RegexKit framework. You could do something like:
NSString *searchString = #"This is neat.";
NSString *regexString = #"[\W]";
NSString *replaceWithString = #"";
NSString *replacedString = [searchString stringByReplacingOccurrencesOfRegex:regexString withString:replaceWithString];
NSLog (#"%#", replacedString);
//... Thisisneat
Consider using NSScanner, and specifically the methods -setCharactersToBeSkipped: (which accepts an NSCharacterSet) and -scanString:intoString: (which accepts a string and returns the scanned string by reference).
You may also want to couple this with -[NSString localizedCompare:], or perhaps -[NSString compare:options:] with the NSDiacriticInsensitiveSearch option. That could simplify having to remove/replace accents, so you can focus on removing puncuation, whitespace, etc.
If you must use an approach like you presented in your question, at least use an NSMutableString and replaceOccurrencesOfString:withString:options:range: — that will be much more efficient than creating tons of nearly-identical autoreleased strings. It could be that just reducing the number of allocations will boost performance "enough" for the time being.
To give a complete example by combining the answers from Luiz and Peter, adding a few lines, you get the code below.
The code does the following:
Creates a set of accepted characters
Turn accented letters into normal letters
Remove characters not in the set
Objective-C
// The input text
NSString *text = #"BûvérÈ!#$&%^&(*^(_()-*/48";
// Create set of accepted characters
NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
[acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
[acceptedCharacters addCharactersInString:#" _-.!"];
// Turn accented letters into normal letters (optional)
NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
// Remove characters not in the set
NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:#""];
Swift (2.2) example
let text = "BûvérÈ!#$&%^&(*^(_()-*/48"
// Create set of accepted characters
let acceptedCharacters = NSMutableCharacterSet()
acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.letterCharacterSet())
acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.decimalDigitCharacterSet())
acceptedCharacters.addCharactersInString(" _-.!")
// Turn accented letters into normal letters (optional)
let sanitizedData = text.dataUsingEncoding(NSASCIIStringEncoding, allowLossyConversion: true)
let sanitizedText = String(data: sanitizedData!, encoding: NSASCIIStringEncoding)
// Remove characters not in the set
let components = sanitizedText!.componentsSeparatedByCharactersInSet(acceptedCharacters.invertedSet)
let output = components.joinWithSeparator("")
Output
The output for both examples would be: BuverE!_-48
Just bumped into this, maybe its too late, but here is what worked for me:
// text is the input string, and this just removes accents from the letters
// lossy encoding turns accented letters into normal letters
NSMutableData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding
allowLossyConversion:YES];
// increase length by 1 adds a 0 byte (increaseLengthBy
// guarantees to fill the new space with 0s), effectively turning
// sanitizedData into a c-string
[sanitizedData increaseLengthBy:1];
// now we just create a string with the c-string in sanitizedData
NSString *final = [NSString stringWithCString:[sanitizedData bytes]];
#interface NSString (Filtering)
- (NSString*)stringByFilteringCharacters:(NSCharacterSet*)charSet;
#end
#implementation NSString (Filtering)
- (NSString*)stringByFilteringCharacters:(NSCharacterSet*)charSet {
NSMutableString * mutString = [NSMutableString stringWithCapacity:[self length]];
for (int i = 0; i < [self length]; i++){
char c = [self characterAtIndex:i];
if(![charSet characterIsMember:c]) [mutString appendFormat:#"%c", c];
}
return [NSString stringWithString:mutString];
}
#end
These answers didn't work as expected for me. Specifically, decomposedStringWithCanonicalMapping didn't strip accents/umlauts as I'd expected.
Here's a variation on what I used that answers the brief:
// replace accents, umlauts etc with equivalent letter i.e 'é' becomes 'e'.
// Always use en_GB (or a locale without the characters you wish to strip) as locale, no matter which language we're taking as input
NSString *processedString = [string stringByFoldingWithOptions: NSDiacriticInsensitiveSearch locale: [NSLocale localeWithLocaleIdentifier: #"en_GB"]];
// remove non-letters
processedString = [[processedString componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
// trim whitespace
processedString = [processedString stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceCharacterSet]];
return processedString;
Peter's Solution in Swift:
let newString = oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet).joinWithSeparator("")
Example:
let oldString = "Jo_ - h !. nn y"
// "Jo_ - h !. nn y"
oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet)
// ["Jo", "h", "nn", "y"]
oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.letterCharacterSet().invertedSet).joinWithSeparator("")
// "Johnny"
I wanted to filter out everything except letters and numbers, so I adapted Lorean's implementation of a Category on NSString to work a little different. In this example, you specify a string with only the characters you want to keep, and everything else is filtered out:
#interface NSString (PraxCategories)
+ (NSString *)lettersAndNumbers;
- (NSString*)stringByKeepingOnlyLettersAndNumbers;
- (NSString*)stringByKeepingOnlyCharactersInString:(NSString *)string;
#end
#implementation NSString (PraxCategories)
+ (NSString *)lettersAndNumbers { return #"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"; }
- (NSString*)stringByKeepingOnlyLettersAndNumbers {
return [self stringByKeepingOnlyCharactersInString:[NSString lettersAndNumbers]];
}
- (NSString*)stringByKeepingOnlyCharactersInString:(NSString *)string {
NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:string];
NSMutableString * mutableString = #"".mutableCopy;
for (int i = 0; i < [self length]; i++){
char character = [self characterAtIndex:i];
if([characterSet characterIsMember:character]) [mutableString appendFormat:#"%c", character];
}
return mutableString.copy;
}
#end
Once you've made your Categories, using them is trivial, and you can use them on any NSString:
NSString *string = someStringValueThatYouWantToFilter;
string = [string stringByKeepingOnlyLettersAndNumbers];
Or, for example, if you wanted to get rid of everything except vowels:
string = [string stringByKeepingOnlyCharactersInString:#"aeiouAEIOU"];
If you're still learning Objective-C and aren't using Categories, I encourage you to try them out. They're the best place to put things like this because it gives more functionality to all objects of the class you Categorize.
Categories simplify and encapsulate the code you're adding, making it easy to reuse on all of your projects. It's a great feature of Objective-C!