I am new to Objective-C, so I am trying to split an String into an Array in this format:
NSString *str = #":49:DE:Bahnhofsstr:12:39:11";
NSArray *arr = [str componentsSeparatedByString:#":"];
I receive the following objects in arr:
[#"", #"49", #"DE", #"Bahnhofsstr", #"12", #"39", #"11"]
But I need it in this format:
[#"", #"49", #"DE", #"Bahnhofsstr:12:39:11"]
Anyone have any ideas?
You can use a regular expression. The one you want is this:
^([^:]*):([^:]*):([^:]*):(.*)$
The above matches three sequences of characters without colons in separated by colons and then a fourth group consisting of any kind of character. The ^ at the front and the $ at the end match the beginning and the end of the string respectively otherwise nonsense like 1:2:3:4:49:DE:Bahnhofsstr:12:39:11 would match because there is a match embedded in the string.
The parenthesis delimit capture groups which will be returned to you once the regular expression matching has been done. The first capture group is all the characters up to the first colon. The second capture group is all the characters between the first and second colons. The third capture group is all the characters between the second and third colons and the fourth capture group is all the characters after the third colon.
There is also a zeroth capture group which is the entire matching sequence.
Here's how to use this in Objective-C:
NSString* pattern = #"^([^:]*):([^:]*):([^:]*):(.*)$";
NSString* line = #":49:DE:Bahnhofsstr:12:39:11";
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern: pattern
options: 0
error: &error];
if (regex == nil)
{
NSLog(#"Invalid regular expression %#, %#", pattern, error);
}
else
{
NSArray* matches = [regex matchesInString: line
options: 0
range: NSMakeRange(0, [line length])];
if ([matches count] == 1)
{
// Should only be one match
NSTextCheckingResult* result = [matches objectAtIndex: 0];
NSMutableArray* captureGroups = [[NSMutableArray alloc] init];
// Omit capture group 0 because it will be the whole string
for (int i = 1 ; i < [result numberOfRanges] ; i++)
{
NSRange groupRange = [result rangeAtIndex: i];
NSString* captureGroup = [line substringWithRange: groupRange];
[captureGroups addObject: captureGroup];
}
NSLog(#"The fields are %#", captureGroups);
}
else
{
// match error
}
}
Regular expressions, as proposed by JeremyP, are an obvious solution to this sort of problem.
Some people don't like regexes, though, so another solution is to use NSScanner which is also made to scan strings and read the result into variables. Given that the delimiter is the same for all fields, it even lends itself to use a nice loop, reducing the tedious scanning code.
Here is an example:
NSString *str = #":49:DE:Bahnhofsstr:12:39:11";
const NSUInteger nFields = 4;
NSScanner *myScanner = [NSScanner scannerWithString: str];
NSMutableArray *arr = [NSMutableArray array];
for (NSUInteger i = 0; i < nFields - 1; i++) {
NSString *field;
// The BOOLs here really ought to be checked
BOOL found = [myScanner scanUpToString: #":" intoString: &field];
BOOL passedDelimiter = [myScanner scanString: #":" intoString: NULL];
[arr addObject: field ?: #"" ];
}
NSString *lastField = [[myScanner string] substringFromIndex:[myScanner scanLocation]];
[arr addObject: lastField];
That last line to read the remainder of the string is taken straight from the docs for NSScanner.
Related
I am new to learning Objective-C (my first programming language!) and trying to write a little program that will add 1 to a number contained within a string. E.g. AA1BB becomes AA2BB.
.
So far I have tried to extract the number and add 1. Then extract the letters and add everything back together in a new string. I have had some success but can't manage to get back to the original arrangement of the initial string.
The code I have so far gives a result of 2BB and disregards the characters before the number which is not what I am after (the result I am trying for with this example would be AA2BB). I can't figure out why!
NSString* aString = #"AA1BB";
NSCharacterSet *numberCharset = [NSCharacterSet characterSetWithCharactersInString:#"0123456789-"]; //Creating a set of Characters object//
NSScanner *theScanner = [NSScanner scannerWithString:aString];
int someNumbers = 0;
while (![theScanner isAtEnd]) {
// Remove Letters
[theScanner scanUpToCharactersFromSet:numberCharset
intoString:NULL];
if ([theScanner scanInt:&someNumbers]) {}
}
NSCharacterSet *letterCharset = [NSCharacterSet characterSetWithCharactersInString:#"ABCDEFGHIJKLMNOPQRSTUVWXYZ"];
NSScanner *letterScanner = [NSScanner scannerWithString:aString];
NSString* someLetters;
while (![letterScanner isAtEnd]) {
// Remove numbers
[letterScanner scanUpToCharactersFromSet:letterCharset
intoString:NULL];
if ([letterScanner scanCharactersFromSet:letterCharset intoString:&someLetters]) {}
}
++someNumbers; //adds +1 to the Number//
NSString *newString = [[NSString alloc]initWithFormat:#"%i%#", someNumbers, someLetters];
NSLog (#"String is now %#", newString);
This is an alternative solution with Regular Expression.
It finds the range of the integer (\\d+ is one or more digits), extracts it, increments it and replaces the value at the given range.
NSString* aString = #"AA1BB";
NSRange range = [aString rangeOfString:#"\\d+" options:NSRegularExpressionSearch];
if (range.location != NSNotFound) {
NSInteger numericValue = [aString substringWithRange:range].integerValue;
numericValue++;
aString = [aString stringByReplacingCharactersInRange:range withString:[NSString stringWithFormat:#"%ld", numericValue]];
}
NSLog(#"%#", aString);
I have a NSString formatted like this:
"Hello world 12 looking for some 56"
I want to find all instances of numbers separated by whitespace and place them in an NSArray. I dont want to remove the numbers though.
Whats the best way of achieving this?
This is a solution using regular expression as suggested in the comment.
NSString *string = #"Hello world 12 looking for some 56";
NSRegularExpression *expression = [NSRegularExpression regularExpressionWithPattern:#"\\b\\d+" options:nil error:nil];
NSArray *matches = [expression matchesInString:string options:nil range:(NSMakeRange(0, string.length))];
NSMutableArray *result = [[NSMutableArray alloc] init];
for (NSTextCheckingResult *match in matches) {
[result addObject:[string substringWithRange:match.range]];
}
NSLog(#"%#", result);
First make an array using NSString's componentsSeparatedByString method and take reference to this SO question. Then iterate the array and refer to this SO question to check if an array element is number: Checking if NSString is Integer.
I don't know where you are looking to do perform this action because it may not be fast (such as if it's being called in a table cell it may be choppy) based upon the string size.
Code:
+ (NSArray *)getNumbersFromString:(NSString *)str {
NSMutableArray *retVal = [NSMutableArray array];
NSCharacterSet *numericSet = [NSCharacterSet decimalDigitCharacterSet];
NSString *placeholder = #"";
unichar currentChar;
for (int i = [str length] - 1; i >= 0; i--) {
currentChar = [str characterAtIndex:i];
if ([numericSet characterIsMember:currentChar]) {
placeholder = [placeholder stringByAppendingString:
[NSString stringWithCharacters:¤tChar
length:[placeholder length]+1];
} else {
if ([placeholder length] > 0) [retVal addObject:[placeholder intValue]];
else placeholder = #"";
return [retVal copy];
}
To explain what is happening above, essentially I am,
going through every character until I find a number
adding that number including any numbers after to a string
once it finds a number it adds it to an array
Hope this helps please ask for clarification if needed
I want to know a simple and fast way to determine if all characters in an NSString are the same.
For example:
NSString *string = "aaaaaaaaa"
=> return YES
NSString *string = "aaaaaaabb"
=> return NO
I know that I can achieve it by using a loop but my NSString is long so I prefer a shorter and simpler way.
you can use this, replace first character with null and check lenght:
-(BOOL)sameCharsInString:(NSString *)str{
if ([str length] == 0 ) return NO;
return [[str stringByReplacingOccurrencesOfString:[str substringToIndex:1] withString:#""] length] == 0 ? YES : NO;
}
Here are two possibilities that fail as quickly as possible and don't (explicitly) create copies of the original string, which should be advantageous since you said the string was large.
First, use NSScanner to repeatedly try to read the first character in the string. If the loop ends before the scanner has reached the end of the string, there are other characters present.
NSScanner * scanner = [NSScanner scannerWithString:s];
NSString * firstChar = [s substringWithRange:[s rangeOfComposedCharacterSequenceAtIndex:0]];
while( [scanner scanString:firstChar intoString:NULL] ) continue;
BOOL stringContainsOnlyOneCharacter = [scanner isAtEnd];
Regex is also a good tool for this problem, since "a character followed by any number of repetitions of that character" is in very simply expressed with a single back reference:
// Match one of any character at the start of the string,
// followed by any number of repetitions of that same character
// until the end of the string.
NSString * patt = #"^(.)\\1*$";
NSRegularExpression * regEx =
[NSRegularExpression regularExpressionWithPattern:patt
options:0
error:NULL];
NSArray * matches = [regEx matchesInString:s
options:0
range:(NSRange){0, [s length]}];
BOOL stringContainsOnlyOneCharacter = ([matches count] == 1);
Both these options correctly deal with multi-byte and composed characters; the regex version also does not require an explicit check for the empty string.
use this loop:
NSString *firstChar = [str substringWithRange:NSMakeRange(0, 1)];
for (int i = 1; i < [str length]; i++) {
NSString *ch = [str substringWithRange:NSMakeRange(i, 1)];
if(![ch isEqualToString:firstChar])
{
return NO;
}
}
return YES;
I am trying to parse a set of words that contain -- first greek letters, then english letters. This would be easy if there was a delimiter between the sets.That is what I've built so far..
- (void)loadWordFileToArray:(NSBundle *)bundle {
NSLog(#"loadWordFileToArray");
if (bundle != nil) {
NSString *path = [bundle pathForResource:#"alfa" ofType:#"txt"];
//pull the content from the file into memory
NSData* data = [NSData dataWithContentsOfFile:path];
//convert the bytes from the file into a string
NSString* string = [[NSString alloc] initWithBytes:[data bytes]
length:[data length]
encoding:NSUTF8StringEncoding];
//split the string around newline characters to create an array
NSString* delimiter = #"\n";
incomingWords = [string componentsSeparatedByString:delimiter];
NSLog(#"incomingWords count: %lu", (unsigned long)incomingWords.count);
}
}
-(void)parseWordArray{
NSLog(#"parseWordArray");
NSString *seperator = #" = ";
int i = 0;
for (i=0; i < incomingWords.count; i++) {
NSString *incomingString = [incomingWords objectAtIndex:i];
NSScanner *scanner = [NSScanner localizedScannerWithString: incomingString];
NSString *firstString;
NSString *secondString;
NSInteger scanPosition;
[scanner scanUpToString:seperator intoString:&firstString];
scanPosition = [scanner scanLocation];
secondString = [[scanner string] substringFromIndex:scanPosition+[seperator length]];
// NSLog(#"greek: %#", firstString);
// NSLog(#"english: %#", secondString);
[outgoingWords insertObject:[NSMutableArray arrayWithObjects:#"greek", firstString, #"english",secondString,#"category", #"", nil] atIndex:0];
[englishWords insertObject:[NSMutableArray arrayWithObjects:secondString,nil] atIndex:0];
}
}
But I cannot count on there being delimiters.
I have looked at this question. I want something similar. This would be: grab the characters in the string until an english letter is found. Then take the first group to one new string, and all the characters after to a second new string.
I only have to run this a few times, so optimization is not my highest priority.. Any help would be appreciated..
EDIT:
I've changed my code as shown below to make use of NSLinguisticTagger. This works, but is this the best way? Note that the interpretation for english characters is -- for some reason "und"...
The incoming string is: άγαλμα, το statue, only the last 6 characters are in english.
int j = 0;
for (j=0; j<incomingString.length; j++) {
NSString *language = [tagger tagAtIndex:j scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
if ([language isEqual: #"und"]) {
NSLog(#"j is: %i", j);
int k = 0;
for (k=0; k<j; k++) {
NSRange range = NSMakeRange (0, k);
NSString *tempString = [incomingString substringWithRange:range ];
NSLog (#"tempString: %#", tempString);
}
return;
}
NSLog (#"Language: %#", language);
}
Alright so what you could do is use NSLinguisticTagger to find out the language of the word (or letter) and if the language has changed then you know where to split the string. You can use NSLinguisticTagger like this:
NSArray *tagschemes = #[NSLinguisticTagSchemeLanguage];
NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:tagschemes options: NSLinguisticTagPunctuation | NSLinguisticTaggerOmitWhitespace];
[tagger setString:#"This is my string in English."];
NSString *language = [tagger tagAtIndex:0 scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
//Loop through each index of the string's characters and check the language as above.
//If it has changed then you can assume the language has changed.
Alternatively you can use NSSpellChecker's requestCheckingOfString to get teh dominant language in a range of characters:
NSSpellChecker *spellChecker = [NSSpellChecker sharedSpellChecker];
[spellChecker setAutomaticallyIdentifiesLanguages:YES];
NSString *spellCheckText = #"Guten Herr Mustermann. Dies ist ein deutscher Text. Bitte löschen Sie diesen nicht.";
[spellChecker requestCheckingOfString:spellCheckText
range:(NSRange){0, [spellCheckText length]}
types:NSTextCheckingTypeOrthography
options:nil
inSpellDocumentWithTag:0
completionHandler:^(NSInteger sequenceNumber, NSArray *results, NSOrthography *orthography, NSInteger wordCount) {
NSLog(#"dominant language = %#", orthography.dominantLanguage);
}];
This answer has information on how to detect the language of an NSString.
Allow me to introduce two good friends of mine.
NSCharacterSet and NSRegularExpression.
Along with them, normalization. (In Unicode terms)
First, you should normalize strings before analyzing them against a character set.
You will need to look at the choices, but normalizing to all composed forms is the way I would go.
This means an accented character is one instead of two or more.
It simplifies the number of things to compare.
Next, you can easily build your own NSCharacterSet objects from strings (loaded from files even) to use to test set membership.
Lastly, regular expressions can achieve the same thing with Unicode Property Names as classes or categories of characters. Regular expressions could be more terse but more expressive.
What's the simplest way, given a string:
NSString *str = #"Some really really long string is here and I just want the first 10 words, for example";
to result in an NSString with the first N (e.g., 10) words?
EDIT: I'd also like to make sure it doesn't fail if the str is shorter than N.
If the words are space-separated:
NSInteger nWords = 10;
NSRange wordRange = NSMakeRange(0, nWords);
NSArray *firstWords = [[str componentsSeparatedByString:#" "] subarrayWithRange:wordRange];
if you want to break on all whitespace:
NSCharacterSet *delimiterCharacterSet = [NSCharacterSet whitespaceAndNewlineCharacterSet];
NSArray *firstWords = [[str componentsSeparatedByCharactersInSet:delimiterCharacterSet] subarrayWithRange:wordRange];
Then,
NSString *result = [firstWords componentsJoinedByString:#" "];
While Barry Wark's code works well for English, it is not the preferred way to detect word breaks. Many languages, such as Chinese and Japanese, do not separate words using spaces. And German, for example, has many compounds that are difficult to separate correctly.
What you want to use is CFStringTokenizer:
CFStringRef string; // Get string from somewhere
CFLocaleRef locale = CFLocaleCopyCurrent();
CFStringTokenizerRef tokenizer = CFStringTokenizerCreate(kCFAllocatorDefault, string, CFRangeMake(0, CFStringGetLength(string)), kCFStringTokenizerUnitWord, locale);
CFStringTokenizerTokenType tokenType = kCFStringTokenizerTokenNone;
unsigned tokensFound = 0, desiredTokens = 10; // or the desired number of tokens
while(kCFStringTokenizerTokenNone != (tokenType = CFStringTokenizerAdvanceToNextToken(tokenizer)) && tokensFound < desiredTokens) {
CFRange tokenRange = CFStringTokenizerGetCurrentTokenRange(tokenizer);
CFStringRef tokenValue = CFStringCreateWithSubstring(kCFAllocatorDefault, string, tokenRange);
// Do something with the token
CFShow(tokenValue);
CFRelease(tokenValue);
++tokensFound;
}
// Clean up
CFRelease(tokenizer);
CFRelease(locale);
Based on Barry's answer, I wrote a function for the sake of this page (still giving him credit on SO)
+ (NSString*)firstWords:(NSString*)theStr howMany:(NSInteger)maxWords {
NSArray *theWords = [theStr componentsSeparatedByString:#" "];
if ([theWords count] < maxWords) {
maxWords = [theWords count];
}
NSRange wordRange = NSMakeRange(0, maxWords - 1);
NSArray *firstWords = [theWords subarrayWithRange:wordRange];
return [firstWords componentsJoinedByString:#" "];
}
Here's my solution, derived from the answers given here, for my own problem of removing the first word from a string...
NSMutableArray *words = [NSMutableArray arrayWithArray:[lowerString componentsSeparatedByString:#" "]];
[words removeObjectAtIndex:0];
return [words componentsJoinedByString:#" "];