Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm quite new to NSString manipulation and don't have much experience at all of manipulating strings in any language really.
My problem is that I have a string that contains a lot of data, within this data is a name that I need to extract into a new NSString. EG:
NSString* dataString =#"randomdata12359123888585/name_john_randomdatawadapoawdk"
"/name_" always precede the data I need and "_" always follows it.
I have looked into things such as NSScanner but I'm not quite sure what the correct approach is or how to implement NSScanner.
Your string format is very well-defined (as you say, the name you are after is always preceded by "/name_" and always followed by "_"), and I suppose that the name ("john") hence cannot contain an underscore.
I'd therefore consider a simple regular expression, which is perfectly suited for this sort of problem:
NSString *regexPattern = #"^.*/name_(.*?)_.*$";
NSString *name = [dataString stringByReplacingOccurrencesOfString: regexPattern
withString: #"$1"
options: NSRegularExpressionSearch
range: NSMakeRange(0, dataString.length)];
In case you are not familiar with regular expressions, what is going on here is:
Begin at the beginning of the string (the "^")
Allow anything (".*") followed by "/name_"
Capture what follows (the parenthesis means "capture this")
In the parenthesis, allow anything (".*"), but make it as short as possible (the "?" after the "*")
It must be followed by an underscore and then allow anything that happens to be there up to the end of the string (the "$")
This will match the whole string, and when substituting the match (i.e., all of the string) with "$1", it will substitute the match with the substring included in the first (and only) parenthesis.
Result: It will produce a string that contains only the name. If the string does not have the correct format (i.e., no name between two underscores), then it will not change anything and return the full, original string.
It is a matter of coding style whether you prefer one approach over the other, but if you like regular expressions, then this approach is both clean, easy to understand and simple to maintain.
As I see it, any fragility in this is due to the data format, which looks suspiciously like something that depends on other "random" pieces of data, so whichever method you choose to parse that string, make sure you add some defensive tests to check the data format and alert you if unexpected strings begin to enter your data. This could be years from now, when you have forgotten everything about underscores, regexes and NSScanner.
-(void)separateString{
NSString* dataString =#"randomdata12359123888585/name_john_randomdatawadapoawdk";
NSArray *arr1 = [dataString componentsSeparatedByString:#"/"];
NSArray *arr2 = [[arr1 objectAtIndex:1] componentsSeparatedByString:#"_"];
NSLog(#"%# %#",arr1,arr2);
}
The output you get is
arr1= (
randomdata12359123888585,
"name_john_randomdatawadapoawdk"
)
arr2 = (
name,
john,
randomdatawadapoawdk
)
now you can access the name or whatever from the array index.
I managed to do this with NSScanner, however the array answer would work too so I've upvoted it.
The NSScanner code I used for anyone else facing a similar problem is:
-(void)formatName{
NSString *stringToSearch = _URLString; //url string is the long string we wish to search.
NSScanner *scanner = [NSScanner scannerWithString:stringToSearch];
[scanner scanUpToString:#"name_" intoString:nil]; // Scan all characters before name_
while(![scanner isAtEnd]) {
NSString *substring = nil;
[scanner scanString:#"name_" intoString:nil]; // Scan the # character
if([scanner scanUpToString:#"_" intoString:&substring]) {
// If the space immediately followed the _, this will be skipped
_nameIwant = substring; //nameIwant is a property to store the name I scanned for
return;
}
}
}
Related
I would like to programmatically receive a JIRA ticket number, like #"ART-235", and obtain the bare digits / number, #"235".
A question I asked about using regular expressions turned up Regular expressions in an Objective-C Cocoa application with a link to https://developer.apple.com/library/ios/documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html, and it looks indeed like I can have a regular expression such as \D*?(\d+) and retrieve the value via a regular expression.
However, I wanted to check in and ask if there is a less bletcherous way to do this, or is this an example of why Objective-C is called a bit archaic? The second link gives what looks like everything I need, but it smells a little funny. For the objective stated above, do I want to use regular expressions, or is there a more nicely idiomatic way to perform this sort of string manipulation?
Sounds like -componentsSeparatedByString: would do what you need.
Getting pieces of a fixed, known, format that doesn't use paired delimiters or nesting is exactly the kind of thing that regexes are made to do. I don't see a thing wrong with using one here.
To address your question as written (about "iteration"), however, you might want to look at NSScanner, which does move through the characters of a string by "character class", allowing you to evaluate them as you go.
NSString * ticket = #"ART-235";
NSScanner * scanner = [NSScanner scannerWithString:ticket];
[scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet]
intoString:nil];
// As an integer
NSInteger ticketNumber;
[scanner scanInteger:&ticketNumber];
// Or as a string
NSString * ticketNumber;
[scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet]
intoString:&ticketNumber];
Like other answers have already said: that simple case can be solved using componentsSeparatedByString:#"-".
That said, your original question is how to enumerate individual characters.
Not all characters are of the same size, some languages combine more than one character into a new language. When enumerating such a string you most likely want to get the resulting of that composition, not the individual pieces. In Objective-C you can enumerate these composed characters like this:
NSString *myString = #"Hello Strings!";
[myString enumerateSubstringsInRange:NSMakeRange(0, myString.length)
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
// Do something with the composed character
NSLog(#"%#", substring);
}];
The example above will log each character one by one.
I made a simple method for you that does the trick, provided that the
ticket identifiers will always be in a "string-number" format !
-(int) numberFromJiraTicket:(NSString*)ticketId
{
//Get number as string
NSString *number = [[ticketId componentsSeparatedByString:#"-"] lastObject];
//Return the INT representation of the number
return [number intValue];
}
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I'm trying to write a method that will search an NSString, determine if an individual word within the string is over 6 characters long and replace that word with some other word (something arbitrary like 'hello').
I am starting with a long paragraph and I need to end up with a single NSString object whose format and spacing has not been affected by the find and replace.
Why another answer?
There are a couple of subtle problems with the simple solutions using componentsSeparatedByString::
Punctuation is not handled as word delimiters.
Whitespace other that the space character (newline, tab) is simply dropped.
On long strings a lot of memory is wasted.
It's slow.
Example
Assuming a substitution word of "–" a string like ...
“Essentially,” the D.H.C. concluded,
”bokanovskification consists of a series of arrests of development.”
... would result in ...
– the D.H.C. – – of a series of – of –
... while the correct output would be:
“–,” the D.H.C. –,”– – of a series of – of –.”
Solution
Fortunately there's a much better, yet simple solution in Cocoa: -[NSString enumerateSubstringsInRange:options:usingBlock:]
It provides fast iteration over substrings defined by the options argument. One possibility is the NSStringEnumerationByWords which enumerates all substrings that are actually real words (in the current locale). It even detects individual words in languages that don't use delimiters (spaces) to separate words, like japanese.
Comparing Solutions
Here's a simple demo project that works on the jargon file (1.6 MB, 237,239 words). It compares three different solutions:
componentsSeparatedByString: 270 ms
enumerateSubstringsInRange: 125 ms
stringByReplacingOccurrencesOfString, as described by #Monolo: 200 ms
Implementation
The core of it is the replacement loop:
NSMutableString *result = [NSMutableString stringWithCapacity:[originalString length]];
__block NSUInteger location = 0;
[originalString enumerateSubstringsInRange:(NSRange){0, [originalString length]}
options:NSStringEnumerationByWords | NSStringEnumerationLocalized | NSStringEnumerationSubstringNotRequired
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (substringRange.length > maxChar) {
NSString *charactersBetweenLongWords = [originalString substringWithRange:(NSRange){ location, substringRange.location - location }];
[result appendString:charactersBetweenLongWords];
[result appendString:replaceWord];
location = substringRange.location + substringRange.length;
}
}];
[result appendString:[originalString substringFromIndex:location]];
Caveat
As pointed out by Monolo the proposed code uses NSString's length to determine the number of characters of a word. That's a questionable approach, to say the least. In fact a string's length specifies the number of code fragments used to encode the string, a value that often defers from what a human would assume the number of characters.
As the term "character" has different meanings in various contexts and the OP didn't specify which kind of character count to use I just leave the code as it was. If you want a different count please refer to the documentation that discusses the topic:
Apple's String Programming Guide, Characters and Grapheme Clusters
Unicode FAQ: How are characters counted when measuring the length or position of a character in a string?
As you can see from the answers, there are several ways to accomplish what you are after, but personally I prefer to use the NSString class's stringByReplacingOccurrencesOfString:withString:options:range: method, which is made exactly to replace substrings with another string.
In your case we need to use the NSRegularExpressionSearch option which will allow to identify words with 7 or more letters (i.e., more than 6 letters as you state it).
If you use the \w* character expression you will automatically get Unicode support, so it works on as many languages as Apple (actually, ICU) supports.
It goes like this:
NSString *stringWithLongWords = #"There are some words of extended length in this text. One of them is Escher's. They will be identified with a regular expression and changed for some arbitrary word.";
NSString *overSixCharsPattern = #"(?w)\\b[\\w]{7,}\\b";
NSString *replacementString = #"hello";
NSString *result = [stringWithLongWords stringByReplacingOccurrencesOfString: overSixCharsPattern
withString: replacementString
options: NSRegularExpressionSearch
range: NSMakeRange(0, stringWithLongWords.length)];
The \b expressions denote a word boundary, which ensures that the whole word is matched and substituted. The w modifier makes \b use a more natural definition of word boundaries. Specifically, it handles the string "Escher's", the example mentioned by #NikolaiRuhe. Docs here, with a specific discussion of boundary detection here.
Also notice that a literal NSString (i.e., one you type directly in your Objective-C source file) needs two backslashes in the source code to produce one in the generated string.
There is more information in the NSString documentation
* Technically \w matches word characters, which also includes numbers in the definition used by regexes.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I'm trying to write a method that will search an NSString, determine if an individual word within the string is over 6 characters long and replace that word with some other word (something arbitrary like 'hello').
I am starting with a long paragraph and I need to end up with a single NSString object whose format and spacing has not been affected by the find and replace.
Why another answer?
There are a couple of subtle problems with the simple solutions using componentsSeparatedByString::
Punctuation is not handled as word delimiters.
Whitespace other that the space character (newline, tab) is simply dropped.
On long strings a lot of memory is wasted.
It's slow.
Example
Assuming a substitution word of "–" a string like ...
“Essentially,” the D.H.C. concluded,
”bokanovskification consists of a series of arrests of development.”
... would result in ...
– the D.H.C. – – of a series of – of –
... while the correct output would be:
“–,” the D.H.C. –,”– – of a series of – of –.”
Solution
Fortunately there's a much better, yet simple solution in Cocoa: -[NSString enumerateSubstringsInRange:options:usingBlock:]
It provides fast iteration over substrings defined by the options argument. One possibility is the NSStringEnumerationByWords which enumerates all substrings that are actually real words (in the current locale). It even detects individual words in languages that don't use delimiters (spaces) to separate words, like japanese.
Comparing Solutions
Here's a simple demo project that works on the jargon file (1.6 MB, 237,239 words). It compares three different solutions:
componentsSeparatedByString: 270 ms
enumerateSubstringsInRange: 125 ms
stringByReplacingOccurrencesOfString, as described by #Monolo: 200 ms
Implementation
The core of it is the replacement loop:
NSMutableString *result = [NSMutableString stringWithCapacity:[originalString length]];
__block NSUInteger location = 0;
[originalString enumerateSubstringsInRange:(NSRange){0, [originalString length]}
options:NSStringEnumerationByWords | NSStringEnumerationLocalized | NSStringEnumerationSubstringNotRequired
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (substringRange.length > maxChar) {
NSString *charactersBetweenLongWords = [originalString substringWithRange:(NSRange){ location, substringRange.location - location }];
[result appendString:charactersBetweenLongWords];
[result appendString:replaceWord];
location = substringRange.location + substringRange.length;
}
}];
[result appendString:[originalString substringFromIndex:location]];
Caveat
As pointed out by Monolo the proposed code uses NSString's length to determine the number of characters of a word. That's a questionable approach, to say the least. In fact a string's length specifies the number of code fragments used to encode the string, a value that often defers from what a human would assume the number of characters.
As the term "character" has different meanings in various contexts and the OP didn't specify which kind of character count to use I just leave the code as it was. If you want a different count please refer to the documentation that discusses the topic:
Apple's String Programming Guide, Characters and Grapheme Clusters
Unicode FAQ: How are characters counted when measuring the length or position of a character in a string?
As you can see from the answers, there are several ways to accomplish what you are after, but personally I prefer to use the NSString class's stringByReplacingOccurrencesOfString:withString:options:range: method, which is made exactly to replace substrings with another string.
In your case we need to use the NSRegularExpressionSearch option which will allow to identify words with 7 or more letters (i.e., more than 6 letters as you state it).
If you use the \w* character expression you will automatically get Unicode support, so it works on as many languages as Apple (actually, ICU) supports.
It goes like this:
NSString *stringWithLongWords = #"There are some words of extended length in this text. One of them is Escher's. They will be identified with a regular expression and changed for some arbitrary word.";
NSString *overSixCharsPattern = #"(?w)\\b[\\w]{7,}\\b";
NSString *replacementString = #"hello";
NSString *result = [stringWithLongWords stringByReplacingOccurrencesOfString: overSixCharsPattern
withString: replacementString
options: NSRegularExpressionSearch
range: NSMakeRange(0, stringWithLongWords.length)];
The \b expressions denote a word boundary, which ensures that the whole word is matched and substituted. The w modifier makes \b use a more natural definition of word boundaries. Specifically, it handles the string "Escher's", the example mentioned by #NikolaiRuhe. Docs here, with a specific discussion of boundary detection here.
Also notice that a literal NSString (i.e., one you type directly in your Objective-C source file) needs two backslashes in the source code to produce one in the generated string.
There is more information in the NSString documentation
* Technically \w matches word characters, which also includes numbers in the definition used by regexes.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Hi I have a problem with strings. I want to add :
NSString *termo = [NSString stringWithFormat:#"%#%#: %# ", #"~00000000:",nazwa, #".*"];
This .* is anything. How can I use it?
Your question is very unclear, however your comment "I have some string which I get from server. I want to parse this string with this" seems to suggest:
that you have a string obtained from somewhere;
this string should contain the text you have stored in the variable nazwa; and
you wish to find the text that follows whatever nazwa contains.
If this guess is correct then the following code fragment might help, it does not contain any checks you need to make to verify the input actually contains what you are looking for and is followed by something - check the documentation for the methods used to see what they return if they don't locate the text etc.:
// a string representing the input
NSString *theInput = #"The quick brown fox";
// nazwa - the text we are looking for
NSString *nazwa = #"quick";
// locate the text in the input
NSRange nazwaPosition = [theInput rangeOfString:nazwa];
// a range contains a location (offset) and a length, so
// adding these finds the offset of what follows
NSUInteger endofNazwa = nazwaPosition.location + nazwaPosition.length;
// extract what follows
NSString *afterNazwa = [theInput substringFromIndex:endofNazwa];
// display
NSLog(#"theInput '%#'\nnazwa '%#'\nafterNazwa '%#'", theInput, nazwa, afterNazwa);
This outputs:
theInput 'The quick brown fox'
nazwa 'quick'
afterNazwa ' brown fox'
HTH
.* is a regular expression that is used to match anything, but if you just want to see if an NSString isn't empty, you're better off doing something like this
![string isEqualToString:#""]
I am trying to generate a numerical string by padding the number with zeroes to the left.
0 would become 00000
1 would become 00001
10 would become 00010
I want to create five character NSString by padding the number with zeroes.
I read this Create NSString by repeating another string a given number of times but the output is an NSMutableString.
How can I implement this algorithm with the output as an NSString?
Best regards.
You can accomplish this by calling
[NSString stringWithFormat:#"%05d", [theNumber intValue]];
where theNumber is the NSString containing the number you want to format.
For further reading, you may want to look at Apple's string formatting guide or the Wikipedia entry for printf.
One quick & simple way to do it:
unsigned int num = 10; // example value
NSString *immutable = [NSString stringWithFormat:#"%.5u", num];
If you actually really want to use the long-winded approach from the example you read, you can send a “copy” message to a mutable string to get an immutable copy. This holds for all mutable types.