How to split NSString where two or more whitespace characters are found? - objective-c

given string input:
#"bonus pay savings 2.69 F";
#"brick and mortar 0.15-B";
desired output string:
[#"bonus pay savings", #"2.69 F"];
[#"brick and mortar", #"0.15-B"];
I tried this approach:
NSString * str = #"bonus pay savings 2.69 F";
NSArray * arr = [str componentsSeparatedByString:#" "];
NSLog(#"Array values are : %#",arr);
But the drawback of my approach is I'm using 3 spaces as a delimiter whereas the number of spaces can vary. How can this be accomplished? Thank you.

You can use NSRegularExpression to split your string. Let's make a category on NSString:
NSString+asdiu.h
#interface NSString (asdiu)
- (NSArray<NSString *> *)componentsSeparatedByRegularExpressionPattern:(NSString *)pattern error:(NSError **)errorOut;
#end
NSString+asdiu.m
#implementation NSString (asdiu)
- (NSArray<NSString *> *)componentsSeparatedByRegularExpressionPattern:(NSString *)pattern error:(NSError **)errorOut {
NSRegularExpression *rex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:errorOut];
if (rex == nil) { return nil; }
NSMutableArray<NSString *> *components = [NSMutableArray new];
__block NSUInteger start = 0;
[rex enumerateMatchesInString:self options:0 range:NSMakeRange(0, self.length) usingBlock:^(NSTextCheckingResult * _Nullable result, NSMatchingFlags flags, BOOL * _Nonnull stop) {
NSRange separatorRange = result.range;
NSRange componentRange = NSMakeRange(start, separatorRange.location - start);
[components addObject:[self substringWithRange:componentRange]];
start = NSMaxRange(separatorRange);
}];
[components addObject:[self substringFromIndex:start]];
return components;
}
#end
You can use it like this:
NSArray<NSString *> *inputs = #[#"bonus pay savings 2.69 F", #"brick and mortar 0.15-B"];
for (NSString *input in inputs) {
NSArray<NSString *> *fields = [input componentsSeparatedByRegularExpressionPattern:#"\\s\\s+" error:nil];
NSLog(#"fields: %#", fields);
}
Output:
2018-06-15 13:38:13.152725-0500 test[23423:1386429] fields: (
"bonus pay savings",
"2.69 F"
)
2018-06-15 13:38:13.153140-0500 test[23423:1386429] fields: (
"brick and mortar",
"0.15-B"
)

A simple solution with Regular Expression.
It replaces all occurrences of 2 or more ({2,}) whitespace characters (\\s) with a random UUID string. Then it splits the string by that UUID string.
NSString *separator = [NSUUID UUID].UUIDString;
NSString *string = #"bonus pay savings 2.69 F";
NSString *collapsedString = [string stringByReplacingOccurrencesOfString:#"\\s{2,}"
withString:separator
options:NSRegularExpressionSearch
range:NSMakeRange(0, [string length])];
NSArray *output = [collapsedString componentsSeparatedByString:separator];
NSLog(#"%#", output);

If you can assume that you only have 2 fields in the input string, I would use a limited split method like this one that always returns an array of 2 items, and then "trim" spaces off the second item using stringByTrimmingCharactersInSet.

#vadian and #robmayoff have both provided good solutions based on regular expressions (REs), in both cases the REs are used to match the gaps to find where to break your string. For comparison approaching the problem the other way by using a RE to match the parts you are interested in is also possible. The RE:
\S+(\h\S+)*
will match the text you are interested in, made up as as follows:
\S - match any non-space character, \S excludes both horizontal
(e.g. spaces, tabs) and vertical space (e.g. newlines)
\S+ - one or more non-space characters, i.e. a "word" of sorts
\h - a single horizontal space character (if you wish matches to
span lines use \s - any horizontal *or* vertical space)
\h\S+ - a space followed by a word
(\h\S+)* - zero or more space separated words
\S+(\h\S+)* - a word follow by zero or more words
With this simple regular expression you can use matchesInString:options:range: to obtain an array of NSTextCheckingResult objects, one for each match in your input; or you can use enumerateMatchesInString:options:range:usingBlock: to have a block called with each match.
As an example here is a solution following #robmayoff's approach:
#interface NSString (componentsMatchingRegularExpression)
- (NSArray<NSString *>*) componentsMatchingRegularExpression:(NSString *)pattern;
#end
#implementation NSString (componentsMatchingRegularExpression)
- (NSArray<NSString *>*) componentsMatchingRegularExpression:(NSString *)pattern
{
NSError *errorReturn;
NSRegularExpression *regularExpression = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&errorReturn];
if (!regularExpression)
return nil;
NSMutableArray *matches = NSMutableArray.new;
[regularExpression enumerateMatchesInString:self
options:0
range:NSMakeRange(0, self.length)
usingBlock:^(NSTextCheckingResult * _Nullable result, NSMatchingFlags flags, BOOL * _Nonnull stop)
{
[matches addObject:[self substringWithRange:result.range]];
}
];
return matches.copy; // non-mutable copy
}
#end
Whether matching what you wish to keep or remove is better is subjective, take your pick.

Regular Expressions are fine for this, and the solutions given using them are perfectly fine, but just for completion you can also do this using NSScanner, which will almost always have better performance than regexes, and is pretty handy to get used to using if you need to do more complicated text parsing.
NSString *str = #"bonus pay savings 2.69 F";
NSScanner *scanner = [NSScanner scannerWithString:str];
scanner.charactersToBeSkipped = nil; // default is to ignore whitespace
while (!scanner.isAtEnd) {
NSString *name;
NSString *value;
// scan up to two spaces, this would be the name
[scanner scanUpToString:#" " intoString:&name];
// scan the two spaces and any extra whitespace
[scanner scanCharactersFromSet:[NSCharacterSet whitespaceCharacterSet] intoString:nil];
// scan to the end of the line, this is the value
[scanner scanUpToString:#"\n" intoString:&value];
}

Related

Looping through the value in Textfield for Particular Text

I have a TextField which has values as shown below.
#"Testing<Car>Testing<Car2>Working<Car3 /Car 4> on the code"
Here I have to loop through the text field and check for the text present within Angle brackets(< >).
There can be space or any special characters within the Angle Brackets.
I tried using NSPredicate and also componentsSeparatedByString, but I was not able to get the exact text within.
Is there any way to get the exact text along with Angle Brackets. Like in the above mentioned example want only
#"<Car>,<Car2> , <Car3 /Car 4>"
Thanks for the help in Advance.
A possible solution is Regular Expression. The pattern checks for < followed by one or more non-> characters and one >.
enumerateMatchesInString extracts the substrings and append them to an array. Finally the array is flattened to a single string.
NSString *string = #"Testing<Car>Testing<Car2>Working<Car3 /Car 4> on the code";
NSRegularExpression *regex = [[NSRegularExpression alloc] initWithPattern:#"<[^>]+>" options:0 error:nil];
__block NSMutableArray<NSString *> *matches = [NSMutableArray array];
[regex enumerateMatchesInString:string options:0 range:NSMakeRange(0, string.length) usingBlock:^(NSTextCheckingResult * _Nullable result, NSMatchingFlags flags, BOOL * _Nonnull stop) {
if (result) [matches addObject:[string substringWithRange:result.range]];
}];
NSLog(#"%#", [matches componentsJoinedByString:#", "]);
We can solve it in different ways. Now I am showing one of the way. You can place textFiled.text in place of str.
NSString *str = #"This is just Added < For testing %# ___ & >";
NSRange r1 = [str rangeOfString:#"<" options: NSBackwardsSearch];
NSRange r2 = [str rangeOfString:#">" options: NSBackwardsSearch];
NSRange rSub = NSMakeRange(r1.location + r1.length, r2.location - r1.location - r1.length);
NSString *sub = [str substringWithRange:rSub];

How to find the number of gaps between words in an NSString?

Given an NSString containing a sentence I would like to determine the number of gaps between the words.
I could use something like [[theString componentsSeparatedByString:#" "].
But that would only work if each gap is a single space character, there could be multiple.
You can use NSRegularExpression, like:
NSString *test = #"The quick brown fox jumped over the lazy dog";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\s+" options:0 error:NULL];
NSUInteger num = [regex numberOfMatchesInString:test options:0 range:NSMakeRange(0, test.length)];
NSLog(#"num: %lu", num);
The regular expression "\s+" matches one or more whitespace characters (it's written here with an extra "\" because we need a literal backslash in the NSString). numberOfMatchesInString:options:range: counts each run of one or more whitespace characters as a match, which is exactly what you want.
You can do it via componentsSeparatedByString - if you filter afterwards to ignore the empty strings:
NSString *theString = #"HI this is a test";
NSArray *arr = [theString componentsSeparatedByString:#" "];
arr = [arr filteredArrayUsingPredicate:[NSPredicate predicateWithBlock:^BOOL(id evaluatedObject, NSDictionary *b) {
return [(NSString*)evaluatedObject length] > 0;
}]];
NSLog(#"number of words: %lu", arr.count);
NSLog(#"number of gaps: %lu", arr.count - 1);
Regex is the 'coolest' way, but this might be the fastest and cleanest
NSArray *components= [theString componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSLog(#"gaps: %f", components.count - 1);

Count the amount of times '$' shows up in a string (Objective C)

I was wondering if there was an easy method to find the amount of times a character such as '$' shows up in a string in the language objective-c.
The real world example I am using is a string that would look like:
542$764$231$DataEntry
What I need to do is first:
1) count the amount of times the '$' shows up to know what tier the DataEntry is in my database (my database structure is one I made up)
2) then I need to get all of the numbers, as they are index numbers. The numbers need to be stored in a NSArray. And I will loop through them all getting the different indexes. I'm not going to explain how my database structure works as that isn't relevant.
Basically from that NSString, I need, the amount of times '$' shows up. And all of the numbers in between the dollar signs. This would be a breeze to do in PHP, but I was curious to see how I could go about this in Objective-C.
Thanks,
Michael
[[#"542$764$231$DataEntry" componentsSeparatedByString:#"$"] count]-1
The componentsSeparatedByString suggested by #Parag Bafna and #J Shapiro or NSRegularExpression e.g.:
#import <Foundation/Foundation.h>
int main(int argc, char *argv[]) {
#autoreleasepool {
NSError *error = NULL;
NSString *searchText = #"542$764$231$DataEntry";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(\\d{3})\\$" options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:searchText options:0 range:NSMakeRange(0, [searchText length]) ];
printf("match count = %ld\n",numberOfMatches);
[regex enumerateMatchesInString:searchText
options:0
range:NSMakeRange(0,[searchText length])
usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
NSRange range = [match rangeAtIndex:1];
printf("match = %s\n",[[searchText substringWithRange:range] UTF8String]);
}];
}
}
The componentsSeparatedByString is probably the preferred approach and much more performant where the pattern has simple repeating delimiters; but I included this approach for completeness sake.
Try this code:
NSMutableArray* substrings=[NSMutableArray new];
// This will contain all the substrings
NSMutableArray* numbers=[NSMutableArray new];
// This will contain all the numbers
NSNumberFormatter* formatter=[NSNumberFormatter new];
// The formatter will scan all the strings and estabilish if they're
// valid numbers, if so it will produce a NSNumber object
[formatter setNumberStyle: NSNumberFormatterDecimalStyle];
NSString* entry= #"542$764$231$DataEntry";
NSUInteger count=0,last=0;
// count will contain the number of '$' characters found
NSRange range=NSMakeRange(0, entry.length);
// range is the range where to check
do
{
range= [entry rangeOfString: #"$" options: NSLiteralSearch range: range];
// Check for a substring
if(range.location!=NSNotFound)
{
// If there's not a further substring range.location will be NSNotFound
NSRange substringRange= NSMakeRange(last, range.location-last);
// Get the range of the substring
NSString* substring=[entry substringWithRange: substringRange];
[substrings addObject: substring];
// Get the substring and add it to the substrings mutable array
last=range.location+1;
range= NSMakeRange(range.location+range.length, entry.length-range.length-range.location);
// Calculate the new range where to check for the next substring
count++;
// Increase the count
}
}while( range.location!=NSNotFound);
// Now count contains the number of '$' characters found, and substrings
// contains all the substrings separated by '$'
for(NSString* substring in substrings)
{
// Check all the substrings found
NSNumber* number;
if([formatter getObjectValue: &number forString: substring range: nil error: nil])
{
// If the substring is a valid number, the method returns YES and we go
// inside this scope, so we can add the number to the numbers array
[numbers addObject: number];
}
}
// Now numbers contains all the numbers found

NSString by removing the initial zeros?

How can I remove leading zeros from an NSString?
e.g. I have:
NSString *myString;
with values such as #"0002060", #"00236" and #"21456".
I want to remove any leading zeros if they occur:
e.g. Convert the previous to #"2060", #"236" and #"21456".
Thanks.
For smaller numbers:
NSString *str = #"000123";
NSString *clean = [NSString stringWithFormat:#"%d", [str intValue]];
For numbers exceeding int32 range:
NSString *str = #"100004378121454";
NSString *clean = [NSString stringWithFormat:#"%d", [str longLongValue]];
This is actually a case that is perfectly suited for regular expressions:
NSString *str = #"00000123";
NSString *cleaned = [str stringByReplacingOccurrencesOfString:#"^0+"
withString:#""
options:NSRegularExpressionSearch
range:NSMakeRange(0, str.length)];
Only one line of code (in a logical sense, line breaks added for clarity) and there are no limits on the number of characters it handles.
A brief explanation of the regular expression pattern:
The ^ means that the pattern should be anchored to the beginning of the string. We need that to ensure it doesn't match legitimate zeroes inside the sequence of digits.
The 0+ part means that it should match one or more zeroes.
Put together, it matches a sequence of one or more zeroes at the beginning of the string, then replaces that with an empty string - i.e., it deletes the leading zeroes.
The following method also gives the output.
NSString *test = #"0005603235644056";
// Skip leading zeros
NSScanner *scanner = [NSScanner scannerWithString:test];
NSCharacterSet *zeros = [NSCharacterSet
characterSetWithCharactersInString:#"0"];
[scanner scanCharactersFromSet:zeros intoString:NULL];
// Get the rest of the string and log it
NSString *result = [test substringFromIndex:[scanner scanLocation]];
NSLog(#"%# reduced to %#", test, result);
- (NSString *) removeLeadingZeros:(NSString *)Instring
{
NSString *str2 =Instring ;
for (int index=0; index<[str2 length]; index++)
{
if([str2 hasPrefix:#"0"])
str2 =[str2 substringFromIndex:1];
else
break;
}
return str2;
}
In addition to adali's answer, you can do the following if you're worried about the string being too long (i.e. greater than 9 characters):
NSString *str = #"000200001111111";
NSString *strippedStr = [NSString stringWithFormat:#"%lld", [temp longLongValue]];
This will give you the result: 200001111111
Otherwise, [NSString stringWithFormat:#"%d", [temp intValue]] will probably return 2147483647 because of overflow.

Separating NSString into NSArray, but allowing quotes to group words

I have a search string, where people can use quotes to group phrases together, and mix this with individual keywords. For example, a string like this:
"Something amazing" rooster
I'd like to separate that into an NSArray, so that it would have Something amazing (without quotes) as one element, and rooster as the other.
Neither componentsSeparatedByString nor componentsSeparatedByCharactersInSet seem to fit the bill. Is there an easy way to do this, or should I just code it up myself?
You probably will have to code some of this up yourself, but the NSScanner should be a good basis on which to build. If you use the scanUpToCharactersInSet method to look for everything up to your next whitespace or quote character to can pick off words. Once you encounter a quite character, you could continue to scan using just the quote in the character set to end at, so that spaces within the quotes don't result in the end of a token.
I made a simple way to do this using NSScanner:
+ (NSArray *)arrayFromTagString:(NSString *)string {
NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *substring;
NSMutableArray *array = [[NSMutableArray alloc] init];
while (scanner.scanLocation < string.length) {
// test if the first character is a quote
unichar character = [string characterAtIndex:scanner.scanLocation];
if (character == '"') {
// skip the first quote and scan everything up to the next quote into a substring
[scanner setScanLocation:(scanner.scanLocation + 1)];
[scanner scanUpToString:#"\"" intoString:&substring];
[scanner setScanLocation:(scanner.scanLocation + 1)]; // skip the second quote too
}
else {
// scan everything up to the next space into the substring
[scanner scanUpToString:#" " intoString:&substring];
}
// add the substring to the array
[array addObject:substring];
//if not at the end, skip the space character before continuing the loop
if (scanner.scanLocation < string.length) [scanner setScanLocation:(scanner.scanLocation + 1)];
}
return array.copy;
}
This method will convert the array back to a tag string, re-quoting the multi-word tags:
+ (NSString *)tagStringFromArray:(NSArray *)array {
NSMutableString *string = [[NSMutableString alloc] init];
NSRange range;
for (NSString *substring in array) {
if (string.length > 0) {
[string appendString:#" "];
}
range = [substring rangeOfString:#" "];
if (range.location != NSNotFound) {
[string appendFormat:#"\"%#\"", substring];
}
else [string appendString:substring];
}
return string.description;
}
I ended up going with a regular expression as I was already using RegexKitLite, and creating this NSString+SearchExtensions category.
.h:
// NSString+SearchExtensions.h
#import <Foundation/Foundation.h>
#interface NSString (SearchExtensions)
-(NSArray *)searchParts;
#end
.m:
// NSString+SearchExtensions.m
#import "NSString+SearchExtensions.h"
#import "RegexKitLite.h"
#implementation NSString (SearchExtensions)
-(NSArray *)searchParts {
__block NSMutableArray *items = [[NSMutableArray alloc] initWithCapacity:5];
[self enumerateStringsMatchedByRegex:#"\\w+|\"[\\w\\s]*\"" usingBlock: ^(NSInteger captureCount,
NSString * const capturedStrings[captureCount],
const NSRange capturedRanges[captureCount],
volatile BOOL * const stop) {
NSString *result = [capturedStrings[0] stringByReplacingOccurrencesOfRegex:#"\"" withString:#""];
NSLog(#"Match: '%#'", result);
[items addObject:result];
}];
return [items autorelease];
}
#end
This returns an NSArray of strings with the search strings, removing the double quotes that surround the phrases.
If you'll allow a slightly different approach, you could try Dave DeLong's CHCSVParser. It is intended to parse CSV strings, but if you set the space character as the delimiter, I am pretty sure you will get the intended behavior.
Alternatively, you can peek into the code and see how it handles quoted fields - it is published under the MIT license.
I would run -componentsSeparatedByString:#"\"" first, then create a BOOL isPartOfQuote, initialized to YES if the first character of the string was a ", but otherwise set to NO.
Then create a mutable array to return:
NSMutableArray* masterArray = [[NSMutableArray alloc] init];
Then, create a loop over the array returned from the separation:
for(NSString* substring in firstSplitArray) {
NSArray* secondSplit;
if (isPartOfQuote == NO) {
secondSplit = [substring componentsSeparatedByString:#" "];
}
else {
secondSplit = [NSArray arrayWithObject: substring];
}
[masterArray addObjectsFromArray: secondSplit];
isPartOfQuote = !isPartOfQuote;
}
Then return masterArray from the function.