Remove Special Characters, Numbers and Double Spaces from String - objective-c

I want to remove Special Characters, Numbers and Double Spaces from a string, but it's removing all spaces. How can I fix it?
Code:
_tfName.text = [[_tfName.text componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#""];
Strings:
Real String Formated What I want
___________ ________ ___________
Lincon Man LiconMan Licon Man
Name Surname NameSurname Name Surname
09123721*)(%!#ˆ*# *blank* *blank*

You have a lot of options available. A good one is to use NSRegularExpression for the two needs. One to find unwanted characters and remove them and another to find double spaces and remove them.
I recommend this option because you can reuse most of the logic for doing both.
You can also use NSScanner.
You could accomplish the unwanted characters part by using NSCharacterSet to create a set of chars you don't want and remove them with NSString method
– stringByTrimmingCharactersInSet:
that takes an NSCharacterSet argument.
NSString has
– stringByReplacingOccurrencesOfString:withString:
You can pass it a double space string.
Other options exist, but NSRegularExpression is the way to go for the double spaces.
The NSCharacterSet approach is pretty easy for the unwanted characters.

Did you think of passing a white space as the parameter to componentsJoinedByString:
string = [[string componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]] componentsJoinedByString:#" "];

This may be what you are looking for:
NSRegularExpression *re = [NSRegularExpression regularExpressionWithPattern:#"[##$.,!\\d]" options:0 error:nil];
NSString *input = #"1234 Nöel* *Smith $!#";
NSString *output = [re stringByReplacingMatchesInString:input
options:0
range:NSMakeRange(0, [input length])
withTemplate:#""];
/* -> Nöel Smith */
Depending if you know what characters to remove or which ones to keep, you can write the regexp like this or the opposite with [^caracters-to-keep]. Just use the needed metacharacters for your case.

Related

Objective-C: Parsing String into an Array under Special Circumstances

I have a string:
[{"id":1,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},{"id":2,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},{"id":3,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0}]
However, I would like to parse this string into an array such as:
[{"id":1,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},
{"id":2,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0},
{"id":3,"gameName":"arizona","cost":"0.5E1","email":"hi#gmail.com","requests":0}]
This array is delimited by the comma in between the curly braces: },{
I tride usign the command
NSArray *responseArray = [response componentsSeparatedByString:#","];
but this separates the string into values at EVERY comma, which is not desirable.
Then I tried using regex:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\{.*\\}" options:NSRegularExpressionCaseInsensitive error:&error];
NSArray *matches = [regex matchesInString:response options:0 range:NSMakeRange(0, [response length])];
which found one match: starting at the first curly brace to the last curly brace.
I was wondering if anyone new how to solve this problem efficiently?
This string seems to be valid JSON. Try a JSON parser: NSJSONSerialization
I agree with H2CO3's suggestion to use a parser where possible.
But looking at your attempted regex, it looks like you just need to make it non-greedy, i.e.
#"\\{.*?\\}"
^
|
Add this question mark for non-greedy matching.
Of course, this will fail if you have deeper levels of (what I assume to be) nested arrays. Go with the JSON parser!

Objective-C – Replace newline sequences with one space

How can I replace newline (\n) sequences with one space.
I.e the user has entered a double newline ("\n\n") I want that replaced with one space (" "). Or the user has entered triple newlines ("\n\n\n") I want that replaced with also one space (" ").
Try this:
NSArray *split = [orig componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]];
split = [split filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:#"length > 0"]];
NSString *res = [split componentsJoinedByString:#" "];
This is how it works:
First line splits by newline characters
Second line removes empty items inserted for multiple separators in a row
Third line joins the strings back using a single space as the new separator
3 times more performant than using componentsSeparatedByCharactersInSet
NSString *fixed = [original stringByReplacingOccurrencesOfString:#"\\n+"
withString:#" "
options:NSRegularExpressionSearch
range:NSMakeRange(0, original.length)];
Possible alternative regex patterns:
Replace only space: [ ]+
Replace space and tabs: [ \\t]+
Replace space, tabs and newlines: \\s+
Replace newlines: \\n+
As wattson says you can do this with NSRegularExpression but the code is quite verbose so if you want to do this at several places I suggestion you to do a helper method or even a NSString category with method like -[NSString stringByReplacingMatchingPattern:withString:] or something similar.
NSString *string = #"a\n\na";
NSLog(#"%#", [[NSRegularExpression regularExpressionWithPattern:#"\\n+"
options:0
error:NULL]
stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#" "]);
Use a regular expression, something like "s/\n+/\w/" (a replace which will match 1 or more newline character and replace with a single white space)
this question has a link to a regex library, but there is NSRegularExpression available too

Objective C. Regular expression to eliminate anything after 3 dots

I wrote the following code to eliminate anything after 3 dots
currentItem.summary = #"I am just testing. I am ... the second part should be eliminated";
NSError * error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(.)*(/././.)(.)*" options:0 error:&error];
if(nil != regex){
currentItem.summary = [regex stringByReplacingMatchesInString:currentItem.summary
options:0 range:NSMakeRange(0, [currentItem.summary length])
withTemplate:#"$1"];
}
However, my input and output are the same. The correct output should be "I am just testing. I am".
I was trying to do this using regular expression because I have a database of other regular expressions that I run on the string. I know the performance might not be as good as a plain text find or replace but the strings involved are short. I also tried using "\" to escape the dots in the regex, but I was getting a warning.
There is another question with a similar topic but the match strings are not for objective c.
This is much easier and will accomplish what you want:
NSRange range = [currentItem.summary rangeOfString:#"..."];
if (range != NSNotFound) {
currentItem.summary = [currentItem.summary substringToIndex:range.location];
}
You have forward slashes, /, instead of backward slashes, \, in your pattern. Also if you wish to match everything before the three dots you should use (.*) - tag everything matched by the enclosed .*. (The other parentheses in the pattern are redundant.)
Nice alternative:
NSScanner *scanner = [NSScanner scannerWithString:currentItem.summary];
[scanner scanUpToString:#"..." intoString: &currentItem.summary];
My recommended regex for your problem:
regularExpressionWithPattern:#"^(.*)\\s*\\.{3}.*$"
Main differences between this one and yours:
uses backslashes to escape special chars
uses ^ and $ to anchor at the beginning and end of the string
only captures the interesting section with ()
strips whitespace before the ... by ignoring any number of whitespace chars (\s*).
After correcting the slashes and other improvements, my final expression is:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^(.*)\\.{3}.*$"
options:0
error:&error];

Split NSString into words, then rejoin it into original form

I am splitting an NSString like this: (filter string is an nsstring)
seperatorSet = [NSMutableCharacterSet whitespaceAndNewlineCharacterSet];
[seperatorSet formUnionWithCharacterSet:[NSCharacterSet punctuationCharacterSet]];
NSMutableArray *words = [[filterString componentsSeparatedByCharactersInSet:seperatorSet] mutableCopy];
I want to put words back into the form of filter string with the original punctuation and spacing. The reason I want to do this is I want to change some words and put it back together as it was originally.
A more robust way to split by words is to use string enumeration. A space is not always the delimiter and not all languages delimit spaces anyway (e.g. Japanese).
NSString * string = #" \n word1! word2,%$?'/word3.word4 ";
[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
options:NSStringEnumerationByWords
usingBlock:
^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(#"Substring: '%#'", substring);
}];
// Logs:
// Substring: 'word1'
// Substring: 'word2'
// Substring: 'word3'
// Substring: 'word4'
NSString *myString = #"Foo Bar Blah B..";
NSArray *myWords = [myString componentsSeparatedByCharactersInSet:
[NSCharacterSet characterSetWithCharactersInString:#" "]
];
NSString* string = [myWords componentsJoinedByString: #" "];
NSLog(#"%#",string);
Since you eliminate the original punctuation, there's no way to turn it back automatically.
The only way is not to use componentsSeparatedByCharactersInSet.
An alternative solution may be to iterate through the string and, for each char, check if it belongs to your character set.
If yes, add the char to a list and the substring to another list (you may use NSMutableArray class).
This way, for example, you know that the punctuation char between the first and the second substring is the first character in your list of separators.
You can use the pathArray componentsJoinedByString: method of the array class to rejoin the words:
NSString *orig = [words pathArray componentsJoinedByString:#" "];
How are you determining which words need to be replaced? Instead of breaking it apart in the first place, perhaps using -stringByReplacingOccurrencesOfString:withString:options:range: would be more suitable.
My guess is you may not be using the best API. If you're really worried about words, you should be using a word-based API. I'm a bit hazy on whether that would be NSDataDetector or something else. (I believe NSRegularExpression can deal with word boundaries in a smarter way.)
If you are using Mac OS X 10.7+ or iOS 4+ you can use NSRegularExpression, The pattern to replace a word is: "\b word \b" - (no spaces around word) \b matches a word boundary. Look at methods replaceMatchesInString:options:range:withTemplate: and stringByReplacingMatchesInString:options:range:withTemplate:.
Under 10.6 pr earlier if you wish to use regular expressions you can wrap the regcomp/regexec C-based functions, they support word boundaries as well. However you may prefer to use one of the other Cocoa options mentioned in other answers for this simple case.

NSRegularExpression string delimited by

I have a string for example #"You've earned Commentator and 4 ##other$$ badges". I want to retreive the substring #"other", which is delimited by ## and $$. I made a NSRegularExpression like this:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"##(.*)$$" options:NSRegularExpressionCaseInsensitive error:nil];
This completely ignores $$ and returns stuff starting with ##. What am I doing wrong? thanks.
Thats because '$' is a special character that represents the end of the line. Try \$\$ to escape it and tell the parser you want the characters.
I wouldn't use a regex in this situation, since the string bashing is so simple. No need for the overhead of compiling the expression.
NSString *source = #"You've earned Commentator and 4 ##other$$ badges";
NSRange firstDelimiterRange = [source rangeOfString:#"##"];
NSRange secondDelimiterRange = [source rangeOfString:#"$$"];
NSString *result = [source substringWithRange:
NSMakeRange(firstDelimiterRange.origin +2,
firstDelimiterRange.origin - secondDelimiterRange.origin)];