How delete all symbols from string except letters and numbers? - objective-c

I am trying use next code:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[\\p{L}[0-9]]+|-" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#""];
but it doesn't work. I use different variations of this, but without success too.
Example text:
Это тестовый.!!?! ;$%###### (вопрос) номер 1256 - верно.
Example output:
Это тестовый вопрос номер 1256 - верно

Your regex is actually matching characters you want to remove, but it is corrupt and does not even do that (due to a "wild" ]).
If you need to delete all chars except letters, digits, hyphens and whitespaces, use #"[^\\p{L}\\p{M}0-9\\s-]+".
Details:
[^\\p{L}\\p{M}0-9\\s-]+ - one or more characters that are NOT:
\\p{L} - Unicode letters
\\p{M} - diacritics
0-9 - ASCII digits
\\s - whitespace
- - a literal hyphen.
See the online Objective-C demo:
NSString *text = #"Это тестовый.!!?! ;$%###### (вопрос) номер 1256 - верно";
NSError *error = NULL;
NSRegularExpression *regexp = [NSRegularExpression regularExpressionWithPattern:#"[^\\p{L}\\p{M}0-9\\s-]+" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *result = [regexp stringByReplacingMatchesInString:text options:0 range:NSMakeRange(0, [text length]) withTemplate:#""];
NSLog(result);
Result: Это тестовый вопрос номер 1256 - верно

1st find the num-alphabet set:
NSCharacterSet *alphaSet = [NSCharacterSet alphanumericCharacterSet];
2nd get the invert set of it, and we will use it as separators:
NSCharacterSet *separatorSet = [alphaSet invertedSet];
3nd use separator to separate the old string and then join characters back together with #"":
NSString *newString = [[oldString componentsSeparatedByCharactersInSet: separatorSet]componentsJoinedByString:#""];'

Related

Regex number of matches is always zero

I want to check a UITextField text with a format like "G12-123456".
Rules are simple;
First character must be upper case letter.
The 2nd and 3rd must be number.
Fourth must be "-" character.
The last six must be only numbers.
Below code not work, number of matches always returns zero.
I also tried regex as "[A-Z0-9]{3}-[0-9]{6}"
NSString * myRegex = #"[A-Z][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9]";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:myRegex
options:NSRegularExpressionCaseInsensitive
error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string
options:NSMatchingReportProgress
range:NSMakeRange(0, [string length])];
This one works with same code [^a-zA-Z0-9] -> Check whether an NSString contains a special character and a digit.
Any help would be appreciated.
First of all basically your code is supposed to work.
However both options are nonsensical. If you want to check for uppercase letter you must not pass NSRegularExpressionCaseInsensitive and NSMatchingReportProgress affects only the block based API.
In both cases pass 0.
The pattern can be written more efficient
NSString *myRegex = #"[A-Z]\\d{2}-\\d{6}";
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:myRegex
options:0
error:&error];
if {error) {
NSLog(#"%#", error);
} else {
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string
options:0
range:NSMakeRange(0, [string length])];
NSLog(#"%lu", numberOfMatches);
}
If the regex must match the entire string add the start - end anchors.
NSString *myRegex = #"^[A-Z]\\d(2)-\\d{6}$";
If numberOfMatches is zero please check if the hyphen character is the standard one (ASCII 45, hex 0x2D).

How to remove text within parentheses using NSRegularExpression?

I try to remove part of the string which is inside parentheses.
As an example, for the string "(This should be removed) and only this part should remain", after using NSRegularExpression it should become "and only this part should remain".
I have this code, but nothing happens. I have tested my regex code with RegExr.com and it works correctly. I would appreciate any help.
NSString *phraseLabelWithBrackets = #"(test text) 1 2 3 text test";
NSError *error = NULL;
NSRegularExpression *regexp = [NSRegularExpression regularExpressionWithPattern:#"/\\(([^\\)]+)\\)/g" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *phraseLabelWithoutBrackets = [regexp stringByReplacingMatchesInString:phraseLabelWithBrackets options:0 range:NSMakeRange(0, [phraseLabelWithBrackets length]) withTemplate:#""];
NSLog(phraseLabelWithoutBrackets);
Remove the regex delimiters and make sure you also exclude ( in the character class:
NSString *phraseLabelWithBrackets = #"(test text) 1 2 3 text test";
NSError *error = NULL;
NSRegularExpression *regexp = [NSRegularExpression regularExpressionWithPattern:#"\\([^()]+\\)" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *phraseLabelWithoutBrackets = [regexp stringByReplacingMatchesInString:phraseLabelWithBrackets options:0 range:NSMakeRange(0, [phraseLabelWithBrackets length]) withTemplate:#""];
NSLog(phraseLabelWithoutBrackets);
See this IDEONE demo and a regex demo.
The \([^()]+\) pattern will match
\( - an open parenthesis
[^()]+ - 1 or more characters other than ( and ) (change + to * to also match and remove empty parentheses ())
\) - a closing parenthesis

How can I replace one pair of character with multiple occurrence in a string?

Original String is: This is a sentence with (noun) (verb) (adverb).
Original sentence has three occurrence of (). I need the last one intact but replace rest with #""
Required String: This is a sentence with (adverb).
I can do it with NSRange but I am looking for NSRegularExpression pattern.
Also which is more efficient, one with NSRange or the NSRegularExpression.
CODE
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\(.*?\\)" options:NSRegularExpressionCaseInsensitive error:NULL];
NSString *newString = [regex stringByReplacingMatchesInString:modify options:0 range:NSMakeRange(0, [modify length]) withTemplate:#""];
Output:: This is a sentence with
You can obtain the match ranges themselves and do the replacement manually, ignoring the last one.
NSMutableString* newString = [modify mutableCopy];
NSArray<NSTextCheckingResult*>* matches = [regex matchesInString:newString options:0 range:NSMakeRange(0, newString.length)];
if (matches.count >= 2)
{
// Enumerate backwards so that each replacement doesn't invalidate the other ranges
for (NSInteger i = matches.count - 2; i >= 0; i--)
{
NSTextCheckingResult* result = matches[i];
[newString replaceCharactersInRange:result.range withString:#""];
}
}

Objective C - Split string into array

How would I do this? I'm new to Objective-C but I can't find anything that would help me do this.
NSArray *splitLine = [currentLine componentsSeparatedByString:#":%#",notNumber];
Where notNumber is a string that represents anything that isn't a number. So I want to separate a string where there are colons separated by strings that aren't numbers. (I want to avoid splitting at times i.e. 3:00pm, but split at iCal parameters like DESCRIPTION: and LOCATION:.)
You can do this in several steps, like this. I have not compiled this code, but it should at least give you an idea of what to do.
1) Create a regex object to match your separators:
NSString *regexString = #"DESCRIPTION:\s|LOCATION:\s"; // or whatever makes sense for your scenario
NSRegularExpression *regex =
[NSRegularExpression regularExpressionWithPattern:regexString
options:NSRegularExpressionCaseInsensitive
error:nil];
2) Replace all the different separators matching your regex with just one separator:
NSRange range = NSMakeRange(0, string.length);
NSString *string2 = [regex stringByReplacingMatchesInString:string
options:0
range:range
withTemplate:#"SEPARATOR"];
3) Split the string!
NSArray *elements = [string2 componentsSeparatedByString:#"SEPARATOR"];
Shortest solution for splitting string.
NSString *str = #"Please split me to form array of words";
NSArray *wordsArray = [str componentsSeparatedByString:#" "];
You can use regular expressions!
Using the pattern (I believe this is the core of your question):
pattern = #"(?<=[^0-9]):(?=[^0-9])"
This pattern will only match ':' symbols not surrounded by numbers.
Then replace with a dummy value that won't show in your data
dummy = #"NEVERSEETHIS"
NSRegularExpressions *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:nil];
NSRange range = NSMakeRange(0, [string length])
NSString *modified= [regex replaceMatchesInString:yourString options:0 range:range withTemplate:dummy];
and finally, split
return [modified componentsSeparatedByString:dummy];

RegEx (replaceMatchesInString) does not work

Why does this RegEx with replaceMatchesInString return only "+" instead of "+123"?
NString *phoneNumberCleaned = [NSString stringWithFormat:#"++00123"];
NSString *strRegExPhoneNumberPrefixWrong = #"^([+0]*)\\d*$";
NSRegularExpression *regEx = [NSRegularExpression regularExpressionWithPattern:strRegEx options:0 error:nil];
[regEx replaceMatchesInString:phoneNumberCleaned options:0 range:NSMakeRange(0, [phoneNumberCleaned length]) withTemplate:#"+"];
return phoneNumberCleaned;
Thanks
NSString *string = #"++00123";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"^[+0]+(?=\\d*)"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSString *modifiedString = [regex
stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#"+"];
return modifiedString;
The problem with your Regex was that ^([+0]*)\\d*$ is also matching the \d* which means, that it also gets replaced (you'd think that it would only replace your capture group, but evidently that isn't so). So you were essentialy replacing any string that matches the above pattern (which was including any trailing numbers), which in your case was the entire number.
What I used in my answer is called a positive lookahead.
^[+0]+(?=\\d*)$
The lookahead basically means that you're looking for zero or more + or 0 that are followed by zero or more digits EXCLUDING the digits from the match. So you only replace the zeroes and pluses, not the digits following them.