Is there an Objective-c regex replace with callback/C# MatchEvaluator equivalent? - objective-c

I have a C# project I'm intending to port to Objective-C. From what I understand about Obj-C, it looks like there's a confusing variety of Regex options but I can't see anything about a way of doing a replace with callback.
I'm looking for something that is the equivalent of the C# MatchEvaluator delegate or PHP's preg_replace_callback. An example of what I want to do in C# is -
// change input so each word is followed a number showing how many letters it has
string inputString = "Hello, how are you today ?";
Regex theRegex = new Regex(#"\w+");
string outputString = theRegex.Replace(inputString, delegate (Match thisMatch){
return thisMatch.Value + thisMatch.Value.Length;
});
// outputString is now 'Hello5, how3 are3 you3 today5 ?'
How could I do this in Objective-C ? In my actual situation the Regex has both lookahead and lookbehind assertions in it though, so any alternative involving finding the strings in advance and then doing a series of straight string replaces won't work unfortunately.

Foundation has a NSRegularExpression class (iOS4 and later), which may be useful to you. From the docs:
The fundamental matching method for
NSRegularExpression is a Block
iterator method that allows clients to
supply a Block object which will be
invoked each time the regular
expression matches a portion of the
target string. There are additional
convenience methods for returning all
the matches as an array, the total
number of matches, the first match,
and the range of the first match.
For example:
NSString *input = #"Hello, how are you today?";
// make a copy of the input string. we are going to edit this one as we iterate
NSMutableString *output = [NSMutableString stringWithString:input];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"\\w+"
options:NSRegularExpressionCaseInsensitive
error:&error];
// keep track of how many additional characters we've added (1 per iteration)
__block NSUInteger count = 0;
[regex enumerateMatchesInString:input
options:0
range:NSMakeRange(0, [input length])
usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
// Note that Blocks in Objective C are basically closures
// so they will keep a constant copy of variables that were in scope
// when the block was declared
// unless you prefix the variable with the __block qualifier
// match.range is a C struct
// match.range.location is the character offset of the match
// match.range.length is the length of the match
NSString *matchedword = [input substringWithRange:match.range];
// the matched word with the length appended
NSString *new = [matchedword stringByAppendingFormat:#"%d", [matchedword length]];
// every iteration, the output string is getting longer
// so we need to adjust the range that we are editing
NSRange newrange = NSMakeRange(match.range.location+count, match.range.length);
[output replaceCharactersInRange:newrange withString:new];
count++;
}];
NSLog(#"%#", output); //output: Hello5, how3 are3 you3 today5?

I modified atshum's code to make it a bit more flexible:
__block int prevEndPosition = 0;
[regex enumerateMatchesInString:text
options:0
range:NSMakeRange(0, [text length])
usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop)
{
NSRange r = {.location = prevEndPosition, .length = match.range.location - prevEndPosition};
// Copy everything without modification between previous replacement and new one
[output appendString:[text substringWithRange:r]];
// Append string to be replaced
[output appendString:#"REPLACED"];
prevEndPosition = match.range.location + match.range.length;
}];
// Finalize string end
NSRange r = {.location = prevEndPosition, .length = [text length] - prevEndPosition};
[output appendString:[text substringWithRange:r]];
Seems to work for now (probably needs a bit more testing)

Related

How to display persian script through unicode

Someone please help me displaying this string in persian script: "\u0622\u062f\u0631\u0633 \u0627\u06cc\u0645\u06cc\u0644"
I have tried using
NSData *data = [yourtext dataUsingEncoding:NSUTF8StringEncoding];
NSString *decodevalue = [[NSString alloc] initWithData:dataencoding:NSNonLossyASCIIStringEncoding];
and this gets returned: u0622u062fu0631u0633 u0627u06ccu0645u06ccu0644
I want the same solution for objective C: https://www.codeproject.com/Questions/714169/Conversion-from-Unicode-to-Original-format-csharp
I assume that your input string has backslash-escaped codes (as if it was in a source code file verbatim), and you want to parse the escape sequences it into a unicode string, and also want to preserve the unescaped characters as they are.
This is what I've came up with:
NSError *badRegexError;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(\\\\u([a-f0-9]{4})|.)" options:0 error:&badRegexError];
if (badRegexError) {
NSLog(#"bad regex: %#", badRegexError);
return;
}
NSString *input = #"\\u0622\\u062f\\u0631\\u0633 123 test -_- \\u0627\\u06cc\\u0645\\u06cc\\u0644";
NSMutableString *output = [NSMutableString new];
[regex enumerateMatchesInString:input options:0 range:NSMakeRange(0, input.length)
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop)
{
NSRange codeRange = [result rangeAtIndex:2];
if (codeRange.location != NSNotFound) {
NSString *codeStr = [input substringWithRange:codeRange];
NSScanner *scanner = [NSScanner scannerWithString:codeStr];
unsigned int code;
if ([scanner scanHexInt:&code]) {
unichar c = (unichar)code;
[output appendString:[NSString stringWithCharacters:&c length:1]];
}
} else {
[output appendString:[input substringWithRange:result.range]];
}
}];
NSLog(#" actual: %#", output);
NSLog(#"expected: %#", #"\u0622\u062f\u0631\u0633 123 test -_- \u0627\u06cc\u0645\u06cc\u0644");
Explanation
This is using a regex that finds blocks of 6 characters like \uXXXX, for example \u062f. It extracts the code as a string like 062f, and then uses NSScanner.scanHexInt to convert it to a number. It assumes that this number is a valid unichar, and builds a string from it.
Note \\\\ in the regex, because first the objc compiler one layer of slashes, and it becomes \\, and then the regex compiler removes the 2nd layer of slashes and it becomes \ which is used for exact matching. If you have just "u0622u062f..." (without slashes), try removing \\\\ from the regex.
The second part of the regex (|.) treats non-escaped characters as is.
Caveats
You also might want to make the matching case insensitive by setting proper regex options.
This doesn't handle invalid character codes.
This is not the most performant solution, and you'd better use a proper parsing library to do this at scale.
Related docs and links
https://developer.apple.com/documentation/foundation/nsregularexpression?language=objc
https://developer.apple.com/documentation/foundation/nsregularexpression/1409687-enumeratematchesinstring?language=objc
How do you use NSRegularExpression's replacementStringForResult:inString:offset:template:
https://developer.apple.com/documentation/foundation/nstextcheckingresult?language=objc
xcode UTF-8 literals
Objective-C parse hex string to integer
just copy and paste this phrase to python shell and press "Enter" you will see this phrase in Farsi or Persian language. the result is: ایمیل آدرس

Replace specific words in NSString

what is the best way to get and replace specific words in string ?
for example I have
NSString * currentString = #"one {two}, thing {thing} good";
now I need find each {currentWord}
and apply function for it
[self replaceWord:currentWord]
then replace currentWord with result from function
-(NSString*)replaceWord:(NSString*)currentWord;
The following example shows how you can use NSRegularExpression and enumerateMatchesInString to accomplish the task. I have just used uppercaseString as function that replaces a word, but you can use your replaceWord method as well:
EDIT: The first version of my answer did not work correctly if the replaced words are
shorter or longer as the original words (thanks to Fabian Kreiser for noting that!) .
Now it should work correctly in all cases.
NSString *currentString = #"one {two}, thing {thing} good";
// Regular expression to find "word characters" enclosed by {...}:
NSRegularExpression *regex;
regex = [NSRegularExpression regularExpressionWithPattern:#"\\{(\\w+)\\}"
options:0
error:NULL];
NSMutableString *modifiedString = [currentString mutableCopy];
__block int offset = 0;
[regex enumerateMatchesInString:currentString
options:0
range:NSMakeRange(0, [currentString length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
// range = location of the regex capture group "(\\w+)" in currentString:
NSRange range = [result rangeAtIndex:1];
// Adjust location for modifiedString:
range.location += offset;
// Get old word:
NSString *oldWord = [modifiedString substringWithRange:range];
// Compute new word:
// In your case, that would be
// NSString *newWord = [self replaceWord:oldWord];
NSString *newWord = [NSString stringWithFormat:#"--- %# ---", [oldWord uppercaseString] ];
// Replace new word in modifiedString:
[modifiedString replaceCharactersInRange:range withString:newWord];
// Update offset:
offset += [newWord length] - [oldWord length];
}
];
NSLog(#"%#", modifiedString);
Output:
one {--- TWO ---}, thing {--- THING ---} good

Count the amount of times '$' shows up in a string (Objective C)

I was wondering if there was an easy method to find the amount of times a character such as '$' shows up in a string in the language objective-c.
The real world example I am using is a string that would look like:
542$764$231$DataEntry
What I need to do is first:
1) count the amount of times the '$' shows up to know what tier the DataEntry is in my database (my database structure is one I made up)
2) then I need to get all of the numbers, as they are index numbers. The numbers need to be stored in a NSArray. And I will loop through them all getting the different indexes. I'm not going to explain how my database structure works as that isn't relevant.
Basically from that NSString, I need, the amount of times '$' shows up. And all of the numbers in between the dollar signs. This would be a breeze to do in PHP, but I was curious to see how I could go about this in Objective-C.
Thanks,
Michael
[[#"542$764$231$DataEntry" componentsSeparatedByString:#"$"] count]-1
The componentsSeparatedByString suggested by #Parag Bafna and #J Shapiro or NSRegularExpression e.g.:
#import <Foundation/Foundation.h>
int main(int argc, char *argv[]) {
#autoreleasepool {
NSError *error = NULL;
NSString *searchText = #"542$764$231$DataEntry";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(\\d{3})\\$" options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:searchText options:0 range:NSMakeRange(0, [searchText length]) ];
printf("match count = %ld\n",numberOfMatches);
[regex enumerateMatchesInString:searchText
options:0
range:NSMakeRange(0,[searchText length])
usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
NSRange range = [match rangeAtIndex:1];
printf("match = %s\n",[[searchText substringWithRange:range] UTF8String]);
}];
}
}
The componentsSeparatedByString is probably the preferred approach and much more performant where the pattern has simple repeating delimiters; but I included this approach for completeness sake.
Try this code:
NSMutableArray* substrings=[NSMutableArray new];
// This will contain all the substrings
NSMutableArray* numbers=[NSMutableArray new];
// This will contain all the numbers
NSNumberFormatter* formatter=[NSNumberFormatter new];
// The formatter will scan all the strings and estabilish if they're
// valid numbers, if so it will produce a NSNumber object
[formatter setNumberStyle: NSNumberFormatterDecimalStyle];
NSString* entry= #"542$764$231$DataEntry";
NSUInteger count=0,last=0;
// count will contain the number of '$' characters found
NSRange range=NSMakeRange(0, entry.length);
// range is the range where to check
do
{
range= [entry rangeOfString: #"$" options: NSLiteralSearch range: range];
// Check for a substring
if(range.location!=NSNotFound)
{
// If there's not a further substring range.location will be NSNotFound
NSRange substringRange= NSMakeRange(last, range.location-last);
// Get the range of the substring
NSString* substring=[entry substringWithRange: substringRange];
[substrings addObject: substring];
// Get the substring and add it to the substrings mutable array
last=range.location+1;
range= NSMakeRange(range.location+range.length, entry.length-range.length-range.location);
// Calculate the new range where to check for the next substring
count++;
// Increase the count
}
}while( range.location!=NSNotFound);
// Now count contains the number of '$' characters found, and substrings
// contains all the substrings separated by '$'
for(NSString* substring in substrings)
{
// Check all the substrings found
NSNumber* number;
if([formatter getObjectValue: &number forString: substring range: nil error: nil])
{
// If the substring is a valid number, the method returns YES and we go
// inside this scope, so we can add the number to the numbers array
[numbers addObject: number];
}
}
// Now numbers contains all the numbers found

Objective-C NSString character substitution

I have a NSString category I am working on to perform character substitution similar to PHP's strtr. This method takes a string and replaces every occurrence of each character in fromString and replaces it with the character in toString with the same index. I have a working method but it is not very performant and would like to make it quicker and able to handle megabytes of data.
Edit (for clarity):
stringByReplacingOccurrencesOfString:withString:options:range: will not work. I have to take a string like "ABC" and after replacing "A" with "B" and "B" with "A" end up with "BAC". Successive invocations of stringByReplacingOccurrencesOfString:withString:options:range: would make a string like "AAC" which would be incorrect.
Suggestions would be great, sample code would be even better!
Code:
- (NSString *)stringBySubstitutingCharactersFromString:(NSString *)fromString
toString:(NSString *)toString;
{
NSMutableString *substitutedString = [self mutableCopy];
NSString *aCharacterString;
NSUInteger characterIndex
, stringLength = substitutedString.length;
for (NSUInteger i = 0; i < stringLength; ++i) {
aCharacterString = [NSString stringWithFormat: #"%C", [substitutedString characterAtIndex:i]];
characterIndex = [fromString rangeOfString:aCharacterString].location;
if (characterIndex == NSNotFound) continue;
[substitutedString replaceCharactersInRange:NSMakeRange(i, 1)
withString:[NSString stringWithFormat:#"%C", [toString characterAtIndex:characterIndex]]];
}
return substitutedString;
}
Also this code is executed after every change to text in a text view. It is passed the entire string every time. I know that there is a better way to do it, but I do not know how. Any suggestions for this would be most certainly appreciated!
You can make that kind of string substitution with NSRegularExpression either modifying an mutable string or creating a new immutable string. It will work with any two strings to substitute (even if they are more than one symbol) but you will need to escape any character that means something in a regular expression (like \ [ ( . * ? + etc).
The pattern finds either of the two substrings with the optional "anything" between and than replaces them with the two substrings with each other preserving the optional string between them.
// These string can be of any length
NSString *firstString = #"Axa";
NSString *secondString = #"By";
// Escaping of characters used in regular expressions has NOT been done here
NSString *pattern = [NSString stringWithFormat:#"(%#|%#)(.*?)(%#|%#)", firstString, secondString, firstString, secondString];
NSString *string = #"AxaByCAxaCBy";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:NSRegularExpressionCaseInsensitive
error:&error];
if (error) {
// Insert error handling here...
}
NSString *modifiedString = [regex stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#"$3$2$1"];
NSLog(#"Before:\t%#", string); // AxaByCAxaCBy
NSLog(#"After: \t%#", modifiedString); // ByAxaCByCAxa

How to capitalize the first word of the sentence in Objective-C?

I've already found how to capitalize all words of the sentence, but not the first word only.
NSString *txt =#"hi my friends!"
[txt capitalizedString];
I don't want to change to lower case and capitalize the first char. I'd like to capitalize the first word only without change the others.
Here is another go at it:
NSString *txt = #"hi my friends!";
txt = [txt stringByReplacingCharactersInRange:NSMakeRange(0,1) withString:[[txt substringToIndex:1] uppercaseString]];
For Swift language:
txt.replaceRange(txt.startIndex...txt.startIndex, with: String(txt[txt.startIndex]).capitalizedString)
The accepted answer is wrong. First, it is not correct to treat the units of NSString as "characters" in the sense that a user expects. There are surrogate pairs. There are combining sequences. Splitting those will produce incorrect results. Second, it is not necessarily the case that uppercasing the first character produces the same result as capitalizing a word containing that character. Languages can be context-sensitive.
The correct way to do this is to get the frameworks to identify words (and possibly sentences) in the locale-appropriate manner. And also to capitalize in the locale-appropriate manner.
[aMutableString enumerateSubstringsInRange:NSMakeRange(0, [aMutableString length])
options:NSStringEnumerationByWords | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
[aMutableString replaceCharactersInRange:substringRange
withString:[substring capitalizedStringWithLocale:[NSLocale currentLocale]]];
*stop = YES;
}];
It's possible that the first word of a string is not the same as the first word of the first sentence of a string. To identify the first (or each) sentence of the string and then capitalize the first word of that (or those), then surround the above in an outer invocation of -enumerateSubstringsInRange:options:usingBlock: using NSStringEnumerationBySentences | NSStringEnumerationLocalized. In the inner invocation, pass the substringRange provided by the outer invocation as the range argument.
Use
- (NSArray *)componentsSeparatedByCharactersInSet:(NSCharacterSet *)separator
and capitalize the first object in the array and then use
- (NSString *)componentsJoinedByString:(NSString *)separator
to join them back
pString = [pString
stringByReplacingCharactersInRange:NSMakeRange(0,1)
withString:[[pString substringToIndex:1] capitalizedString]];
you can user with regular expression i have done it's works for me simple you can paste below code
+(NSString*)CaptializeFirstCharacterOfSentence:(NSString*)sentence{
NSMutableString *firstCharacter = [sentence mutableCopy];
NSString *pattern = #"(^|\\.|\\?|\\!)\\s*(\\p{Letter})";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:NULL];
[regex enumerateMatchesInString:sentence options:0 range:NSMakeRange(0, [sentence length]) usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
//NSLog(#"%#", result);
NSRange r = [result rangeAtIndex:2];
[firstCharacter replaceCharactersInRange:r withString:[[sentence substringWithRange:r] uppercaseString]];
}];
NSLog(#"%#", firstCharacter);
return firstCharacter;
}
//Call this method
NsString *resultSentence = [UserClass CaptializeFirstCharacterOfSentence:yourTexthere];
An alternative solution in Swift:
var str = "hello"
if count(str) > 0 {
str.splice(String(str.removeAtIndex(str.startIndex)).uppercaseString, atIndex: str.startIndex)
}
For the sake of having options, I'd suggest:
NSString *myString = [NSString stringWithFormat:#"this is a string..."];
char *tmpStr = calloc([myString length] + 1,sizeof(char));
[myString getCString:tmpStr maxLength:[myString length] + 1 encoding:NSUTF8StringEncoding];
int sIndex = 0;
/* skip non-alpha characters at beginning of string */
while (!isalpha(tmpStr[sIndex])) {
sIndex++;
}
toupper(tmpStr[sIndex]);
myString = [NSString stringWithCString:tmpStr encoding:NSUTF8StringEncoding];
I'm at work and don't have my Mac to test this on, but if I remember correctly, you couldn't use [myString cStringUsingEncoding:NSUTF8StringEncoding] because it returns a const char *.
In swift you can do it as followed by using this extension:
extension String {
func ucfirst() -> String {
return (self as NSString).stringByReplacingCharactersInRange(NSMakeRange(0, 1), withString: (self as NSString).substringToIndex(1).uppercaseString)
}
}
calling your string like this:
var ucfirstString:String = "test".ucfirst()
I know the question asks specifically for an Objective C answer, however here is a solution for Swift 2.0:
let txt = "hi my friends!"
var sentencecaseString = ""
for (index, character) in txt.characters.enumerate() {
if 0 == index {
sentencecaseString += String(character).uppercaseString
} else {
sentencecaseString.append(character)
}
}
Or as an extension:
func sentencecaseString() -> String {
var sentencecaseString = ""
for (index, character) in self.characters.enumerate() {
if 0 == index {
sentencecaseString += String(character).uppercaseString
} else {
sentencecaseString.append(character)
}
}
return sentencecaseString
}