How to display persian script through unicode - objective-c

Someone please help me displaying this string in persian script: "\u0622\u062f\u0631\u0633 \u0627\u06cc\u0645\u06cc\u0644"
I have tried using
NSData *data = [yourtext dataUsingEncoding:NSUTF8StringEncoding];
NSString *decodevalue = [[NSString alloc] initWithData:dataencoding:NSNonLossyASCIIStringEncoding];
and this gets returned: u0622u062fu0631u0633 u0627u06ccu0645u06ccu0644
I want the same solution for objective C: https://www.codeproject.com/Questions/714169/Conversion-from-Unicode-to-Original-format-csharp

I assume that your input string has backslash-escaped codes (as if it was in a source code file verbatim), and you want to parse the escape sequences it into a unicode string, and also want to preserve the unescaped characters as they are.
This is what I've came up with:
NSError *badRegexError;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(\\\\u([a-f0-9]{4})|.)" options:0 error:&badRegexError];
if (badRegexError) {
NSLog(#"bad regex: %#", badRegexError);
return;
}
NSString *input = #"\\u0622\\u062f\\u0631\\u0633 123 test -_- \\u0627\\u06cc\\u0645\\u06cc\\u0644";
NSMutableString *output = [NSMutableString new];
[regex enumerateMatchesInString:input options:0 range:NSMakeRange(0, input.length)
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop)
{
NSRange codeRange = [result rangeAtIndex:2];
if (codeRange.location != NSNotFound) {
NSString *codeStr = [input substringWithRange:codeRange];
NSScanner *scanner = [NSScanner scannerWithString:codeStr];
unsigned int code;
if ([scanner scanHexInt:&code]) {
unichar c = (unichar)code;
[output appendString:[NSString stringWithCharacters:&c length:1]];
}
} else {
[output appendString:[input substringWithRange:result.range]];
}
}];
NSLog(#" actual: %#", output);
NSLog(#"expected: %#", #"\u0622\u062f\u0631\u0633 123 test -_- \u0627\u06cc\u0645\u06cc\u0644");
Explanation
This is using a regex that finds blocks of 6 characters like \uXXXX, for example \u062f. It extracts the code as a string like 062f, and then uses NSScanner.scanHexInt to convert it to a number. It assumes that this number is a valid unichar, and builds a string from it.
Note \\\\ in the regex, because first the objc compiler one layer of slashes, and it becomes \\, and then the regex compiler removes the 2nd layer of slashes and it becomes \ which is used for exact matching. If you have just "u0622u062f..." (without slashes), try removing \\\\ from the regex.
The second part of the regex (|.) treats non-escaped characters as is.
Caveats
You also might want to make the matching case insensitive by setting proper regex options.
This doesn't handle invalid character codes.
This is not the most performant solution, and you'd better use a proper parsing library to do this at scale.
Related docs and links
https://developer.apple.com/documentation/foundation/nsregularexpression?language=objc
https://developer.apple.com/documentation/foundation/nsregularexpression/1409687-enumeratematchesinstring?language=objc
How do you use NSRegularExpression's replacementStringForResult:inString:offset:template:
https://developer.apple.com/documentation/foundation/nstextcheckingresult?language=objc
xcode UTF-8 literals
Objective-C parse hex string to integer

just copy and paste this phrase to python shell and press "Enter" you will see this phrase in Farsi or Persian language. the result is: ایمیل آدرس

Related

String search in objective-c

Given a string like this:
http://files.domain.com/8aa55fc4-3015-400e-80f5-390997b43cf9/c07cb0d2-b7d7-4bfd-b0c3-6f43571e3c29-MyFile.jpg
I need to just locate the string "MyFile", and also tell what kind of image it is (.jpg or .png). How can I accomplish this?
The only thing I can think of is to search backward for the first four characters to get the file extension, then keep searching backward until I find the first hyphen, and assume the file name itself doesn't have any hyphens. But I don't know how to do that. Is there a better way?
Use NSRegularExpression to search for the file name. The search pattern really depends on what you know about the file name. If the "random" numbers and characters before MyFile has a known format, you could take that into account. My proposal below assumes that the file name doesn't contain any minus signs.
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"-([:alnum:]*)\\.(jpg|png)$"
options:NSRegularExpressionSearch
error:nil];
// Get the match between the first brackets.
NSTextCheckingResult *match = [regex firstMatchInString:string options:0
range:NSMakeRange(0, [string length])];
NSRange matchRange = [match rangeAtIndex:1];
NSString *fileName = [string substringWithRange:matchRange];
NSLog(#"Filename: %#", fileName);
// Get the extension with a simple NSString method.
NSString *extension = [string pathExtension];
NSLog(#"Extension: %#", extension);
[myString lastPathComponent] will get the filename.
[myString pathExtension] will get the extension.
To get the suffix of the filename, I think you'll have to roll your own parse. Is it always the string after the last dash and before the extension?
If so, here's an idea:
- (NSString *)lastLittleBitOfTheFilenameFrom:(NSString *)filename {
NSInteger fnStart = [filename rangeOfString:#"-" options:NSBackwardsSearch].location + 1;
NSInteger fnEnd = [filename rangeOfString:#"." options:NSBackwardsSearch].location;
// might need some error checks here depending on what you expect in the original url
NSInteger length = fnEnd - fnStart;
return [filename substringWithRange:NSMakeRange(fnStart, length)];
}
Or, thanks to #Chuck ...
// even more sensitive to unexpected input, but nice and tiny ...
- (NSString *)lastLittleBitOfTheFilenameFrom:(NSString *)filename {
NSString *nameExt = [[filename componentsSeparatedByString:#"-"] lastObject];
return [[nameExt componentsSeparatedByString:#"."] objectAtIndex:0];
}
If you have the string in an NSString object, or create it from that string, you may use the rangeOfString method to acomplish both.
See https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html for more details.

Objective-C NSString character substitution

I have a NSString category I am working on to perform character substitution similar to PHP's strtr. This method takes a string and replaces every occurrence of each character in fromString and replaces it with the character in toString with the same index. I have a working method but it is not very performant and would like to make it quicker and able to handle megabytes of data.
Edit (for clarity):
stringByReplacingOccurrencesOfString:withString:options:range: will not work. I have to take a string like "ABC" and after replacing "A" with "B" and "B" with "A" end up with "BAC". Successive invocations of stringByReplacingOccurrencesOfString:withString:options:range: would make a string like "AAC" which would be incorrect.
Suggestions would be great, sample code would be even better!
Code:
- (NSString *)stringBySubstitutingCharactersFromString:(NSString *)fromString
toString:(NSString *)toString;
{
NSMutableString *substitutedString = [self mutableCopy];
NSString *aCharacterString;
NSUInteger characterIndex
, stringLength = substitutedString.length;
for (NSUInteger i = 0; i < stringLength; ++i) {
aCharacterString = [NSString stringWithFormat: #"%C", [substitutedString characterAtIndex:i]];
characterIndex = [fromString rangeOfString:aCharacterString].location;
if (characterIndex == NSNotFound) continue;
[substitutedString replaceCharactersInRange:NSMakeRange(i, 1)
withString:[NSString stringWithFormat:#"%C", [toString characterAtIndex:characterIndex]]];
}
return substitutedString;
}
Also this code is executed after every change to text in a text view. It is passed the entire string every time. I know that there is a better way to do it, but I do not know how. Any suggestions for this would be most certainly appreciated!
You can make that kind of string substitution with NSRegularExpression either modifying an mutable string or creating a new immutable string. It will work with any two strings to substitute (even if they are more than one symbol) but you will need to escape any character that means something in a regular expression (like \ [ ( . * ? + etc).
The pattern finds either of the two substrings with the optional "anything" between and than replaces them with the two substrings with each other preserving the optional string between them.
// These string can be of any length
NSString *firstString = #"Axa";
NSString *secondString = #"By";
// Escaping of characters used in regular expressions has NOT been done here
NSString *pattern = [NSString stringWithFormat:#"(%#|%#)(.*?)(%#|%#)", firstString, secondString, firstString, secondString];
NSString *string = #"AxaByCAxaCBy";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:NSRegularExpressionCaseInsensitive
error:&error];
if (error) {
// Insert error handling here...
}
NSString *modifiedString = [regex stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#"$3$2$1"];
NSLog(#"Before:\t%#", string); // AxaByCAxaCBy
NSLog(#"After: \t%#", modifiedString); // ByAxaCByCAxa

capturing a string before and after some data using regular expressions in ObjectiveC

I am relatively new to regex expressions and needed some advice.
The goal is to the get data in the following format into an array:
value=777
value=888
From this data: "value=!##777!##value=##$888*"
Here is my code (Objective C):
NSString *aTestString = #"value=!##777!##value=##$**888***";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"value=(?=[^\d])(\d)" options:0 error:&anError];
So my questions are:
1) Can the regex engine capture data that is split like that? Retrieving the "value=" removing the garbage data in the middle, and then grouping it with its number "777" etc?
2) If this can be done, then is my regex expression valid? value=(?=[^\d])(\d)
The lookahead (?=) is wrong here, you haven't correctly escaped the \d (it becomes \\d) and last but not least you left out the quantifiers * (0 or more times) and + (1 or more times):
NSString *aTestString = #"value=!##777!##value=##$**888***";
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"value=[^\\d]*(\\d+)"
options:0
error:NULL
];
[regex
enumerateMatchesInString:aTestString
options:0
range:NSMakeRange(0, [aTestString length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSLog(#"Value: %#", [aTestString substringWithRange:[result rangeAtIndex:1]]);
}
];
Edit: Here's a more refined pattern. It catches a word before =, then discards non-digits and catches digits afterwards.
NSString *aTestString = #"foo=!##777!##bar=##$**888***";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(\\w+)=[^\\d]*(\\d+)" options:0 error:NULL];
[regex
enumerateMatchesInString:aTestString
options:0
range:NSMakeRange(0, [aTestString length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSLog(
#"Found: %#=%#",
[aTestString substringWithRange:[result rangeAtIndex:1]],
[aTestString substringWithRange:[result rangeAtIndex:2]]
);
}
];
// Output:
// Found: foo=777
// Found: bar=888
Regular expresssions are expressions that match a given pattern. A regular expression could match, say, a string like "value=!##777" using an expression like "value=[##!%^&][0-9]", which says to match the literal "value=", and then any string made up of the characters #, #, !, %, ^, and &, and finally any string made up of digits. But you can't use a single regular expression by itself to get just the parts of the string that you want, i.e. "value=777".
So, one solution would be to create an expression that recognizes strings such as "value=!##777", and then do some further processing on that string to remove the offending characters.
I think you'll be better off using NSScanner to scan the data and extract the parts you want. For example, you can use -scanString:intoString: to get the "value=" part, followed by -scanCharactersFromSet:intoString: to remove the part you don't want, and then call that method again to get the collection of digits.

How to get regex to grab all letters from an objective c string?

I'm trying to get the following regular expression to grab only the letters from an alpha-numeric character input box, however it's always returning the full string, and not any of the A-Z letters.
What am I doing wrong?
It needs to grab all the letters only. No weird characters and no numbers, just A-Z and put it into a string for me to use later on.
// A default follows
NSString *TAXCODE = txtTaxCode.text;
// Setup default for taxcode
if ([TAXCODE length] ==0)
{
TAXCODE = #"647L";
}
NSError *error = NULL;
NSRegularExpression *regex;
regex = [NSRegularExpression regularExpressionWithPattern:#"/[^A-Z]/gi"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSLog(#"TAXCODE = %#", TAXCODE);
NSLog(#"TAXCODE.length = %d", [TAXCODE length]);
NSLog(#"STC (before regex) = %#", STC);
STC = [regex stringByReplacingMatchesInString:TAXCODE
options:0
range:NSMakeRange(0, [TAXCODE length])
withTemplate:#""];
NSLog(#"STC (after regex) = %#", STC);
My debug output is as follows:
TAXCODE = 647L
TAXCODE.length = 4
STC (before regex) =
STC (after regex) = 647L
If you only ever going to have letters on one end then you could use.
NSString *TAXCODE =#"647L";
NSString *newcode = [TAXCODE stringByTrimmingCharactersInSet:[NSCharacterSet decimalDigitCharacterSet]];
If intermixed letters then you can get an Array that you can then play with.
NSString *TAXCODE =#"L6J47L";
NSArray *newcodeArray = [TAXCODE componentsSeparatedByCharactersInSet:[NSCharacterSet decimalDigitCharacterSet]];
I think you need to drop the perl syntax on the regexp. Use #"[^A-Z]" as the match string.

Is there an Objective-c regex replace with callback/C# MatchEvaluator equivalent?

I have a C# project I'm intending to port to Objective-C. From what I understand about Obj-C, it looks like there's a confusing variety of Regex options but I can't see anything about a way of doing a replace with callback.
I'm looking for something that is the equivalent of the C# MatchEvaluator delegate or PHP's preg_replace_callback. An example of what I want to do in C# is -
// change input so each word is followed a number showing how many letters it has
string inputString = "Hello, how are you today ?";
Regex theRegex = new Regex(#"\w+");
string outputString = theRegex.Replace(inputString, delegate (Match thisMatch){
return thisMatch.Value + thisMatch.Value.Length;
});
// outputString is now 'Hello5, how3 are3 you3 today5 ?'
How could I do this in Objective-C ? In my actual situation the Regex has both lookahead and lookbehind assertions in it though, so any alternative involving finding the strings in advance and then doing a series of straight string replaces won't work unfortunately.
Foundation has a NSRegularExpression class (iOS4 and later), which may be useful to you. From the docs:
The fundamental matching method for
NSRegularExpression is a Block
iterator method that allows clients to
supply a Block object which will be
invoked each time the regular
expression matches a portion of the
target string. There are additional
convenience methods for returning all
the matches as an array, the total
number of matches, the first match,
and the range of the first match.
For example:
NSString *input = #"Hello, how are you today?";
// make a copy of the input string. we are going to edit this one as we iterate
NSMutableString *output = [NSMutableString stringWithString:input];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"\\w+"
options:NSRegularExpressionCaseInsensitive
error:&error];
// keep track of how many additional characters we've added (1 per iteration)
__block NSUInteger count = 0;
[regex enumerateMatchesInString:input
options:0
range:NSMakeRange(0, [input length])
usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
// Note that Blocks in Objective C are basically closures
// so they will keep a constant copy of variables that were in scope
// when the block was declared
// unless you prefix the variable with the __block qualifier
// match.range is a C struct
// match.range.location is the character offset of the match
// match.range.length is the length of the match
NSString *matchedword = [input substringWithRange:match.range];
// the matched word with the length appended
NSString *new = [matchedword stringByAppendingFormat:#"%d", [matchedword length]];
// every iteration, the output string is getting longer
// so we need to adjust the range that we are editing
NSRange newrange = NSMakeRange(match.range.location+count, match.range.length);
[output replaceCharactersInRange:newrange withString:new];
count++;
}];
NSLog(#"%#", output); //output: Hello5, how3 are3 you3 today5?
I modified atshum's code to make it a bit more flexible:
__block int prevEndPosition = 0;
[regex enumerateMatchesInString:text
options:0
range:NSMakeRange(0, [text length])
usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop)
{
NSRange r = {.location = prevEndPosition, .length = match.range.location - prevEndPosition};
// Copy everything without modification between previous replacement and new one
[output appendString:[text substringWithRange:r]];
// Append string to be replaced
[output appendString:#"REPLACED"];
prevEndPosition = match.range.location + match.range.length;
}];
// Finalize string end
NSRange r = {.location = prevEndPosition, .length = [text length] - prevEndPosition};
[output appendString:[text substringWithRange:r]];
Seems to work for now (probably needs a bit more testing)