Matching the entire input with Regex, Objective-C - objective-c

I have an application that communicates with a serial port. I am looking to create a packet descriptor with regex that can recognize the expression.
The string is !$S0, 0, 48, 3and I want the regex to recognize any digit.
- (IBAction)getStatus:(id)sender
{
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:#"[(!$S\\d,\\s\\d,\\s\\d,\\s\\d)]" options:0 error:0];
self.getStatus = [[ORSSerialPacketDescriptor alloc] initWithRegularExpression:regex maximumPacketLength:20 userInfo:nil];
[self.serialPort startListeningForPacketsMatchingDescriptor:self.getStatus];
NSString *command = #"$S";
command = [command stringByAppendingString:[self lineEndingString]];
NSData *dataToSend = [command dataUsingEncoding:NSUTF8StringEncoding];
[self.serialPort sendData:dataToSend];
}
I expect it to pull the whole response so that I can process the string here:
- (void)serialPort:(ORSSerialPort *)serialPort didReceivePacket:(NSData *)packetData matchingDescriptor:(ORSSerialPacketDescriptor *)descriptor {
NSString *asciString = [[NSString alloc] initWithData:packetData encoding:NSASCIIStringEncoding];
NSLog(#"package[asci]: %#", asciString);
if (descriptor == self.getStatus) {
}
}
Any help would be appreciated.

You may use
#"!\\$S\\d+(?:,\\s+\\d+){3}"
Enclose with ^ (start of string) and $ (end of string) if you plan to match the string exactly:
#"^!\\$S\\d+(?:,\\s+\\d+){3}$"
See the regex demo
Details
^ - start of string
! - a !
\\$ - a $ symbol (must be escaped)
S - an S letter
\\d+ - 1 or more digits
(?:,\\s+\\d+){3} - 3 consecutive sequences of:
, - a comma
\\s+ - 1 or more whitespaces
\\d+ - 1 or more digits
$ - end of string.

Related

In my macOS application, I am working with UserDefaults dictionaryRepresentation. Sometimes I get strings with unknown encoding. Any suggesition?

I am working with a Objective-C Application, specifically I am gathering the dictionary representation of NSUserDefaults with this code:
NSUserDefaults *defaults = [NSUserDefaults standardUserDefaults];
NSDictionary *userDefaultsDict = [defaults dictionaryRepresentation];
While enumerating keys and objects of the resulting dict, sometimes I find a kind of opaque string that you can see in the following picture:
So it seems like an encoding problem.
If I try to print description of the string, the debugger correctly prints:
Printing description of obj:
tsuqsx
However, if I try to write obj to a file, or use it in any other way, I get an unreadable output like this:
What I would like to achieve is the following:
Detect in some way that the string has the encoding problem.
Convert the string to UTF8 encoding to use it in the rest of the program.
Any help is greatly appreciated. Thanks
EDIT: Very Hacky possible Solution that helps explaining what I am trying to do.
After trying all possible solutions based on dataUsingEncoding and back, I ended up with the following solution, absolutely weird, but I post it here, in the hope that it can help somebody to guess the encoding and what to do with unprintable characters:
- (BOOL)isProblematicString:(NSString *)candidateString {
BOOL returnValue = YES;
if ([candidateString length] <= 2) {
return NO;
}
const char *temp = [candidateString UTF8String];
long length = temp[0];
char *dest = malloc(length + 1);
long ctr = 1;
long usefulCounter = 0;
for (ctr = 1;ctr <= length;ctr++) {
if ((ctr - 1) % 3 == 0) {
memcpy(&dest[ctr - usefulCounter - 1],&temp[ctr],1);
} else {
if (ctr != 1 && ctr < [candidateString length]) {
if (temp[ctr] < 0x10 || temp[ctr] > 0x1F) {
returnValue = NO;
}
}
usefulCounter += 1;
}
}
memset(&dest[length],0,1);
free(dest);
return returnValue;
}
- (NSString *)utf8StringFromUnknownEncodedString:(NSString*)originalUnknownString {
const char *temp = [originalUnknownString UTF8String];
long length = temp[0];
char *dest = malloc(length + 1);
long ctr = 1;
long usefulCounter = 0;
for (ctr = 1;ctr <= length;ctr++) {
if ((ctr - 1) % 3 == 0) {
memcpy(&dest[ctr - usefulCounter - 1],&temp[ctr],1);
} else {
usefulCounter += 1;
}
}
memset(&dest[length],0,1);
NSString *returnValue = [[NSString alloc] initWithUTF8String:dest];
free(dest);
return returnValue;
}
This returns me a string that I can use to build a full UTF8 string. I am looking for a clean solution. Any help is greatly appreciated. Thanks
We're talking about a string which comes from the /Library/Preferences/.GlobalPreferences.plist
(key com.apple.preferences.timezone.new.selected_city).
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
NSLog(#"%#", city); // \^Zt\^\\^]s\^]\^\u\^V\^_q\^]\^[s\^W\^Zx\^P
(lldb) p [city description]
(__NSCFString *) $1 = 0x0000600003f6c240 #"\x1at\x1c\x1ds\x1d\x1cu\x16\x1fq\x1d\x1bs\x17\x1ax\x10"
What I would like to achieve is the following:
Detect in some way that the string has the encoding problem.
Convert the string to UTF8 encoding to use it in the rest of the program.
&
After trying all possible solutions based on dataUsingEncoding and back.
This string has no encoding problem and characters like \x1a, \x1c, ... are valid characters.
You can call dataUsingEncoding: with ASCII, UTF-8, ... but all these characters will still be
present. They're called control characters (or non-printing characters). The linked Wikipedia page explains what these characters are and how they're defined in ASCII, extended ASCII and unicode.
What you're looking for is a way how to remove control characters from a string.
Remove control characters
We can create a category for our new method:
#interface NSString (ControlCharacters)
- (NSString *)stringByRemovingControlCharacters;
#end
#implementation NSString (ControlCharacters)
- (NSString *)stringByRemovingControlCharacters {
// TODO Remove control characters
return self;
}
#end
In all examples below, the city variable is created in this way ...
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
... and contains #"\x1at\x1c\x1ds\x1d\x1cu\x16\x1fq\x1d\x1bs\x17\x1ax\x10". Also all
examples below were tested with the following code:
NSString *cityWithoutCC = [city stringByRemovingControlCharacters];
// tsuqsx
NSLog(#"%#", cityWithoutCC);
// {length = 6, bytes = 0x747375717378}
NSLog(#"%#", [cityWithoutCC dataUsingEncoding:NSUTF8StringEncoding]);
Split & join
One way is to utilize the NSCharacterSet.controlCharacterSet.
There's a stringByTrimmingCharactersInSet:
method (NSString), but it removes these characters from the beginning/end only,
which is not what you're looking for. There's a trick you can use:
- (NSString *)stringByRemovingControlCharacters {
NSArray<NSString *> *components = [self componentsSeparatedByCharactersInSet:NSCharacterSet.controlCharacterSet];
return [components componentsJoinedByString:#""];
}
It splits the string by control characters and then joins these components back. Not a very efficient way, but it works.
ICU transform
Another way is to use ICU transform (see ICU User Guide).
There's a stringByApplyingTransform:reverse:
method (NSString), but it only accepts predefined constants. Documentation says:
The constants defined by the NSStringTransform type offer a subset of the functionality provided by the underlying ICU transform functionality. To apply an ICU transform defined in the ICU User Guide that doesn't have a corresponding NSStringTransform constant, create an instance of NSMutableString and call the applyTransform:reverse:range:updatedRange: method instead.
Let's update our implementation:
- (NSString *)stringByRemovingControlCharacters {
NSMutableString *result = [self mutableCopy];
[result applyTransform:#"[[:Cc:] [:Cf:]] Remove"
reverse:NO
range:NSMakeRange(0, self.length)
updatedRange:nil];
return result;
}
[:Cc:] represents control characters, [:Cf:] represents format characters. Both represents the same character set as the already mentioned NSCharacterSet.controlCharacterSet. Documentation:
A character set containing the characters in Unicode General Category Cc and Cf.
Iterate over characters
NSCharacterSet also offers the characterIsMember: method. Here we need to iterate over characters (unichar) and check if it's a control character or not.
Let's update our implementation:
- (NSString *)stringByRemovingControlCharacters {
if (self.length == 0) {
return self;
}
NSUInteger length = self.length;
unichar characters[length];
[self getCharacters:characters];
NSUInteger resultLength = 0;
unichar result[length];
NSCharacterSet *controlCharacterSet = NSCharacterSet.controlCharacterSet;
for (NSUInteger i = 0 ; i < length ; i++) {
if ([controlCharacterSet characterIsMember:characters[i]] == NO) {
result[resultLength++] = characters[i];
}
}
return [NSString stringWithCharacters:result length:resultLength];
}
Here we filter out all characters (unichar) which belong to the controlCharacterSet.
Other ways
There're other ways how to iterate over characters - for example - Most efficient way to iterate over all the chars in an NSString.
BBEdit & others
Let's write this string to a file:
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
[city writeToFile:#"/Users/zrzka/city.txt"
atomically:YES
encoding:NSUTF8StringEncoding
error:nil];
It's up to the editor how all these controls characters are handled/displayed. Here's en example - Visual Studio Code.
View - Render Control Characters off:
View - Render Control Characters on:
BBEdit displays question marks (upside down), but I'm sure there's a way how to
toggle control characters rendering. Don't have BBEdit installed to verify it.

Regex to reject sequence of Digits

I need to validate phone number. Below is the code snippet
-(BOOL) validatePhone:(NSString*) phoneString
{
NSString *regExPattern = #"^[6-9]\\d{9}$"; ORIGINAL
// NSString *regExPattern = #"^[6-9](\\d)(?!\1+$)\\d*$";
NSRegularExpression *regEx = [[NSRegularExpression alloc] initWithPattern:regExPattern options:NSRegularExpressionCaseInsensitive error:nil];
NSUInteger regExMatches = [regEx numberOfMatchesInString:phoneString options:0 range:NSMakeRange(0, [phoneString length])];
NSLog(#"%lu", (unsigned long)regExMatches);
if (regExMatches == 0) {
return NO;
}
else
return YES;
}
I want to reject phone number that is in sequnce example
9999999999, 6666677777
It seems you want to disallow 5 and more identical consecutive digits.
Use
#"^[6-9](?!\\d*(\\d)\\1{4})\\d{9}$"
See the regex demo
Details
^ - start of string
[6-9] - a digit from 6 to 9
(?!\d*(\d)\1{4}) - a negative lookahead that fails the match if, immediately to the right of the current location, there is
\d* - 0+ digits
(\d) - a digit captured into Group 1
\1{4} - the same digit as captured in Group 1 repeated four times
\d{9} - any 9 digits
$ - end of string (replace with \z to match the very end of string do disallow the match before the final LF symbol in the string).
Note that \d is Unicode aware in the ICU regex library, thus it might be safer to use [0-9] instead of \d.

how to replace in string using regex

I have below as string
name : abc,
position : 2
I want to make replace so that the string becomes as below
name : "abc",
position : 2
What change I want to do is abc will have double quotes so abc becomes "abc".
Note: abc is dynamic, it can be anything as below.
name : Test,
position : 2
name : Great,
position : 2
name : developers,
position : 2
Any idea how this can be done?
I suggest using \\b(name\\s*:\\s*)(.+), pattern and replace with $1"$2",:
NSError *error = nil;
NSString *myText = #"name : abc,\nposition : 2";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\b(name\\s*:\\s*)(.+)," options:nil error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:myText options:0 range:NSMakeRange(0, [myText length]) withTemplate:#"$1\"$2\","];
NSLog(#"%#", modifiedString);
See the Objective-C demo
Details:
\\b - a leading word boundary
(name\\s*:\\s*) - Group 1 matching name, 0+ whitespaces, : and 0+ whitespaces again
(.+) - any 0+ chars other than line break chars as many as possible
, - comma
The replacement pattern - $1"$2", - inserts Group 1 contents, ", Group 2 contents and ",.
See the regex demo.

How to use RegEx to support single line mode in textview?

I set my custom textview to support regExPatternValidation = #"^[0-9]{0,10}$";
and use the following method to accomplish my validation:
+ (BOOL)validateString:(NSString *)string withRegExPattern:(NSString *)regexPattern
{
BOOL doesValidate = NO;
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regexPattern
options:NSRegularExpressionCaseInsensitive
error:&error];
if (error)
{
DDLogError(#"%#:%# : regular expression error: [%#]", THIS_FILE, THIS_METHOD, error.description);
return doesValidate;
}
NSRange textRange = NSMakeRange(0, string.length);
NSUInteger regExMatches = [regex numberOfMatchesInString:string options:NSMatchingReportCompletion range:textRange];
if (regExMatches > 0 && regExMatches <= textRange.length+1)
{
doesValidate = YES;
}
else
{
doesValidate = NO;
}
return doesValidate;
}
One of its purposes is to control single or multi line modes. For some strange reason, when I hit the Return key (\n), the numberOfMatchesInString: still returns 1 match. Even though my regex pattern has no inclusion to support \n characters.
Is it possible to accomplish this feature using regex in Objective-C?
The issue you have has its roots in how anchors ^ and $ work.
^ matches at the beginning (right before the first character, or \n in our case), and $ matches at the end of string (at \n). When you press Return, your string looks like \n. Exactly a match!
So, in your case [0-9]* can match an empty string due to the * quantifier (0 or more occurrences of the preceding pattern).
So, you can avoid matching an empty string with a negative look-ahead:
#"^(?!\n$)[0-9]*$"
It will not match an empty string with just a newline symbol in it. See this demo.

why is code falling on substringwithrange

hi here is my function, but everytime when i try to init and alloc foo it falls, can you tell me why?
-(NSString*)modifyTheCode:(NSString*) theCode{
if ([[theCode substringToIndex:1]isEqualToString:#"0"]) {
if ([theCode length] == 1 ) {
return #"0000000000";
}
NSString* foo = [[NSString alloc]initWithString:[theCode substringWithRange:NSMakeRange(2, [theCode length]-1)]];
return [self modifyTheCode:foo];
} else {
return theCode;
}
}
the error message:
warning: Unable to read symbols for /Developer/Platforms/iPhoneOS.platform/DeviceSupport/4.3.2 (8H7)/Symbols/Developer/usr/lib/libXcodeDebuggerSupport.dylib (file not found).
replace this line
NSString* foo = [[NSString alloc]initWithString:[theCode substringWithRange:NSMakeRange(2, [theCode length]-1)]];
with this line
NSString* foo = [[NSString alloc]initWithString:[theCode substringWithRange:NSMakeRange(1, [theCode length]-1)]];
and try..
What is the error message?
If you are working with NSRange maybe you should check the length of theCode first.
Because the range is invalid. NSRange has two members, location and length. The range you give starts at the third character of the string and has the length of the string minus one. So your length is one character longer than the amount of characters left in the string.
Suppose theCode is #"0123". The range you create is { .location = 2, .length = 3 } This represents:
0123
^ start of range is here
^ start of range + 3 off the end of the string.
By the way, you'll be pleased to know that there are convenience methods so you don't have to mess with ranges. You could do:
if ([theCode hasPrefix: #"0"])
{
NSString* foo = [theCode substringFromIndex: 1]; // assumes you just want to strip off the leading #"0"
return [self modifyTheCode:foo];
} else {
return theCode;
}
By the way, your original code leaked foo because you never released it.