Quick way to jumble the order of an NSString? - objective-c

Does anyone know of an existing way to change the order of an existing NSString or NSMutableString's characters? I have a workaround in mind anyway but it would be great if there was an existing method for it.
For example, given the string #"HORSE", a method which would return #"ORSEH", #"SORHE", #"ROHES", etc?

Consider this code:
.h File:
#interface NSString (Scrambling)
+ (NSString *)scrambleString:(NSString *)toScramble;
#end
.m File:
#implementation NSString (Scrambling)
+ (NSString *)scrambleString:(NSString *)toScramble {
for (int i = 0; i < [toScramble length] * 15; i ++) {
int pos = arc4random() % [toScramble length];
int pos2 = arc4random() % ([toScramble length] - 1);
char ch = [toScramble characterAtIndex:pos];
NSString *before = [toScramble substringToIndex:pos];
NSString *after = [toScramble substringFromIndex:pos + 1];
NSString *temp = [before stringByAppendingString:after];
before = [temp substringToIndex:pos2];
after = [temp substringFromIndex:pos2];
toScramble = [before stringByAppendingFormat:#"%c%#", ch, after];
}
return toScramble;
}
#end
Not the most beautiful code or execution, but gets the job done. There's probably a (const char *) way to do this, but this works fine for me. A quick test shows a 0.001021 second length for execution on my Mac.
Usage:
NSString *scrambled = [NSString scrambleString:otherString];
Code adapted from another language / pseudocode

You can use Durstenfeld's variation of the Fisher-Yates Shuffle.
For a very long string, you could save a lot of CPU time and allocations by copying the unichars to a unichar buffer, then performing the transform using a c or c++ approach to swap characters. Note that the UTF8String is not the buffer you want to take, nor should you mutate it. Then create (or set) a new NSString from the shuffled buffer.
More info on the Fisher Yates algo and C and C++ implementations can be found here.

Related

Why does my NSMutableString edit sometimes not work?

I'm trying to repair some mis-numbered movie subtitle files (each sub is separated by a blank line). The following code scans up to the faulty subtitle index number in a test file. If I just 'printf' the faulty old indices and replacement new indices, everything appears just as expected.
//######################################################################
-(IBAction)scanToSubIndex:(id)sender
{
NSMutableString* tempString = [[NSMutableString alloc] initWithString:[theTextView string]];
int textLen = (int)[tempString length];
NSScanner *theScanner = [NSScanner scannerWithString:tempString];
while ([theScanner isAtEnd] == NO)
{
[theScanner scanUpToString:#"\r\n\r\n" intoString:NULL];
[theScanner scanString:#"\r\n\r\n" intoString:NULL];
if([theScanner scanLocation] >= textLen)
break;
else
{ // remove OLD subtitle index...
NSString *oldNumStr;
[theScanner scanUpToString:#"\r\n" intoString:&oldNumStr];
printf("old number:%s\n", [oldNumStr UTF8String]);
NSRange range = [tempString rangeOfString:oldNumStr];
[tempString deleteCharactersInRange:range];
// ...and insert SEQUENTIAL index
NSString *newNumStr = [self changeSubIndex];
printf("new number:%s\n\n", [newNumStr UTF8String]);
[tempString insertString:newNumStr atIndex:range.location];
}
}
printf("\ntempString\n\n:%s\n", [tempString UTF8String]);
}
//######################################################################
-(NSString*)changeSubIndex
{
static int newIndex = 1;
// convert int to string and return...
NSString *numString = [NSString stringWithFormat:#"%d", newIndex];
++newIndex;
return numString;
}
When I attempt to write the new indices to the mute string however, I end up with disordered results like this:
sub 1
sub 2
sub 3
sub 1
sub 5
sub 6
sub 7
sub 5
sub 9
sub 7
sub 8
An interesting observation (and possible clue?) is that when I reach subtitle number 1000, every number gets written to the mutable string in sequential order as required. I've been struggling with this for a couple of weeks now, and I can't find any other similar questions on SO. Any help much appreciated :-)
NSScanner & NSMutableString
NSMutableString is a subclass of NSString. In other words, you can pass NSMutableString at places where the NSString is expected. But it doesn't mean you're allowed to modify it.
scannerWithString: expects NSString. Translated to human language - I expect a string and I also do expect that the string is read-only (wont be modified).
In other words - your code is considered to be a programmer error - you give something to the NSScanner, NSScanner expects immutable string and you're modifying it.
We don't know what the NSScanner class is doing under the hood. There can be buffering or any other kind of optimization.
Even if you will be lucky with the mentioned scanLocation fix (in the comments), you shouldn't rely on it, because the under the hood implementation can change with any new release.
Don't do this. Not just here, but everywhere where you see immutable data type.
(There're situations where you can do it, but then you should really know what the under the hood implementation is doing, be certain that it wont be modified, etc. But generally speaking, it's not a good idea unless you know what you're doing.)
Sample
This sample code is based on the following assumptions:
we're talking about SubRip Text (SRT)
file is small (can easily fit memory)
rest of the SRT file is correct
especially the delimiter (#"\r\n")
#import Foundation;
NS_ASSUME_NONNULL_BEGIN
#interface SubRipText : NSObject
+ (NSString *)fixSubtitleIndexes:(NSString *)string;
#end
NS_ASSUME_NONNULL_END
#implementation SubRipText
+ (NSString *)fixSubtitleIndexes:(NSString *)string {
NSMutableString *result = [#"" mutableCopy];
__block BOOL nextLineIsIndex = YES;
__block NSUInteger index = 1;
[string enumerateLinesUsingBlock:^(NSString * _Nonnull line, BOOL * _Nonnull stop) {
if (nextLineIsIndex) {
[result appendFormat:#"%lu\r\n", (unsigned long)index];
index++;
nextLineIsIndex = NO;
return;
}
[result appendFormat:#"%#\r\n", line];
nextLineIsIndex = line.length == 0;
}];
return result;
}
#end
Usage:
NSString *test = #"29\r\n"
"00:00:00,498 --> 00:00:02,827\r\n"
"Hallo\r\n"
"\r\n"
"4023\r\n"
"00:00:02,827 --> 00:00:06,383\r\n"
"This is two lines,\r\n"
"subtitles rocks!\r\n"
"\r\n"
"1234\r\n"
"00:00:06,383 --> 00:00:09,427\r\n"
"Maybe not,\r\n"
"just learn English :)\r\n";
NSString *result = [SubRipText fixSubtitleIndexes:test];
NSLog(#"%#", result);
Output:
1
00:00:00,498 --> 00:00:02,827
Hallo
2
00:00:02,827 --> 00:00:06,383
This is two lines,
subtitles rocks!
3
00:00:06,383 --> 00:00:09,427
Maybe not,
just learn English :)
There're other ways how to achieve this, but you should think about readability, speed of writing, speed of running, ... Depends on your usage - how many of them are you going to fix, etc.

In my macOS application, I am working with UserDefaults dictionaryRepresentation. Sometimes I get strings with unknown encoding. Any suggesition?

I am working with a Objective-C Application, specifically I am gathering the dictionary representation of NSUserDefaults with this code:
NSUserDefaults *defaults = [NSUserDefaults standardUserDefaults];
NSDictionary *userDefaultsDict = [defaults dictionaryRepresentation];
While enumerating keys and objects of the resulting dict, sometimes I find a kind of opaque string that you can see in the following picture:
So it seems like an encoding problem.
If I try to print description of the string, the debugger correctly prints:
Printing description of obj:
tsuqsx
However, if I try to write obj to a file, or use it in any other way, I get an unreadable output like this:
What I would like to achieve is the following:
Detect in some way that the string has the encoding problem.
Convert the string to UTF8 encoding to use it in the rest of the program.
Any help is greatly appreciated. Thanks
EDIT: Very Hacky possible Solution that helps explaining what I am trying to do.
After trying all possible solutions based on dataUsingEncoding and back, I ended up with the following solution, absolutely weird, but I post it here, in the hope that it can help somebody to guess the encoding and what to do with unprintable characters:
- (BOOL)isProblematicString:(NSString *)candidateString {
BOOL returnValue = YES;
if ([candidateString length] <= 2) {
return NO;
}
const char *temp = [candidateString UTF8String];
long length = temp[0];
char *dest = malloc(length + 1);
long ctr = 1;
long usefulCounter = 0;
for (ctr = 1;ctr <= length;ctr++) {
if ((ctr - 1) % 3 == 0) {
memcpy(&dest[ctr - usefulCounter - 1],&temp[ctr],1);
} else {
if (ctr != 1 && ctr < [candidateString length]) {
if (temp[ctr] < 0x10 || temp[ctr] > 0x1F) {
returnValue = NO;
}
}
usefulCounter += 1;
}
}
memset(&dest[length],0,1);
free(dest);
return returnValue;
}
- (NSString *)utf8StringFromUnknownEncodedString:(NSString*)originalUnknownString {
const char *temp = [originalUnknownString UTF8String];
long length = temp[0];
char *dest = malloc(length + 1);
long ctr = 1;
long usefulCounter = 0;
for (ctr = 1;ctr <= length;ctr++) {
if ((ctr - 1) % 3 == 0) {
memcpy(&dest[ctr - usefulCounter - 1],&temp[ctr],1);
} else {
usefulCounter += 1;
}
}
memset(&dest[length],0,1);
NSString *returnValue = [[NSString alloc] initWithUTF8String:dest];
free(dest);
return returnValue;
}
This returns me a string that I can use to build a full UTF8 string. I am looking for a clean solution. Any help is greatly appreciated. Thanks
We're talking about a string which comes from the /Library/Preferences/.GlobalPreferences.plist
(key com.apple.preferences.timezone.new.selected_city).
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
NSLog(#"%#", city); // \^Zt\^\\^]s\^]\^\u\^V\^_q\^]\^[s\^W\^Zx\^P
(lldb) p [city description]
(__NSCFString *) $1 = 0x0000600003f6c240 #"\x1at\x1c\x1ds\x1d\x1cu\x16\x1fq\x1d\x1bs\x17\x1ax\x10"
What I would like to achieve is the following:
Detect in some way that the string has the encoding problem.
Convert the string to UTF8 encoding to use it in the rest of the program.
&
After trying all possible solutions based on dataUsingEncoding and back.
This string has no encoding problem and characters like \x1a, \x1c, ... are valid characters.
You can call dataUsingEncoding: with ASCII, UTF-8, ... but all these characters will still be
present. They're called control characters (or non-printing characters). The linked Wikipedia page explains what these characters are and how they're defined in ASCII, extended ASCII and unicode.
What you're looking for is a way how to remove control characters from a string.
Remove control characters
We can create a category for our new method:
#interface NSString (ControlCharacters)
- (NSString *)stringByRemovingControlCharacters;
#end
#implementation NSString (ControlCharacters)
- (NSString *)stringByRemovingControlCharacters {
// TODO Remove control characters
return self;
}
#end
In all examples below, the city variable is created in this way ...
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
... and contains #"\x1at\x1c\x1ds\x1d\x1cu\x16\x1fq\x1d\x1bs\x17\x1ax\x10". Also all
examples below were tested with the following code:
NSString *cityWithoutCC = [city stringByRemovingControlCharacters];
// tsuqsx
NSLog(#"%#", cityWithoutCC);
// {length = 6, bytes = 0x747375717378}
NSLog(#"%#", [cityWithoutCC dataUsingEncoding:NSUTF8StringEncoding]);
Split & join
One way is to utilize the NSCharacterSet.controlCharacterSet.
There's a stringByTrimmingCharactersInSet:
method (NSString), but it removes these characters from the beginning/end only,
which is not what you're looking for. There's a trick you can use:
- (NSString *)stringByRemovingControlCharacters {
NSArray<NSString *> *components = [self componentsSeparatedByCharactersInSet:NSCharacterSet.controlCharacterSet];
return [components componentsJoinedByString:#""];
}
It splits the string by control characters and then joins these components back. Not a very efficient way, but it works.
ICU transform
Another way is to use ICU transform (see ICU User Guide).
There's a stringByApplyingTransform:reverse:
method (NSString), but it only accepts predefined constants. Documentation says:
The constants defined by the NSStringTransform type offer a subset of the functionality provided by the underlying ICU transform functionality. To apply an ICU transform defined in the ICU User Guide that doesn't have a corresponding NSStringTransform constant, create an instance of NSMutableString and call the applyTransform:reverse:range:updatedRange: method instead.
Let's update our implementation:
- (NSString *)stringByRemovingControlCharacters {
NSMutableString *result = [self mutableCopy];
[result applyTransform:#"[[:Cc:] [:Cf:]] Remove"
reverse:NO
range:NSMakeRange(0, self.length)
updatedRange:nil];
return result;
}
[:Cc:] represents control characters, [:Cf:] represents format characters. Both represents the same character set as the already mentioned NSCharacterSet.controlCharacterSet. Documentation:
A character set containing the characters in Unicode General Category Cc and Cf.
Iterate over characters
NSCharacterSet also offers the characterIsMember: method. Here we need to iterate over characters (unichar) and check if it's a control character or not.
Let's update our implementation:
- (NSString *)stringByRemovingControlCharacters {
if (self.length == 0) {
return self;
}
NSUInteger length = self.length;
unichar characters[length];
[self getCharacters:characters];
NSUInteger resultLength = 0;
unichar result[length];
NSCharacterSet *controlCharacterSet = NSCharacterSet.controlCharacterSet;
for (NSUInteger i = 0 ; i < length ; i++) {
if ([controlCharacterSet characterIsMember:characters[i]] == NO) {
result[resultLength++] = characters[i];
}
}
return [NSString stringWithCharacters:result length:resultLength];
}
Here we filter out all characters (unichar) which belong to the controlCharacterSet.
Other ways
There're other ways how to iterate over characters - for example - Most efficient way to iterate over all the chars in an NSString.
BBEdit & others
Let's write this string to a file:
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
[city writeToFile:#"/Users/zrzka/city.txt"
atomically:YES
encoding:NSUTF8StringEncoding
error:nil];
It's up to the editor how all these controls characters are handled/displayed. Here's en example - Visual Studio Code.
View - Render Control Characters off:
View - Render Control Characters on:
BBEdit displays question marks (upside down), but I'm sure there's a way how to
toggle control characters rendering. Don't have BBEdit installed to verify it.

How to read input in Objective-C?

I am trying to write some simple code that searches two dictionaries for a string and prints to the console if the string appears in both dictionaries. I want the user to be able to input the string via the console, and then pass the string as a variable into a message. I was wondering how I could go about getting a string from the console and using it as the argument in the following method call.
[x rangeOfString:"the string goes here" options:NSCaseInsensitiveSearch];
I am unsure as to how to get the string from the user. Do I use scanf(), or fgets(), into a char and then convert it into a NSSstring, or simply scan into an NSString itself. I am then wondering how to pass that string as an argument. Please help:
Here is the code I have so far. I know it is not succinct, but I just want to get the job done:
#import <Foundation/Foundation.h>
#include <stdio.h>
#include "stdlib.h"
int main(int argc, const char* argv[]){
#autoreleasepool {
char *name[100];
printf("Please enter the name you wish to search for");
scanf("%s", *name);
NSString *name2 = [NSString stringWithFormat:#"%s" , *name];
NSString *nameString = [NSString stringWithContentsOfFile:#"/usr/share/dict/propernames" encoding:NSUTF8StringEncoding error:NULL];
NSString *dictionary = [NSString stringWithContentsOfFile:#"/usr/share/dict/words" encoding:NSUTF8StringEncoding error:NULL];
NSArray *nameString2 = [nameString componentsSeparatedByString:#"\n"];
NSArray *dictionary2 = [dictionary componentsSeparatedByString:#"\n"];
int nsYES = 0;
int dictYES = 0;
for (NSString *n in nameString2) {
NSRange r = [n rangeOfString:name2 options:NSCaseInsensitiveSearch];
if (r.location != NSNotFound){
nsYES = 1;
}
}
for (NSString *x in dictionary2) {
NSRange l = [x rangeOfString:name2 options:NSCaseInsensitiveSearch];
if (l.location != NSNotFound){
dictYES = 1;
}
}
if (dictYES && nsYES){
NSLog(#"glen appears in both dictionaries");
}
}
}
Thanks.
Safely reading from standard input in an interactive manner in C is kind of involved. The standard functions require a fixed-size buffer, which means either some input will be too long (and corrupt your memory!) or you'll have to read in a loop. And unfortunately, Cocoa doesn't offer us a whole lot of help.
For reading standard input entirely (as in, if you're expecting an input file over standard input), there is NSFileHandle, which makes it pretty succinct. But for interactively reading and writing like you want to do here, you pretty much have to go with the linked answer for reading.
Once you have read some input into a C string, you can easily turn it into an NSString with, for example, +[NSString stringWithUTF8String:].

Enumerate NSString characters via pointer

How can I enumerate NSString by pulling each unichar out of it? I can use characterAtIndex but that is slower than doing it by an incrementing unichar*. I didn't see anything in Apple's documentation that didn't require copying the string into a second buffer.
Something like this would be ideal:
for (unichar c in string) { ... }
or
unichar* ptr = (unichar*)string;
You can speed up -characterAtIndex: by converting it to it's IMP form first:
NSString *str = #"This is a test";
NSUInteger len = [str length]; // only calling [str length] once speeds up the process as well
SEL sel = #selector(characterAtIndex:);
// using typeof to save my fingers from typing more
unichar (*charAtIdx)(id, SEL, NSUInteger) = (typeof(charAtIdx)) [str methodForSelector:sel];
for (int i = 0; i < len; i++) {
unichar c = charAtIdx(str, sel, i);
// do something with C
NSLog(#"%C", c);
}
EDIT: It appears that the CFString Reference contains the following method:
const UniChar *CFStringGetCharactersPtr(CFStringRef theString);
This means you can do the following:
const unichar *chars = CFStringGetCharactersPtr((__bridge CFStringRef) theString);
while (*chars)
{
// do something with *chars
chars++;
}
If you don't want to allocate memory for coping the buffer, this is the way to go.
Your only option is to copy the characters into a new buffer. This is because the NSString class does not guarantee that there is an internal buffer you can use. The best way to do this is to use the getCharacters:range: method.
NSUInteger i, length = [string length];
unichar *buffer = malloc(sizeof(unichar) * length);
NSRange range = {0,length};
[string getCharacters:buffer range:range];
for(i = 0; i < length; ++i) {
unichar c = buffer[i];
}
If you are using potentially very long strings, it would be better to allocate a fixed size buffer and enumerate the string in chunks (this is actually how fast enumeration works).
I created a block-style enumeration method that uses getCharacters:range: with a fixed-size buffer, as per ughoavgfhw's suggestion in his answer. It avoids the situation where CFStringGetCharactersPtr returns null and it doesn't have to malloc a large buffer. You can drop it into an NSString category, or modify it to take a string as a parameter if you like.
-(void)enumerateCharactersWithBlock:(void (^)(unichar, NSUInteger, BOOL *))block
{
const NSInteger bufferSize = 16;
const NSInteger length = [self length];
unichar buffer[bufferSize];
NSInteger bufferLoops = (length - 1) / bufferSize + 1;
BOOL stop = NO;
for (int i = 0; i < bufferLoops; i++) {
NSInteger bufferOffset = i * bufferSize;
NSInteger charsInBuffer = MIN(length - bufferOffset, bufferSize);
[self getCharacters:buffer range:NSMakeRange(bufferOffset, charsInBuffer)];
for (int j = 0; j < charsInBuffer; j++) {
block(buffer[j], j + bufferOffset, &stop);
if (stop) {
return;
}
}
}
}
The fastest reliable way to enumerate characters in an NSString I know of is to use this relatively little-known Core Foundation gem hidden in plain sight (CFString.h).
NSString *string = <#initialize your string#>
NSUInteger stringLength = string.length;
CFStringInlineBuffer buf;
CFStringInitInlineBuffer((__bridge CFStringRef) string, &buf, (CFRange) { 0, stringLength });
for (NSUInteger charIndex = 0; charIndex < stringLength; charIndex++) {
unichar c = CFStringGetCharacterFromInlineBuffer(&buf, charIndex);
}
If you look at the source code of these inline functions, CFStringInitInlineBuffer() and CFStringGetCharacterFromInlineBuffer(), you'll see that they handle all the nasty details like CFStringGetCharactersPtr() returning NULL, CFStringGetCStringPtr() returning NULL, defaulting to slower CFStringGetCharacters() and caching the characters in a C array for fastest access possible. This API really deserves more publicity.
The caveat is that if you initialize the CFStringInlineBuffer at a non-zero offset, you should pass a relative character index to CFStringInlineBuffer(), as stated in the header comments:
The next two functions allow fast access to the contents of a string, assuming you are doing sequential or localized accesses. To use, call CFStringInitInlineBuffer() with a CFStringInlineBuffer (on the stack, say), and a range in the string to look at. Then call CFStringGetCharacterFromInlineBuffer() as many times as you want, with a index into that range (relative to the start of that range). These are INLINE functions and will end up calling CFString only once in a while, to fill a buffer. CFStringGetCharacterFromInlineBuffer() returns 0 if a location outside the original range is specified.
I don't think you can do this. NSString is an abstract interface to a multitude of classes that make no guarantees about the internal storage of the character data, so it's entirely possible there is no character array to get a pointer to.
If neither of the options mentioned in your question are suitable for your app, I'd recommend either creating your own string class for this purpose, or using raw malloc'ed unichar arrays instead of string objects.
This will work:
char *s = [string UTF8String];
for (char *t = s; *t; t++)
/* use as */ *t;
[Edit] And if you really need unicode characters then you have no option but to use length and characterAtIndex. From the documentation:
The NSString class has two primitive methods—length and characterAtIndex:—that provide the basis for all other methods in its interface. The length method returns the total number of Unicode characters in the string. characterAtIndex: gives access to each character in the string by index, with index values starting at 0.
So your code would be:
for (int index = 0; index < string.length; index++)
{
unichar c = [string characterAtIndex: index];
/* ... */
}
[edit 2]
Also, don't forget that NSString is 'toll-free bridged' to CFString and thus all the non-Objective-C, straight C-code interface functions are usable. The relevant one would be CFStringGetCharacterAtIndex

Objective-C Convert String like '00120' into array of Integers

I need to convert a string like '00120' into an NSArray of NSIntegers.
can you please help?
Thanks
Try this code out:
NSString *input = #"00120";
NSMutableArray *integers = [NSMutableArray array];
for (int i = 0; i < input.length; i++) {
unichar c = [input characterAtIndex:i];
if (!isnumber(c))
[integers addObject:[NSNumber numberWithInt:-1]];
else
[integers addObject:[NSNumber numberWithInt:c - '0']]; // convert the ASCII value to it's integer counterpart.
}
This is, of course, assuming all of your characters are numbers in the string.
EDIT: If you want a NSInteger, you need to make a C-Array:
NSString *input = #"00120";
NSInteger *integers = calloc(input.length, sizeof(NSInteger));
NSInteger integersLen = input.length;
for (int i = 0; i < input.length; i++)
{
unichar c = [input characterAtIndex:i];
if (!isnumber(c))
integers[i] = -1;
else
integers[i] = c - '0'; // convert the ASCII value to it's integer counterpart
}
Everything you need to know can be found in the class reference for NSString and NSMutableArray. Look up a tutorial on for loops if you're not familiar with them already.
Notable methods that you're likely to want to use are -length and -characterAtIndex: on NSString, and -addObject: / -insertObject:atIndex: on NSMutableArray.
I don't mean to come across as patronising, but I'm not going to write out the code for you here as you'll learn much more if you work it out yourself with some help. Please do feel free to update the question with your code if you get stuck and ask for more specific help.