How to Get the First Different Character Between 2 Strings in Objective-C (for iOS)? - objective-c

I know I can loop through each character of two NSString objects using characterAtIndex: and compare them, but this approach would be very expensive if I use this function frequently.
Is there anything built in for this, or a more efficient way to do it?

The quickest way i can think of is to get a C string from it, and then iterate through the strings.
Just a quick example (fix it to your liking):
const char* myCString = [myNSStringInstance UTF8String];
const char* string2 = [nsstring2 UTF8String];
// Assume same length. You can fix this
for(i = 0; i < strlen(myCString); i++) {
if(myCString[i] != string2[i]) {
// Do something here...
}
}

It's a litte hackish, but you could get the c-string for each and then use pointer indexing. Same basic algorithm as your mentioned idea, but theoretically as efficient as you could reasonably expect a solution to be (just looking at two memory addresses and comparing their contents.
Pseudo code:
char *stringA = [stringA cStringUsingEncoding:NSUTF8StringEncoding];
char *stringB = [stringB cStringUsingEncoding:NSUTF8StringEncoding];
int mismatchIndex = -1;
for(int i = 0; i<shorterStringLength; i++) {
if (stringA[i] != stringB[i]) {
mismatchIndex = i;
break;
}
}

Related

Converting NSString to array of chars inside For Loop

I'm trying to use an existing piece of code in an iOS project to alphabetize a list of words in an array (for instance, to make tomato into amoott, or stack into ackst). The code seems to work if I run it on its own, but I'm trying to integrate it into my existing app.
Each word I want it to alphabetize is stored as an NSString inside an array. The issue seems to be that the code takes the word as an array of chars, and I can't get my NSStrings into that format.
If I use string = [currentWord UTFString], I get an error of Array type char[128] is not assignable, and if I try to create the char array inside the loop (const char *string = [curentWord UTF8String]) I get warnings relating to Initializing char with type const char discards qualifiers. Not quite sure how I can get around it – any tips? The method is below, I'll take care of storing the alphabetized versions later.
- (void) alphabetizeWord {
char string[128], temp;
int n, i, j;
for (NSString* currentWord in wordsList) {
n = [currentWord length];
for (i = 0; i < n-1; i++) {
for (j = i+1; j < n; j++) {
if (string[i] > string[j]) {
temp = string[i];
string[i] = string[j];
string[j] = temp;
}
}
}
NSLog(#"The word %# in alphabetical order is %s", currentWord, string);
}
}
This should work :
- (void)alphabetizeWord {
char str[128];
for (NSString *currentWord in wordList)
{
int wordLength = [currentWord length];
for (int i = 0; i < wordLength; i++)
{
str[i] = [currentWord characterAtIndex:i];
}
// Adding the termination char
str[wordLength] = 0;
// Add your word
}
}
EDIT : Sorry, didn't fully understand at first. Gonna check this out.

How do I convert a Hexa-Tri-Decimal number into an int in objective c?

The Hexa-Tri-Decimal number is 0-9 and A-Z. I know I can covert from hex with a NSScanner but not sure how to go about converting Hexa-Tri-Decimal.
For example I have a NSString with "0XPM" the int value should be 43690, "1BLC" would be 61680.
Objective C is built on top of C, and luckily enough you can use the functions there to accomplish the conversion. What you're looking for is strtol or one of it's sibling functions. If I recall correctly strtol handles up to base36 (the hexa-tri-decimal you refer to).
http://www.cplusplus.com/reference/clibrary/cstdlib/strtol/
I can only think to do this using C strings, as they offer easier access to individual characters.
This seemed like an interesting problem to solve, so I had a go at writing it:
int parseBase36Number(NSString *input)
{
const char *inputCString = [[input lowercaseString] UTF8String];
size_t inputLength = [input length];
int orderOfMagnitudeMultiplier = 1;
int result = 0;
// iterate backward through the number
for (int i = inputLength - 1; i >= 0; i--)
{
char inputChar = inputCString[i];
int charNumericValue;
if (isdigit(inputChar))
{
charNumericValue = inputChar - '0';
}
else if (islower(inputChar))
{
charNumericValue = inputChar - 'a' + 10;
}
else
{
// unhanded character, throw error
}
result += charNumericValue * orderOfMagnitudeMultiplier;
orderOfMagnitudeMultiplier *= 36;
}
return result;
}
NOTE: I've not tested this at all, so take care and let me know how it goes!

Enumerate NSString characters via pointer

How can I enumerate NSString by pulling each unichar out of it? I can use characterAtIndex but that is slower than doing it by an incrementing unichar*. I didn't see anything in Apple's documentation that didn't require copying the string into a second buffer.
Something like this would be ideal:
for (unichar c in string) { ... }
or
unichar* ptr = (unichar*)string;
You can speed up -characterAtIndex: by converting it to it's IMP form first:
NSString *str = #"This is a test";
NSUInteger len = [str length]; // only calling [str length] once speeds up the process as well
SEL sel = #selector(characterAtIndex:);
// using typeof to save my fingers from typing more
unichar (*charAtIdx)(id, SEL, NSUInteger) = (typeof(charAtIdx)) [str methodForSelector:sel];
for (int i = 0; i < len; i++) {
unichar c = charAtIdx(str, sel, i);
// do something with C
NSLog(#"%C", c);
}
EDIT: It appears that the CFString Reference contains the following method:
const UniChar *CFStringGetCharactersPtr(CFStringRef theString);
This means you can do the following:
const unichar *chars = CFStringGetCharactersPtr((__bridge CFStringRef) theString);
while (*chars)
{
// do something with *chars
chars++;
}
If you don't want to allocate memory for coping the buffer, this is the way to go.
Your only option is to copy the characters into a new buffer. This is because the NSString class does not guarantee that there is an internal buffer you can use. The best way to do this is to use the getCharacters:range: method.
NSUInteger i, length = [string length];
unichar *buffer = malloc(sizeof(unichar) * length);
NSRange range = {0,length};
[string getCharacters:buffer range:range];
for(i = 0; i < length; ++i) {
unichar c = buffer[i];
}
If you are using potentially very long strings, it would be better to allocate a fixed size buffer and enumerate the string in chunks (this is actually how fast enumeration works).
I created a block-style enumeration method that uses getCharacters:range: with a fixed-size buffer, as per ughoavgfhw's suggestion in his answer. It avoids the situation where CFStringGetCharactersPtr returns null and it doesn't have to malloc a large buffer. You can drop it into an NSString category, or modify it to take a string as a parameter if you like.
-(void)enumerateCharactersWithBlock:(void (^)(unichar, NSUInteger, BOOL *))block
{
const NSInteger bufferSize = 16;
const NSInteger length = [self length];
unichar buffer[bufferSize];
NSInteger bufferLoops = (length - 1) / bufferSize + 1;
BOOL stop = NO;
for (int i = 0; i < bufferLoops; i++) {
NSInteger bufferOffset = i * bufferSize;
NSInteger charsInBuffer = MIN(length - bufferOffset, bufferSize);
[self getCharacters:buffer range:NSMakeRange(bufferOffset, charsInBuffer)];
for (int j = 0; j < charsInBuffer; j++) {
block(buffer[j], j + bufferOffset, &stop);
if (stop) {
return;
}
}
}
}
The fastest reliable way to enumerate characters in an NSString I know of is to use this relatively little-known Core Foundation gem hidden in plain sight (CFString.h).
NSString *string = <#initialize your string#>
NSUInteger stringLength = string.length;
CFStringInlineBuffer buf;
CFStringInitInlineBuffer((__bridge CFStringRef) string, &buf, (CFRange) { 0, stringLength });
for (NSUInteger charIndex = 0; charIndex < stringLength; charIndex++) {
unichar c = CFStringGetCharacterFromInlineBuffer(&buf, charIndex);
}
If you look at the source code of these inline functions, CFStringInitInlineBuffer() and CFStringGetCharacterFromInlineBuffer(), you'll see that they handle all the nasty details like CFStringGetCharactersPtr() returning NULL, CFStringGetCStringPtr() returning NULL, defaulting to slower CFStringGetCharacters() and caching the characters in a C array for fastest access possible. This API really deserves more publicity.
The caveat is that if you initialize the CFStringInlineBuffer at a non-zero offset, you should pass a relative character index to CFStringInlineBuffer(), as stated in the header comments:
The next two functions allow fast access to the contents of a string, assuming you are doing sequential or localized accesses. To use, call CFStringInitInlineBuffer() with a CFStringInlineBuffer (on the stack, say), and a range in the string to look at. Then call CFStringGetCharacterFromInlineBuffer() as many times as you want, with a index into that range (relative to the start of that range). These are INLINE functions and will end up calling CFString only once in a while, to fill a buffer. CFStringGetCharacterFromInlineBuffer() returns 0 if a location outside the original range is specified.
I don't think you can do this. NSString is an abstract interface to a multitude of classes that make no guarantees about the internal storage of the character data, so it's entirely possible there is no character array to get a pointer to.
If neither of the options mentioned in your question are suitable for your app, I'd recommend either creating your own string class for this purpose, or using raw malloc'ed unichar arrays instead of string objects.
This will work:
char *s = [string UTF8String];
for (char *t = s; *t; t++)
/* use as */ *t;
[Edit] And if you really need unicode characters then you have no option but to use length and characterAtIndex. From the documentation:
The NSString class has two primitive methods—length and characterAtIndex:—that provide the basis for all other methods in its interface. The length method returns the total number of Unicode characters in the string. characterAtIndex: gives access to each character in the string by index, with index values starting at 0.
So your code would be:
for (int index = 0; index < string.length; index++)
{
unichar c = [string characterAtIndex: index];
/* ... */
}
[edit 2]
Also, don't forget that NSString is 'toll-free bridged' to CFString and thus all the non-Objective-C, straight C-code interface functions are usable. The relevant one would be CFStringGetCharacterAtIndex

Quickest way to be sure region of memory is blank (all NULL)?

If I have an unsigned char *data pointer and I want to check whether size_t length of the data at that pointer is NULL, what would be the fastest way to do that? In other words, what's the fastest way to make sure a region of memory is blank?
I am implementing in iOS, so you can assume iOS frameworks are available, if that helps. On the other hand, simple C approaches (memcmp and the like) are also OK.
Note, I am not trying to clear the memory, but rather trying to confirm that it is already clear (I am trying to find out whether there is anything at all in some bitmap data, if that helps). For example, I think the following would work, though I have not tried it yet:
- BOOL data:(unsigned char *)data isNullToLength:(size_t)length {
unsigned char tester[length] = {};
memset(tester, 0, length);
if (memcmp(tester, data, length) != 0) {
return NO;
}
return YES;
}
I would rather not create a tester array, though, because the source data may be quite large and I'd rather avoid allocating memory for the test, even temporarily. But I may just being too conservative there.
UPDATE: Some Tests
Thanks to everyone for the great responses below. I decided to create a test app to see how these performed, the answers surprised me, so I thought I'd share them. First I'll show you the version of the algorithms I used (in some cases they differ slightly from those proposed) and then I'll share some results from the field.
The Tests
First I created some sample data:
size_t length = 1024 * 768;
unsigned char *data = (unsigned char *)calloc(sizeof(unsigned char), (unsigned long)length);
int i;
int count;
long check;
int loop = 5000;
Each test consisted of a loop run loop times. During the loop some random data was added to and removed from the data byte stream. Note that half the time there was actually no data added, so half the time the test should not find any non-zero data. Note the testZeros call is a placeholder for calls to the test routines below. A timer was started before the loop and stopped after the loop.
count = 0;
for (i=0; i<loop; i++) {
int r = random() % length;
if (random() % 2) { data[r] = 1; }
if (! testZeros(data, length)) {
count++;
}
data[r] = 0;
}
Test A: nullToLength. This was more or less my original formulation above, debugged and simplified a bit.
- (BOOL)data:(void *)data isNullToLength:(size_t)length {
void *tester = (void *)calloc(sizeof(void), (unsigned long)length);
int test = memcmp(tester, data, length);
free(tester);
return (! test);
}
Test B: allZero. Proposal by Carrotman.
BOOL allZero (unsigned char *data, size_t length) {
bool allZero = true;
for (int i = 0; i < length; i++){
if (*data++){
allZero = false;
break;
}
}
return allZero;
}
Test C: is_all_zero. Proposed by Lundin.
BOOL is_all_zero (unsigned char *data, size_t length)
{
BOOL result = TRUE;
unsigned char* end = data + length;
unsigned char* i;
for(i=data; i<end; i++) {
if(*i > 0) {
result = FALSE;
break;
}
}
return result;
}
Test D: sumArray. This is the top answer from the nearly duplicate question, proposed by vladr.
BOOL sumArray (unsigned char *data, size_t length) {
int sum = 0;
for (int i = 0; i < length; ++i) {
sum |= data[i];
}
return (sum == 0);
}
Test E: lulz. Proposed by Steve Jessop.
BOOL lulz (unsigned char *data, size_t length) {
if (length == 0) return 1;
if (*data) return 0;
return memcmp(data, data+1, length-1) == 0;
}
Test F: NSData. This is a test using NSData object I discovered in the iOS SDK while working on all of these. It turns out Apple does have an idea of how to compare byte streams that is designed to be hardware independent.
- (BOOL)nsdTestData: (NSData *)nsdData length: (NSUInteger)length {
void *tester = (void *)calloc(sizeof(void), (unsigned long)length);
NSData *nsdTester = [NSData dataWithBytesNoCopy:tester length:(NSUInteger)length freeWhenDone:NO];
int test = [nsdData isEqualToData:nsdTester];
free(tester);
return (test);
}
Results
So how did these approaches compare? Here are two sets of data, each representing 5000 loops through the check. First I tried this on the iPhone Simulator running on a relatively old iMac, then I tried this running on a first generation iPad.
On the iPhone 4.3 Simulator running on an iMac:
// Test A, nullToLength: 0.727 seconds
// Test F, NSData: 0.727
// Test E, lulz: 0.735
// Test C, is_all_zero: 7.340
// Test B, allZero: 8.736
// Test D, sumArray: 13.995
On a first generation iPad:
// Test A, nullToLength: 21.770 seconds
// Test F, NSData: 22.184
// Test E, lulz: 26.036
// Test C, is_all_zero: 54.747
// Test B, allZero: 63.185
// Test D, sumArray: 84.014
These are just two samples, I ran the test many times with only slightly varying results. The order of performance was always the same: A & F very close, E just behind, C, B, and D. I'd say that A, F, and E are virtual ties, on iOS I'd prefer F because it takes advantage of Apple's protection from processor change issues, but A & E are very close. The memcmp approach clearly wins over the simple loop approach, close to ten times faster in the simulator and twice as fast on the device itself. Oddly enough, D, the winning answer from the other thread performed very poorly in this test, probably because it does not break out of the loop when it hits the first difference.
I think you should do it with an explicit loop, but just for lulz:
if (length == 0) return 1;
if (*pdata) return 0;
return memcmp(pdata, pdata+1, length-1) == 0;
Unlike memcpy, memcmp does not require that the two data sections don't overlap.
It may well be slower than the loop, though, because the un-alignedness of the input pointers means there probably isn't much the implementation of memcmp can do to optimize, plus it's comparing memory with memory rather than memory with a constant. Easy enough to profile it and find out.
Not sure if it's the best, but I probably would do something like this:
bool allZero = true;
for (int i = 0; i < size_t; i++){
if (*data++){
//Roll back so data points to the non-zero char
data--;
//Do whatever is needed if it isn't zero.
allZero = false;
break;
}
}
If you've just allocated this memory, you can always call calloc rather than malloc (calloc requires that all the data is zeroed out). (Edit: reading your comment on the first post, you don't really need this. I'll just leave it just in case)
If you're allocating the memory yourself, I'd suggest using the calloc() function. It's just like malloc(), except it zeros out the buffer first. It's what's used to allocate memory for Objective-C objects and is the reason that all ivars default to 0.
On the other hand, if this is a statically declared buffer, or a buffer you're not allocating yourself, memset() is the easy way to do this.
Logic to get a value, check it, and set it will be at least as expensive as just setting it. You want it to be null, so just set it to null using memset().
This would be the preferred way to do it in C:
BOOL is_all_zero (const unsigned char* data, size_t length)
{
BOOL result = TRUE;
const unsigned char* end = data + length;
const unsigned char* i;
for(i=data; i<end; i++)
{
if(*i > 0)
{
result = FALSE;
break;
}
}
return result;
}
(Though note that strictly and formally speaking, a memory cell containing a NULL pointer mustn't necessarily be 0, as long as a null pointer cast results in the value zero, and a cast of a zero to a pointer results in a NULL pointer. In practice, this shouldn't matter as all known compilers use 0 or (void*) 0 for NULL.)
Note the edit to the initial question above. I did some tests and it is clear that the memcmp approach or using Apple's NSData object and its isEqualToData: method are the best approaches for speed. The simple loops are clearer to me, but slower on the device.

Trying to Understand NSString::initWithBytes

I'm attempting conversion of a legacy C++ program to objective-C. The program needs an array of the 256 possible ASCII characters (8-bits per character). I'm attempting to use the NSString method initWithBytes:length:encoding: to do so. Unfortunately, when coded as shown below, it crashes (although it compiles).
NSString* charasstring[256];
unsigned char char00;
int temp00;
for (temp00 = 0; temp00 <= 255; ++temp00)
{
char00 = (unsigned char)temp00;
[charasstring[temp00] initWithBytes:&char00 length:1 encoding:NSASCIIStringEncoding];
}
What I'm missing?
First, the method is simply initWithBytes:length:encoding and not the NSString::initWithBytes you used in the title. I point this out only because forgetting everything you know from C++ is your first step towards success with Objective-C. ;)
Secondly, your code demonstrates that you don't understand Objective-C or use of the Foundation APIs.
you aren't allocating instances of NSString anywhere
you declared an array of 256 NSString instance pointers, probably not what you want
a properly encoded ASCII string does not include all of the bytes
I would suggest you start here.
To solve that specific problem, the following code should do the trick:
NSMutableArray* ASCIIChars = [NSMutableArray arrayWithCapacity:256];
int i;
for (i = 0; i <= 255; ++i)
{
[ASCIIChars addObject:[NSString stringWithFormat:#"%c", (unsigned char)i]];
}
To be used, later on, as follows:
NSString* oneChar = [ASCIIChars objectAtIndex:32]; // for example
However, if all you need is an array of characters, you can just use a simple C array of characters:
unsigned char ASCIIChars [256];
int i;
for (i = 0; i <= 255; ++i)
{
ASCIIChars[i] = (unsigned char)i;
}
To be used, later on, as follows:
unsigned char c = ASCIIChars[32];
The choice will depend on how you want to use that array of characters.