Why is NSMaximumStringLength not INT_MAX - objective-c

While perusing through the NSString header file I saw the following.
#define NSMaximumStringLength (INT_MAX-1)
Why is the maximum string length one short of INT_MAX? Is this to accomodate for a null terminator (\0)? A related article can be found here.

Hypothesis:
It's to accomodate the NULL char: \0.
Documentation:
In Apple documentation found here for NSMaximumStringLength
NSMaximumStringLength
DECLARED IN foundation/NSString.h
SYNOPSIS NSMaximumStringLength
DESCRIPTION NSMaximumStringLength is the greatest possible length for an NSString.
And an NSString is but an "array of Unicode characters" - Source
NSString is concretized into either __NSCFStringduring runtime or __NSCFConstantString during compile time- Source
__NSCFString : Probably akin to __NSCFConstantString (See memory investigation below).
__NSCFConstantString: uses a char array allocation ( const char *cStr ) - Source.
Memory Investigation of NSString:
Code
NSString *s1 = #"test";
Breaking during runtime in LLDB:
Type:
expr [s1 fileSystemRepresentation]
Output:
$0 = 0x0b92bf70 "test" // Essential memory location and content.
To view memory type in LLDB:
memory read 0x0b92bf70
Output:
0x0b92bf70: 74 65 73 74 00 00 00 00 00 00 00 00 00 00 00 00 test............
0x0b92bf80: 7c 38 d4 02 72 a2 1b 03 f2 e6 1b 03 71 c5 4a 00 |8..r.......q.J.
*Notice empty termination after the last char t.
Testing Hypothesis of NULL termination:
Added a char* to previous code:
NSString *s1 = #"test";
char *p = (char*)[s1 cString];
Break into code with LLDB and type:
expr p[4] = '\1' // Removing NULL char.
Now if we print NSString with command:
expr s1
Output:
(NSString *) $0 = 0x002f1534 #"test
Avg Draw Time: %g"
Notice garbage after the 't', "Avg Draw Time: %g" (aka buffer over reading).
Conclusion
Through inference we can observe that there is 1 byte in the NSMaximumStringLength definition that is left for the NULL char to determine the end of a string in memory.

Related

Pointers in memory changed when exiting if statement

#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
const char* dog = "german shepard";
NSFileManager* fileManager = [NSFileManager defaultManager];
NSString *doglistPath = #"/Users/doglover/doglist.plist";
const char* doglistPath_cString = [doglistPath UTF8String];
const char** doglist;
if ([fileManager fileExistsAtPath:doglistPath]) {
NSArray *doglistPlist = [NSArray arrayWithContentsOfFile:doglistPath];
NSUInteger doglistPlistCount = [doglistPlist count];
const char* doglist2[blacklistPlistCount];
memset(doglist2, 0, sizeof doglist2);
for (int index = 0; index < doglistPlistCount; index++) {
const char *doglistName = [doglistPlist[index][#"type"] UTF8String];
doglist2[index] = doglistName;
}
doglist = doglist2;
}
while(true) {
NSLog(#"%s", *doglist);
doglist++;
}
return 0;
}
There are 20 items in the plist.
Whenever I run this code, it prints only 4 items and gives Thread 1: EXC_BAD_ACCESS error.
When I inspected the memory, when the runtime leaves the if-statement, the pointers in the doglist memory changes from :
B1 78 10 03 01 00 00 00
51 7B 10 03 01 00 00 00
01 7D 10 03 01 00 00 00
A1 7E 10 03 01 00 00 00
91 80 10 03 01 00 00 00
F8 74 10 03 01 00 00 00
11 83 10 03 01 00 00 00
C1 84 10 03 01 00 00 00
to
B1 78 10 03 01 00 00 00
51 7B 10 03 01 00 00 00
01 7D 10 03 01 00 00 00
A1 7E 10 03 01 00 00 00
40 00 00 00 00 00 00 00
90 ED BF EF FE 7F 00 00
0E 00 B6 D1 69 DA B9 2C
00 00 00 00 00 00 00 00
In the changed memory, first four pointers do contain the items, but an error occurs at 0x40.
Why did the memory change after exiting the if statement?
A few notes:
+arrayWithContentsOfFile is deprecated. You shouldn't be using it. Do you have control over the doglist.plist file? If so, I would suggest storing it using some other encoding mechanism, such as JSON.
Where is blacklistPlistCount declared? It's not in the code you posted, which prevents the code from compiling.
You are using -[NSString UTF8String] which returns a pointer to a buffer. However, this buffer is not guaranteed to exist past the lifetime of the string. Per the documentation:
This C string is a pointer to a structure inside the string object, which may have a lifetime shorter than the string object and will certainly not have a longer lifetime. Therefore, you should copy the C string if it needs to be stored outside of the memory context in which you use this property.
Why are you using low-level C pointers at all to convert back and forth between C data structures and Objective-C types such as NSArray and NSString? I would be using NSMutableArray for doglist2 instead of using a C array. Unless you're doing certain specialized tasks, raw C pointers in Objective-C code is probably a code smell.
Anyway, point 3 is the most likely cause of your crash, because the buffer is long gone by the time you reach the while loop in your code.

Converting NSData that contains UTF-8 and null bytes to string

I have an __NSCFData object. I know what's inside it.
61 70 70 6c 65 2c 74 79 70 68 6f 6f 6e 00 41 52 4d 2c 76 38 00
I tried converting it to a string with initWithData: and stringWithUTF8String: and it gives me "apple,typhoon". The conversion is terminated at 00
The data actually is
61 a
70 p
70 p
6c l
65 e
2c ,
74 t
79 y
70 p
68 h
6f o
6f o
6e n
00 (null)
41 A
52 R
4d M
2c ,
76 v
38 8
00 (null)
How can I properly convert this without loss of information?
The documentation for stringWithUTF8String describes its first parameter as:
A NULL-terminated C array of bytes in UTF8 encoding.
Which is why your conversion stops at the first null byte.
What you appear to have is a collection of C strings packed into a single NSData. You can convert each one individually. Use the NSData methods bytes and length to obtain a pointer to the bytes/first C string and the total number of bytes respectively. The standard C function strlen() will give you the length in bytes of an individual string. Combine these and some simple pointer arithmetic and you can write a loop which converts each string and, for example, stores them all into an array or concatenates them.
If you get stuck implementing the solution ask a new question, show your code, and explain the issue. Someone will undoubtedly help you with the next step.
HTH
In contrast to the intention of some answers, the stored strings in instances of NSString are not 0-terminated. Even there might be problems with writing them out (since underlying C functions for output expects a 0-terminated string), the instances itself can contain a \0:
NSString *zeroIncluded = #"A\0B";
NSLog(#"%ld", [zeroIncluded length]);
// prints 3
To create such an instance you can use methods that have a bytes and a length parameter, i. e. -initWithBytes:length:encoding:. Therefore something like this should work:
NSData *data = …
[[NSString alloc] initWithBytes:[data bytes] length:[data length] encoding:NSUTF8StringEncoding];
However, as intended by CRD, you might check, whether you want to have such a string.
0, or null, is the sentinel value which terminates strings, so you're going to have to deal with it somehow if you want to automatically dump the bytes into a string. If you don't, the string, or things that try to print it, for example, will assume the end of string is reached when reaching the NULL.
Just replace the bytes as they occur with something printable, like a space. Use whatever value works for you.
Example:
// original data you have from somewhere
char something[] = "apple,typhoon\0ARM,v8\0";
NSData *data = [NSData dataWithBytes:something length:sizeof(something)];
// Find each null terminated string in the data
NSMutableArray *strings = [NSMutableArray new];
NSMutableString *temp = [NSMutableString string];
const char *bytes = [data bytes];
for (int i = 0; i < [data length]; i++) {
unsigned char byte = (unsigned char)bytes[i];
if (byte == 0) {
if ([temp length] > 0) {
[strings addObject:temp];
temp = [NSMutableString string];
}
} else {
[temp appendFormat:#"%c", byte];
}
}
// Results
NSLog(#"strings count: %lu", [strings count]);
[strings enumerateObjectsUsingBlock:^(NSString *string, NSUInteger idx, BOOL * _Nonnull stop) {
NSLog(#"%ld: %#", idx, string);
}];
// strings count: 2
// 0: apple,typhoon
// 1: ARM,v8

Objective-C Raw MD5-hash

In Objective-C, I generate a simple MD5-hash of 'HelloKey', which returns 0FD16658AEE3C52060A39F4EDFB11437. Unfortunately, I could not get a raw return, so I have to work with this string to get a raw MD5-hash (or do you know how I can get a raw result from the start?)
Anyway, in order to convert it to raw, I split it into chunks of 2 chars each, calculate the hex value, and append a char with that value to a string.
Here's the function:
- (NSString *)hex2bin:(NSString *)input{
NSString *output = #"";
for (int i = 0; i < input.length; i+=2){
NSString *component = [input substringWithRange:NSMakeRange(i, 2)];
unsigned int outVal;
NSScanner* scanner = [NSScanner scannerWithString:component];
[scanner scanHexInt:&outVal];
/* if(outVal > 127){
outVal -= 256;
} */
// unsigned char appendage = (char)outVal;
output = [NSString stringWithFormat:#"%#%c", output, outVal];
NSLog(#"component: %# = %d", component, outVal);
}
return output;
}
When I print each outval, I get:
0F = 15
D1 = 209
66 = 102
58 = 88
AE = 174
E3 = 227
C5 = 197
20 = 32
60 = 96
A3 = 163
9F = 159
4E = 78
DF = 223
B1 = 177
14 = 20
37 = 55
However, when I print the string that I get with a special function that tells me the integer values of each character (a function which is shown here):
- (NSString *)str2bin:(NSString *)input{
NSString *output = #"";
for (NSInteger charIdx=0; charIdx < input.length; charIdx++){
char currentChar = [input characterAtIndex:charIdx];
int charNum = [NSNumber numberWithChar:currentChar].intValue;
output = [NSString stringWithFormat:#"%# %d", output, charNum];
}
return output;
}
I get: 15 20 102 88 -58 30 72 32 96 -93 -4 78 2 -79 20 55. You will notice that there are significant differences, like 209 -> 20, 174 -> -58, 227 -> 30. In some cases, the difference is 256, so no harm done. But in other cases, it's not, and I would really like to know what's going wrong. Any tips?
You are doing it wrong, since you are trying to store binary data in NSString, which is UTF8 string.
You should use NSData (or C string) to store binary hash representation.

iOS NSInvocation setArgument: atIndex: does not work with struct on ARM builds

I have a strange problem with setting the argument of an NSInvocation with a struct that contains a double or any 64 bit type which is not aligned (I offset it with a char at the beginning of the struct). The problem is that some bytes are cleared after the argument is set. This problem occurs on ARM7 but not in the iOS simulator.
I'm using LLVM 3.0 and Xcode 4.2
Here is my code and test results:
NSInvocation+Extension.h
#interface NSInvocation (Extension)
+ (NSInvocation*) invocationWithTarget: (id)aTarget
selector: (SEL)aSelector
retainArguments: (BOOL)aRetainArguments, ...;
- (void) setArguments: (va_list)aArgList;
- (void) setArguments: (va_list)aArgList atIndex: (NSInteger)aIndex;
#end // NSInvocation (Extension)
NSInvocation+Extension.m
#import <objc/runtime.h>
#import "NSInvocation+Extension.h"
#implementation NSInvocation (Extension)
+ (NSInvocation*) invocationWithTarget: (id)aTarget
selector: (SEL)aSelector
retainArguments: (BOOL)aRetainArguments, ...
{
NSMethodSignature* signature = [aTarget methodSignatureForSelector: aSelector];
NSInvocation* invocation = [NSInvocation invocationWithMethodSignature: signature];
if (aRetainArguments)
{
[invocation retainArguments];
}
[invocation setTarget: aTarget];
[invocation setSelector: aSelector];
va_list argList;
va_start(argList, aRetainArguments);
[invocation setArguments: argList];
va_end(argList);
return invocation;
}
- (void) setArguments: (va_list)aArgList
{
[self setArguments: aArgList atIndex: 0];
}
- (void) setArguments: (va_list)aArgList atIndex: (NSInteger)aIndex
{
// Arguments are aligned on machine word boundaries
const NSUInteger KOffset = sizeof(size_t) - 1;
UInt8* argPtr = (UInt8*)aArgList;
NSMethodSignature* signature = [self methodSignature];
// Indices 0 and 1 indicate the hidden arguments self and _cmd respectively.
for (int index = aIndex + 2; index < [signature numberOfArguments]; ++index)
{
const char* type = [signature getArgumentTypeAtIndex: index];
NSUInteger size = 0;
NSGetSizeAndAlignment(type, &size, NULL);
[self setArgument: argPtr atIndex: index];
argPtr += (size + KOffset) & ~KOffset;
}
}
#end // NSInvocation (Extension)
Declare method to invoke and a data struct
- (void) arg1: (char)aArg1 arg2: (char)aArg2 arg3: (TEST)aArg3 arg4: (char)aArg4;
typedef struct test {
char c;
double s;
char t;
void* b;
char tu;
} TEST;
Calling code
TEST df = { 'A', 12345678.0, 'B', (void*)2, 'C' };
char buf[100] = {0};
NSInvocation* ik = [NSInvocation invocationWithTarget: self selector: #selector(arg1:arg2:arg3:arg4:) retainArguments: NO, '1', '2', df, '3'];
[ik getArgument: &buf atIndex: 4];
Contents of buf on ARM7 (bytes 8, 9, 10 and 11 set to zero which messed up the double value)
41 00 00 00 00 00 00 00 29 8C 67 41 42 00 00 00 02 00 00 00 43 00 00 00
Contents of buf on i386 simulator (as expected)
41 00 00 00 00 00 00 C0 29 8C 67 41 42 00 00 00 02 00 00 00 43 00 00 00
First thought is that you really must use va_arg to access successive arguments in a variadic argument list. There is no way you can just assume that the arguments are arranged in a nice contiguous piece of memory as you do. For one thing, the ARM ABI says the first four arguments are passed in registers.
A va_list need not necessarily be just a pointer, it's an opaque type. Your cast to uint8_t* is almost certainly invalid.

appending data using NSMutableData

Right now I'm appending data using NSMutableData's -appendBytes:length: like this:
int length = [self.trackData length]+3;
[contents appendBytes:&length length:4];
Suppose length is 20. In hex, the bytes appended are 16 00 00 00, extended to 4 bytes.
How can I add the additional zeros to the left like in 00 00 00 16?
You probably want to swap the bytes to big-endian:
int length = NSSwapHostIntToBig([self.trackData length]+3);
[contents appendBytes:&length length:4];