Unique Instances of NSString for empty or 1-char strings - objective-c

I would like to understand more about the way XCode/Objective-C handle constant strings. I found a related question, but I would like more information. Consider the following code:
NSString *a = [[NSString alloc] initWithUTF8String:[[_textFieldA stringValue] UTF8String]];
NSString *b = [[NSString alloc] initWithUTF8String:[[_textFieldB stringValue] UTF8String]];
NSString *c = [a copy];
NSString *d = [a mutableCopy];
Note that the textFields are just a way to set the strings at runtime ensuring that the compiler doesn't get too smart on me and build in a single instance.
If my text fields are empty, or contain a single character such as "x" or "$", then a == b == c == the same constant NSString instance. If I instead provide "xy", then a == c != b. d is always unique, as one might expect since it is mutable.
Now normally this wouldn't be an issue, I'm not trying to modify the contents of these strings, however, I am working on a system where I frequently use objc_setAssociatedObject. So here now I might come accross an empty string, and then set associated object data on it, and then have another empty string and collide with the first.
I have, for the moment, solved my issue by creating mutable strings instead.
So my questions:
Is this an Objective-C specification, or an XCode excentricity?
Does anyone know how the instance is determined? Why "x" get's one instance, but not "xy"? I would think some internal dictionary is involved and there's no good reason to stop at 1 character.
Is there a way to turn this off, so all empty strings are unique instances, or other suggestions?
I am using XCode 5.1.1, OSX 10.9.4, SDK 10.9.
Thank you!

Is this an Objective-C specification, or an XCode excentricity?
It is just implementation detail. Not documented any where. These kind of behaviour may changed in future without notice.
Does anyone know how the instance is determined? Why "x" get's one instance, but not "xy"? I would think some internal dictionary is involved and there's no good reason to stop at 1 character.
No until someone able to access source code want to share the details with us.
Is there a way to turn this off, so all empty strings are unique instances, or other suggestions?
No way to turn it off. Don't use objc_setAssociatedObject with NSString
As #Ken Thomases said in comment
In general, it probably doesn't make sense to use objc_setAssociatedObject() with any value class.
Some other examples are NSNumber, NSData and NSValue. They are often cached and reused.

Related

Objective-c pointers magic. Type protection

I have a dictionary.
I extract one of its values as follows:
NSString *magicValue= [filterDict valueForKey:[filterDict allKeys][0]];
[SomeClass foo: magicValue];
And foo is:
- (void)foo:(NSString*)magicValue
{
NSLog("magicValue is string:%#",[magic isKindOfClass:[NSString class]] ? #"YES" : #"NO");
NSLog("magicValue is number:%#",[magic isKindOfClass:[NSNumber class]] ? #"YES" : #"NO");
}
If the dictionary value is number magicValue will be NSNumber. So the defined string pointer will be pointing to an NSNumber. The log will return yes for the number check.
I never added protection to such methods, to check what class "magicValue" is. I assumed that when I define a method with string parameter it will be string.
Should I start accounting for such behavior and always add checks, or is it the fault of the guy that assigned that dictionary value to magic in such a way and used my method. I need some best practices advice and how to handle this.
This question could have already been answered but I didn't know how to search for it.
Short answer: No, do not check that, if there is no special reason.
Long answer:
You have to differentiate between two cases:
A. Using id
You have a variable or a return vale of the type id. This is in your example -valueForKey:. The reason for that typing is to keep the method generic. Even it is theoretically possible, in practice a type mismatch in such situation is very rare and detected fast in development. With a different motivation I asked the audience (>200) in a public talk, how many times they had such a typing error in production. For all listeners, all of their apps in all of the app's versions there was 1(in words: one!) case. Simply forget about that risk. It is the anxiety of developers using statically typing languages (Java, C++, Swift).
B. Wrong assignment
If you do not have an id type, it is still possible to do such tricks. (And sometimes you want to do that. That is a strength of dynamic typing.) There are two subcategories:
You can do it implicitly:
NSString *string = [NSNumber numberWithInt:1];
The compiler will warn you about that. So everything is fine, because the developer will see his mistake. You do not have to protect him or your code.
One can do it explicitly:
NSString *string = (NSString*)[NSNumber numberWithInt:1];
In such a case, code can break. But he did it explicitly. If it is wrong, the developer had criminal energy to do so and you do not have to protect him from himself.
Most of the time you should know what class you're referencing, or at least what you intend it to be. On the occasions where you have an uexpected class which can cause a crash depending on what messages you send to it, you can then debug your code and get the correct reference.
There are times, usually when dealing with inheritance, when you need to determine the class at runtime rather than at compile time. This is when, isKindOfClass: can be useful. If you know that a value could be one of many classes, I would extract it as an id and then cast it at the last moment e.g.
id value = [[NSUserDefaults standardUserDefaults] valueForKey:aKey];
if ([value isKindOfClass:[MyClass class]]) {
// Do one thing
}
else {
// Do another
}

Lifespan of NSString cStringUsingEncoding return value

I have an NSTextField and get its contents like so
NSString *s = [textField stringValue]
Now I want to convert this NSString to a string that my platform-independent C code can handle. Thus I'm doing:
const char *cstr = [s cStringUsingEncoding:NSISOLatin1StringEncoding];
What I don't understand now is how long this "cstr" pointer stays valid. Apple docs for cStringUsingEncoding say:
The returned C string is guaranteed to be valid only until either the
receiver is freed, or until the current memory is emptied, whichever
occurs first. You should copy the C string or use
getCString:maxLength:encoding: if it needs to store the C string
beyond this time.
Two questions about this:
I suppose the aforementioned "receiver" is the NSString returned by the [textField stringValue]. Since I don't own this NSString how can I tell when this will be freed? Is it safe to assume that this NSString won't be freed before the NSTextField widget will be freed?
What does "until the current memory is emptied" mean precisely? I don't understand this at all.
Of course, I could just go ahead and make a copy but I'd like to understand how long the string pointer returned by cStringUsingEncoding is valid.
I know there are several similar questions here but none could really answer my question since in my case, the owner of the NSString is the NSTextField widget and I don't know when this widget will release the NSString or if it stays valid for the complete lifespan of the widget itself.
I suppose the aforementioned "receiver" is the NSString returned by the [textField stringValue]
yes, in this case the receiver is s
Since I don't own this NSString how can I tell when this will be freed?
you don't. you should retain s by storing it in an instance variable for as long as you need it
Is it safe to assume that this NSString won't be freed before the NSTextField widget will be freed?
no, because you don't know what or how the text field returned s to you
What does "until the current memory is emptied" mean precisely? I don't understand this at all.
good question. also, hard to tell, because you don't own the string or know about its underlying implementation. say it was a mutable string that was mutated and had to reallocate memory...
you can be pretty sure of your safety if you copy s, store the copy in an instance variable and then use the copy to get the C string (or just copy the C string).
Receiver for sure means the string s, and the danger to cstr is clear when s is freed. I think the phrase "or until current memory is emptied" is a documentation bug introduced by ARC. It can be read as "or until an ARC-implied release is executed".
See the doc quoted here in 2010 as evidence. I think the author, probably searching for 'autorelease pool' for places to update the docs, was grasping for a harmless, ARC-compatible synonym for "or until the current autorelease pool is emptied". I think it would have been better to just drop the disjunction.
Anyway, either take control of the NSString, or copy the cstring.

How to zeroize an objective-c object under ARC [duplicate]

On iOS, I was wondering, say if I read user provided password value as such:
NSString* strPwd = UITextField.text;
//Check 'strPwd'
...
//How to clear out 'strPwd' from RAM?
I just don't like leaving sensitive data "dangling" in the RAM. Any idea how to zero it out?
Basically you really can't. There are bugs filed with Apple over this exact issue. Additionally there are problems with UITextField and NSString at a minimum.
To reiterate the comment in a now deleted answer by #Leo Natan:
Releasing the enclosing NSString object does not guarantee the string
bytes are zeroes in memory. Also, if a device is jailbroken, all the
sandboxing Apple promises will be of no use. However, in this case,
there is little one can do, as it is possible to swap the entire
runtime in the middle of the application running, this getting the
password from the memory.
Please file another bug with apple requesting this, the more the better.
Apple Bug Reporter
While NSString doesn't have this capability (for reasons of encapsulation mentioned elsewhere), it shouldn't be too hard to have your app use regular old C-strings, which are just pointers to memory. Once you have that pointer, it's fairly easy to clear things out when you're done.
This won't help with user-entered text-fields (which use NSString-s and we can't change them), but you can certainly keep all of your app's sensitive data in pointer-based memory.
I haven't experimented with it (I don't have a current jailbroken device), but it also might be interesting to experiment with NSMutableString -- something like:
// Code typed in browser; may need adjusting
// keep "password" in an NSMutableString
NSInteger passLength = password.length;
NSString *dummy = #"-";
while (dummy.length < passLength)
{
dummy = [dummy stringByAppendingString: #"-"];
}
NSRange fullPass = NSMakeRange(0, passLength);
[password replaceOccurancesOfString: password
withString: dummy
options: 0
range: fullPass];
NOTE: I have no idea if this does what you want; it's just something I thought of while typing my earlier answer. Even if it does work now, I guess it depends on the implementation, which is fragile (meaning: subject to breaking in the future), so shouldn't be used.
Still, might be an interesting exercise! :)

Should I use an intermediate temp variable when appending to an NSString?

This works -- it does compile -- but I just wanted to check if it would be considered good practice or something to be avoided?
NSString *fileName = #"image";
fileName = [fileName stringByAppendingString:#".png"];
NSLog(#"TEST : %#", fileName);
OUTPUT: TEST : image.png
Might be better written with a temporary variable:
NSString *fileName = #"image";
NSString *tempName;
tempName = [fileName stringByAppendingString:#".png"];
NSLog(#"TEST : %#", tempName);
just curious.
Internally, compilers will normally break your code up into a representation called "Single Static Assignment" where a given variable is only ever assigned one value and all statements are as simple as possible (compound elements are separated out into different lines). Your second example follows this approach.
Programmers do sometimes write like this. It is considered the clearest way of writing code since you can write all statements as basic tuples: A = B operator C. But it is normally considered too verbose for code that is "obvious", so it is an uncommon style (outside of situations where you're trying to make very cryptic code comprehensible).
Generally speaking, programmers will not be confused by your first example and it is considered acceptable where you don't need the original fileName again. However, many Obj-C programmers, encourage the following style:
NSString *fileName = [#"image" stringByAppendingString:#".png"];
NSLog(#"TEST : %#", fileName);
or even (depending on horizontal space on the line):
NSLog(#"TEST : %#", [#"image" stringByAppendingString:#".png"]);
i.e. if you only use a variable once, don't name it (just use it in place).
On a stylistic note though, if you were following the Single Static Assigment approach, you shouldn't use tempName as your variable name since it doesn't explain the role of the variable -- you'd instead use something like fileNameWithExtension. In a broader sense, I normally avoid using "temp" as a prefix since it is too easy to start naming everything "temp" (all local variables are temporary so it has little meaning).
The first line is declaring an NSString literal. It has storage that lasts the lifetime of the process, so doesn't need to be released.
The call to stringByAppendingString returns an autoreleased NSString. That should not be released either, but will last until it gets to the next autorelease pool drain.
So assigning the result of the the stringByAppendingString call back to the fileName pointer is perfectly fine in this case. In general, however, you should check what your object lifetimes are, and handle them accordingly (e.g. if fileName had been declared as a string that you own the memory to you would need to release it, so using a temp going to be necessary).
The other thing to check is if you're doing anything with fileName after this snippet - e.g. holding on to it in a instance variable - in which case your will need to retain it.
The difference is merely whether you still need the reference to the literal string or not. From the memory management POV and the object creational POV it really shouldn't matter. One thing to keep in mind though is that the second example makes it slightly easier when debugging. My preferred version would look like this:
NSString *fileName = #"image";
NSString *tempName = [fileName stringByAppendingString:#".png"];
NSLog(#"TEST : %#", tempName);
But in the end this is just a matter of preference.
I think you're right this is really down to preferred style.
Personally I like your first example, the codes not complicated and the first version is concise and easier on the eyes. Theres too much of the 'language' hiding what it's doing in the second example.
As noted memory management doesn't seem to be an issue in the examples.

NSString setter using isEqualToString

In the Pragmatic Core Data book, I came across this code snippet for an NSString setter:
- (void)setMyString:(NSString*)string;
{
#synchronized(self) {
if ([string isEqualToString:myString]) return;
[myString release];
myString = [string retain];
}
}
Is there any reason to use [string isEqualToString:myString] instead of string == myString here? Does it not mean that if the two strings have the same content, the result will be different than if they are actually the same object? Does this matter?
Thanks.
Notice that the variables you're comparing are pointers to NSStrings. Pointer comparison just checks if the pointers are referring to the same address. It doesn't know anything about the content at the end. Two string objects in two different places can have the same content. Thus you need isEqualToString:. In this case, I'm not sure either that it's a terribly important distinction to make though. It would make more sense to me if it were special-casing sending out change notifications based on whether the new string would actually be a change.
Incidentally, in an NSString setter, you almost always want copy rather than retain. I don't know the exact use case in this book, but if you just retain the string and it happens to be mutable, it can change behind your back and cause weird results. And if the string isn't mutable, copy is just an alias for retain.