I'm just beginning, and I'm a little hung up on this. I may have a fundamental misunderstanding with which you can kindly help me out.
Why is it that you can assign a string value to an NSString* (and, I'm sure, many other object types) directly? E.g.,
NSString* s = #"Hello, world!";
whereas the following code, I believe, would assign to s2 s1's pointer value (and therefore only incidentally provide s2 with a string value)?
NSString* s1 = #"Hello, world!";
NSString* s2 = s1;
For many objects, don't you have to indicate a property, a.k.a. instance variable, to which you want to assign a value (i.e., use a setter method)? Shouldn't the object itself accept assignments only of pointer values? Or do classes such as NSString automatically reinterpret code such as the first example above to assign the indicated string to an implied instance variable using an implied setter?
Why is it that you can assign a string value to an NSString* (and, I'm
sure, many other object types) directly?
Though it may look like it, you are not assigning the value of the string 'directly' to the instance variable. You are actually assigning the address of the string value to your instance variable. Now, the real question is what is going on behind the scenes when you have an expression of the type:
NSString * str = #"Hello World";
This expression represents the creation of a string literal. In C (and Objective-C which is a strict superset of C), string literals get special handling. Specifically, the following happens:
When your code is compiled the string "Hello World" will be created in the data section of
the program.
When the program is executing, an instance variable 'str' will be allocated on the heap.
The 'str' instance variable will be pointed at the static memory location where the actual string "Hello World" is stored.
The main difference between your first and second examples is that in the second example the memory for the string variable is dynamically allocated on the heap, at runtime. Note that in both cases the variable 'str' is just a pointer allocated dynamically.
More or less the latter. String literals like #"Hello World!" are treated as a special case in Objective-C: strings declared with that syntax are statically allocated, instantiated and cached at compile time to improve performance. From the programmer's perspective, it's no different from calling [NSString stringWithString:#"Hello World!"] or a constructor that takes a C-string -- you should just think of it as syntactic sugar.
FWIW, Objective-C has recently begun extending the # prefix to allow declaring dictionary and array literals as well, e.g.: #{ #"key" : #"value" } or #[ obj1, obj2, obj3 ].
This is a function of the compiler and not a language construct. The compiler in this case recognizes a string literal and inserts some code to produce the intended result.
#"" is essentially shorthand for NSString's +stringWithUTF8String method.
take from here:
What does the # symbol represent in objective-c?
NSString *s1 = #"Hello, world!";
is essentially equivalent to
NSString *s1 = [NSString stringWithUTF8String:"Hello, world!"];
The former allocates a new NSString object statically (instead of on the heap at runtime, as the latter would do).
It's important to note that these are just pointers. When you do NSString *s2 = s1, both s1 and s2 refer to the same object.
Related
I'm a little confused when coding an Objective-C project. The ARC is on. Here is a sample code:
NSString *foo = [[NSString alloc] initWithUTF8String:"This is a C string."];
// Use foo here...
foo = #"This is an ObjC string."
Here are my questions:
Do I need to explicitly terminate C string with '\0' in initWithUTF8String: method, or it is okay to omit NULL terminator?
Is there any memory leakage when I reuse foo as a pointer and assign new Objective-C string to it? Why?
If I change NSString to other class, like NSObject or my own class, is there any difference for question 2? (Initialize an object and then reassign other value directly to it.)
Thank you!
An explicit \0 is not required because in C (and hence Objective C), quoted string literals are null-terminated implicitly by the compiler. Here's a similar question.
Do string literals that end with a null-terminator contain an extra null-terminator?
No memory leakage. The ARC-configured compiler will generate code to release the first string that was being referenced before assigning the new string.
No change. You may get a compile-time warning if the types aren't compatible.
You must have the null terminator. From the documentation: "bytes - A NULL-terminated C array of bytes in UTF-8 encoding. This value must not be NULL."
No. The compiler will insert implicit release of the previous value and retain the new one since you declared foo with (implicit) strong semantics. From the documentation: "__strong is the default. An object remains “alive” as long as there is a strong pointer to it."
In general, no.
For the first question, what Apple's official documentation tells me is:
Returns a string created by copying the data from a given C array of UTF8-encoded bytes.
(id)stringWithUTF8String:(const char *)bytes
Parameters
bytes
A NULL-terminated C array of bytes in UTF8 encoding.
But since string literal is NULL terminated by default (as #TomSwift points out), it's okay to omit it.
Is there any different between
NSString * str = #"123";
and
NSString * str = [[NSString alloc] initWithString:#"123"];
from compiler's aspect?
Theoretically yes; in implementation detail, probably not.
In the first case, the compiler creates a constant string and assigns a pointer to it to the variable str. You do not own the string.
In the second case, the compiler creates a constant string (as before) but this time it is used by the run time as a parameter in initialising another string that you do own (because the second string was created using alloc).
That's the end of the stuff you need to know.
However, there is a lot of optimisation that goes on. Because NSStrings are immutable, you'll find that initWithString: actually just returns the parameter. Normally, it would retain the parameter before returning it to you (because you are expecting to own it) but literal strings have a special retainCount (INT_MAX I think) to stop the run time from ever trying to deallocate them. So in practice, your second line of code produces identical results to the first.
This incidentally, is why it is incorrect top say the string is autoreleased in the first case, because it isn't. It's just a constant string with a special retain count.
But you can and should safely ignore the implementation detail and just remember, you don't own the string in the first case, but you do own it in the second case.
Lots of differences. The most important is that you own the second string so you're responsible for releasing it (as is the case whenever you get an object from the init family of methods).
Another is that the former creates a string literal, and if you make a new string with the same literal, they will be pointers to the same object. If you do this:
NSString * str1 = #"123";
NSString * str2 = [[NSString alloc] initWithString:#"123"];
NSString * str3 = #"123";
Then str1 == str2 is false, but str1 == str3 is true. (Of course, the string content is the same, so isEqual: will return true. Also, while this does make for faster comparison, you probably shouldn't use it because it's an implementation detail and could in theory change in the future).
Yes, in the first case you do not own the string and you are not responsible to release it.
In the second case, instead, you are calling alloc thus you become the owner of the object and you must call release on it when you have done, otherwise it will become a memory leak.
In general, if the method you use to get your object contains "new","alloc","copy" or "mutableCopy" then you are the owner of the object and you are responsible to release it.
Check the memory management rules
Yes. The first is assignment of an NSString, and in the second the alloc (which means you need to release it in some way later) and initWithString: method are getting called.
Yes , first statement creates an autorelease object.
Second one creates an object occupying some memory and you have to release it after using it.
The main important difference about memory (your question title) is:
when you do:
NSString* myString = #"my text";
you are allocating an object of NSConstantString type.
The difference with NSString is:
NSConstantString is statically allocate, while NSString is dynamically allocated.
I'm new to Objective-C and need help with the concept of pointers. I've written this code:
//myArray is of type NSMutableArray
NSString *objectFromArray = [myArray objectAtIndex:2];
[objectFromArray uppercaseString];
I assumed that this would change the string at myArray[2] since I got the actual pointer to it. Shouldn't any changes to the dereferenced pointer mean that the object in that location changes? Or does this have something to do with 'string immutability'? Either way, when I use NSLog and iterate through myArray, all the strings are still lowercase.
Shouldn't any changes to the dereferenced pointer mean that the object in that location changes?
Yes, they would. But if you read the documentation for uppercaseString, you see that it does not modify the string in place. Rather, it returns a new uppercase version of the original string. All methods on NSString work like that.
You would need an instance of NSMutableString to be able to modify its contents in place. But NSMutableString does not have a corresponding uppercase method, so you would have to write it yourself (as a category on NSMutableString).
of course!! no string in the array will be converted to uppercase as the statement [objectFromArray uppercaseString]; would have returned the uppercase string which was not collected in any object though. "uppercaseString" does not modify the string object itself with which is is called...!!
What does the following line actually do?
string = #"Some text";
Assuming that "string" is declared thusly in the header:
NSString *string;
What does the "=" actually do here? What does it do to "string"'s reference count? In particular, assuming that for some reason "string" is not otherwise assigned to, does it need to be released?
Thanks!
The assignment is just that. The string pointer is basically a label that points to specific address in memory. Reassignment statement would point that label to another address in memory!
It doesn't change reference counting or do anything beyond that in Objective-C. You need to maintain the reference count yourself, if you are running in a non-garbage-collection environment:
[string release];
string = [#"Some text" retain];
However, string literals don't need to be managed, as they get allocated statically and never get deallocated! So the release and retain methods are just NOOPs (i.e. no operations). You can safely omit them.
What does the following line actually do?
string = #"Some text";
Assuming that "string" is declared thusly in the header:
NSString *string;
What does the "=" actually do here? What does it do to "string"'s reference count?
string is not a string.
string is, in fact, not any other kind of Cocoa object, either.
string is a variable, which you've created to hold an instance of NSString. The assignment operator puts something into a variable*. In your example above, you create a literal string, and put that into the variable.
Since string is a variable, not a Cocoa object, it has no reference count.
Assigning an object somewhere can extend the object's lifetime in garbage-collected code (only on the Mac). See the Memory Management Programming Guide for Cocoa for more details.
*Or a C array. Don't confuse these with Cocoa arrays; they're not interchangeable, and you can't use the assignment operator to put things into a Cocoa collection (not in Objective-C, anyway).
When you use a literal like in this case, it is just syntactic sugar to quickly create an NSString object. Once created, the object behaves just like another other. The difference here is that your string is compiled into the program instead of created dynamically.
I'm learning objective-C and Cocoa and have come across this statement:
The Cocoa frameworks expect that global string constants rather than string literals are used for dictionary keys, notification and exception names, and some method parameters that take strings.
I've only worked in higher level languages so have never had to consider the details of strings that much. What's the difference between a string constant and string literal?
In Objective-C, the syntax #"foo" is an immutable, literal instance of NSString. It does not make a constant string from a string literal as Mike assume.
Objective-C compilers typically do intern literal strings within compilation units — that is, they coalesce multiple uses of the same literal string — and it's possible for the linker to do additional interning across the compilation units that are directly linked into a single binary. (Since Cocoa distinguishes between mutable and immutable strings, and literal strings are always also immutable, this can be straightforward and safe.)
Constant strings on the other hand are typically declared and defined using syntax like this:
// MyExample.h - declaration, other code references this
extern NSString * const MyExampleNotification;
// MyExample.m - definition, compiled for other code to reference
NSString * const MyExampleNotification = #"MyExampleNotification";
The point of the syntactic exercise here is that you can make uses of the string efficient by ensuring that there's only one instance of that string in use even across multiple frameworks (shared libraries) in the same address space. (The placement of the const keyword matters; it guarantees that the pointer itself is guaranteed to be constant.)
While burning memory isn't as big a deal as it may have been in the days of 25MHz 68030 workstations with 8MB of RAM, comparing strings for equality can take time. Ensuring that most of the time strings that are equal will also be pointer-equal helps.
Say, for example, you want to subscribe to notifications from an object by name. If you use non-constant strings for the names, the NSNotificationCenter posting the notification could wind up doing a lot of byte-by-byte string comparisons when determining who is interested in it. If most of these comparisons are short-circuited because the strings being compared have the same pointer, that can be a big win.
Some definitions
A literal is a value, which is immutable by definition. eg: 10
A constant is a read-only variable or pointer. eg: const int age = 10;
A string literal is a expression like #"". The compiler will replace this with an instance of NSString.
A string constant is a read-only pointer to NSString. eg: NSString *const name = #"John";
Some comments on the last line:
That's a constant pointer, not a constant object1. objc_sendMsg2 doesn't care if you qualify the object with const. If you want an immutable object, you have to code that immutability inside the object3.
All #"" expressions are indeed immutable. They are replaced4 at compile time with instances of NSConstantString, which is a specialized subclass of NSString with a fixed memory layout5. This also explains why NSString is the only object that can be initialized at compile time6.
A constant string would be const NSString* name = #"John"; which is equivalent to NSString const* name= #"John";. Here, both syntax and programmer intention are wrong: const <object> is ignored, and the NSString instance (NSConstantString) was already immutable.
1 The keyword const applies applies to whatever is immediately to its left. If there is nothing to its left, it applies to whatever is immediately to its right.
2 This is the function that the runtime uses to send all messages in Objective-C, and therefore what you can use to change the state of an object.
3 Example: in const NSMutableArray *array = [NSMutableArray new]; [array removeAllObjects]; const doesn't prevent the last statement.
4 The LLVM code that rewrites the expression is RewriteModernObjC::RewriteObjCStringLiteral in RewriteModernObjC.cpp.
5 To see the NSConstantString definition, cmd+click it in Xcode.
6 Creating compile time constants for other classes would be easy but it would require the compiler to use a specialized subclass. This would break compatibility with older Objective-C versions.
Back to your quote
The Cocoa frameworks expect that global string constants rather than
string literals are used for dictionary keys, notification and
exception names, and some method parameters that take strings. You
should always prefer string constants over string literals when you
have a choice. By using string constants, you enlist the help of the
compiler to check your spelling and thus avoid runtime errors.
It says that literals are error prone. But it doesn't say that they are also slower. Compare:
// string literal
[dic objectForKey:#"a"];
// string constant
NSString *const a = #"a";
[dic objectForKey:a];
In the second case I'm using keys with const pointers, so instead [a isEqualToString:b], I can do (a==b). The implementation of isEqualToString: compares the hash and then runs the C function strcmp, so it is slower than comparing the pointers directly. Which is why constant strings are better: they are faster to compare and less prone to errors.
If you also want your constant string to be global, do it like this:
// header
extern NSString *const name;
// implementation
NSString *const name = #"john";
Let's use C++, since my Objective C is totally non-existent.
If you stash a string into a constant variable:
const std::string mystring = "my string";
Now when you call methods, you use my_string, you're using a string constant:
someMethod(mystring);
Or, you can call those methods with the string literal directly:
someMethod("my string");
The reason, presumably, that they encourage you to use string constants is because Objective C doesn't do "interning"; that is, when you use the same string literal in several places, it's actually a different pointer pointing to a separate copy of the string.
For dictionary keys, this makes a huge difference, because if I can see the two pointers are pointing to the same thing, that's much cheaper than having to do a whole string comparison to make sure the strings have equal value.
Edit: Mike, in C# strings are immutable, and literal strings with identical values all end pointing at the same string value. I imagine that's true for other languages as well that have immutable strings. In Ruby, which has mutable strings, they offer a new data-type: symbols ("foo" vs. :foo, where the former is a mutable string, and the latter is an immutable identifier often used for Hash keys).