Modify strings with reflection - objective-c

I was reading this question/answers, which basically showed an interesting behaviour in Java and strings, and two questions came up in my mind:
Are Objective-C/Swift String s behave the same? I mean if I have for example two variables which stores the same literal "someString", internally, will they refer to one "someString" object? I didn't find anything about it in the documentation.
If the answer to my previous question is yes, then is it possible to change same string literals the way like in Java?

Not all NSString literals (#"string literal") share the same storage due to compilation units.
NSString literals can not be changed in the program code, they are compiled into readonly segments.
NSString variables, that is that are created at runtime, only are shared by assignment.
NSString instances are immutable and can not be changed after creation.
NSMutableString instances can be modified and all variables pointing to such an instance point to the same change.
Swift is slightly different, as #Grimxn points out, Swift String is not a class and immutability is determined by the declaration syntax: let or var.

Related

Why are instances created using a 'literal syntax' known as 'literals'?

Something that is bothering me is why the term 'literal' is used to refer to instances of classes like NSString and NSArray. I had only seen the term used in reference to NSString and being naive I thought it had something to do with it 'literally' being a string, that is between quotation markers. Sorry if that sounds pathetic, but that was how I had been thinking about it.
Then today I learned that certain instances of NSArray can also be referred to as literal instances, i.e. an instance of the class created using a 'literal syntax'.
As #Linuxios notes, literal syntaxes are built into the language. They're broader than you think, though. A literal just means that an actual value is encoded in the source. So there are quite a few literal syntaxes in ObjC. For example:
1 - int
1.0 - double
1.0f - float
"a" - C-string
#"a" - NSString
#[] - NSArray
^{} - function
Yeah, blocks are just function literals. They are an anonymous value that is assignable to a symbol name (such as a variable or constant).
Generally speaking, literals can be stored in the text segment and be computed at compile time (rather than at run time). If I remember correctly, array literals are currently expanded into the equivalent code and evaluated at runtime, but #"..." string literals are encoded into the binary as static data (at least now they are; non-Apple versions of gcc used to encode an actual function call to construct static strings as I remember).
A literal syntax or a literal is just an object that was created using a dedicated syntax built into the language instead of using the normal syntax for object creation (whatever that is).
Here I create a literal array:
NSArray* a = #[#"Hello", #"World"];
Which is, for all intents and purposes equivalent to this:
NSArray* a = [NSArray arrayWithObjects:#"Hello", #"World", nil];
The first is called a literal because the #[] syntax is built into the language for creating arrays, in the same way that the #"..." syntax is built in for creating NSStrings.
the term 'literal' is used to refer to instances of classes
It's not referring to the instance really; after the object is created, the way it was created doesn't matter:
NSArray * thisWasCreatedWithALiteral = #[#1, #2];
NSArray * butWhoCares = thisWasCreatedWithALiteral;
The "literal" part is just the special syntax #[#1, #2], and
it ha[s] something to do with it 'literally' being a string, that is between quotation markers.
is exactly right: this is a written-out representation of the array, as opposed to one created with a constructor method like arrayWithObjects:

What is the different between NSCFString and NSConstantString?

I declare the object variable as a NSString
But when I use the XCode to look into my object, I saw there are two type of String, it seems that the system automatically transfer to another:
What are the different between them? Are they interchangeable to one and others. Also, what is the condition two change to another?
Thanks.
They're both concrete subclasses of NSString. __NSCFString is one created during runtime via Foundation or Core Foundation, while __NSCFConstantString is either a CFSTR("...") constant or an #"..." constant, created at compile-time.
Their interfaces are private. Treat them both as NSString and you should have no trouble.
As far as I know, NSCFConstantString is an implementation of NSString that keeps the string data in code memory. Compiler creates instances of it when you use #"string" constants. You can use NSCFConstantString anywhere an NSString could be used due to subclass/superclass relationship, but obviously not the other way around.
It appears to be an optimization done by the compiler. I'm guessing that the string that is getting converted to an NSCFConstantString is equal to one of the constants that is cached for performance reasons. Your NSCFString is just a toll-free bridged string that can be an NSString or a CFString. See this article for more information.
One of benefits of transforming NSString to NSCFConstantString is next example:
For example - in method cellForRowAtIndexPath for tableView if you will write
NSString *ident = #"identificator";
NSLog(#"%p", ident);
than it would be the same address for every cell. But with
NSLog(#"%p", &ident) it would be different address for every cell.
NSString ident = #"identificator" is a special case - it is created as
a __NSCFConstantString class so all equal string literals will share
the same memory address to optimize memory usage. &ident will get an
address of a local variable pointing to a NSString and will have
NSString** type.
Reference to source (comments).

Objective-C String Differences

What's the difference between NSString *myString = #"Johnny Appleseed" versus NSString *myString = [NSString stringWithString: #"Johnny Appleseed"]?
Where's a good case to use either one?
The other answers here are correct. A case where you would use +stringWithString: is to obtain an immutable copy of a string which might be mutable.
In the first case, you are getting a pointer to a constant NSString. As long as your program runs, myString will be a valid pointer. In the second, you are creating an autoreleased NSString object with a constant string as a template. In that case, myString won't point to a real object anymore after the current run loop ends.
Edit: As many people have noted, the normal implementation of stringWithString: just returns a pointer to the constant string, so under normal circumstances, your two examples are exactly the same. There is a bit of a subtle difference in that Objective-C allows methods of a class to be replaced using categories and allows whole classes to be replaced with class_poseAs. In those cases, you might run into a non-default implementation of stringWithString:, which may have different semantics than you expect it to. Just because it happens to be that the default implementation does the same thing as a simple assignment doesn't mean that you should rely on subtle implementation-specific behaviour in your program - use the right case for the particular job you're trying to do.
Other than syntax and a very very minor difference in performance, nothing. The both produce the exact same pointer to the exact same object.
Use the first example. It's easier to read.
In practice, nothing. You wouldn't ever use the second form, really, unless you had some special reason to. And I can't think of any right now.
(See Carl's answer for the technical difference.)

static NSStrings in Objective-C

I frequently see a code snippet like this in class instance methods:
static NSString *myString = #"This is a string.";
I can't seem to figure out why this works. Is this simply the objc equivalent of a #define that's limited to the method's scope? I (think) I understand the static nature of the variable, but more specifically about NSStrings, why isn't it being alloc'd, init'd?
Thanks~
I think the question has two unrelated parts.
One is why isn't it being alloc'ed and init'ed. The answer is that when you write a Objective-C string literal of the #"foo" form, the Objective-C compiler will create an NSString instance for you.
The other question is what the static modifier does. It does the same that it does in a C function, ensuring that the myString variable is the same each time the method is used (even between different object instances).
A #define macro is something quite different: It's "programmatic cut and paste" of source code, executed before the code arrives at the compiler.
Just stumbled upon the very same static NSString declaration. I wondered how exactly this static magic works, so I read up a bit. I'm only gonna address the static part of your question.
According to K&R every variable in C has two basic attributes: type (e.g. float) and storage class (auto, register, static, extern, typedef).
The static storage class has two different effects depending on whether it's used:
inside of a block of code (e.g. inside of a function),
outside of all blocks (at the same level as a function).
A variable inside a block that doesn't have it's storage class declared is by default considered to be auto (i.e. it's local). It will get deleted as soon as the block exits. When you declare an automatic variable to be static it will keep it's value upon exit. That value will still be there when the block of code gets invoked again.
Global variables (declared at the same level as a function) are always static. Explicitly declaring a global variable (or a function) to be static limits its scope to just the single source code file. It won't be accessible from and it won't conflict with other source files. This is called internal linkage.
If you'd like to find out more then read up on internal and external linkage in C.
You don't see a call to alloc/init because the #"..." construct creates a constant string in memory (via the compiler).
In this context, static means that the variable cannot be accessed out of the file in which it is defined.
For the part of NSString alloc, init:
I think first, it can be thought as a convenience, but it is not equally the same for [[NSString alloc] init].
I found a useful link here. You can take a look at that
NSString and shortcuts
For the part of static and #define:
static instance in the class means you can access using any instance of the class. You can change the value of static. For the function, it means variable's value is preserved between function calls
#define is you put a macro constant to avoid magic number and string and define function macros. #define MAX_NUMBER 100. then you can use int a[MAX_MUMBER]. When the code is compiled, it will be copied and pasted to int a[100]
It's a special case init case for NSString which simply points the NSString pointer to an instance allocated and inited at startup (or maybe lazily, I'm not sure.) There is one one of these NSString instances created in this fashion for each unique #"" you use in your program.
Also I think this is true even if you don't use the static keyword. Furthermore I think all other NSStrings initialized with this string will point to the same instance (not a problem because they are immutable.)
It's not the same as a #define, because you actually have an NSString variable by creating the string with the = #"whatever" initialization. It seems more equivalent to c's const char* somestr = "blah blah blah".

What's the difference between a string constant and a string literal?

I'm learning objective-C and Cocoa and have come across this statement:
The Cocoa frameworks expect that global string constants rather than string literals are used for dictionary keys, notification and exception names, and some method parameters that take strings.
I've only worked in higher level languages so have never had to consider the details of strings that much. What's the difference between a string constant and string literal?
In Objective-C, the syntax #"foo" is an immutable, literal instance of NSString. It does not make a constant string from a string literal as Mike assume.
Objective-C compilers typically do intern literal strings within compilation units — that is, they coalesce multiple uses of the same literal string — and it's possible for the linker to do additional interning across the compilation units that are directly linked into a single binary. (Since Cocoa distinguishes between mutable and immutable strings, and literal strings are always also immutable, this can be straightforward and safe.)
Constant strings on the other hand are typically declared and defined using syntax like this:
// MyExample.h - declaration, other code references this
extern NSString * const MyExampleNotification;
// MyExample.m - definition, compiled for other code to reference
NSString * const MyExampleNotification = #"MyExampleNotification";
The point of the syntactic exercise here is that you can make uses of the string efficient by ensuring that there's only one instance of that string in use even across multiple frameworks (shared libraries) in the same address space. (The placement of the const keyword matters; it guarantees that the pointer itself is guaranteed to be constant.)
While burning memory isn't as big a deal as it may have been in the days of 25MHz 68030 workstations with 8MB of RAM, comparing strings for equality can take time. Ensuring that most of the time strings that are equal will also be pointer-equal helps.
Say, for example, you want to subscribe to notifications from an object by name. If you use non-constant strings for the names, the NSNotificationCenter posting the notification could wind up doing a lot of byte-by-byte string comparisons when determining who is interested in it. If most of these comparisons are short-circuited because the strings being compared have the same pointer, that can be a big win.
Some definitions
A literal is a value, which is immutable by definition. eg: 10
A constant is a read-only variable or pointer. eg: const int age = 10;
A string literal is a expression like #"". The compiler will replace this with an instance of NSString.
A string constant is a read-only pointer to NSString. eg: NSString *const name = #"John";
Some comments on the last line:
That's a constant pointer, not a constant object1. objc_sendMsg2 doesn't care if you qualify the object with const. If you want an immutable object, you have to code that immutability inside the object3.
All #"" expressions are indeed immutable. They are replaced4 at compile time with instances of NSConstantString, which is a specialized subclass of NSString with a fixed memory layout5. This also explains why NSString is the only object that can be initialized at compile time6.
A constant string would be const NSString* name = #"John"; which is equivalent to NSString const* name= #"John";. Here, both syntax and programmer intention are wrong: const <object> is ignored, and the NSString instance (NSConstantString) was already immutable.
1 The keyword const applies applies to whatever is immediately to its left. If there is nothing to its left, it applies to whatever is immediately to its right.
2 This is the function that the runtime uses to send all messages in Objective-C, and therefore what you can use to change the state of an object.
3 Example: in const NSMutableArray *array = [NSMutableArray new]; [array removeAllObjects]; const doesn't prevent the last statement.
4 The LLVM code that rewrites the expression is RewriteModernObjC::RewriteObjCStringLiteral in RewriteModernObjC.cpp.
5 To see the NSConstantString definition, cmd+click it in Xcode.
6 Creating compile time constants for other classes would be easy but it would require the compiler to use a specialized subclass. This would break compatibility with older Objective-C versions.
Back to your quote
The Cocoa frameworks expect that global string constants rather than
string literals are used for dictionary keys, notification and
exception names, and some method parameters that take strings. You
should always prefer string constants over string literals when you
have a choice. By using string constants, you enlist the help of the
compiler to check your spelling and thus avoid runtime errors.
It says that literals are error prone. But it doesn't say that they are also slower. Compare:
// string literal
[dic objectForKey:#"a"];
// string constant
NSString *const a = #"a";
[dic objectForKey:a];
In the second case I'm using keys with const pointers, so instead [a isEqualToString:b], I can do (a==b). The implementation of isEqualToString: compares the hash and then runs the C function strcmp, so it is slower than comparing the pointers directly. Which is why constant strings are better: they are faster to compare and less prone to errors.
If you also want your constant string to be global, do it like this:
// header
extern NSString *const name;
// implementation
NSString *const name = #"john";
Let's use C++, since my Objective C is totally non-existent.
If you stash a string into a constant variable:
const std::string mystring = "my string";
Now when you call methods, you use my_string, you're using a string constant:
someMethod(mystring);
Or, you can call those methods with the string literal directly:
someMethod("my string");
The reason, presumably, that they encourage you to use string constants is because Objective C doesn't do "interning"; that is, when you use the same string literal in several places, it's actually a different pointer pointing to a separate copy of the string.
For dictionary keys, this makes a huge difference, because if I can see the two pointers are pointing to the same thing, that's much cheaper than having to do a whole string comparison to make sure the strings have equal value.
Edit: Mike, in C# strings are immutable, and literal strings with identical values all end pointing at the same string value. I imagine that's true for other languages as well that have immutable strings. In Ruby, which has mutable strings, they offer a new data-type: symbols ("foo" vs. :foo, where the former is a mutable string, and the latter is an immutable identifier often used for Hash keys).