Why shouldn't you use objc_msgSend() in Objective C? - objective-c

The Objective C Runtime Guide from Apple, states that you should never use objc_msgSend() in your own code, and recommends using methodForSelector: instead. However, it doesn't provide any reason for this.
What are the dangers of calling objc_msgSend() in your code?

Reason #1: Bad style - it's redundant and unreadable.
The compiler automatically generates calls to objc_msgSend() (or some variant thereof) when it encounters Objective-C messaging expressions. If you know the class and the selector to be sent at compile-time, there's no reason to write
id obj = objc_msgSend(objc_msgSend([NSObject class], #selector(alloc)), #selector(init));
instead of
id obj = [[NSObject alloc] init];
Even if you don't know the class or the selector (or even both), it's still safer (at least the compiler has a chance to warn you if you are doing something potentially nasty/wrong) to obtain a correctly typed function pointer to the implementation itself and use that function pointer instead:
const char *(*fptr)(NSString *, SEL) = [NSString instanceMethodForSelector:#selector(UTF8String)];
const char *cstr = fptr(#"Foo");
This is especially true when the types of the arguments of a method are sensitive to default promotions - if they are, then you don't want to pass them through the variadic arguments objc_msgSend() takes, because your program will quickly invoke undefined behavior.
Reason #2: dangerous and error-prone.
Notice the "or some variant thereof" part in #1. Not all message sends use the objc_msgSend() function itself. Due to complications and requirements in the ABI (in the calling convention of functions, in particular), there are separate functions for returning, for example, floating-point values or structures. For example, in the case of a method that performs some sort of searching (substrings, etc.), and it returns an NSRange structure, depending on the platform, it may be necessary to use the structure-returning version of the messenger function:
NSRange retval;
objc_msgSend_stret(&retval, #"FooBar", #selector(rangeOfString:), #"Bar");
And if you get this wrong (e. g. you use the inappropriate messenger function, you mix up the pointers to the return value and to self, etc.), your program will likely behave incorrectly and/or crash. (And you will most probably get it wrong, because it's not even that simple - not all methods returning a struct use this variant, since small structures will fit into one or two processor registers, eliminating the need for using the stack as the place of the return value. That's why - unless you are a hardcore ABI hacker - you rather want to let the compiler do its job, or there be dragons.)

You ask "what are the dangers?" and #H2CO3 has listed some ending with "unless you are a hardcore ABI hacker"...
As with many rules there are exceptions (and possibly a few more under ARC). So your reasoning for using msgSend should go something along the lines of:
[ 1] I think I should use msgSend - don't.
[2] But I've a case here... - you probably haven't, keep looking for another solution.
...
[10] I really think I should use it here - think again.
...
[100] Really, this looks like a case for msgSend, I can't see any other solution! OK, go read Document.m in the TextEdit code sample from Apple. Do you know why they used msgSend? Are you sure... think again...
...
[1000] I understand why Apple used it, and my case is similar... You've found and understood the exception that proves the rule and your case matches, use it!
HTH

I can make a case. We used msgSend in one of our C++ files (before we switched to ARC) that's in a cross-platform project (Windows, Mac and Linux). We use it to ref count a reference in the backed (the shared code) that's used later to go from frontend to backend and vice versa. Very special case, admittedly.

Related

Is a variable silently declared for you by the compiler/runtime when you don't declare one?

When I have a method with just a return statement and a value:
-(id)doSomethingCool
{
return [someArray objectAtIndex:2];
}
... is the compiler (or runtime) actually adding an intermediate variable behind the scenes:
-(id)doSomethingCool
{
id someObject = [someArray objectAtIndex:2];
return someObject;
}
I'm guessing at the assembly level it might be doing something like this?
I realize this is an obscure and probably performance-insignificant issue for 99% of applications, but I'm still curious what actually happens behind the curtains in Objective-C if anyone knows.
As an aside, is the only reason people do the first technique just for shorthand convenience, even if over tens of millions of iterations it would be no different had they done it the second way?
Conceptually, that's basically what happens. The value returned from the function is a temporary value. It's actually a copy of whatever value you're returning, which exists until the expression that the method call is used in finishes.
In practice, when you compile with optimizations turned on (in Release mode), the two examples you give will generate identical object code. The difference between the two is largely just down to style, though explicitly storing values in local variables can be useful in debugging.

What is the biggest advantage of using pointers in ObjectiveC

I realize 99% of you think "what the h***…" But please help me to get my head around the this concept of using pointers. I'm sure my specific question would help lots of newbies.
I understand what pointers ARE and that they are a reference to an adress in memory and that by using the (*) operator you can get the value in that address.
Let's say:
int counter = 10;
int *somePointer = &counter;
Now I have the address in memory of counter, and I can indirectly point to its value by doing this:
int x = *somePointer;
Which makes x = 10, right?
But this is the most basic example, and for this case I could use int x = counter; and get that value, so please explain why pointers really are such an important thing in Objective-C and some other languages... in what case would only a pointer make sense?
Appreciate it.
Objective-C has pointers because it is an evolution of C, which used pointers extensively. The advantage of a pointer in an object-oriented language like Objective-C is that after you create an object, you can pass around a pointer to the object instead of passing around the object itself. In other words, if you have some object that takes up a large amount of storage space, passing around a pointer is a lot more memory-efficient than passing around a copy of the object itself. This may not be noticeable in simple cases when you’re only dealing with primitive types like ints, but when you start dealing with more complex objects the memory and time savings are enormous.
More importantly, pointers make it much easier for different parts of your code to talk to each other. If variables could only be passed to functions “by value” instead of “by reference” (which is what happens when you use pointers), then functions could never alter their inputs. They could only change the state of your program by either returning a value or by changing a global variable—the overuse of which generally leads to sloppy, unorganized code.
Here’s a concrete example. Suppose you have an Objective-C method that will parse a JSON string and return an NSDictionary:
+ (NSDictionary *)parseJsonString:(NSString *)json
error:(NSError **)error;
The method will do the parsing and return an NSDictionary if everything goes okay. But what if there’s some problem with the input string? We want a way to indicate to the user (or at least to the programmer) what happened, so we have a pointer to a pointer to an NSError, which will contain that information. If our method fails (probably returning nil), we can dereference the error parameter to see what went wrong. What we’ve effectively done is to give our method two different kinds of return values: usually, it will return an NSDictionary, but it could also return an NSError.
If you want to read more about this, you may have better luck searching for “pointers in C” rather than “pointers in Objective-C”; pointers are of course used extensively in Objective-C, but all of the underlying machinery is identical to that of C itself.
What is the biggest advantage of using pointers in ObjectiveC
I'd say the biggest advantage is that you can use Objective-C at all - all Objective-C objects are pointers are accessed using pointers (the compiler and the runtime won't let you create objects statically), so you wouldn't get any further without them...
Item:
What if I told you to write me a program that would maintain a set of counters, but the number of counters would be entered by the user when he started the program. We code this with an array of integers allocated on the heap.
int *counters = malloc(numOfCounters * sizeof(int));
Malloc works with memory directly, so it by nature returns a pointer. All Objective-C objects are heap-allocated with malloc, so these are always pointers.
Item:
What if I told you to write me a function that read a file, and then ran another function when it was done. However, this other function was unknown and would be added by other people, people I didn't even know.
For this we have the "callback". You'd write a function that looked like this:
int ReadAndCallBack(FILE *fileToRead, int numBytes, int whence, void(*callback)(char *));
That last argument is a pointer to a function. When someone calls the function you've written, they do something like this:
void MyDataFunction(char *dataToProcess);
ReadAndCallBack(myFile, 1024, 0, MyDataFunction);
Item:
Passing a pointer as a function argument is the most common way of returning multiple values from a function. In the Carbon libraries on OSX, almost all of the library functions return an error status, which poses a problem if a library function has to return something useful to the programmer. So you pass the address where you'd like the function to hand information back to you...
int size = 0;
int error = GetFileSize(afilePath,&size);
If the function call returns an error, it is in error, if there was no error, error will probably be zero and size will contain what we need.
The biggest advantage of pointers in Objective-C, or in any language with dynamic allocation, is that your program can handle more items than the names that you invent in your source code.

Objective-C's "obj performSelector" vs objc_msgSend( )?

Going through Apache Cordova's source code, I ran into two lines of code that I'm puzzled about:
//[obj performSelector:normalSelector withObject:command];
objc_msgSend(obj,normalSelector,command);
From Apple's documentation, there doesn't seem to be a lot of difference between these two methods.
id objc_msgSend(id theReceiver, SEL theSelector, ...)
Sends a message with a simple return value to an instance of a class.
- (id)performSelector:(SEL)aSelectorwithObject:(id)anObject
Sends a message to the receiver with an object as the argument. (required)
What exactly is the difference between these two methods? In the case above, both are sending messages with an object as an argument to a receiving object.
You're asking the difference between two "methods" but only one of them is actually a method. The objc_msgSend function is, well, a function. Not a method.
The objc_msgSend function is the function that you actually call when you invoke any method on any object in Objective C. For example, the following two are basically equivalent:
// This is what the compiler generates
objc_msgSend(obj, #selector(sel:), param);
// This is what you write
[obj sel:param];
// You can check the assembly output, they are *almost* identical!
The major difference here is that objc_msgSend does not get type checked by the compiler -- or at least, its arguments don't get type checked against the selector's parameter types. So the following are roughly equivalent:
[obj performSelector:normalSelector withObject:command];
objc_msgSend(obj, #selector(performSelector:withObject:),
normalSelector, command);
But, that's a bit of a waste, since all performSelector:withObject: does is call objc_msgSend.
HOWEVER: You should stay away from obc_msgSend because it is not type-safe, as mentioned above. All the apache devs are doing is removing a single method call, which will only give you very slight performance benefits in most cases.
The commented out line is correct, the objc_msgSend() line is incorrect in that it needs to be explicitly typed (varargs are not compatible with non-varargs function calls on some platforms sometimes).
Effectively they do the same thing. Really, the method call version is just a wrapper around objc_msgSend().

Objective-C String Differences

What's the difference between NSString *myString = #"Johnny Appleseed" versus NSString *myString = [NSString stringWithString: #"Johnny Appleseed"]?
Where's a good case to use either one?
The other answers here are correct. A case where you would use +stringWithString: is to obtain an immutable copy of a string which might be mutable.
In the first case, you are getting a pointer to a constant NSString. As long as your program runs, myString will be a valid pointer. In the second, you are creating an autoreleased NSString object with a constant string as a template. In that case, myString won't point to a real object anymore after the current run loop ends.
Edit: As many people have noted, the normal implementation of stringWithString: just returns a pointer to the constant string, so under normal circumstances, your two examples are exactly the same. There is a bit of a subtle difference in that Objective-C allows methods of a class to be replaced using categories and allows whole classes to be replaced with class_poseAs. In those cases, you might run into a non-default implementation of stringWithString:, which may have different semantics than you expect it to. Just because it happens to be that the default implementation does the same thing as a simple assignment doesn't mean that you should rely on subtle implementation-specific behaviour in your program - use the right case for the particular job you're trying to do.
Other than syntax and a very very minor difference in performance, nothing. The both produce the exact same pointer to the exact same object.
Use the first example. It's easier to read.
In practice, nothing. You wouldn't ever use the second form, really, unless you had some special reason to. And I can't think of any right now.
(See Carl's answer for the technical difference.)

using objc_msgSend to call a Objective C function with named arguments

I want to add scripting support for an Objective-C project using the objc runtime. Now I face the problem, that I don't have a clue, how I should call an Objective-C method which takes several named arguments.
So for example the following objective-c call
[object foo:bar];
could be called from C with:
objc_msgSend(object, sel_getUid("foo:"), bar);
But how would I do something similar for the method call:
[object foo:var bar:var2 err:errVar];
??
Best Markus
The accepted answer is close, but it won't work properly for certain types. For example, if the method is declared to take a float as its second argument, this won't work.
To properly use objc_msgSend, you have to cast it to the the appropriate type. For example, if your method is declared as
- (void)foo:(id)foo bar:(float)bar err:(NSError **)err
then you would need to do something like this:
void (*objc_msgSendTyped)(id self, SEL _cmd, id foo, float bar, NSError**error) = (void*)objc_msgSend;
objc_msgSendTyped(self, #selector(foo:bar:err:), foo, bar, error);
Try the above case with just objc_msgSend, and log out the received arguments. You won't see the correct values in the called function. This unusual casting situation arises because objc_msgSend is not intended to be called like a normal C function. It is (and must be) implemented in assembly, and just jumps to a target C function after fiddling with a few registers. In particular, there is no consistent way to refer to any argument past the first two from within objc_msgSend.
Another case where just calling objc_msgSend straight wouldn't work is a method that returns an NSRect, say, because objc_msgSend is not used in that case, objc_msgSend_stret is. In the underlying C function for a method that returns an NSRect, the first argument is actually a pointer to an out value NSRect, and the function itself actually returns void. You must match this convention when calling because it's what the called method will assume. Further, the circumstances in which objc_msgSend_stret is used differ between architectures. There is also an objc_msgSend_fpret, which should be used for methods that return certain floating point types on certain architectures.
Now, since you're trying to do a scripting bridge thing, you probably cannot explicitly cast every case you run across, you want a general solution. All in all, this is not completely trivial, and unfortunately your code has to be specialized to each architecture you wish to target (e.g. i386, x86_64, ppc). Your best bet is probably to see how PyObjC does it. You'll also want to take a look at libffi. It's probably a good idea to understand a little bit more about how parameters are passed in C, which you can read about in the Mac OS X ABI Guide. Last, Greg Parker, who works on the objc runtime, has written a bunch of very nice posts on objc internals.
objc_msgSend(object, sel_getUid("foo:bar:err:"), var, var2, errVar);
If one of the variables is a float, you need to use #Ken's method, or cheat by a reinterpret-cast:
objc_msgSend(..., *(int*)&var, ...)
Also, if the selector returns a float, you may need to use objc_msgSend_fpret, and if it returns a struct you must use objc_msgSend_stret. If that is a call to superclass you need to use objc_msgSendSuper2.
objc_msgSend(obj, #selector(foo:bar:err:), var, var2, &errVar);