Why does method_getNumberOfArguments return two more results than the selector would imply? - objective-c

In the objective-C runtime, why does method_getNumberOfArguments return two more results than the selector would imply?
For example, why does #selector(initWithPrice:color:) return 4?

TL;DR
Alright. Just to set the record straight, yes, the first two arguments to any objective-c method are self and _cmd, always in that order.
A brief history of Objective-C
However, the more interesting subject is the why to this scenario. To do that, we must first look into the history of objc. Without further ado, let's get started.
Way back in 1983, Brad Cox, the 'God' of objective-c, wanted to create an object-oriented runtime-based language on top of C, for good performance and flexibility across platforms. As a result, the very first Objective-C 'compilers' were just simple preprocessors of Objective-C source converted to their C-runtime equivalents, and then compiled with the platform specific C compiler tool.
However, C was not designed for objects, and that was the most fundamental thing that Objective-C had to surmount. While C is a robust and flexible language, runtime support is one of it's critical downfalls.
During the very early design phase of Objective-C, it was decided that objects would be a purely heap-based pointer design, so that they could be passed between any function without weird copy semantics and such (this changed a bit with Obj-C++ and ARC, but that's too wide of a scope for this post), and that every method should be self aware (acually, as bbum points out, it was an optimization for using the same stack frame as the original function call), so that you could have, in theory, multiple method names mapped to the same selector, as follows:
// this is a completely valid objc 1.0 method declaration
void *nameOrAge(id self, SEL _cmd) {
if (_cmd == #selector(name)) {
return "Richard";
}
if (_cmd == #selector(age)) {
return (void *) (intptr_t) 16;
}
return NULL;
}
This function, then could be theoretically mapped to two selectors, name and age, and perform conditional code based on which one is invoked. In general Objective-C code, this is not too big of a deal, as it's quite difficult with ARC now to map functions to selectors, due to casting and such, but the language has evolved quite a bit from then.
Hopefully, that helps you to understand the why behind the two 'invisible' arguments to an Objective-C method, with the first one being the object that was invoked, and the second one being the method that was invoked on that object.

The first two arguments are the hidden arguments self and _cmd.

Related

Return type differs in implementation for objective C protocols

I have a protocol that has a method returning NSArray*.
In the implementation I had made the return type of that method to be NSView*
I see this is happening only in case of Objective C class pointers and not in other cases like returning void vs returning int.
I would expect a complier warning at the minimum but the compilation happens just fine.
#protocol prot <NSObject>
-(NSArray*)array;
#end
#interface impl : NSObject<prot>
#end
#implementation impl
//Should return NSArray. Returns NSView instead
- (NSView *)array
{
return nil;
}
#end
First things first:
impl should be Implementation since class names are written in upper camel case and abbreviations are bad(TM). Moreover, Class is a class pointer, NSView* and NSArray* are instance pointers.
To your Q, even I'm a bit tired of this discussion (dynamic vs. static typing, early vs. late binding):
A: Why should the compiler warn? Both are instance pointers and maybe the messages sent to the object are supported by both. The compiler does not care about binding, it is done at runtime.
B: But this is very unsafe!
A: Did you ever ship code with such an error?
B: No. But it is unsafe by theory.
A: Yes, that's true for alle theories that ship code without running it at least one time.
B: But you have to admit, that this is more unsafe than type checking at compile time.
A: Yes, theoretically that's true.
B: So why do you support it?
A: Because there are many situations in which dynamic typing has advantages. I. e. it is very easy to write generic code without having templates. (Even sometimes they are called generics, they are still silly templates.) It is very easy to give around responsibility, what needs contra-conceptual extensions in other languages (signals & slots in C++, delegates in C#, …) It is very easy to create stand-in objects for lowering memory pressure. It is very easy to write an ORIM. Shall I continue?
B: Yes
A: Is is that flexible that you can write a whole AOP framework within that language. It is that flexible that you can write a prototype based framework within that language.
However, sometimes it is easy to detect for the compiler that something makes no sense at all. And sometimes the compiler warns about that. But in many cases the compiler is not more intelligent than the developer.
Agreed that it should generate a warning, but it doesn't. Part of the issue is that all ObjC objects are id at runtime, which is why you're seeing different behavior for int (which isn't id). But that's not really an excuse. It's a limitation of the compiler. There are numerous places where it doesn't do a good job of distinguishing between ObjC object types. ObjC objects are duck-typed, so as long as they respond to the right messages "they work."
Sometimes this is a benefit; for example, NSArray is actually a class cluster, and there are several (private) types that pretend to be NSArray by just implementing the same interface. That's something that is easy in ObjC, but hard in Swift. Still no excuse, since it would be easy to get that benefit without this frustrating lack of a compiler warning, but it gets back to how ObjC thinks about class types.
This limitation is fixed in Swift, and another benefit of moving over, but that doesn't really help you, I know.

Can Foundation tell me whether an Objective-C method requires a special structure return?

Background as I understand it: Objective-C method invocations are basically a C function call with two hidden parameters (the receiver and the selector). The Objective-C runtime contains a function named objc_msgSend() that allows to invoke methods that way. Unfortunately, when a function returns a struct some special treatment may be needed. There are arcane (some might say insane) rules that govern whether the structure is returned like other values or whether it's actually returned by reference in a hidden first argument. For Objective-C there's another function called objc_msgSend_stret() that must be used in these cases.
The question: Given a method, can NSMethodSignature or something else tell me whether I have to use objc_msgSend() or objc_msgSend_stret()? So far we have found out that NSMethodSignature knows this, it prints it in its debug output, but there doesn't seem to be a public API.
In case you want to respond with "why on earth would you want to do that?!", please read the following before you do: https://github.com/erikdoe/ocmock/pull/41
Objective-C uses the same underlying ABI for C on a given architecture, because methods are just C functions with implicit self and _cmd arguments.
In other words, if you have a method:
- (SomeStructType)myMeth:(SomeArgType)arg;
then really this is a plain C function:
SomeStructType myMeth(id self, SEL _cmd, SomeArgType arg);
I'm pretty sure you already know that, but I'm merely mentioning it for other readers.
In other words, you want to ask libffi or any kind of similar library how SomeStructType would be returned for that architecture.
NSMethodSignature has a -methodReturnType that you can inspect to see if the return type is a struct. Is this what you're trying to do?
From http://www.sealiesoftware.com/blog/archive/2008/10/30/objc_explain_objc_msgSend_stret.html:
The rules for which struct types return in registers are always
arcane, sometimes insane. ppc32 is trivial: structs never return in
registers. i386 is straightforward: structs with sizeof exactly equal
to 1, 2, 4, or 8 return in registers. x86_64 is more complicated,
including rules for returning floating-point struct fields in FPU
registers, and ppc64's rules and exceptions will make your head spin.
The gory details are documented in the Mac OS X ABI Guide, though as
usual if the documentation and the compiler disagree then the
documentation is wrong.
If you're calling objc_msgSend directly and need to know whether to
use objc_msgSend_stret for a particular struct type, I recommend the
empirical approach: write a line of code that calls your method,
compile it on each architecture you care about, and look at the
assembly code to see which dispatch function the compiler uses.

Objective-C: Checking class type, better to use isKindOfClass, or respondsToSelector?

Is it more appropriate to check a class's type by calling isKindOfClass:, or take the "duck typing" approach by just checking whether it supports the method you're looking for via respondsToSelector: ?
Here's the code I'm thinking of, written both ways:
for (id widget in self.widgets)
{
[self tryToRefresh:widget];
// Does this widget have sources? Refresh them, too.
if ([widget isKindOfClass:[WidgetWithSources class]])
{
for (Source* source in [widget sources])
{
[self tryToRefresh:source];
}
}
}
Alternatively:
for (id widget in self.widgets)
{
[self tryToRefresh:widget];
// Does this widget have sources? Refresh them, too.
if ([widget respondsToSelector:(#selector(sources))])
{
for (Source* source in [widget sources])
{
[self tryToRefresh:source];
}
}
}
It depends on the situation!
My rule of thumb would be, is this just for me, or am I passing it along to someone else?
In your example, respondsToSelector: is fine, since all you need to know is whether you can send the object that message, so you can do something with the result. The class isn't really that important.
On the other hand, if you were going to pass that object to some other piece of code, you don't necessarily know what messages it will be intending to send. In those cases, you would probably be casting the object in order to pass it along, which is probably a clue that you should check to see if it really isKindOfClass: before you cast it.
Another thing to consider is ambiguity; respondsToSelector: tells you an object will respond to a message, but it could generate a false positive if the object returns a different type than you expect. For example, an object that declares a method:
- (int)sources;
Would pass the respondsToSelector: test but then generate an exception when you try to use its return value in a for-in loop.
How likely is that to happen? It depends on your code, how large your project is, how many people are writing code against your API, etc.
It's slightly more idiomatic Objective C to use respondsToSelector:. Objective C is highly dynamic, so your design time assumptions about class structure may not necessarily hold water at run time. respondsToSelector: gets round that by giving you a shortcut to the most common reason for querying the type of a class - whether it performs some operation.
In general where there's ambiguity around a couple of equally appealing choices, go for readability. In this case that means thinking about intent. Do you care if it's specifically a WidgetWithSources, or do you really just care that it has a sources selector? If it's the latter, then use respondsToSelector:. If the former, and it may well be in some cases, then use isKindOfClass. Readability, in this case, means that you're not asking the reader to make the connection between type equivalence of WidgetWithSources and the need to call sources. respondsToSelector: makes that connection for the reader, letting them know what you actually intended. It's a small act of kindness towards your fellow programmer.
Edit: #benzado's answer is nicely congruent.
Good answers from #Tim & #benzado, here is a variation on the theme, the previously covered two cases first:
If at some point you have may have a reference to distinct classes and need them differently then this is probably a case for isKindOfClass: For example, an color might be stored in preferences as either an NSData serialization on an NSColor, or as an NSString value with one of the standard names; to obtain the NSColor value in this case isKindOfClass: on the object return is probably appropriate.
If you have a reference to a single class but different versions of it over time have supported different methods then consider respondsToSelector: For example, many framework classes add new methods in later versions of the OS and Apple's standard recommendation is to check for these methods using respondsToSelector: (and not an OS version check).
If you have a reference to distinct classes and you are testing if they adhere to some informal protocol then:
If this is code you control you can switch to a formal protocol and then use conformsToProtocol: as your test. This has the advantage of testing for type and not just name; otherwise
If this is code you do not control then use respondsToSelector:, but we aware that this is only testing that a method with the same name exists, not that it takes the same types of arguments.
Checking either might be a warning that you are about to make a hackish solution. The widget already knows his class and his selectors.
So a third option might be to consider refactoring. Moving this logic to a [widget tryToRefresh] may be cleaner and allow future widgets to implement additional behind the scenes logic.

C function vs. Objective-C method?

What is the difference between the two? If I'm writing a program, when would I need a this:
void aFunction() {
//do something
}
and when would I need this:
-(void)aMethod {
//do something else
}
Actually, an Objective-C method is just a C function with two arguments always present at the beginning.
This:
-(void)aMethod;
Is exactly equivalent to this:
void function(id self, SEL _cmd);
Objective-C's messaging is such that this:
[someObject aMethod];
Is exactly equivalent to this (almost -- there is a variadic argument ABI issue beyond the scope of this answer):
objc_msgSend(someObject, #selector(aMethod));
objc_msgSend() finds the appropriate implementation of the method (by looking it up on someObject) and then, through the magic of a tail call optimization, jumps to the implementation of the method which, for all intents and purposes, works exactly like a C function call that looks like this:
function(someObject, #selector(aMethod));
Quite literally, Objective-C was originally implemented as nothing but a C preprocessor. Anything you can do in Objective-C could be rewritten as straight C.
Doing so, however, would be a complete pain in the ass and not worth your time beyond the incredibly educational experience of doing so.
In general, you use Objective-C methods when talking to objects and function when working with straight C goop. Given that pretty much all of Mac OS X and iOS provide Objective-C APIs -- certainly entirely so for the UI level programming entry points -- then you use Obj-C most of the time.
Even when writing your own model level code that is relatively standalone, you'll typically use Objective-C simply because it provides a very natural glue between state/data & functionality, a fundamental tenant of object oriented programming.
In Objective-C each function operates on an object, like
[myObject myFunction]
A C method has the form:
return-type function-name(argument1, argument2, etc) {}
An Objective-C instance method has the form:
-(return-type)function-name:argument1 {}
or for a multi-argument function
-(return-type)function-name:argument1 function-name:argument2 {}
I always use Objective-C-style methods in Obj-C programming, even though you can still use C-type functions as well.
I suppose the equivalent in C to [myObject myMethod:arg] might be myObject.myMethod(arg)
The first is a freestanding function. The second is an instance method for an Objective-C class. So I guess you would need the second version if you're actually writing a class.

Why must the last part of an Objective-C method name take an argument (when there is more than one part)?

In Objective-C, you can't declare method names where the last component doesn't take an argument. For example, the following is illegal.
-(void)take:(id)theMoney andRun;
-(void)take:(id)yourMedicine andDontComplain;
Why was Objective-C designed this way? Was it just an artifact of Smalltalk that no one saw a need to be rid of?
This limitation makes sense in Smalltalk, since Smalltalk doesn't have delimiters around message invocation, so the final component would be interpreted as a unary message to the last argument. For example, BillyAndBobby take:'$100' andRun would be parsed as BillyAndBobby take:('$100' andRun). This doesn't matter in Objective-C where square brackets are required.
Supporting parameterless selector components wouldn't gain us much in all the usual ways a language is measured, as the method name a programmer picks (e.g. runWith: rather than take:andRun) doesn't affect the functional semantics of a program, nor the expressiveness of the language. Indeed, a program with parameterless components is alpha equivalent to one without. I'm thus not interested in answers that state such a feature isn't necessary (unless that was the stated reasons of the Objective-C designers; does anyone happen to know Brad Cox or Tom Love? Are they here?) or that say how to write method names so the feature isn't needed. The primary benefit is readability and writability (which is like readability, only... you know), as it would mean you could write method names that even more closely resemble natural language sentences. The likes of -(BOOL)applicationShouldTerminateAfterLastWindowClosed:(NSApplication*)theApplication (which Matt Gallagher points out on "Cocoa With Love" is a little bit confusing when you drop the formal parameter) could be named -(BOOL)application:(NSApplication*)theApplication shouldTerminateAfterLastWindowClosed, thus placing the parameter immediately next to the appropriate noun.
Apple's Objective-C runtime (for example) is perfectly capable of handling these kind of selectors, so why not the compiler? Why not support them in method names as well?
#import <Foundation/Foundation.h>
#import <objc/runtime.h>
#interface Potrzebie : NSObject
-(void)take:(id)thing;
#end
#implementation Potrzebie
+(void)initialize {
SEL take_andRun = NSSelectorFromString(#"take:andRun");
IMP take_ = class_getMethodImplementation(self, #selector(take:));
if (take_) {
if (NO == class_addMethod(self, take_andRun, take_, "##:#")) {
NSLog(#"Couldn't add selector '%#' to class %s.",
NSStringFromSelector(take_andRun),
class_getName(self));
}
} else {
NSLog(#"Couldn't find method 'take:'.");
}
}
-(void)take:(id)thing {
NSLog(#"-take: (actually %#) %#",NSStringFromSelector(_cmd), thing);
}
#end
int main() {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
Potrzebie *axolotl=[[Potrzebie alloc] init];
[axolotl take:#"paradichloroaminobenzaldehyde"];
[axolotl performSelector:NSSelectorFromString(#"take:andRun")
withObject:#"$100"];
[axolotl release];
[pool release];
return 0;
}
This is Brad Cox. My original answer misunderstood the question. I assumed reallyFast was a hardcoded extension to trigger faster messaging, not a kind of syntactic sugar. The real answer is that Smalltalk didn't support it, perhaps because its parser couldn't deal with the (assumed) ambiguity. Although OC's square brackets would remove any ambiguity, I simply didn't think of departing from Smalltalk's keyword structure.
21 years of programming Objective-C and this question has never crossed my mind. Given the language design, the compiler is right and the runtime functions are wrong ().
The notion of interleaved arguments with method names has always meant that, if there is at least one argument, the last argument is always the last part of the method invocation syntax.
Without thinking it through terribly much, I'd bet there are some syntactic bugaboos with not enforcing the current pattern. At the least, it would make the compiler harder to write in that any syntax which has optional elements interleaved with expressions is always harder to parse. There might even be an edge case that flat out prevents it. Certainly, Obj-C++ would make it more challenging, but that wasn't integrated with the language until years after the base syntax was already set in stone.
As far as why Objective-C was designed this way, I'd suspect the answer is that the original designers of the language just didn't consider allowing the interleaved syntax to go beyond that last argument.
That is a best guess. I'll ask one of 'em and update my answer when I find out more.
I asked Brad Cox about this and he was very generous in responding in detail (Thanks, Brad!!):
I was focused at that time on
duplicating as much of Smalltalk as
possible in C and doing that as
efficiently as possible. Any spare
cycles went into making ordinary
messaging fast. There was no thought
of a specialized messaging option
("reallyFast?" [bbum: I asked using 'doSomething:withSomething:reallyFast'
as the example]) since ordinary
messages were already as fast as they
could be. This involved hand-tuning
the assembler output of the C
proto-messager, which was such a
portability nightmare that some if not
all of that was later taken out. I do
recall the hand-hacked messager was
very fast; about the cost of two
function calls; one to get into the
messager logic and the rest for doing
method lookups once there.
Static typing enhancements were later
added on top of Smalltalk's pure
dynamic typing by Steve Naroff and
others. I had only limited involvement
in that.
Go read Brad's answer!
Just for your information, the runtime doesn't actually care about the selectors, any C string is valid, you could as well make a selector like that: "==+===+---__--¨¨¨¨¨^::::::" with no argument the runtime will accept it, the compiler just can't or else it's impossible to parse. There are absolutely no sanity check when it comes to selectors.
I assume they are not supported in Objective-C because they weren't available in Smalltalk, either. But that has a different reason than you think: they are not needed. What is needed is support for methods with 0, 1, 2, 3, ... arguments. For every number of arguments, there is already a working syntax to call them. Adding any other syntax would just cause unnecessary confusion.
If you wanted multi-word parameterless selectors, why stop with a single extra word? One might then ask that
[axolotl perform selector: Y with object: Y]
also becomes supported (i.e. that a selector is a sequence of words, some with colon and a parameter, and others not). While this would have been possible, I assume that nobody considered it worthwhile.