Can Foundation tell me whether an Objective-C method requires a special structure return? - objective-c

Background as I understand it: Objective-C method invocations are basically a C function call with two hidden parameters (the receiver and the selector). The Objective-C runtime contains a function named objc_msgSend() that allows to invoke methods that way. Unfortunately, when a function returns a struct some special treatment may be needed. There are arcane (some might say insane) rules that govern whether the structure is returned like other values or whether it's actually returned by reference in a hidden first argument. For Objective-C there's another function called objc_msgSend_stret() that must be used in these cases.
The question: Given a method, can NSMethodSignature or something else tell me whether I have to use objc_msgSend() or objc_msgSend_stret()? So far we have found out that NSMethodSignature knows this, it prints it in its debug output, but there doesn't seem to be a public API.
In case you want to respond with "why on earth would you want to do that?!", please read the following before you do: https://github.com/erikdoe/ocmock/pull/41

Objective-C uses the same underlying ABI for C on a given architecture, because methods are just C functions with implicit self and _cmd arguments.
In other words, if you have a method:
- (SomeStructType)myMeth:(SomeArgType)arg;
then really this is a plain C function:
SomeStructType myMeth(id self, SEL _cmd, SomeArgType arg);
I'm pretty sure you already know that, but I'm merely mentioning it for other readers.
In other words, you want to ask libffi or any kind of similar library how SomeStructType would be returned for that architecture.

NSMethodSignature has a -methodReturnType that you can inspect to see if the return type is a struct. Is this what you're trying to do?

From http://www.sealiesoftware.com/blog/archive/2008/10/30/objc_explain_objc_msgSend_stret.html:
The rules for which struct types return in registers are always
arcane, sometimes insane. ppc32 is trivial: structs never return in
registers. i386 is straightforward: structs with sizeof exactly equal
to 1, 2, 4, or 8 return in registers. x86_64 is more complicated,
including rules for returning floating-point struct fields in FPU
registers, and ppc64's rules and exceptions will make your head spin.
The gory details are documented in the Mac OS X ABI Guide, though as
usual if the documentation and the compiler disagree then the
documentation is wrong.
If you're calling objc_msgSend directly and need to know whether to
use objc_msgSend_stret for a particular struct type, I recommend the
empirical approach: write a line of code that calls your method,
compile it on each architecture you care about, and look at the
assembly code to see which dispatch function the compiler uses.

Related

Objective C - What is the difference between IMP and function pointer?

I recently started a project where I require to do swizzling.
After going through many tutorials I got a question, What is the difference between Implementation and function pointer?
From memory, an IMP is a memory-address just like a function pointer, and can be invoked just like an ordinary C function. However it is guaranteed to use objective-C messaging convention, where:
The first argument is the object to operate on (self).
The second argument is the _cmd (SELECTOR) to be invoked. I believe this is so to support dynamic features, such as ObjC message forwarding where we could wrap the original implementation in a proxy, say to start a transaction or perform a security check, or, for a Cocoa specific example, add some property observation cruft, by magic, at run-time. While we already have the function signature, I could be helpful, in some cases, to know "how did I get here?" with the message signature.
Following arguments, if any, are according to the method contract.

Why does method_getNumberOfArguments return two more results than the selector would imply?

In the objective-C runtime, why does method_getNumberOfArguments return two more results than the selector would imply?
For example, why does #selector(initWithPrice:color:) return 4?
TL;DR
Alright. Just to set the record straight, yes, the first two arguments to any objective-c method are self and _cmd, always in that order.
A brief history of Objective-C
However, the more interesting subject is the why to this scenario. To do that, we must first look into the history of objc. Without further ado, let's get started.
Way back in 1983, Brad Cox, the 'God' of objective-c, wanted to create an object-oriented runtime-based language on top of C, for good performance and flexibility across platforms. As a result, the very first Objective-C 'compilers' were just simple preprocessors of Objective-C source converted to their C-runtime equivalents, and then compiled with the platform specific C compiler tool.
However, C was not designed for objects, and that was the most fundamental thing that Objective-C had to surmount. While C is a robust and flexible language, runtime support is one of it's critical downfalls.
During the very early design phase of Objective-C, it was decided that objects would be a purely heap-based pointer design, so that they could be passed between any function without weird copy semantics and such (this changed a bit with Obj-C++ and ARC, but that's too wide of a scope for this post), and that every method should be self aware (acually, as bbum points out, it was an optimization for using the same stack frame as the original function call), so that you could have, in theory, multiple method names mapped to the same selector, as follows:
// this is a completely valid objc 1.0 method declaration
void *nameOrAge(id self, SEL _cmd) {
if (_cmd == #selector(name)) {
return "Richard";
}
if (_cmd == #selector(age)) {
return (void *) (intptr_t) 16;
}
return NULL;
}
This function, then could be theoretically mapped to two selectors, name and age, and perform conditional code based on which one is invoked. In general Objective-C code, this is not too big of a deal, as it's quite difficult with ARC now to map functions to selectors, due to casting and such, but the language has evolved quite a bit from then.
Hopefully, that helps you to understand the why behind the two 'invisible' arguments to an Objective-C method, with the first one being the object that was invoked, and the second one being the method that was invoked on that object.
The first two arguments are the hidden arguments self and _cmd.

Is Objective-C converted to C code before compilation?

I know objective C is strict superset of C and we are writing the same thing.
But when i write
#interface Myintf {}
#end
Does it get converted to a C struct or is it that the memory layout for the data structure Myintf prepared by Objective c compiler is same as that of a C struct defined in runtime.h?
and same question about objc_msgsend
Apple document says
In Objective-C, messages aren’t bound to method implementations until runtime. The compiler converts a message expression, into a call on a messaging function, objc_msgSend. This function takes the receiver and the name of the method mentioned in the message—that is, the method selector—as its two principal parameters:
Does it get converted to a C struct
In the old days it used to, but with the modern runtime (OS X 10.5+ 64 bit and iOS), I think it's a bit more complicated. It can't be using a traditional struct because the Fragile Instance Variable problem is solved.
and same question about objc_msgsend
Yes. All method invocations - or more correctly, all message sends - are converted into calls to obj_msgsend() (except for when super is used as the receiver when a different C function is used).
Note that early implementations of Objective-C were implemented as a preprocessor and produced C source code as an intermediate step. The modern compiler does not bother with this and goes straight from Objective-C source code to an object code format.
No and no. Both cases ultimately rely on the runtime. In a way, it is converted to use C interfaces, but there is a level of abstraction introduced -- it's not entirely static.
It will help to look at the assembly generated by the compiler to see how this works in more detail.
Given the declaration:
id objc_msgSend(id theReceiver, SEL theSelector, ...);
The compiler inserts a call to objc_msgSend when your implementation calls a method. It is not reduced to a static C function call, but dynamic dispatch -- another level of indirection, if you like to think of it that way.

Objective-C: Is there an -invoke on blocks that takes parameters?

As you may be aware, blocks take -invoke:
void(^foo)() = ^{
NSLog(#"Do stuff");
};
[foo invoke]; // Logs 'Do stuff'
I would like to do the following:
void(^bar)(int) = ^(int k) {
NSLog(#"%d", k);
};
[bar invokeWithParameters:7]; // Want it to log '7', but no such instance method
The ordinary argument-less -invoke works on bar, but it prints a nonsense value.
I can't find a direct message of this kind I can send to a block, nor can I find the original documentation that would describe how blocks take -invoke.
Is there a list of messages accepted by blocks?
(Yes, I have tried to use class_copyMethodList to extract a list of methods from the runtime; there appear to be none.)
Edit: Yes, I'm also aware of invoking the block the usual way (bar(7);). What I'm really after is a selector for a method I can feed into library code that doesn't take blocks (per-se).
You can invoke it like a function:
bar(7);
There's even an example in the documentation that uses exactly the same signature. See Declaring and Using a Block.
The best reference on the behavior of blocks is the Block Language Specification(RTF) document. This mentions certain methods that are supported (copy, retain, etc.) but nothing about an -invoke method.
A blocks very definition is the sum total of "messages" that the block can receive, in terms of the calling parameters/ABI.
This is for a couple of reasons:
First, a block is not a function and a block pointer is not a function pointer. They cannot be used interchangeably.
Secondly, the C ABI is such that you have to have a declaration of the function begin called when the call site is being compiled if the parameters are to be encoded correctly.
The alternative is to use something like NSInvocation, which allows the arguments to be encoded individually, but even that still requires full C ABI knowledge for each individual argument.
Ultimately, if you can compile a call site that has all the parameters, be it an Objective-C method or a function call, to the fidelity necessary to make the compiler happy, you can convert that call site into a call to the block.
I.e. unless you clarify your question a bit, what you are asking for is either already supported or nigh impossible due to the vagaries of the C ABI.

How to implement an IMP function that returns a large struct type determined at run-time?

Background: CamelBones registers Perl classes with the Objective-C runtime.
To do this, every Perl method is registered with the same IMP
function; that function examines its self & _cmd arguments to find
which Perl method to call.
This has worked well enough for several years, for messages that were
dispatched with objc_msgSend. But now I want to add support for
returning floating-point and large struct types from Perl methods.
Floating-point isn't hard; I'll simply write another IMP that returns
double, to handle messages dispatched with objc_msgSend_fpret.
The question is what to do about objc_msgSend_stret. Writing a
separate IMP for every possible struct return type is impractical, for
two reasons: First, because even if I did so only for struct types
that are known at compile-time, that's an absurd number of functions.
And second, because we're talking about a framework that can be linked against any arbitrary Objective-C & Perl code, we don't know all the potential struct types when the framework is being compiled.
What I hope to do is write a single IMP that can handle any return
type that's dispatched via objc_msgSend_stret. Could I write it as
returning void, and taking a pointer argument to a return buffer, like
the old objc_msgSend_stret was declared? Even if that happened to
work for now, could I rely on it continuing to work in the future?
Thanks for any advice - I've been racking my brain on this one. :-)
Update:
Here's the advice I received from one of Apple's runtime engineers, on their objc-language mailing list:
You must write assembly code to handle
this case.
Your suggestion fails on some
architectures, where ABI for "function
returning void with a pointer to a
struct as the first argument" differs
from "function returning a struct".
(On i386, the struct address is popped
from the stack by the caller in one
case and by the callee in the other
case.) That's why the prototype for
objc_msgSend_stret was changed.
The assembly code would capture the
struct return address, smuggle it into
non-struct-return C function call
without disturbing the rest of the
parameters, and then do the right
ABI-specific cleanup on exit (ret $4
on i386). Alternatively, the assembly
code can capture all of the
parameters. The forwarding machinery
does something like this. That code
might be in open-source CoreFoundation
if you want to see what the techniques
look like.
I'll leave this question open, in case someone brainstorms a better idea, but with this coming directly from Apple's own "runtime wrangler," I figure it's probably as authoritative an answer as I'm likely to get. Time to dust off the x86 reference manuals and knock the rust off my assembler-fu, I guess...
It seems that the Apple engineer is right: the only to way to go is assembly code. Here are some usefull pointers to getting started:
From the Objective-C runtime code: The i386 and x86_64 hand-crafted messenger assmbly stubs for the various messaging methods.
An SO answer that provides an overview of the dispatching.
A in-depth review of the dispatching mecanism with a line-by-line analysis of the assembly code
Hope it helps.