How does objective-c handle method resolution at run-time? - objective-c

I've read here recently that an objective-c object is stored on the heap as a struct. The struct contains the objects iVars, inherited iVars, and the isa pointer.
I'm trying to figure out when I send a message to this object, how does the run-time figure out the code to run?
I know there is a class object for each class. Is this also stored on the heap?
I think the way it works is that the run-time gets the isa pointer from the struct, uses this to call the message on the class object. Is this correct?

In short, every Objective-C instance has a pointer to its class. The class contains an inventory of metadata that includes all the methods that the class implements. When a message is sent to an object -- when a method is called -- the runtime uses the pointer to the class to lookup the method by name and call it, if it can be found. If it isn't found, the runtime looks to the superclass (which is a part of each class's metadata) on up the inheritance chain to NSObject. If the method ultimately can't be found, the runtime goes through a series of last ditch efforts to see if their is an alternative handler and eventually raises an exception, if not.
If you want more detail than that, I wrote up a multipart tour of exactly how Objective-C method dispatch works. It is slightly out of date -- doesn't deal with ARC, tagged pointers or blocks-as-IMP -- but still fully applicable.
Yes, classes are stored in the heap, but generally not in malloc()d memory. Classes are generally loaded as read-only, shared, memory. That is, there will be only one copy of the NSString class in memory for all applications running on the system. You can dynamically create classes on the fly and these will be in the regular heap, but it is atypical.

Related

Do Objective-C objects get their own copies of instance methods?

I'm new to Objective-C and was wondering if anyone could provide any information to clarify this for me. My (possibly wrong) understanding of object instantiation in other languages is that the object will get it's own copies of instance variables as well as instance methods, but I'm noticing that all the literature I've read thus far about Objective-C seems to indicate that the object only gets copies of instance variables, and that even when calling an instance method, program control reverts back to the original method defined inside the class itself. For example, this page from Apple's developer site shows program flow diagrams that suggest this:
https://developer.apple.com/library/mac/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/WorkingwithObjects/WorkingwithObjects.html#//apple_ref/doc/uid/TP40011210-CH4-SW1
Also in Kochan's "Programming in Objective-C", 6th ed., pg. 41, referring to an example fraction class and object, the author states that:
"The first message sends the setNumerator: message to myFraction...control is then sent to the setNumerator: method you defined for your Fraction class...Objective-C...knows that it's the method from this class to use because it knows that myFraction is an object from the Fraction class"
On pg. 42, he continues:
"When you allocate a new object...enough space is reserved in memory to store the object's data, which includes space for its instance variables, plus a little more..."
All of this would seem to indicate to me that there is only ever one copy of any method, the original method defined within the class, and when calling an instance method, Objective-C simply passes control to that original copy and temporarily "wires it" to the called object's instance variables. I know I may not be using the right terminology, but is this correct? It seems logical as creating multiple copies of the same methods would be a waste of memory, but this is causing me to rethink my entire understanding of object instantiation. Any input would be greatly appreciated! Thank you.
Your reasoning is correct. The instance methods are shared by all instances of a class. The reason is, as you suspect, that doing it the other way would be a massive waste of memory.
The temporary wiring you speak of is that each method has an additional hidden parameter passed to it: a pointer to the calling object. Since that gives the method access to the calling object, then it can easily access all of the necessary instance variables and all is well. Note that any static variable exists in only a single instance as well and if you are not aware of that, unexpected things can happen. However, regular local variables are not shared and are recreated for each call of a method.
Apple's documention on the topic is very good so have a look for more info.
Just think of a method as a set of instructions. There is no reason to have a copy of the same method for each object. I think you may be mistaken about other languages as well. Methods are associated with the class, not individual objects.
Yes, your thinking is more or less right (although it's simpler than that: behind the scenes in most such languages methods don't need to be "wired" to anything, they just take an extra parameter for self and insert struct lookups before references to instance variables).
What might be confusing you is that not all languages work this way, in their implementations and semantically. Object-oriented languages are (very roughly) divided into two camps: class-based, like Objective-C; and prototype-based, like Javascript. In the second camp of languages, a method or procedure really is an object in its own right and can often be assigned directly to an object's instance variables as well - there are no classes to lookup methods from, only objects and other objects, all with the same first-class status (this is an oversimplification, good languages still allow for sharing and efficiency).

Where and how are an Objective-C class's methods stored?

I know that when an object is instantiated on the heap, at the least enough memory is allocated to hold the object's ivars. My question is about how methods are stored by the compiler. Is there only one instance of method code in memory? Or is the code generated an intrinsic part of the object in memory, stored contiguously with the ivars and executed?
It seems like if the latter were the case, even trivial objects such as NSStrings would require a (relatively) large amount of memory (NSString inherits methods from NSObject, also).
Or is the method stored once in memory and passed a pointer to the object which owns it?
In a "standard" Objective-C runtime, every object contains, before any other instance variables, a pointer to the class it is a member of, as if the base Object class had an instance variable called:
Class isa;
Each object of a given class shares the same isa pointer.
The class contains a number of elements, including a pointer to the parent class, as well as an array of method lists. These methods are the ones implemented on this class specifically.
struct objc_class {
Class super_class;
...
struct objc_method_list **methodLists;
...
};
These method lists each contain an array of methods:
struct objc_method_list {
int method_count;
struct objc_method method_list[];
};
struct objc_method {
SEL method_name;
char *method_types;
IMP method_imp;
};
The IMP type here is a function pointer. It points to the (single) location in memory where the implementation of the method is stored, just like any other code.
A note: What I'm describing here is, in effect, the ObjC 1.0 runtime. The current version doesn't store classes and objects quite like this; it does a number of complicated, clever things to make method calls even faster. But what I'm describing still is still the spirit of how it works, if not the exact way it does.
I've also left out a few fields in some of these structures which just confused the situation (e.g, backwards compatibility and/or padding). Read the real headers if you want to see all the gory details.
Methods are stored once in memory. Depending on the platform, they are paged into RAM as needed. If you really want more details, read the Mach-O and runtime guides from Apple. It's not usually something programmers concern themselves with any more unless they're doing something pretty low-level.
Objects don't really "own" methods. I suppose you could think of it as classes owning methods, so if you have 400 NSStrings you still only have one copy of each method in RAM.
When a method gets called, the first parameter is the object pointer, self. That's how a method knows where the data is that it needs to operate on.

What happens when alloc or allocWithZone is called?

I wanted to know , How exactly does an Objective C object gets created. I have been reading different blog posts and apple docs but I could only find incomplete information here and there about ivar and objc_class structures ad various other runtime methods and structures.
But I still did not get, What happens when we call alloc on a Class and how are superclass data members added to the structure ?
If possible, can any one Explain this to me or point me to the source code of these methods that actually allocate memory ?
When alloc is called, it (as any other message send) first gets transformed (by the compiler) into one of the objc_msgSend* functions. This function will get the class structure pointer as its first argument, and #selector(alloc) as its second.
Then, objc_msgSend looks up the corresponding method implementation of +[class alloc], which is, in general, not overridden (custom initialization is conceptually done in -initWith...), so it will generally be +[NSObject alloc]. It is likely that alloc simply calls +[NSObject allocWithZone:]; that function's implementation might do the following steps:
1) It finds the class' istance size (probably via class_getInstanceSize())
2) It allocates memory, most likely using the class_createInstance() function. This function clears the allocated memory to zeroes (that's why, as the specs say, all your ivars are explicitly initialized to 0 on startup), then sets the newliy created object's isa pointer to the class structure itself.
3) The allocWithZone: methods returns the fresh object pointer to alloc
4) alloc returns the object pointer to the sender, most likely it will run into [Class initWith...:].
Hope this helps. Also, apart from the Obj-C runtime docs, don't forget to check the GNUstep NSObject implementations. That's a logic and possible way how the GNU people implemented it and how Apple might have implemented it.
Check out http://www.mikeash.com/pyblog/friday-qa-2009-03-13-intro-to-the-objective-c-runtime.html

Objective-C Find all init (constructor methods)

using the "Method * class_copyMethodList(Class cls, unsigned int *outCount)" function one can get a list of all methods that exist on an objective-C class.
I would like to know how to find which of these methods are constructors as I am writing an IOC container. I would like to determine the constructors and their parameter types.
I would like to know how to find which of these methods are
constructors as I am writing an IOC container. I would like to
determine the constructors and their parameter types.
In short, you can't. Or, at the least, you'll find that down this path lies madness.
First, Objective-C does not have constructors. It has initializers, sometimes many, and -- for a properly written class -- only one of which is the designated initializer. There is no way to identify the designated initializer at compile time or run time.
How do I use this with a Method * and no instantiated member of the
class?
You don't. First you allocate an instance of the class, then you initialize the instance.
Overall, this level of abstraction just isn't done in Objective-C outside of academic investigations. It can be done, but it is generally avoided because of the fragility of the resulting solution and the hairball of code-hell that is trying to dynamically support the underlying C ABI (go look at the source to libffi).
If you want to go down this path, then you are far better off either defining a custom abstract class that all of your containers will subclass that can provide the binding logic to the class behind it.
Or use protocols; i.e. a class could implement an IOCBean protocol and one method would be initIOCGoop that is the designated initializer goo.
Doing this generically for all classes is going to be rife with fragility, special cases, and will require a gigantic mess of code that will be difficult to maintain over time.
You can get the method signature by using the following method:
methodSignatureForSelector:
From the documentation:
An NSMethodSignature object records type information for the arguments and return value of a method. It is used to forward messages that the receiving object does not respond to—most notably in the case of distributed objects. You typically create an NSMethodSignature object using NSObject’s methodSignatureForSelector: instance method (on Mac OS X v10.5 and later you can also use signatureWithObjCTypes:). It is then used to create an NSInvocation object, which is passed as the argument to a forwardInvocation: message to send the invocation on to whatever other object can handle the message. In the default case, NSObject invokes doesNotRecognizeSelector:, which raises an exception. For distributed objects, the NSInvocation object is encoded using the information in the NSMethodSignature object and sent to the real object represented by the receiver of the message.

Objective C message dispatch mechanism [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 months ago.
The community reviewed whether to reopen this question 7 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I am just staring to play around with Objective C (writing toy iPhone apps) and I am curious about the underlying mechanism used to dispatch messages. I have a good understanding of how virtual functions in C++ are generally implemented and what the costs are relative to a static or non-virtual method call, but I don't have any background with Obj-C to know how messages are sent. Browsing around I found this loose benchmark and it mentions IMP cached messages being faster than virtual function calls, which are in turn faster than a standard message send.
I am not trying to optimize anything, just get deeper understanding of how exactly the messages get dispatched.
How are Obj-C messages dispatched?
How do Instance Method Pointers get cached and can you (in general) tell by reading the code if a message will get cached?
Are class methods essentially the same as a C function (or static class method in C++), or is there something more to them?
I know some of these questions may be 'implementation dependent' but there is only one implementation that really counts.
How are Obj-C messages dispatched?
Objective-C messages are dispatched using the runtime's objc_msgSend() function. Shown in the Apple docs, the function takes at least 2 arguments:
The receiving object
The selector of the message
[A variable list of arguments to the message being sent.]
Instances of a class have an isa pointer, which is a pointer to their class object. The selectors of methods in each object are stored in a "table" in the class object, and the objc_msgSend() function follows the isa pointer to the class object, to the find this table, and checks whether the method is in the table for the class. If it cannot find it, it looks for the method in the table of the class's superclass. If not found, it continues up the object tree, until it either finds the method or gets to the root object (NSObject). At this point, an exception is thrown.
How do Instance Method Pointers get cached and can you (in general) tell by reading the code if a message will get cached?
From Apple's Objective-C runtime guide on Messaging:
To speed the messaging process, the runtime system caches the selectors and addresses of methods as they are used. There’s a separate cache for each class, and it can contain selectors for inherited methods as well as for methods defined in the class. Before searching the dispatch tables, the messaging routine first checks the cache of the receiving object’s class (on the theory that a method that was used once may likely be used again). If the method selector is in the cache, messaging is only slightly slower than a function call. Once a program has been running long enough to “warm up” its caches, almost all the messages it sends find a cached method. Caches grow dynamically to accommodate new messages as the program runs.
As stated, caching starts to occur once the program is running, and after the program has been running long enough, most of the method calls will run through the cached method. As it also says, the caching occurs as the methods are used, so a message is only cached when it is used.
Are class methods essentially the same as a C function (or static class method in C++), or is there something more to them?
Class objects handle method despatch in a similar manner to that of instances of classes. Each class object has an object that stores its own class methods, in an object called a metaclass. The class object has its own isa pointer to its metaclass object, which in turn has super metaclass objects, which it can inherit class objects from. Method dispatch to class methods is as so:
The dispatch system follows the class object's isa pointer to the metaclass object
The metaclass object's method table is searched for the class method.
If not found, the search continues to the metaclass object's superclass, where the search continues.
This process repeats until either the method is found, or until it gets to the root metaclass, and an exception is thrown.
Dispatch mechanisms
It is used to find a necessary executable code when method was called(message sent)
Inline
Static(Direct)(C, Java final, C++ default, Swift static, final) - compiler knows the necessary method realisation at compile-time.
Dynamic - is based on witness table(virtual table, dispatch table) and it introduce polymorphism
Table, V-Table(C++ virtual, Java default, Swift default) - Every object has a reference to class which has a table with all method addresses(super, overrides, new). SIL contains vtable or witness_table
Message(Objective-C, Swift dynamic) - Every object has a reference(isa) to class which contains a reference to superclass and dispatch table(which contains only realised methods(new and which were overhead)) and don't contain methods from super. If method was not found in current dispatch table, it continue searching in superclass's dispatch table. This process is optimised by caching. SIL contains volatile
Objective-C Message Dispatch
For example
class A {
func foo1() {}
func foo2() {}
}
class B: A {
override func foo2() {}
func foo3() {}
}
Objective-C obc_msgSend
id obc_msgSend(id self, SEL op, ...)
// self - object which receive a message
// op - selector of method
//... - arguments
If method implementation was not found for given selector you see next error
unrecognized selector sent to instance