I read everywhere that Objective-C has true dynamic binding, where as C++ has only Late binding. Unfortunately none of the books go on to explain it clearly or discuss the underlying implementation. For e.g C++ uses virtual table. How about Objective-C?
http://www.gnu.org/software/gnustep/resources/ObjCFun.html has a pretty good description.
Basically what dynamic binding means is that at the time that the method call is actually made, the decision is made about what method to invoke. And the method can, if you wish, be dynamically chosen at that point.
Edit: Here is a lot more detail to the best of my understanding. I do not promise that it is entirely correct, but it should be mostly right. Every object in Objective C is a struct whose first member, named isa, is a pointer to a class. Each class is itself an object that is traditionally laid out as:
struct objc_class {
Class isa;
Class super_class;
const char *name;
long version;
long info;
long instance_size;
struct objc_ivar_list *ivars;
struct objc_method_list **methodLists;
struct objc_cache *cache;
struct objc_protocol_list *protocols;
};
At runtime, here is pseudo-code for what happens on a method lookup:
Follow isa to find the class
if implementation = class.lookup_method(method):
call implementation
else if get_implementation = class.lookup_method(forwardInvocation):
implementation = get_implementation(method)
if implementation:
call implementation
else:
raise runtime error
else:
raise runtime error
And how does that lookup_method work?
def lookup_method (class, method):
if method in class.objc_cache:
return implementation from objc_cache
else if method in class.objc_method_list:
cache implementation from objc_method_list
return implementation
else if implementation = class.super_class.lookup_method(method):
cache implementation
return implementation
else:
return null
In response to the obvious question, yes this is much slower than C++'s virtual tables. According to benchmarks, about 1/3 of the speed. Every Objective C text immediately follows that up with the fact that in the real world, method lookup speed is almost never a bottleneck.
This is much more flexible than C's method lookups. For instance you can use forwardInvocation to cause unrecognized methods to go to an object that you have in a variable. This kind of delegation can be done without knowing what the type of that object will be at run time, or what methods it will support. You can also add methods to classes - even at runtime if you wish - without having access to the source code. You also have rich runtime introspection on classes and methods.
The obvious flip side, that any C++ programmer will be jumping up and down about, is that you've thrown away any hope of compile time type checking.
Does that explain the differences and give you sufficient detail to understand what is going on?
Both dynamic binding and late binding are the same,in fact. In we have static binding ,or early binding , which checks the issues which happen at compile time(errors regarding thevariables,expressions etc) and these information are stored in a v-table(virtual method table). What late binding does is that it just binds the methods with those in the v-table.
Related
I have a protocol that has a method returning NSArray*.
In the implementation I had made the return type of that method to be NSView*
I see this is happening only in case of Objective C class pointers and not in other cases like returning void vs returning int.
I would expect a complier warning at the minimum but the compilation happens just fine.
#protocol prot <NSObject>
-(NSArray*)array;
#end
#interface impl : NSObject<prot>
#end
#implementation impl
//Should return NSArray. Returns NSView instead
- (NSView *)array
{
return nil;
}
#end
First things first:
impl should be Implementation since class names are written in upper camel case and abbreviations are bad(TM). Moreover, Class is a class pointer, NSView* and NSArray* are instance pointers.
To your Q, even I'm a bit tired of this discussion (dynamic vs. static typing, early vs. late binding):
A: Why should the compiler warn? Both are instance pointers and maybe the messages sent to the object are supported by both. The compiler does not care about binding, it is done at runtime.
B: But this is very unsafe!
A: Did you ever ship code with such an error?
B: No. But it is unsafe by theory.
A: Yes, that's true for alle theories that ship code without running it at least one time.
B: But you have to admit, that this is more unsafe than type checking at compile time.
A: Yes, theoretically that's true.
B: So why do you support it?
A: Because there are many situations in which dynamic typing has advantages. I. e. it is very easy to write generic code without having templates. (Even sometimes they are called generics, they are still silly templates.) It is very easy to give around responsibility, what needs contra-conceptual extensions in other languages (signals & slots in C++, delegates in C#, …) It is very easy to create stand-in objects for lowering memory pressure. It is very easy to write an ORIM. Shall I continue?
B: Yes
A: Is is that flexible that you can write a whole AOP framework within that language. It is that flexible that you can write a prototype based framework within that language.
However, sometimes it is easy to detect for the compiler that something makes no sense at all. And sometimes the compiler warns about that. But in many cases the compiler is not more intelligent than the developer.
Agreed that it should generate a warning, but it doesn't. Part of the issue is that all ObjC objects are id at runtime, which is why you're seeing different behavior for int (which isn't id). But that's not really an excuse. It's a limitation of the compiler. There are numerous places where it doesn't do a good job of distinguishing between ObjC object types. ObjC objects are duck-typed, so as long as they respond to the right messages "they work."
Sometimes this is a benefit; for example, NSArray is actually a class cluster, and there are several (private) types that pretend to be NSArray by just implementing the same interface. That's something that is easy in ObjC, but hard in Swift. Still no excuse, since it would be easy to get that benefit without this frustrating lack of a compiler warning, but it gets back to how ObjC thinks about class types.
This limitation is fixed in Swift, and another benefit of moving over, but that doesn't really help you, I know.
Data encapsulation, or as I like to call it, Who owns it and who needs to know about it, makes up a lot of object-oriented programming. The who needs to know is often satisfied by accessor methods, but these get to be pretty expensive if they all result in an objc_msgsend just to read a variable. C++ answers the problem with inline methods - use the "inline" keyword before the definition, or define the method within the class declaration, and the compiler puts the accessor code within the caller's code, saving the overhead associated with an actual function call.
class IntWrapper {
public:
int getInt() { return anInt; }
protected:
int anInt;
};
Similar syntax is rewarded by a complier error in Objective-C. Having searched the language guides in Xcode ("[Object-Oriented] Programming with Objective-C"), I don't see any relevant reference to "inline" of a method. Is there such thing as inline in Objective-C? Is it called something else? If anyone could point me to the documentation that references inline, much appreciated.
Using the simple test code:
#interface ClassA : NSObject
{
int anInt;
}
- (int) anInt;
#end
#implementation ClassA
- (int) anInt { return anInt; }
#end
and looking at the assembly of the code that uses it, it looks like about 25 instructions.
All Objective-C methods are dispatched dynamically. They can be overridden by subclasses. They can even be replaced at runtime ("swizzled") by the Objective-C runtime API.
In some ways, they are similar to virtual methods in C++.
As such they can't be inlined.
By the way, the technique you cite violates the principle you cite ("Who owns it and who needs to know about it?"). Putting the implementation in the class declaration exposes implementation detail to clients who don't need to know it. Furthermore, the compiler inlining the code into clients prevents that implementation from changing without a recompile, which is the fragile base class problem. Modern Objective-C avoids the fragile base class problem, which means a framework class can change what instance variables it has without breaking clients.
I know that when an object is instantiated on the heap, at the least enough memory is allocated to hold the object's ivars. My question is about how methods are stored by the compiler. Is there only one instance of method code in memory? Or is the code generated an intrinsic part of the object in memory, stored contiguously with the ivars and executed?
It seems like if the latter were the case, even trivial objects such as NSStrings would require a (relatively) large amount of memory (NSString inherits methods from NSObject, also).
Or is the method stored once in memory and passed a pointer to the object which owns it?
In a "standard" Objective-C runtime, every object contains, before any other instance variables, a pointer to the class it is a member of, as if the base Object class had an instance variable called:
Class isa;
Each object of a given class shares the same isa pointer.
The class contains a number of elements, including a pointer to the parent class, as well as an array of method lists. These methods are the ones implemented on this class specifically.
struct objc_class {
Class super_class;
...
struct objc_method_list **methodLists;
...
};
These method lists each contain an array of methods:
struct objc_method_list {
int method_count;
struct objc_method method_list[];
};
struct objc_method {
SEL method_name;
char *method_types;
IMP method_imp;
};
The IMP type here is a function pointer. It points to the (single) location in memory where the implementation of the method is stored, just like any other code.
A note: What I'm describing here is, in effect, the ObjC 1.0 runtime. The current version doesn't store classes and objects quite like this; it does a number of complicated, clever things to make method calls even faster. But what I'm describing still is still the spirit of how it works, if not the exact way it does.
I've also left out a few fields in some of these structures which just confused the situation (e.g, backwards compatibility and/or padding). Read the real headers if you want to see all the gory details.
Methods are stored once in memory. Depending on the platform, they are paged into RAM as needed. If you really want more details, read the Mach-O and runtime guides from Apple. It's not usually something programmers concern themselves with any more unless they're doing something pretty low-level.
Objects don't really "own" methods. I suppose you could think of it as classes owning methods, so if you have 400 NSStrings you still only have one copy of each method in RAM.
When a method gets called, the first parameter is the object pointer, self. That's how a method knows where the data is that it needs to operate on.
Assume that:
New Protocol is declared
Method in this protocol is marked #required
Class conforms to Protocol
Class does not implement the method mentioned in Protocol
At compile time, information about this method is known: i.e. that it is required and that this class and any other classes this class may may extend do not implement it.
Why in this case the compiler issues a warning and not an error?
Errors are only issued when the compiler cannot continue because something went terribly wrong.
When calling a method in Objective-C, the method lookup is done during runtime and not during compilation, which C++ does. In Objective-C a "message" is simply sent to the object, something like obj.executeCommand("Hey, can you execute function <name> for me?"). In C++ the object will be called directly, in a way like obj.<name>(). In the case of Objective-C the executeCommand() method is called, which exists. In C++'s case the function is called but it does not exist. These are methods that are linked on the compiler level, which means they both become memory addresses rather than names. executeCommand becomes 0x12345678 but it still uses the same message ("execute function <name>").
This is probably very confusing, but it's related to the way methods are implemented in different languages.
If you feel strongly about it, why not turn on -Werror?
I don't know the real answer but here is a use case that would go against it.
What if you implemented all of the protocol methods in a category???
Main interface declaration adopts the protocol however the protocol method implementation is in a category.
This is valid code but will show compile error if compiler was that strict !!
Objective-C is a dynamic language. The idea of what an implementation is, is different to a static language.
For the most part, it's in code that most of us implement inside the #implementation ... #end block.
But what if a method is not found? Then an object has a chance deal with it dynamically.
Imagine you have an interface for a sound effect player:
#protocol FX
- (void)playBeep;
- (void)playSiren;
- (void)playHonk;
#end
An implementation could have the files Beep.mp3, Siren.mp3, Honk.mp3 to play, but instead of implementing each of the methods, it could override -forwardInvocation: and parse the selector string, something like this pseudocode:
NSString *selName = NSStringFromSelector([invocation selector]);
if ([selName startsWith:#"play"]) {
NSString filename = fileNameFromSelector(selName);
[self playSoundFileNamed:filename];
}
This may seem contrived, but once you start using the dynamic features of the language, you will start finding more and more places where it makes sense. And by sense I mean, does this effort help in the long run?
In the above case, just add a -sound* method name to the interface, and drop in a appropriately named sound file. It just works.
Another example from personal experiments: how to deal with Core Data entities in a more natural way. I want to do this:
NSArray *people = [Person findAllWithNameLike:#"B%"];
instead of mucking about with predicates, fetch requests etc.
But I don't want to define every permutation of method in code.
How about if I wanted to build an XML builder? I would look at a dynamic approach. It has served Groovy Builders well (look at Groovy/Grails for examples).
One last example: I have a traits system where I can define behaviours in the form of groups of methods and have my objects assimilate this behaviour. So, while the compiler doesn't see an implementation for the interface my object conforms to, the implementation is injected into it from a trait class, using the Objective-C runtime. Why would I do this? I find many delegate methods are boiler plate, but at the same time, a single base class for each situation is not flexible enough. Instead of cut and paste from code samples, my 'samples' compile and run :) and any changes are reflected across all projects using the trait.
To really understand why all this is available to you, it is worth playing around with a Smalltalk environment (search Pharo or Squeak). This is where Objective-C has its roots.
And finally, to stop these warnings:
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wprotocol"
#implementation ... #end
#pragma clang diagnostic pop
Because there are times when there are bogus "required" methods in a poorly designed protocol.
They should have been optional but someone insisted they are "required".
Thusly making this a run time issue rather than a compile bug is very very wise.
using the "Method * class_copyMethodList(Class cls, unsigned int *outCount)" function one can get a list of all methods that exist on an objective-C class.
I would like to know how to find which of these methods are constructors as I am writing an IOC container. I would like to determine the constructors and their parameter types.
I would like to know how to find which of these methods are
constructors as I am writing an IOC container. I would like to
determine the constructors and their parameter types.
In short, you can't. Or, at the least, you'll find that down this path lies madness.
First, Objective-C does not have constructors. It has initializers, sometimes many, and -- for a properly written class -- only one of which is the designated initializer. There is no way to identify the designated initializer at compile time or run time.
How do I use this with a Method * and no instantiated member of the
class?
You don't. First you allocate an instance of the class, then you initialize the instance.
Overall, this level of abstraction just isn't done in Objective-C outside of academic investigations. It can be done, but it is generally avoided because of the fragility of the resulting solution and the hairball of code-hell that is trying to dynamically support the underlying C ABI (go look at the source to libffi).
If you want to go down this path, then you are far better off either defining a custom abstract class that all of your containers will subclass that can provide the binding logic to the class behind it.
Or use protocols; i.e. a class could implement an IOCBean protocol and one method would be initIOCGoop that is the designated initializer goo.
Doing this generically for all classes is going to be rife with fragility, special cases, and will require a gigantic mess of code that will be difficult to maintain over time.
You can get the method signature by using the following method:
methodSignatureForSelector:
From the documentation:
An NSMethodSignature object records type information for the arguments and return value of a method. It is used to forward messages that the receiving object does not respond to—most notably in the case of distributed objects. You typically create an NSMethodSignature object using NSObject’s methodSignatureForSelector: instance method (on Mac OS X v10.5 and later you can also use signatureWithObjCTypes:). It is then used to create an NSInvocation object, which is passed as the argument to a forwardInvocation: message to send the invocation on to whatever other object can handle the message. In the default case, NSObject invokes doesNotRecognizeSelector:, which raises an exception. For distributed objects, the NSInvocation object is encoded using the information in the NSMethodSignature object and sent to the real object represented by the receiver of the message.