Objective C Class Methods vs C Functions - objective-c

While working on on open source project, I came across the following C function declaration and implementation:
// FSNData.h
NSString *stringForMimeType(MimeType type);
#interface FSNData : NSObject
// All the expected objective-c property and instance method declarations
#end
// FSNData.m
#import "FSNData.h"
// where 'type' is an enum
// this does work as expected
NSString *stringForMimeType(MimeType type) {
switch (type) {
case MimeType_image_jpeg: return #"image/jpeg";
case MimeType_image_png: return #"image/png";
default:
NSLog(#"ERROR: FSNData: unknown MimeType: %d", type);
// do not return "application/octet-stream"; instead, let the recipient guess
// http://en.wikipedia.org/wiki/Internet_media_type
return nil;
}
}
#implementation
// all properties and methods defined in FSData.h implemented as expected
#end
This example could easily be re-written as a class level method with out any problem. As it is, using stringFormMimeType() sill requires importing the FSNData header file anyway.
Looking at the Apple docs, it states only:
Because Objective-C rests on a foundation of ANSI C, you can freely
intermix straight C code with Objective-C code. Moreover, your code
can call functions defined in non-Cocoa programmatic interfaces, such
as the BSD library interfaces in /usr/include.
There is no mention of when C functions should favour Objective-C methods.
The only benefit I can see at this point, is that calling the above function, as opposed to a class method, some Objective-C runtime call(s) would be skipped. In a typical use case of FSNData, this would not give a noticeable boost in performance to the user (probably even to developers)*.
What benefit exists (other than coding style) for favouring a C function over a class method?
*FSNData is used as part of the FSNetworking library, so I doubt there would be thousands upon thousands of network operations being performed during any application's life cycle.

In short, C (or C++) implementations are very useful:
For Abstraction
For Reusability
When making medium and large scale programs
In performance critical paths
For 'Interior' implementations
What benefit exists (other than coding style) for favouring a C function over a class method?
ObjC messaging introduces indirect function calls. These are firewalls for optimizers.
C functions can easily restrict access, whereas 'private' ObjC implementations may be looked up using the ObjC runtime, or accidentally overridden.
C functions may be removed from your executable if not referenced, or they may be made private. If you write reusable code (and you should), this can have a huge impact on your binary sizes and load times -- C functions which are not referenced/used may be removed, but ObjC types and methods will be preserved (including everything they reference). This is why your app's binary size may grow significantly when you use only small part of an ObjC static library -- every objc class in the library is preserved. If that library were C or C++, then you could get by with very small growth because you need only what is referenced. What is or is not referenced is easier to prove with C and C++.
C functions can be inlined, either during compilation or during Link Time Optimization phases.
The compiler and optimizers are able to do much optimization with C functions (e.g. inter-procedural optimizations), but very little with ObjC methods because they are always indirect.
To avoid ObjC message dispatch overhead (as you mentioned)
Potential for additional reference counting operations and autorelease pool activity when interacting with ObjC objects.
Of course you won't always hurt paying for things you don't need or use -- and remember that ObjC class methods have some benefits over C functions, too. So, just look at C or C++ implementations as another tool in your toolbox. I find them very useful as complexity and project sizes increase, and they can be used to make your programs much faster. Just do what you are least likely to regret in 2015 ;)

You already touched on the marginal performance difference of avoiding an objc_msgSend call. Objective-C class methods are also subject to overriding in subclasses, so implementing a method in C will prevent it from being overridden in a subclass. Relatedly, because of that runtime inheritance/polymorphism, an Objective-C method can never be inlined, whereas a C function can potentially be inlined by the compiler for added performance.
When it comes to avoiding objc_msgSend, a wise man once told me, "If the overhead of objc_msgSend is too great for you, Objective-C is probably the wrong tool for the job."

Related

Return type differs in implementation for objective C protocols

I have a protocol that has a method returning NSArray*.
In the implementation I had made the return type of that method to be NSView*
I see this is happening only in case of Objective C class pointers and not in other cases like returning void vs returning int.
I would expect a complier warning at the minimum but the compilation happens just fine.
#protocol prot <NSObject>
-(NSArray*)array;
#end
#interface impl : NSObject<prot>
#end
#implementation impl
//Should return NSArray. Returns NSView instead
- (NSView *)array
{
return nil;
}
#end
First things first:
impl should be Implementation since class names are written in upper camel case and abbreviations are bad(TM). Moreover, Class is a class pointer, NSView* and NSArray* are instance pointers.
To your Q, even I'm a bit tired of this discussion (dynamic vs. static typing, early vs. late binding):
A: Why should the compiler warn? Both are instance pointers and maybe the messages sent to the object are supported by both. The compiler does not care about binding, it is done at runtime.
B: But this is very unsafe!
A: Did you ever ship code with such an error?
B: No. But it is unsafe by theory.
A: Yes, that's true for alle theories that ship code without running it at least one time.
B: But you have to admit, that this is more unsafe than type checking at compile time.
A: Yes, theoretically that's true.
B: So why do you support it?
A: Because there are many situations in which dynamic typing has advantages. I. e. it is very easy to write generic code without having templates. (Even sometimes they are called generics, they are still silly templates.) It is very easy to give around responsibility, what needs contra-conceptual extensions in other languages (signals & slots in C++, delegates in C#, …) It is very easy to create stand-in objects for lowering memory pressure. It is very easy to write an ORIM. Shall I continue?
B: Yes
A: Is is that flexible that you can write a whole AOP framework within that language. It is that flexible that you can write a prototype based framework within that language.
However, sometimes it is easy to detect for the compiler that something makes no sense at all. And sometimes the compiler warns about that. But in many cases the compiler is not more intelligent than the developer.
Agreed that it should generate a warning, but it doesn't. Part of the issue is that all ObjC objects are id at runtime, which is why you're seeing different behavior for int (which isn't id). But that's not really an excuse. It's a limitation of the compiler. There are numerous places where it doesn't do a good job of distinguishing between ObjC object types. ObjC objects are duck-typed, so as long as they respond to the right messages "they work."
Sometimes this is a benefit; for example, NSArray is actually a class cluster, and there are several (private) types that pretend to be NSArray by just implementing the same interface. That's something that is easy in ObjC, but hard in Swift. Still no excuse, since it would be easy to get that benefit without this frustrating lack of a compiler warning, but it gets back to how ObjC thinks about class types.
This limitation is fixed in Swift, and another benefit of moving over, but that doesn't really help you, I know.

Why does method_getNumberOfArguments return two more results than the selector would imply?

In the objective-C runtime, why does method_getNumberOfArguments return two more results than the selector would imply?
For example, why does #selector(initWithPrice:color:) return 4?
TL;DR
Alright. Just to set the record straight, yes, the first two arguments to any objective-c method are self and _cmd, always in that order.
A brief history of Objective-C
However, the more interesting subject is the why to this scenario. To do that, we must first look into the history of objc. Without further ado, let's get started.
Way back in 1983, Brad Cox, the 'God' of objective-c, wanted to create an object-oriented runtime-based language on top of C, for good performance and flexibility across platforms. As a result, the very first Objective-C 'compilers' were just simple preprocessors of Objective-C source converted to their C-runtime equivalents, and then compiled with the platform specific C compiler tool.
However, C was not designed for objects, and that was the most fundamental thing that Objective-C had to surmount. While C is a robust and flexible language, runtime support is one of it's critical downfalls.
During the very early design phase of Objective-C, it was decided that objects would be a purely heap-based pointer design, so that they could be passed between any function without weird copy semantics and such (this changed a bit with Obj-C++ and ARC, but that's too wide of a scope for this post), and that every method should be self aware (acually, as bbum points out, it was an optimization for using the same stack frame as the original function call), so that you could have, in theory, multiple method names mapped to the same selector, as follows:
// this is a completely valid objc 1.0 method declaration
void *nameOrAge(id self, SEL _cmd) {
if (_cmd == #selector(name)) {
return "Richard";
}
if (_cmd == #selector(age)) {
return (void *) (intptr_t) 16;
}
return NULL;
}
This function, then could be theoretically mapped to two selectors, name and age, and perform conditional code based on which one is invoked. In general Objective-C code, this is not too big of a deal, as it's quite difficult with ARC now to map functions to selectors, due to casting and such, but the language has evolved quite a bit from then.
Hopefully, that helps you to understand the why behind the two 'invisible' arguments to an Objective-C method, with the first one being the object that was invoked, and the second one being the method that was invoked on that object.
The first two arguments are the hidden arguments self and _cmd.

Using a custom allocator in an iOS library

I'm creating a library that will be used by multiple types of iOS apps. Part of my API allows a user to specify routines that will be used for the library's allocations. My library is implemented mostly in C++, so this has been straightforward so far.
However, I've recently been adding some user interface functionality to the library: displaying a UIWebView using a custom view controller. I'm not sure how to ensure that my allocators are used for these objects.
How can I ensure that all of the Cocoa UI objects created by my library are allocated with my own functions?
I've tried a few things including overriding -initWithZone and calling CFAllocatorSetDefault before my -init. None of them have worked yet; and honestly I'm still a beginner with Objective C and Cocoa, so I'd like to know what the "correct" way to do this is.
I'm unable to find evidence of it now, but it certainly was the case that CFAllocator, malloc_zone_t and NSZone were all toll-free bridged. So you could just cast your allocator to an NSZone and pass it along.
I think the problem you're going to face is that NSZone was added at NextStep so as to allow a program to maintain multiple heaps, with the feeling being that it would allow programmers to keep related objects close to one another in memory — which is good for caching — and in some cases to throw away entire object graphs without walking the graph, which is obviously fast. However the former was of little benefit in practice and the latter is more likely to create problems than to be of actual benefit. So Apple has back-peddled from NSZones, gradually turning the related runtime calls into no-ops and removing detailed documentation. Apple's feeling is that, at the Objective-C level, you should not only maintain only a single heap (which is a side issue from your point of view) but that they'll always know best how to maintain it.
EDIT: an alternative idea is to replace NSObject's alloc, that being the thing that creates memory. The Objective-C runtime is well-defined enough that we know exactly what behaviour alloc exhibits, so that a vanilla version might be:
+ (id)alloc
{
Class *newInstance;
// we'll use calloc to ensure we get correct initial behaviour of
// everything equal to 0, and use the runtime's class_getInstanceSize
// so that we're allocating the correct amount of memory irrespective
// of whether this call has fallen through from a subclass
newInstance = (Class *)calloc(1, class_getInstanceSize(self));
// the thing that defines a class is that the first thing stored in
// it is the isa pointer, which points to the metaclass. So we'll
// set the isa pointer appropriately
*newInstance = self;
return (id)newInstance;
}
To replace NSObject's normal init, you'd probably want to define your alternative as a category method on NSObject named, say, customInit, then use class_getClassMethod, class_getMethodImplementation and method_setImplementation directly on [NSObject class] to switch it into place. See the Object-C Runtime Reference for documentation on those.

Why are Objective-C instance variables declared in an interface?

I'm just getting into Objective-C (Java is my primary OO language).
Defining an object's instance variables in the interface instead of the class seems strange. I'm used to an interface being a public API definition with nothing besides method signatures (not counting constants here).
Is there some reason that state is defined in an interface (even if it is private) and behaviour is defined in a class. It just seems odd that since objects are state+behavior that the definition would be split into two separate places.
Is it a design benefit is some way? A pain in the rear issue that you are just forced to deal with in Objective-C? A non-issue, just different? Any background on why it's done this way?
Or can you put object state in a class and I just haven't hit that part in my book yet?
UPDATE
The answer below was written before the language feature of declaring instance variables in the implementation was implemented. The premise of the question is now no longer valid. As FireLizzard says, nothing needs to go in the #interface that you don't want to be public.
It's a hangover from the fact that Objective-C originated as a fairly thin layer built on top of C. The C way is to define the interface to a module (do not confuse with a Java interface) in a header file and literally include it in each compilation unit. It's akin to automatically copy-pasting the declarations to the top of every compiled file. If that seems primitive, it is because it is, but C is a 40 year old language.
You have to define instance variables - even private ones - in the interface because Objective-C objects are implemented as C structs which are themselves just blocks of memory and named offsets within that block. The struct that represents an object of each class has to include space for the superclass instance variables so subclasses need to know at least the size of the C struct representing the superclass and also the public and protected instance variable offset. That, unfortunately, means that all the instance variables even private ones have to be exposed as part of the external interface.* C++ the other OO version of C suffers from the same problem for the same reasons.
It's a bit of a pain having to write down all the method signatures twice, but you get used to it.
*With the 64 bit runtime, you no longer need to declare the ivars for synthesized accessors in the #interface but since all methods are public, it still means exposing internal state to the outside World, althoug it does alleviate the fragile base class problem.
In Objective C interface does not refer to the instance at all
Brad Cox who designed Objective C decided that the equivalent of C declarations and definitions should be made explicit so each class has a #interface section telling what it looks like externally and an #implementation saying how it is implemented.
Java came along later and changed the object model so that there is only one definition of an object which pulled the #interface and #implementation together. The compiler (and runtime introspection) in effect construct the interface from the code.
The equivalent of an interface in Java is a Protocol in Objective C.
You just get used to it.

Why doesn't Objective-C support private methods?

I've seen a number of strategies for declaring semi-private methods in Objective-C, but there does not seem to be a way to make a truly private method. I accept that. But, why is this so? Every explanation I've essentially says, "you can't do it, but here's a close approximation."
There are a number of keywords applied to ivars (members) that control their scope, e.g. #private, #public, #protected. Why can't this be done for methods as well? It seems like something the runtime should be able to support. Is there an underlying philosophy I'm missing? Is this deliberate?
The answer is... well... simple. Simplicity and consistency, in fact.
Objective-C is purely dynamic at the moment of method dispatch. In particular, every method dispatch goes through the exact same dynamic method resolution point as every other method dispatch. At runtime, every method implementation has the exact same exposure and all of the APIs provided by the Objective-C runtime that work with methods and selectors work equally the same across all methods.
As many have answered (both here and in other questions), compile-time private methods are supported; if a class doesn't declare a method in its publicly available interface, then that method might as well not exist as far as your code is concerned. In other words, you can achieve all of the various combinations of visibility desired at compilation time by organizing your project appropriately.
There is little benefit to duplicating the same functionality into the runtime. It would add a tremendous amount of complexity and overhead. And even with all of that complexity, it still wouldn't prevent all but the most casual developer from executing your supposedly "private" methods.
EDIT: One of the assumptions I've
noticed is that private messages would
have to go through the runtime
resulting in a potentially large
overhead. Is this absolutely true?
Yes, it is. There's no reason to suppose that the implementor of a class would not want to use all of the Objective-C feature set in the implementation, and that means that dynamic dispatch must happen. However, there is no particular reason why private methods couldn't be dispatched by a special variant of objc_msgSend(), since the compiler would know that they were private; i.e. this could be achieved by adding a private-only method table to the Class structure.
There would be no way for a private
method to short-circuit this check or
skip the runtime?
It couldn't skip the runtime, but the runtime wouldn't necessarily have to do any checking for private methods.
That said, there's no reason that a third-party couldn't deliberately call objc_msgSendPrivate() on an object, outside of the implementation of that object, and some things (KVO, for example) would have to do that. In effect, it would just be a convention and little better in practice than prefixing private methods’ selectors or not mentioning them in the interface header.
To do so, though, would undermine the pure dynamic nature of the language. No longer would every method dispatch go through an identical dispatch mechanism. Instead, you would be left in a situation where most methods behave one way and a small handful are just different.
This extends beyond the runtime as there are many mechanisms in Cocoa built on top of the consistent dynamism of Objective-C. For example, both Key Value Coding and Key Value Observation would either have to be very heavily modified to support private methods — most likely by creating an exploitable loophole — or private methods would be incompatible.
The runtime could support it but the cost would be enormous. Every selector that is sent would need to be checked for whether it is private or public for that class, or each class would need to manage two separate dispatch tables. This isn't the same for instance variables because this level of protection is done at compile time.
Also, the runtime would need to verify that the sender of a private message is of the same class as the receiver. You could also bypass private methods; if the class used instanceMethodForSelector:, it could give the returned IMP to any other class for them to invoke the private method directly.
Private methods could not bypass the message dispatch. Consider the following scenario:
A class AllPublic has a public instance method doSomething
Another class HasPrivate has a private instance method also called doSomething
You create an array containing any number of instances of both AllPublic and HasPrivate
You have the following loop:
for (id anObject in myArray)
[anObject doSomething];
If you ran that loop from within AllPublic, the runtime would have to stop you sending doSomething on the HasPrivate instances, however this loop would be usable if it was inside the HasPrivate class.
The answers posted thus far do a good job of answering the question from a philosophical perspective, so I'm going to posit a more pragmatic reason: what would be gained by changing the semantics of the language? It's simple enough to effectively "hide" private methods. By way of example, imagine you have a class declared in a header file, like so:
#interface MyObject : NSObject {}
- (void) doSomething;
#end
If you have a need for "private" methods, you can also put this in the implementation file:
#interface MyObject (Private)
- (void) doSomeHelperThing;
#end
#implementation MyObject
- (void) doSomething
{
// Do some stuff
[self doSomeHelperThing];
// Do some other stuff;
}
- (void) doSomeHelperThing
{
// Do some helper stuff
}
#end
Sure, it's not quite the same as C++/Java private methods, but it's effectively close enough, so why alter the semantics of the language, as well as the compiler, runtime, etc., to add a feature that's already emulated in an acceptable way? As noted in other answers, the message-passing semantics -- and their reliance on runtime reflection -- would make handling "private" messages non-trivial.
The easiest solution is just to declare some static C functions in your Objective-C classes. These only have file scope as per the C rules for the static keyword and because of that they can only be used by methods in that class.
No fuss at all.
Yes, it can be done without affecting the runtime by utilizing a technique already employed by the compiler(s) for handling C++: name-mangling.
It hasn't been done because it hasn't been established that it would solve some considerable difficulty in the coding problem space that other techniques (e.g., prefixing or underscoring) are able to circumvent sufficiently. IOW, you need more pain to overcome ingrained habits.
You could contribute patches to clang or gcc that add private methods to the syntax and generated mangled names that it alone recognized during compilation (and promptly forgot). Then others in the Objective-C community would be able to determine whether it was actually worthwhile or not. It's likely to be faster that way than trying to convince the developers.
Essentially, it has to do with Objective-C's message-passing form of method calls. Any message can be sent to any object, and the object chooses how to respond to the message. Normally it will respond by executing the method named after the message, but it could respond in a number of other ways too. This doesn't make private methods completely impossible — Ruby does it with a similar message-passing system — but it does make them somewhat awkward.
Even Ruby's implementation of private methods is a bit confusing to people because of the strangeness (you can send the object any message you like, except for the ones on this list!). Essentially, Ruby makes it work by forbidding private methods to be called with an explicit receiver. In Objective-C it would require even more work since Objective-C doesn't have that option.
It's an issue with the runtime environment of Objective-C. While C/C++ compiles down into unreadable machine code, Objective-C still maintains some human-readable attributes like method names as strings. This gives Objective-C the ability to perform reflective features.
EDIT: Being a reflective language without strict private methods makes Objective-C more "pythonic" in that you trust other people that use your code rather than restrict what methods they can call. Using naming conventions like double underscores is meant to hide your code from a casual client coder, but won't stop coders needing to do more serious work.
There are two answers depending on the interpretation of the question.
The first is by hiding the method implementation from the interface. This is used, typically with a category with no name (e.g. #interface Foo()). This permits the object to send those messages but not others - though one might still override accidentally (or otherwise).
The second answer, on the assumption that this is about performance and inlining, is made possible but as a local C function instead. If you wanted a ‘private foo(NSString *arg)‘ method, you would do void MyClass_foo(MyClass *self, NSString *arg) and call it as a C function like MyClass_foo(self,arg). The syntax is different, but it acts with the sane kind of performance characteristics of C++'s private methods.
Although this answers the question, I should point out that the no-name category is by far the more common Objective-C way of doing this.
Objective-C doesn't support private methods because it doesn't need them.
In C++, every method must be visible in the declaration of the class. You can't have methods that someone including the header file cannot see. So if you want methods that code outside your implementation shouldn't use, you have no choice, the compiler must give you some tool so you can tell it that the method must not be used, that is the "private" keyword.
In Objective-C, you can have methods that are not in the header file. So you achieve the same purpose very easily by not adding the method to the header file. There's no need for private methods. Objective-C also has the advantage that you don't need to recompile every user of a class because you changed private methods.
For instance variables, that you used to have to declare in the header file (not anymore), #private, #public and #protected are available.
A missing answer here is: because private methods are a bad idea from an evolvability point of view. It might seem a good idea to make a method private when writing it, but it is a form of early binding. The context might change, and a later user might want to use a different implementation. A bit provocative: "Agile developers don't use private methods"
In a way, just like Smalltalk, Objective-C is for grown-up programmers. We value knowing what the original developer assumed the interface should be, and take the responsibility to deal with the consequences if we need to change implementation. So yes, it is philosophy, not implementation.