GNU Objective-C runtime trickery - objective-c

Can I, in the GNU Objective-C runtime, attach semi-arbitrary pieces of data to instance variables?
Challenge:
I'm currently working on a kind of Cocoa workalike for Linux, as a sort of pet project. (Please, let's not get sidetracked by all the "use GNUStep" stuff. I know about it, but it doesn't suit my needs. Moving on…) For this purpose I'm trying to cobble together a simple ORM system, reminiscent of DBIx::Class for Perl. The general idea is to make the declaration as simple (read: short) as possible, and if at all possible, without the need to provide +(id)constantClassAttribute methods for overriding.
The general idea is to declare my result classes as follows:
#interface SomeTable : ORMResult {
unsigned long long id;
ORMResult *toOneRelation;
ORMResultSet *toManyRelation;
}
#end
So far, so hoopy. I can now access these fields using [ORMResult self]->ivars, and do all manner of nasty stuff, like automagically generating accessors like -[toManyRelation] or -[setToOneRelation]. Piece of cake. Unfortunately, there are two pieces of information I cannot add using this setup; one is simple enough to solve, the other not so much:
What is the actual result class?
This is solved by subclassing ORMResult (like SomeTable), and plugging that in
there, using runtime dynam(ag)ics to figure out it's to-ness (toMany, toOne).
(And this is the tricky one!) Is the relationship nullable?
This is less easily solved. My initial ideas were
(ab)using protocols, like so:
#interface SomeTable : ORMResult {
unsigned long long id;
ORMResult <ORMNullable> *toOneRelation;
}
#end
This compiles, but unfortunately, when I try to use GDB to inspect the
ivars->ivar_list entries I find that the protocol information isn't actually kept
for the runtime to toy with. This makes, I suppose, some kind of twisted sense,
as protocol declarations are mostly for the compiler.
Abusing the protocol identifiers (byref, bycopy and friends, using defines:
#interface SomeTable : ORMResult {
unsigned long long id;
nullable OMRResult *toOneRelation;
}
#end
This has the rather obvious drawback of not actually working, as these
specifiers apparently only work in protocol method declarations.
The question, then, is how can this attachment of information to the ivars be pulled off in practice?
Note: As mentioned initially, I'm using the GNU Objective-C runtime, as supplied by GCC on Linux; and not the one supplied by Apple!
Edit: Starpox! I forgot a central point: An alternative, of course, is to simply make all relations nullable. This I don't really want, but if no other alternative exists, I guess that's the path I'll end up going down.

Well, how we used to do this in ye olde days on the Mac was to create a global variable holding an NSMutableDictionary into which we put the data we want to attach to an object. Simply use a string representation of the pointer as the key.
The only difficulty becomes figuring out when an object has gone away and making sure that its entry in the dictionary is removed as well. You may have to resort to hackery like method swizzling -dealloc to achieve that.

You might look at objc_setAssociatedObject and friends, which allow you to attach arbitrary data to an object. However, I'm not sure if they're supported in the version of libobjc that you're running.

Related

Return type differs in implementation for objective C protocols

I have a protocol that has a method returning NSArray*.
In the implementation I had made the return type of that method to be NSView*
I see this is happening only in case of Objective C class pointers and not in other cases like returning void vs returning int.
I would expect a complier warning at the minimum but the compilation happens just fine.
#protocol prot <NSObject>
-(NSArray*)array;
#end
#interface impl : NSObject<prot>
#end
#implementation impl
//Should return NSArray. Returns NSView instead
- (NSView *)array
{
return nil;
}
#end
First things first:
impl should be Implementation since class names are written in upper camel case and abbreviations are bad(TM). Moreover, Class is a class pointer, NSView* and NSArray* are instance pointers.
To your Q, even I'm a bit tired of this discussion (dynamic vs. static typing, early vs. late binding):
A: Why should the compiler warn? Both are instance pointers and maybe the messages sent to the object are supported by both. The compiler does not care about binding, it is done at runtime.
B: But this is very unsafe!
A: Did you ever ship code with such an error?
B: No. But it is unsafe by theory.
A: Yes, that's true for alle theories that ship code without running it at least one time.
B: But you have to admit, that this is more unsafe than type checking at compile time.
A: Yes, theoretically that's true.
B: So why do you support it?
A: Because there are many situations in which dynamic typing has advantages. I. e. it is very easy to write generic code without having templates. (Even sometimes they are called generics, they are still silly templates.) It is very easy to give around responsibility, what needs contra-conceptual extensions in other languages (signals & slots in C++, delegates in C#, …) It is very easy to create stand-in objects for lowering memory pressure. It is very easy to write an ORIM. Shall I continue?
B: Yes
A: Is is that flexible that you can write a whole AOP framework within that language. It is that flexible that you can write a prototype based framework within that language.
However, sometimes it is easy to detect for the compiler that something makes no sense at all. And sometimes the compiler warns about that. But in many cases the compiler is not more intelligent than the developer.
Agreed that it should generate a warning, but it doesn't. Part of the issue is that all ObjC objects are id at runtime, which is why you're seeing different behavior for int (which isn't id). But that's not really an excuse. It's a limitation of the compiler. There are numerous places where it doesn't do a good job of distinguishing between ObjC object types. ObjC objects are duck-typed, so as long as they respond to the right messages "they work."
Sometimes this is a benefit; for example, NSArray is actually a class cluster, and there are several (private) types that pretend to be NSArray by just implementing the same interface. That's something that is easy in ObjC, but hard in Swift. Still no excuse, since it would be easy to get that benefit without this frustrating lack of a compiler warning, but it gets back to how ObjC thinks about class types.
This limitation is fixed in Swift, and another benefit of moving over, but that doesn't really help you, I know.

Why did Apple previously typedef reference (pointer) types but not now?

I've been wondering why Apple uses data types in Core Foundation that are typedef'd to a pointer type while in Cocoa they are not.
As an example, you would reference a UIColor object like UIColor * while a reference to a CGColor object would be CGColorRef? Or NSURL * and CFURLRef? Why not just always use CGColor * and CFURL *? Or conversely, why no UIColorRef or NSURLRef types, since you never access a UIColor or NSURL directly anyway?
Or for example, why is it id and not id *, since it is actually a pointer and can in fact be typecast to void *?
Specifically, is there some reason Apple had a habit of doing this in their older frameworks, but stopped doing it in Cocoa? Is it simply a matter of style?
What Matt said, but there is a bit more to it.
The typedefs in the C based APIs also allow the implementation details to be hidden. For example, you can have the following without ever defining the __CFURL structure in a public header.
typedef __CFURL *CFURLRef;
Objective-C has long had these kinds of features in the form of categories and, recently added, the ability to move instance variable declarations out of the header file. Expect that, over time, you will see all instance variables removed from the public header files in the SDK.
Note that the Cocoa frameworks long, long, pre-dated CoreFoundation.
As for why id is used instead of id *, that dates back to when Objective-C was first created in the early 1980s. Specifically, the notion of the language was that you would build "software integrated circuits" that could be "plugged together" like real ICs. The goal was to keep the C bits around as implementation details and, ideally, not exposed in your APIs.
As for why you end up with NSString * instead of NSString, that is largely exactly because of the C underpinnings of the language. I wrote a fairly detailed answer to a slightly different SO question that is relevant.
You'll probably also find this answer relevant, too.
The reason for NSURL* vs CFURLRef is pretty much that it's just coding style. Cocoa is an Objective-C API and the general style in Objective-C is to not have a typedef whereas Core Foundation is a C API and the general style of it is to use a typedef. It's pretty much down to coding style.
id vs id* - I am not entirely sure with that, but my guess is it's historical and they just wanted to have the base "object" to be without the *. I don't know for sure the history of that, though. But again it'll just be a style thing.

If Protocol method is marked #required, when not implemented, why does compiler issue a warning and not an error?

Assume that:
New Protocol is declared
Method in this protocol is marked #required
Class conforms to Protocol
Class does not implement the method mentioned in Protocol
At compile time, information about this method is known: i.e. that it is required and that this class and any other classes this class may may extend do not implement it.
Why in this case the compiler issues a warning and not an error?
Errors are only issued when the compiler cannot continue because something went terribly wrong.
When calling a method in Objective-C, the method lookup is done during runtime and not during compilation, which C++ does. In Objective-C a "message" is simply sent to the object, something like obj.executeCommand("Hey, can you execute function <name> for me?"). In C++ the object will be called directly, in a way like obj.<name>(). In the case of Objective-C the executeCommand() method is called, which exists. In C++'s case the function is called but it does not exist. These are methods that are linked on the compiler level, which means they both become memory addresses rather than names. executeCommand becomes 0x12345678 but it still uses the same message ("execute function <name>").
This is probably very confusing, but it's related to the way methods are implemented in different languages.
If you feel strongly about it, why not turn on -Werror?
I don't know the real answer but here is a use case that would go against it.
What if you implemented all of the protocol methods in a category???
Main interface declaration adopts the protocol however the protocol method implementation is in a category.
This is valid code but will show compile error if compiler was that strict !!
Objective-C is a dynamic language. The idea of what an implementation is, is different to a static language.
For the most part, it's in code that most of us implement inside the #implementation ... #end block.
But what if a method is not found? Then an object has a chance deal with it dynamically.
Imagine you have an interface for a sound effect player:
#protocol FX
- (void)playBeep;
- (void)playSiren;
- (void)playHonk;
#end
An implementation could have the files Beep.mp3, Siren.mp3, Honk.mp3 to play, but instead of implementing each of the methods, it could override -forwardInvocation: and parse the selector string, something like this pseudocode:
NSString *selName = NSStringFromSelector([invocation selector]);
if ([selName startsWith:#"play"]) {
NSString filename = fileNameFromSelector(selName);
[self playSoundFileNamed:filename];
}
This may seem contrived, but once you start using the dynamic features of the language, you will start finding more and more places where it makes sense. And by sense I mean, does this effort help in the long run?
In the above case, just add a -sound* method name to the interface, and drop in a appropriately named sound file. It just works.
Another example from personal experiments: how to deal with Core Data entities in a more natural way. I want to do this:
NSArray *people = [Person findAllWithNameLike:#"B%"];
instead of mucking about with predicates, fetch requests etc.
But I don't want to define every permutation of method in code.
How about if I wanted to build an XML builder? I would look at a dynamic approach. It has served Groovy Builders well (look at Groovy/Grails for examples).
One last example: I have a traits system where I can define behaviours in the form of groups of methods and have my objects assimilate this behaviour. So, while the compiler doesn't see an implementation for the interface my object conforms to, the implementation is injected into it from a trait class, using the Objective-C runtime. Why would I do this? I find many delegate methods are boiler plate, but at the same time, a single base class for each situation is not flexible enough. Instead of cut and paste from code samples, my 'samples' compile and run :) and any changes are reflected across all projects using the trait.
To really understand why all this is available to you, it is worth playing around with a Smalltalk environment (search Pharo or Squeak). This is where Objective-C has its roots.
And finally, to stop these warnings:
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wprotocol"
#implementation ... #end
#pragma clang diagnostic pop
Because there are times when there are bogus "required" methods in a poorly designed protocol.
They should have been optional but someone insisted they are "required".
Thusly making this a run time issue rather than a compile bug is very very wise.

No ivars -> What am I missing?

I never use ivars. I only use properties -- sometimes assign properties with primitive types, and sometimes on a "private" class extension. I've seen the advantages of not using ivars in switching to ARC -- I have some borrowed code with lots of ivars that I still can't "ARC", since I don't know what needs to be retained. So I know some advantages of not using ivars, but what are the advantages of using ivars instead of properties?
Note: I depend exclusively on the ivars that are automagically added in (by the compiler?) for the property declaration.
Don't mark to close: I've looked at some of the other questions, like this and this and none hit the spot. The titles look good, but like so many questions on SO, the questions are a mess of strange doubts and other stuff.
Declared properties cannot be treated in the same manner as an #protected ivar. You can declare the property in a class extension to keep it private from any other class, or declare it in the header interface to make it publicly accessible, however there is no way to make it accessible only to subclasses. This would require the ivar declaration.
EDIT
Just another brief thought. I have recently been writing a lot of framework classes, and I think there might be something to be said for using iVars as documentation.
For example, let's say you are calling some code in a tight loop and you want to ensure that it is performant. Inside that tight loop you want to access a property of a class, but need to know whether each time you call it the return value is calculated on-the-fly or stored in an iVar. Seeing the iVar in the header is a quick way to ensure that you'll get that variable back without much overhead.
I don't see any reason to use iVars if you don't have to. If Apple and the compiler want to do work for you, I say let them. You'll have code that more efficient and easier to maintain. At this point iVars are legacy code.
One good reason for me: an annoying GCC bug, see this other question for a description.
If you're using Clang/LVVM, then you don't have to worry about this bug.

Why doesn't Objective-C support private methods?

I've seen a number of strategies for declaring semi-private methods in Objective-C, but there does not seem to be a way to make a truly private method. I accept that. But, why is this so? Every explanation I've essentially says, "you can't do it, but here's a close approximation."
There are a number of keywords applied to ivars (members) that control their scope, e.g. #private, #public, #protected. Why can't this be done for methods as well? It seems like something the runtime should be able to support. Is there an underlying philosophy I'm missing? Is this deliberate?
The answer is... well... simple. Simplicity and consistency, in fact.
Objective-C is purely dynamic at the moment of method dispatch. In particular, every method dispatch goes through the exact same dynamic method resolution point as every other method dispatch. At runtime, every method implementation has the exact same exposure and all of the APIs provided by the Objective-C runtime that work with methods and selectors work equally the same across all methods.
As many have answered (both here and in other questions), compile-time private methods are supported; if a class doesn't declare a method in its publicly available interface, then that method might as well not exist as far as your code is concerned. In other words, you can achieve all of the various combinations of visibility desired at compilation time by organizing your project appropriately.
There is little benefit to duplicating the same functionality into the runtime. It would add a tremendous amount of complexity and overhead. And even with all of that complexity, it still wouldn't prevent all but the most casual developer from executing your supposedly "private" methods.
EDIT: One of the assumptions I've
noticed is that private messages would
have to go through the runtime
resulting in a potentially large
overhead. Is this absolutely true?
Yes, it is. There's no reason to suppose that the implementor of a class would not want to use all of the Objective-C feature set in the implementation, and that means that dynamic dispatch must happen. However, there is no particular reason why private methods couldn't be dispatched by a special variant of objc_msgSend(), since the compiler would know that they were private; i.e. this could be achieved by adding a private-only method table to the Class structure.
There would be no way for a private
method to short-circuit this check or
skip the runtime?
It couldn't skip the runtime, but the runtime wouldn't necessarily have to do any checking for private methods.
That said, there's no reason that a third-party couldn't deliberately call objc_msgSendPrivate() on an object, outside of the implementation of that object, and some things (KVO, for example) would have to do that. In effect, it would just be a convention and little better in practice than prefixing private methods’ selectors or not mentioning them in the interface header.
To do so, though, would undermine the pure dynamic nature of the language. No longer would every method dispatch go through an identical dispatch mechanism. Instead, you would be left in a situation where most methods behave one way and a small handful are just different.
This extends beyond the runtime as there are many mechanisms in Cocoa built on top of the consistent dynamism of Objective-C. For example, both Key Value Coding and Key Value Observation would either have to be very heavily modified to support private methods — most likely by creating an exploitable loophole — or private methods would be incompatible.
The runtime could support it but the cost would be enormous. Every selector that is sent would need to be checked for whether it is private or public for that class, or each class would need to manage two separate dispatch tables. This isn't the same for instance variables because this level of protection is done at compile time.
Also, the runtime would need to verify that the sender of a private message is of the same class as the receiver. You could also bypass private methods; if the class used instanceMethodForSelector:, it could give the returned IMP to any other class for them to invoke the private method directly.
Private methods could not bypass the message dispatch. Consider the following scenario:
A class AllPublic has a public instance method doSomething
Another class HasPrivate has a private instance method also called doSomething
You create an array containing any number of instances of both AllPublic and HasPrivate
You have the following loop:
for (id anObject in myArray)
[anObject doSomething];
If you ran that loop from within AllPublic, the runtime would have to stop you sending doSomething on the HasPrivate instances, however this loop would be usable if it was inside the HasPrivate class.
The answers posted thus far do a good job of answering the question from a philosophical perspective, so I'm going to posit a more pragmatic reason: what would be gained by changing the semantics of the language? It's simple enough to effectively "hide" private methods. By way of example, imagine you have a class declared in a header file, like so:
#interface MyObject : NSObject {}
- (void) doSomething;
#end
If you have a need for "private" methods, you can also put this in the implementation file:
#interface MyObject (Private)
- (void) doSomeHelperThing;
#end
#implementation MyObject
- (void) doSomething
{
// Do some stuff
[self doSomeHelperThing];
// Do some other stuff;
}
- (void) doSomeHelperThing
{
// Do some helper stuff
}
#end
Sure, it's not quite the same as C++/Java private methods, but it's effectively close enough, so why alter the semantics of the language, as well as the compiler, runtime, etc., to add a feature that's already emulated in an acceptable way? As noted in other answers, the message-passing semantics -- and their reliance on runtime reflection -- would make handling "private" messages non-trivial.
The easiest solution is just to declare some static C functions in your Objective-C classes. These only have file scope as per the C rules for the static keyword and because of that they can only be used by methods in that class.
No fuss at all.
Yes, it can be done without affecting the runtime by utilizing a technique already employed by the compiler(s) for handling C++: name-mangling.
It hasn't been done because it hasn't been established that it would solve some considerable difficulty in the coding problem space that other techniques (e.g., prefixing or underscoring) are able to circumvent sufficiently. IOW, you need more pain to overcome ingrained habits.
You could contribute patches to clang or gcc that add private methods to the syntax and generated mangled names that it alone recognized during compilation (and promptly forgot). Then others in the Objective-C community would be able to determine whether it was actually worthwhile or not. It's likely to be faster that way than trying to convince the developers.
Essentially, it has to do with Objective-C's message-passing form of method calls. Any message can be sent to any object, and the object chooses how to respond to the message. Normally it will respond by executing the method named after the message, but it could respond in a number of other ways too. This doesn't make private methods completely impossible — Ruby does it with a similar message-passing system — but it does make them somewhat awkward.
Even Ruby's implementation of private methods is a bit confusing to people because of the strangeness (you can send the object any message you like, except for the ones on this list!). Essentially, Ruby makes it work by forbidding private methods to be called with an explicit receiver. In Objective-C it would require even more work since Objective-C doesn't have that option.
It's an issue with the runtime environment of Objective-C. While C/C++ compiles down into unreadable machine code, Objective-C still maintains some human-readable attributes like method names as strings. This gives Objective-C the ability to perform reflective features.
EDIT: Being a reflective language without strict private methods makes Objective-C more "pythonic" in that you trust other people that use your code rather than restrict what methods they can call. Using naming conventions like double underscores is meant to hide your code from a casual client coder, but won't stop coders needing to do more serious work.
There are two answers depending on the interpretation of the question.
The first is by hiding the method implementation from the interface. This is used, typically with a category with no name (e.g. #interface Foo()). This permits the object to send those messages but not others - though one might still override accidentally (or otherwise).
The second answer, on the assumption that this is about performance and inlining, is made possible but as a local C function instead. If you wanted a ‘private foo(NSString *arg)‘ method, you would do void MyClass_foo(MyClass *self, NSString *arg) and call it as a C function like MyClass_foo(self,arg). The syntax is different, but it acts with the sane kind of performance characteristics of C++'s private methods.
Although this answers the question, I should point out that the no-name category is by far the more common Objective-C way of doing this.
Objective-C doesn't support private methods because it doesn't need them.
In C++, every method must be visible in the declaration of the class. You can't have methods that someone including the header file cannot see. So if you want methods that code outside your implementation shouldn't use, you have no choice, the compiler must give you some tool so you can tell it that the method must not be used, that is the "private" keyword.
In Objective-C, you can have methods that are not in the header file. So you achieve the same purpose very easily by not adding the method to the header file. There's no need for private methods. Objective-C also has the advantage that you don't need to recompile every user of a class because you changed private methods.
For instance variables, that you used to have to declare in the header file (not anymore), #private, #public and #protected are available.
A missing answer here is: because private methods are a bad idea from an evolvability point of view. It might seem a good idea to make a method private when writing it, but it is a form of early binding. The context might change, and a later user might want to use a different implementation. A bit provocative: "Agile developers don't use private methods"
In a way, just like Smalltalk, Objective-C is for grown-up programmers. We value knowing what the original developer assumed the interface should be, and take the responsibility to deal with the consequences if we need to change implementation. So yes, it is philosophy, not implementation.