Why does registering a subclass with its superclass in +initialize present a chicken and egg issue? (objc) - objective-c

Just reading an excerpt from this website.
Because +initialize runs lazily, it's obviously not a good place to
put code to register a class that otherwise wouldn't get used. For
example, NSValueTransformer or NSURLProtocol subclasses can't use
+initialize to register themselves with their superclasses, because you set up a chicken-and-egg situation.
I understand that +initialize is run once per class when the first message is sent to that class. Also, if any of the subclasses do not implement their own +initialize, the +initialize method will be run again in the superclass.
I am just not 100% on why registering a subclass with its superclass in its own +initialize method would present a chicken and egg problem.
Is it because the superclass may have never had its +initialize invoked, and you are trying to register your subclass with its superclass in a method that depends on the superclass calling its +initialize first?
Just a little bit of further clarification would go a long way for me, thank you.

Take the example of NSURLProtocol. The way it's used is that registered subclasses are asked, in turn, if they can handle a request. The first to answer yes gets an instance created and the request is handed off.
The initialize method is only called if a message is sent to the class. Since only registered subclasses are asked to handle a request, you can't register in initialize because it won't ever be invoked.

Two extracts from the documentation on the initialize method:
The runtime sends initialize to each class in a program just before the class, or any class that inherits from it, is sent its first message from within the program. The runtime sends the initialize message to classes in a thread-safe manner. Superclasses receive this message before their subclasses.
...
Because initialize is called in a thread-safe manner and the order of initialize being called on different classes is not guaranteed, it’s important to do the minimum amount of work necessary in initialize methods. Specifically, any code that takes locks that might be required by other classes in their initialize methods is liable to lead to deadlocks. Therefore you should not rely on initialize for complex initialization, and should instead limit it to straightforward, class local initialization.
The initialize message is sent to a class the first time the runtime encounters it, for example the first time you need to allocate that class or the first time you access its sharedInstance method (in case of a singleton), and it acquires some locks in order to guarantee the thread safety. If you make references to subclasses from within this method, you can get into a deadlock situation, as both the base class and the subclass will lock onto the same thing.
For example, let's consider the scenario of a superclass MyClass and one of it's children MySubclass:
#interface MyClass
#end
#interface MySubclass: MyClass
#end
#implementation MyClass
+ (void)initialize {
[MySubclass doSomething];
}
When the runtime encounters the first usage of MyClass, it acquires a lock, and calls the class method initialize. Now, when executing the method it realises that this is also the first time it encounters MySubclass, and must also intialize it before the class can do some actual work. And what does this trigger? Yes, you've guessed, another call to +[MyClass initialize].
This how we end up in the chicken-egg situation, or to put it more technical - the deadlock, or the recursion. MyClass calls on MySubclass, this means that MySubclass needs to be initialized before MyClass is used. However MySubclass is a child of MyClass, so MyClass should be initialized first. So, which one the two should be first initialized?

Related

In what order methods +initialize and +load called?

Lets imagine that we have two classes:
#interface First : NSObject
#end
#interface Second : NSObject
#end
#implementation First
+(void)load
{
NSLog(#"This must be called first");
}
#end
#implementation Second
+(void)load
{
NSLog(#"And this must be called second");
}
#end
We have +load methods in each class. If we run this code, This must be called first will be first and And this must be called second will be second.
What determines the order in which the +load methods of this classes are called? In my experiment, if I move #implementation of second class before #implementation of first class - And this must be called second is printed first and This must be called first is printed second. Is this means that +load order depends only from order in source code?
In my real case I have precompiled framework with custom +load (some code are called before main() and I see logs from it), and I need to execute my code before this code (and as I understand - I can place it into +load, but I don't know how to change order). Or may be I can call my code before framework code with some other technique?
You really can't rely on order, nor can you effectively control order. By design. +load should happen before the +initialize of that class, but they order of the two is seemingly indeterminate across multiple classes (which I find slightly surprising, but well within the rules).
This is a big part of why you shouldn't do any heavy lifting in +load or +initialize. They really should only be used sparingly and only for initializing a small bit of highly localized state. Touching other significant subsystems is dangerous because you'll be changing initialization order and behavior in ways that might break the system. Shouldn't, but it might, it can, and it has in the past.
Instead, you really should try to have a "start here" point in your framework code that the client explicitly calls into.
+load methods are invoked by the objc runtime as part of the image loading process (you can see this by breaking in your load method and printing a stacktrace).
The order in which +load methods are invoked seems to depend on the order of the objc class lists generated by clang.
If you look at the source code of the objc runtime, you'll see that load_images (the function called by dyld), calls prepare_load_methods to get a list of all objc classes in an image. prepare_load_methods calls _getObjc2NonlazyClassList, which fetches the objc classlist from the __objc_nlclslist section in the image.
load_images then calls call_load_methods, which goes through all loaded classes and invokes their +load methods.

Objective-C: How to force a call to `+initialize` at startup rather than later when the class happens to used for the first time?

Problem
For certain classes, I would like to explicitly call the +initialize method when my program starts, rather than allowing the runtime system to call it implicitly at some nondeterministic point later when the class happens to first be used. Problem is, this isn't recommended.
Most of my classes have little to no work to do in initialization, so I can just let the runtime system do its thing for those, but at least one of my classes requires as much as 1 second to initialize on older devices, and I don't want things to stutter later when the program is up and running. (A good example of this would be sound effects — I don't want sudden delay the first time I try to play a sound.)
What are some ways to do this initialization at startup-time?
Attempted solutions
What I've done in the past is call the +initialize method manually from main.c, and made sure that every +initialize method has a bool initialized variable wrapped in a #synchronized block to prevent accidental double-initialization. But now Xcode is warning me that +initialize would be called twice. No surprise there, but I don't like ignoring warnings, so I'd rather fix the problem.
My next attempt (earlier today) was to define a +preinitialize function that I call directly instead +initialize, and to make sure I call +preinitialize implicitly inside of +initialize in case it is not called explicitly at startup. But the problem here is that something inside +preinitialize is causing +initialize to be called implicitly by the runtime system, which leads me to think that this is a very unwise approach.
So let's say I wanted to keep the actual initialization code inside +initialize (where it's really intended to be) and just write a tiny dummy method called +preinitialize that forces +initialize to be called implicitly by the runtime system somehow? Is there a standard approach to this? In a unit test, I wrote...
+ (void) preinitialize
{
id dummy = [self alloc];
NSLog(#"Preinitialized: %i", !!dummy);
}
...but in the debugger, I did not observe +initialize being called prior to +alloc, indicating that +initialize was not called implicitly by the runtime system inside of +preinitialize.
Edit
I found a really simple solution, and posted it as an answer.
The first possible place to run class-specific code is +load, which happens when the class is added to the ObjC runtime. It's still not completely deterministic which classes' +load implementations will be called in what order, but there are some rules. From the docs:
The order of initialization is as follows:
All initializers in any framework you link to.
All +load methods in your image.
All C++ static initializers and C/C++ __attribute__(constructor)
functions in your image.
All initializers in frameworks that link to you.
In addition:
A class’s +load method is called after all of its superclasses’ +load
methods.
A category +load method is called after the class’s own +load method.
So, two peer classes (say, both direct NSObject subclasses) will both +load in step 2 above, but there's no guarantee which order the two of them will be relative to each other.
Because of that, and because metaclass objects in ObjC are generally not great places to set and maintain state, you might want something else...
A better solution?
For example, your "global" state can be kept in the (single) instance of a singleton class. Clients can call [MySingletonClass sharedSingleton] to get that instance and not care about whether it's getting its initial setup done at that time or earlier. And if a client needs to make sure it happens earlier (and in a deterministic order relative to other things), they can call that method at a time of their choosing — such as in main before kicking off the NSApplication/UIApplication run loop.
Alternatives
If you don't want this costly initialization work to happen at app startup, and you don't want it to happen when the class is being put to use, you have a few other options, too.
Keep the code in +initialize, and contrive to make sure the class gets messaged before its first "real" use. Perhaps you can kick off a background thread to create and initialize a dummy instance of that class from application:didFinishLaunching:, for example.
Put that code someplace else — in the class object or in a singleton, but in a method of your own creation regardless — and call it directly at a time late enough for setup to avoid slowing down app launch but soon enough for it to be done before your class' "real" work is needed.
There are two problems here. First, you should never call +initialize directly. Second, if you have some piece of initialization that can take over a second, you generally shouldn't run it on the main queue because that would hang the whole program.
Put your initialization logic into a separate method so you can call it when you expect to. Optionally, put the logic into a dispatch_once block so that it's safe to call it multiple times. Consider the following example.
#interface Foo: NSObject
+ (void)setup;
#end
#implementation Foo
+ (void)setup {
NSLog(#"Setup start");
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
NSLog(#"Setup running");
[NSThread sleepForTimeInterval:1]; // Expensive op
});
}
#end
Now in your application:didFinishLaunchingWithOptions: call it in the background.
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {
NSLog(#"START");
// Here, you should setup your UI into an "inactive" state, since we can't do things until
// we're done initializing.
dispatch_group_t group = dispatch_group_create();
dispatch_group_async(group, dispatch_get_global_queue(0, 0), ^{
[Foo setup];
// And any other things that need to intialize in order.
});
dispatch_group_notify(group, dispatch_get_main_queue(), ^{
NSLog(#"We're all ready to go now! Turn on the the UI. Set the variables. Do the thing.");
});
return YES;
}
This is how you want to approach things if order matters to you. All the runtime options (+initialize and +load) make no promises on order, so don't rely on them for work that needs that. You'll just make everything much more complicated than it needs to be.
You may want to be able to check for programming errors in which you accidentally call Foo methods before initialization is done. That's best done, IMO, with assertions. For example, create an +isInitialized method that checks whatever +setup does (or create a class variable to track it). Then you can do this:
#if !defined(NS_BLOCK_ASSERTIONS)
#define FooAssertInitialized(condition) NSAssert([Foo isInitialized], #"You must call +setup before using Foo.")
#else
#define FooAssertInitialized(condition)
#endif
- (void)someMethodThatRequiresInitialization {
FooAssertInitialized();
// Do stuff
}
This makes it easy to mark methods that really do require initialization before use vs ones that may not.
Cocoa provides a setup point earlier than +initialize in the form of +load, which is called very shortly after the program's start. This is a weird environment: other classes that rely on +load may not be completely initialized yet, and more importantly, your main() has not been called! That means there's no autorelease pool in place.
After load but before initialize, functions marked with __attribute__((constructor)) will be called. This doesn't allow you to do much that you can't do in main() so far as I know.
One option would be to create a dummy instance of your class in either main() or a constructor, guaranteeing that initialize will be called as early as possible.
Answering my own question here. It turns out that the solution is embarrassingly simple.
I had been operating under the mistaken belief that +initialize would not be called until the first instance method in a class is invoked. This is not so. It is called before the first instance method or class method is invoked (other than +load, of course).
So the solution is simply to cause +initialize to be invoked implicitly. There are multiple ways to do this. Two are discussed below.
Option 1 (simple and direct, but unclear)
In startup code, simply call some method (e.g., +class) of the class you want to initialize at startup, and discard the return value:
(void)[MyClass class];
This is guaranteed by the Objective-C runtime system to call [MyClass initialize] implicitly if it has not yet been called.
Option 2 (less direct, but clearer)
Create a +preinitialize method with an empty body:
+ (void) preinitialize
{
// Simply by calling this function at startup, an implicit call to
// +initialize is generated.
}
Calling this function at startup implicitly invokes +initialize:
[MyClass preinitialize]; // Implicitly invokes +initialize.
This +preinitialize method serves no purpose other than to document the intention. Thus, it plays well with +initialize and +deinitialize and is fairly self-evident in the calling code. I write a +deinitialize method for every class I write that has an +initialize method. +deinitialize is called from the shutdown code; +initialize is called implicitly via +preinitialize in the startup code. Super simple. Sometimes I also write a +reinitialize method, but the need for this is rare.
I am now using this approach for all my class initializers. Instead of calling [MyClass initialize] in the start up code, I am now calling [MyClass preinitialize]. It's working great, and the call stack shown in the debugger confirms that +initialize is being called exactly at the intended time and fully deterministically.

Implementation of -init vs. +initialize

Can anyone explain why we need to include if (self == SomeClass class) inside the +initialize method?
I’ve found similar questions (listed below), but didn’t find any specific clarifications:
Objective-C: init vs initialize
Should +initialize/+load always start with an: if (self == [MyClass class]) guard?
Everyone says that if you don’t implement/override +initialize in Subclass, then it’s going to call the parent class twice.
Can anyone explain that part in particular, specifically why does it call the parent class twice?
Lastly, why doesn’t it happen when we implement +initialize in the class that inherits from NSObject, where we create a custom -init method and call self = [super init];.
Imagine you have a superclass that implements +initialize and a subclass that does not.
#interface SuperClass : NSObject #end
#implementation SuperClass
+(void)initialize {
NSLog(#"This is class %# running SuperClass +initialize", self);
}
#end
#interface SubClass : SuperClass #end
#implementation SubClass
// no +initialize implementation
#end
Use the superclass. This provokes a call to +[SuperClass initialize].
[SuperClass class];
=> This is class SuperClass running SuperClass +initialize
Now use the subclass. The runtime looks for an implementation of +initialize in SubClass and does not find anything. Then it looks for an inherited implementation in SuperClass and finds it. The inherited implementation gets called even though it was already called once on behalf of SuperClass itself:
[SubClass class];
=> This is class SubClass running SuperClass +initialize
The guard allows you to perform work that must be run at most once. Any subsequent calls to +initialize have a different class as self, so the guard can ignore them.
-init and +initialize are completely different things. The first is for initializing instances; the second is for initializing classes.
The first time any given class is messaged, the runtime makes sure to invoke +initialize on it and its superclasses. The superclasses are initialized first because they need to be ready before any subclass can initialize itself.
So, the first that time YourSubclass is messaged, the runtime might do something like:
[NSObject initialize];
[YourClass initialize];
[YourSubclass initialize];
(Although it's very unlikely that this would be the first time that NSObject is messaged, so probably it doesn't need to be initialized at this point. It's just for illustration.)
If YourSubclass doesn't implement +initialize, then the [YourSubclass initialize] invocation shown above will actually call +[YourClass initialize]. That's just the normal inheritance mechanism at work. That will make the second time that +[YourClass initialize] has been called.
Since the work done in a +initialize method is usually the sort of thing that should only be done once, the guard if (self == [TheClassWhoseImplementationThisMethodIsPartOf class]) is necessary. Also, that work often assumes that self refers to the current class being written, so that's also a reason for the guard.
The second answer you cite notes an exception, which is the old-style mechanism for registering KVO dependent keys with the +setKeys:triggerChangeNotificationsForDependentKey: method. That method is specific to the actual class it's invoked on, not any subclasses. You should avoid it and use the more modern +keyPathsForValuesAffectingValueForKey: or +keyPathsForValuesAffecting<Key> methods. If you must use the old way, put that part outside of the guard. Also, subclasses of such a class must call through to super which is not normally done.
Update:
A +initialize method should not normally call through to super because the runtime has already initialized the superclass. If and only if the superclass is known to register dependent keys using the old mechanism, then any subclasses must call through to super.
The same worry does not exist in the case of -init because the runtime is not automatically calling the superclass init method for you before calling yours. Indeed, if your init method does not call through to super, then nothing will have initialized the superclass's "part" of the instance.
The questions you cite have good accepted answers. To summarize, +initialize is called by the runtime on every class, so for a superclass with N subclasses, it will get called N+1 times on the superclass (once directly and once for each subclass that inherits it). Same thing if a subclass overrides it and calls super.
You can defend against this by asking at the superclass level, "is this my direct initialization by the system, and not 'super' being inherited or called by my subclass?"
if (self == [ThisSuperclass self]) {}
-init is used to initialize instances of classes and isn't invoked by the runtime by default like +initialize. Instances inherit their implementations of -init, can override the inherited implementation, and can also enjoy the benefit of the inherited implementation by calling [super init];.

How does an Objective C subclass initialize method calls the superclass's initialize method

While reading the 'objective c guide' from Apple's dev site, i got some question marks. From this question I already know that both sub and superclass 'initialize' methods get called. My question is; why does this happen? I know from that post also that the initialize is always called, but is that even true when I never use the superclass itself, and only the subclass?
A slight related question which came to mind on this topic:
Does a subclass 'contain' it's superclass, together with some new methods/variables, or
is everything copied from the superclass into the subclass?
In the first case i would understand that the initialize method would be sent to the 'contained' superclasses within the subclass, in the second option, I'd expect the subclass's initialize method to explicitely call [super initialize], which it doesn't.
Thanks!
The +initialize call is special and is explicitly called for every class. This is done outside of the normal inheritance chain you would be used to seeing. +initialize will be called on every class, subclass and category (yes, categories get their own initialize) the first time they're accessed.

How does inheriting from NSObject work?

There are a couple of things about Objective-C that are confusing to me:
Firstly, in the objective-c guide, it is very clear that each class needs to call the init method of its subclass. It's a little bit unclear about whether or not a class that inherits directly from NSObject needs to call its init method. Is this the case? And if so, why is that?
Secondly, in the section about NSObject, there's this warning:
A class that doesn’t need to inherit any special behavior from another class should nevertheless be made a subclass of the NSObject class. Instances of the class must at least have the ability to behave like Objective-C objects at runtime. Inheriting this ability from the NSObject class is much simpler and much more reliable than reinventing it in a new class definition.
Does this mean that I need to specify that all objects inherit from NSObject explicitly? Or is this like Java/Python/C# where all classes are subtypes of NSObject? If not, is there any reason to make a root class other than NSObject?
1) Any time an object is allocated in Objective-C its memory is zeroed out, and must be initialized by a call to init. Subclasses of NSObject may have their own specialized init routines, and at the beginning of such they should call their superclass' init routine something like so:
self = [super init];
The idea being that all init routines eventually trickle up to NSObject's init.
2) You need to be explicit about the inheritance:
#instance myClass : NSObject { /*...*/ } #end
There is no reason to have a root class other than NSObject -- a lot of Objective-C relies heavily on this class, so trying to circumvent it will result in you needlessly shooting yourself in the foot.
Since it is possible to inherit from different root base classes, yes you must explicitly declare you inherit from NSObject when you make any new class (unless of course you are subclassing something else already, which itself in turn probably subclasses NSObject).
Almost never is there a need to make your own base class, nor would it be easy to do so.
Objective-C can have multiple root classes, so you need to be explicit about inheritance. IIRC NSProxy is another root class. You'll likely never want or need to create your own root class, but they do exist.
As for calling NSObject's init, it's part custom and part safety. NSObject's init may not do anything now, that's no guarantee that future behaviour won't change. Call init to be safe.
You need to call [super init] because there is code behind initializing that you dont have to write because it is written for you in NSObjects init, such as probably actual memory allocation etc.