Objective C two-phase construction of objects - objective-c

I've been reading up on RAII and single vs. two-phase construction/initialization. For whatever reason, I was in the two-phase camp up until recently, because at some point I must have heard that it's bad to do error-prone operations in your constructor. However, I think I'm now convinced that single-phase is preferable, based on questions I've read on SO and other articles.
My question is: Why does Objective C use the two-phase approach (alloc/init) almost exclusively for non-convenience constructors? Is there any specific reason in the language, or was it just a design decision by the designers?

I have the enviable situation of working for the guy who wrote +alloc back in 1991, and I happened to ask him a very similar question a few months ago. The addition of +alloc was in order to provide +allocWithZone:, which was in order to add memory pools in NeXTSTEP 2.0 where memory was very tight (4M). This allowed the caller to control where objects were allocated in memory. It was a replacement for +new and its kin, which was (and continues to be, though no one uses it) a 1-phase constructor, based on Smalltalk's new. When Cocoa came over to Apple, the use of +alloc was already entrenched, and there was no going back to +new, even though actually picking your NSZone is seldom of significant value.
So it isn't a big 1-phase/2-phase philosophical question. In practice, Cocoa has a single phase construction, because you always do (and always should) call these back-to-back in a single call without a test on the +alloc. You can think of it as a elaborate way of typing "new".

My experience is with c++, but one downside of c++'s one phase initialization is handling of inheritance/virtual functions. In c++, you can't call virtual functions during construction or destruction (well, you can, it just won't do what you expect). A two phase init could solve this (partially. From what I understand, it would get routed to the right class, but the init might not have finished yet. You could still do things with that) (I'm still in favor of the one phase)

Related

Why does the Objective-C compiler need to know method signatures?

Why does the Objective-C compiler need to know at compile-time the signature of the methods that will be invoked on objects when it could defer that to runtime (i.e., dynamic binding)? For example, if I write [foo someMethod], why it is necessary for the compiler to know the signature of someMethod?
Because of calling conventions at a minimum (with ARC, there are more reasons, but calling conventions have always been a problem).
You may have been told that [foo someMethod] is converted into a function call:
objc_msgSend(foo, #selector(someMethod))
This, however, isn't exactly true. It may be converted to a number of different function calls depending on what it returns (and what's returned matters whether you use the result or not). For instance, if it returns an object or an integer, it'll use objc_msgSend, but if it returns a structure (on both ARM and Intel) it'll use objc_msgSend_stret, and if it returns a floating point on Intel (but not ARM I believe), it'll use objc_msgSend_fpret. This is all because on different processors the calling conventions (how you set up the stack and registers, and where the result is stored) are different depending on the result.
It also matters what the parameters are and how many there are (the number can be inferred from ObjC method names unless they're varargs... right, you have to deal with varargs, too). On some processors, the first several parameters may be put in registers, while later parameters may be put on the stack. If your function takes a varargs, then the calling convention may be different still. All of that has to be known in order to compile the function call.
ObjC could be implemented as a more pure object model to avoid all of this (as other, more dynamic languages do), but it would be at the cost of performance (both space and time). ObjC can make method calls surprisingly cheap given the level of dynamic dispatch, and can easily work with pure C machine types, but the cost of that is that we have to let the compiler know more specifics about our method signatures.
BTW, this can (and every so often does) lead to really horrible bugs. If you have a couple of methods:
- (MyPointObject *)point;
- (CGPoint)point;
Maybe they're defined in completely different files as methods on different classes. But if the compiler chooses the wrong definition (such as when you're sending a message to id), then the result you get back from -point can be complete garbage. This is a very, very hard bug to figure out when it happens (and I've had it happen to me).
For a bit more background, you may enjoy Greg Parker's article explaining objc_msgSend_stret and objc_msgSend_fpret. Mike Ash also has an excellent introduction to this topic. And if you want to go deep down this rabbit hole, you can see bbum's instruction-by-instruction investigation of objc_msgSend. It's outdated now, pre-ARC, and only covers x86_64 (since every architecture needs its own implementation), but is still highly educational and recommended.

Objective C's ARC vs C++'s Manual Memory Management [duplicate]

This question already has answers here:
What are the advantages and disadvantages of using ARC? [closed]
(4 answers)
Closed 9 years ago.
One possible exam question reads as follows:
"Explain the benefits and drawbacks of Objective C's memory management when compared to c++'s"
I do know Objective C uses ARC, and ARC enables us to avoid destroying an object that is still being referenced by something else (meaning, its still needed). But I can't find any drawbacks at all, anywhere. I can only think of "There are no drawbacks" as an answer, but since the question explicitly asks for drawbacks, I'm guessing there must be at least one.
Reference counting may solve a problem that you don't have. It comes at a price, and you'll end up paying the price no matter whether you wanted the solution in the first place.
Contrary to what the gut feeling may say, most objects actually don't need to be shared at all and have a well-defined, unique ownership throughout their life. All that's needed is the ability to pass those objects around; reference counting provides that, but it provides much more, and has a greater cost.
(This answer compares reference counting in Objective C to C++-style lifetime management. It does not consider whether reference counting in Obj-C is sensible in the first place. ARC is simply an automated form of MRC, and if you were using MRC in the past and it made sense, then the question whether to migrate to ARC is not the point of this post. Rather, this post applies equally to the comparison of "MRC in Obj-C vs C++".)
Reference counting 'frees' you from always thinking about WHEN to delete an object. Anybody using the object just says, I still need it (want to retain it) or I am done with it (I release it)
that makes memory management way easier and also makes the code more manageable
BUT
it comes at the price of 2 additional method calls whenever you pass stuff around: you have to retain the object, THEN save the pointer and later also call release it.
When you deal with LOTS of objects that become a real life problem. Just the extra calls can kill your algorithms performance
and especially if you don't need any reference counting because the scope where the object is used is clear, the overhead is just annoying.
so it is convenience + maintainability vs. speed

Selectors or Blocks for callbacks in an Objective-C library

Question
We're developing a custom EventEmitter inspired message system in Objective-C. For listeners to provide callbacks, should we require blocks or selectors and why?
Which would you rather use, as a developer consuming a third party library? Which seems most in line with Apple's trajectory, guidelines and practices?
Background
We're developing a brand new iOS SDK in Objective-C which other third parties will use to embed functionality into their app. A big part of our SDK will require the communication of events to listeners.
There are five patterns I know of for doing callbacks in Objective-C, three of which don't fit:
NSNotificationCenter - can't use because it doesn't guarantee the order observers will be notified and because there's no way for observers to prevent other observers from receiving the event (like stopPropagation() would in JavaScript).
Key-Value Observing - doesn't seem like a good architectural fit since what we really have is message passing, not always "state" bound.
Delegates and Data Sources - in our case, there usually will be many listeners, not a single one which could rightly be called the delegate.
And two of which that are contenders:
Selectors - under this model, callers provide a selector and a target which are collectively invoked to handle an event.
Blocks - introduced in iOS 4, blocks allow functionality to be passed around without being bound to an object like the observer/selector pattern.
This may seem like an esoteric opinion question, but I feel there is an objective "right" answer that I am simply too inexperienced in Objective-C to determine. If there's a better StackExchange site for this question, please help me by moving it there.
UPDATE #1 — April 2013
We chose blocks as the means of specifying callbacks for our event handlers. We're largely happy with this choice and don't plan to remove block-based listener support. It did have two notable drawbacks: memory management and design impedance.
Memory Management
Blocks are most easily used on the stack. Creating long-lived blocks by copying them onto the heap introduces interesting memory management issues.
Blocks which make calls to methods on the containing object implicitly boost self's reference count. Suppose you have a setter for the name property of your class, if you call name = #"foo" inside a block, the compiler treats this as [self setName:#"foo"] and retains self so that it won't be deallocated while the block is still around.
Implementing an EventEmitter means having long-lived blocks. To prevent the implicit retain, the user of the emitter needs to create a __block reference to self outside of the block, ex:
__block *YourClass this = self;
[emitter on:#"eventName" callBlock:...
[this setName:#"foo"];...
}];
The only problem with this approach is that this may be deallocated before the handler is invoked. So users must unregister their listeners when being deallocated.
Design Impedance
Experienced Objective-C developers expect to interact with libraries using familiar patterns. Delegates are a tremendously familiar pattern, and so canonical developers expect to use it.
Fortunately, the delegate pattern and block-based listeners are not mutually exclusive. Although our emitter must be able to be handle listeners from many places (having a single delegate won't work) we could still expose an interface which would allow developers to interact with the emitter as though their class was the delegate.
We haven't implemented this yet, but we probably will based on requests from users.
UPDATE #2 — October 2013
I'm no longer working on the project that spawned this question, having quite happily returned to my native land of JavaScript.
The smart developers who took over this project decided correctly to retire our custom block-based EventEmitter entirely.
The upcoming release has switched to ReactiveCocoa.
This gives them a higher level signaling pattern than our EventEmitter library previously afforded, and allows them to encapsulate state inside of signal handlers better than our block-based event handlers or class-level methods did.
Personally, I hate using delegates. Because of how objective-C is structured, It really clutters code up If I have to create a separate object / add a protocol just to be notified of one of your events, and I have to implement 5/6. For this reason, I prefer blocks.
While they (blocks) do have their disadvantages (e.x. memory management can be tricky). They are easily extendable, simple to implement, and just make sense in most situations.
While apple's design structures may use the sender-delegate method, this is only for backwards compatibility. More recent Apple APIs have been using blocks (e.x. CoreData), because they are the future of objective-c. While they can clutter code when used overboard, it also allows for simpler 'anonymous delegates', which is not possible in objective C.
In the end though, it really boils down to this:
Are you willing to abandon some older, more dated platforms in exchange for using blocks vs. a delegate? One major advantage of a delegate is that it is guaranteed to work in any version of the objc-runtime, whereas blocks are a more recent addition to the language.
As far as NSNotificationCenter/KVO is concerned, they are both useful, and have their purposes, but as a delegate, they are not intended to be used. Neither can send a result back to the sender, and for some situations, that is vital (-webView:shouldLoadRequest: for example).
I think the right thing to do is to implement both, use it as a client, and see what feels most natural. There are advantages to both approaches, and it really depends on the context and how you expect the SDK to be used.
The primary advantage of selectors is simple memory management--as long as the client registers and unregisters correctly, it doesn't need to worry about memory leaks. With blocks, memory management can get complex, depending on what the client does inside the block. It's also easier to unit test the callback method. Blocks can certainly be written to be testable, but it's not common practice from what I've seen.
The primary advantage of blocks is flexibility--the client can easily reference local variables without making them ivars.
So I think it just depends on the use case--there is no "objective right answer" to such a general design question.
Great writeup!
Coming from writing lots of JavaScript, event-driven programming feels way cleaner than having delegates back and forth, in my personal opinion.
Regarding the memory-managing aspect of listeners, my attempt at solving this (drawing heavily from Mike Ash's MAKVONotificationCenter), swizzles both the caller and emitter's dealloc implementation (as seen here) in order to safely remove listeners in both ways.
I'm not entirely sure how safe this approach is, but the idea is to try it 'til it breaks.
A thing about a library is, that you can only to some extend anticipate, how it will be used. so you need to provide a solution, that is as simple and open as possible — and familiar to the users.
For me all this fits best to delegation. Although you are right, that it can only have on listener (delegate), this means no limitation, as the user can write a class as delegate, that knows about all desired listeners and informs them. Of course you can provide a registering class. that will call the delegate methods on all registered objects.
Blocks are as good.
what you name selectors is called target/action and simple yet powerful.
KVO seems to be a not optimal solution for me as-well, as it would possibly weaken encapsulation, or lead to a wrog mental model of how using your library's classes.
NSNotifications are nice to inform about certain events, but the users should not be forced to use them, as they are quite informal. and your classes wont be able to know, if there is someone tuned-in.
some useful thoughts on API-Design: http://mattgemmell.com/2012/05/24/api-design/

How does the new automatic reference counting mechanism work?

Can someone briefly explain to me how ARC works? I know it's different from Garbage Collection, but I was just wondering exactly how it worked.
Also, if ARC does what GC does without hindering performance, then why does Java use GC? Why doesn't it use ARC as well?
Every new developer who comes to Objective-C has to learn the rigid rules of when to retain, release, and autorelease objects. These rules even specify naming conventions that imply the retain count of objects returned from methods. Memory management in Objective-C becomes second nature once you take these rules to heart and apply them consistently, but even the most experienced Cocoa developers slip up from time to time.
With the Clang Static Analyzer, the LLVM developers realized that these rules were reliable enough that they could build a tool to point out memory leaks and overreleases within the paths that your code takes.
Automatic reference counting (ARC) is the next logical step. If the compiler can recognize where you should be retaining and releasing objects, why not have it insert that code for you? Rigid, repetitive tasks are what compilers and their brethren are great at. Humans forget things and make mistakes, but computers are much more consistent.
However, this doesn't completely free you from worrying about memory management on these platforms. I describe the primary issue to watch out for (retain cycles) in my answer here, which may require a little thought on your part to mark weak pointers. However, that's minor when compared to what you're gaining in ARC.
When compared to manual memory management and garbage collection, ARC gives you the best of both worlds by cutting out the need to write retain / release code, yet not having the halting and sawtooth memory profiles seen in a garbage collected environment. About the only advantages garbage collection has over this are its ability to deal with retain cycles and the fact that atomic property assignments are inexpensive (as discussed here). I know I'm replacing all of my existing Mac GC code with ARC implementations.
As to whether this could be extended to other languages, it seems geared around the reference counting system in Objective-C. It might be difficult to apply this to Java or other languages, but I don't know enough about the low-level compiler details to make a definitive statement there. Given that Apple is the one pushing this effort in LLVM, Objective-C will come first unless another party commits significant resources of their own to this.
The unveiling of this shocked developers at WWDC, so people weren't aware that something like this could be done. It may appear on other platforms over time, but for now it's exclusive to LLVM and Objective-C.
ARC is just play old retain/release (MRC) with the compiler figuring out when to call retain/release. It will tend to have higher performance, lower peak memory use, and more predictable performance than a GC system.
On the other hand some types of data structure are not possible with ARC (or MRC), while GC can handle them.
As an example, if you have a class named node, and node has an NSArray of children, and a single reference to its parent that "just works" with GC. With ARC (and manual reference counting as well) you have a problem. Any given node will be referenced from its children and also from its parent.
Like:
A -> [B1, B2, B3]
B1 -> A, B2 -> A, B3 -> A
All is fine while you are using A (say via a local variable).
When you are done with it (and B1/B2/B3), a GC system will eventually decide to look at everything it can find starting from the stack and CPU registers. It will never find A,B1,B2,B3 so it will finalize them and recycle the memory into other objects.
When you use ARC or MRC, and finish with A it have a refcount of 3 (B1, B2, and B3 all reference it), and B1/B2/B3 will all have a reference count of 1 (A's NSArray holds one reference to each). So all of those objects remain live even though nothing can ever use them.
The common solution is to decide one of those references needs to be weak (not contribute to the reference count). That will work for some usage patterns, for example if you reference B1/B2/B3 only via A. However in other patterns it fails. For example if you will sometimes hold onto B1, and expect to climb back up via the parent pointer and find A. With a weak reference if you only hold onto B1, A can (and normally will) evaporate, and take B2, and B3 with it.
Sometimes this isn't an issue, but some very useful and natural ways of working with complex structures of data are very difficult to use with ARC/MRC.
So ARC targets the same sort of problems GC targets. However ARC works on a more limited set of usage patterns then GC, so if you took a GC language (like Java) and grafted something like ARC onto it some programs wouldn't work any more (or at least would generate tons of abandoned memory, and may cause serious swapping issues or run out of memory or swap space).
You can also say ARC puts a bigger priority on performance (or maybe predictability) while GC puts a bigger priority on being a generic solution. As a result GC has less predictable CPU/memory demands, and a lower performance (normally) than ARC, but can handle any usage pattern. ARC will work much better for many many common usage patterns, but for a few (valid!) usage patterns it will fall over and die.
Magic
But more specifically ARC works by doing exactly what you would do with your code (with certain minor differences). ARC is a compile time technology, unlike GC which is runtime and will impact your performance negatively. ARC will track the references to objects for you and synthesize the retain/release/autorelease methods according to the normal rules. Because of this ARC can also release things as soon as they are no longer needed, rather than throwing them into an autorelease pool purely for convention sake.
Some other improvements include zeroing weak references, automatic copying of blocks to the heap, speedups across the board (6x for autorelease pools!).
More detailed discussion about how all this works is found in the LLVM Docs on ARC.
It varies greatly from garbage collection. Have you seen the warnings that tell you that you may be leaking objects on different lines? Those statements even tell you on what line you allocated the object. This has been taken a step further and now can insert retain/release statements at the proper locations, better than most programmers, almost 100% of the time. Occasionally there are some weird instances of retained objects that you need to help it out with.
Very well explained by Apple developer documentation. Read "How ARC Works"
To make sure that instances don’t disappear while they are still needed, ARC tracks how many properties, constants, and variables are currently referring to each class instance. ARC will not deallocate an instance as long as at least one active reference to that instance still exists.
To make sure that instances don’t disappear while they are still needed, ARC tracks how many properties, constants, and variables are currently referring to each class instance. ARC will not deallocate an instance as long as at least one active reference to that instance still exists.
To know Diff. between Garbage collection and ARC: Read this
ARC is a compiler feature that provides automatic memory management of objects.
Instead of you having to remember when to use retain, release, and autorelease, ARC evaluates the lifetime requirements of your objects and automatically inserts appropriate memory management calls for you at compile time. The compiler also generates appropriate dealloc methods for you.
The compiler inserts the necessary retain/release calls at compile time, but those calls are executed at runtime, just like any other code.
The following diagram would give you the better understanding of how ARC works.
Those who're new in iOS development and not having work experience on Objective C.
Please refer the Apple's documentation for Advanced Memory Management Programming Guide for better understanding of memory management.

Calling -retainCount Considered Harmful

Or, Why I Didn't Use retainCount On My Summer Vacation
This post is intended to solicit detailed write-ups about the whys and wherefores of that infamous method, retainCount, in order to consolidate the relevant information floating around SO.*
The basics: What are the official reasons to not use retainCount? Is there ever any situation at all when it might be useful? What should be done instead?** Feel free to editorialize.
Historical/explanatory: Why does Apple provide this method in the NSObject protocol if it's not intended to be used? Does Apple's code rely on retainCount for some purpose? If so, why isn't it hidden away somewhere?
For deeper understanding: What are the reasons that an object may have a different retain count than would be assumed from user code? Can you give any examples*** of standard procedures that framework code might use which cause such a difference? Are there any known cases where the retain count is always different than what a new user might expect?
Anything else you think is worth metioning about retainCount?
*
Coders who are new to Objective-C and Cocoa often grapple with, or at least misunderstand, the reference-counting scheme. Tutorial explanations may mention retain counts, which (according to these explanations) go up by one when you call retain, alloc, copy, etc., and down by one when you call release (and at some point in the future when you call autorelease).
A budding Cocoa hacker, Kris, could thus quite easily get the idea that checking an object's retain count would be useful in resolving some memory issues, and, lo and behold, there's a method available on every object called retainCount! Kris calls retainCount on a couple of objects, and this one is too high, and that one's too low, and what the heck is going on?! So Kris makes a post on SO, "What's wrong with my memory management?" and then a swarm of <bold>, <large> letters descend saying "Don't do that! You can't rely on the results.", which is well and good, but our intrepid coder may want a deeper explanation.
I'm hoping that this will turn into an FAQ, a page of good informational essays/lectures from any of our experts who are inclined to write one, that new Cocoa-heads can be pointed to when they wonder about retainCount.
** I don't want to make this too broad, but specific tips from experience or the docs on verifying/debugging retain and release pairings may be appropriate here.
***In dummy code; obviously the general public don't have access to Apple's actual code.
The basics: What are the official reasons to not use retainCount?
Autorelease management is the most obvious -- you have no way to be sure how many of the references represented by the retainCount are in a local or external (on a secondary thread, or in another thread's local pool) autorelease pool.
Also, some people have trouble with leaks, and at a higher level reference counting and how autorelease pools work at fundamental levels. They will write a program without (much) regard to proper reference counting, or without learning ref counting properly. This makes their program very difficult to debug, test, and improve -- it's also a very time consuming rectification.
The reason for discouraging its use (at the client level) is twofold:
The value may vary for so many reasons. Threading alone is reason enough to never trust it.
You still have to implement correct reference counting. retainCount will never save you from imbalanced reference counting.
Is there ever any situation at all when it might be useful?
You could in fact use it in a meaningful way if you wrote your own allocators or reference counting scheme, or if your object lived on one thread and you had access to any and all autorelease pools it could exist in. This also implies you would not share it with any external APIs. The easy way to simulate this is to create a program with one thread, zero autorelease pools, and do your reference counting the 'normal' way. It's unlikely that you'll ever need to solve this problem/write this program for anything other than "academic" reasons.
As a debugging aid: you could use it to verify that the retain count is not unusually high. If you take this approach, be mindful of the implementation variances (some are cited in this post), and don't rely on it. Don't even commit the tests to your SCM repository.
This may be a useful diagnostic in extremely rare circumstances. It can be used to detect:
Over-retaining: An allocation with a positive imbalance in retain count would not show up as a leak if the allocation is reachable by your program.
An object which is referenced by many other objects: One illustration of this problem is a (mutable) shared resource or collection which operates in a multithreaded context - frequent access or changes to this resource/collection can introduce a significant bottleneck in your program's execution.
Autorelease levels: Autoreleasing, autorelease pools, and retain/autorelease cycles all come with a cost. If you need to minimize or reduce memory use and/or growth, you could use this approach to detect excessive cases.
From commentary with Bavarious (below): a high value may also indicate an invalidated allocation (dealloc'd instance). This is completely an implementation detail, and again, not usable in production code. Messaging this allocation would result in a error when zombies are enabled.
What should be done instead?
If you're not responsible for returning the memory at self (that is, you did not write an allocator), leave it alone - it is useless.
You have to learn proper reference counting.
For a better understanding of release and autorelease usage, set up some breakpoints and understand how they are used, in what cases, etc. You'll still have to learn to use reference counting correctly, but this can aid your understanding of why it's useless.
Even simpler: use Instruments to track allocs and ref counts, then analyze the ref counting and callstacks of several objects in an active program.
Historical/explanatory: Why does Apple provide this method in the NSObject protocol if it's not intended to be used? Does Apple's code rely on retainCount for some purpose? If so, why isn't it hidden away somewhere?
We can assume that it is public for two primary reasons:
Reference counting proper in managed environments. It's fine for the allocators to use retainCount -- really. It's a very simple concept. When -[NSObject release] is called, the ref counter (unless overridden) may be called, and the object can be deallocated if retainCount is 0 (after calling dealloc). This is all fine at the allocator level. Allocators and zones are (largely) abstracted so... this makes the result meaningless for ordinary clients. See commentary with bbum (below) for details on why retainCount cannot be equal to 0 at the client level, object deallocation, deallocation sequences, and more.
To make it available to subclassers who want a custom behavior, and because the other reference counting methods are public. It may be handy in a few cases, but it's typically used for the wrong reasons (e.g. immortal singletons). If you need your own reference counting scheme, then this family may be worth overriding.
For deeper understanding: What are the reasons that an object may have a different retain count than would be assumed from user code? Can you give any examples*** of standard procedures that framework code might use which cause such a difference? Are there any known cases where the retain count is always different than what a new user might expect?
Again, a custom reference counting schemes and immortal objects. NSCFString literals fall into the latter category:
NSLog(#"%qu", [#"MyString" retainCount]);
// Logs: 1152921504606846975
Anything else you think is worth mentioning about retainCount?
It's useless as a debugging aid. Learn to use leak and zombie analyses, and use them often -- even after you have a handle on reference counting.
Update: bbum has posted an article entitled retainCount is useless. The article contains a thorough discussion of why -retainCount isn’t useful in the vast majority of cases.
The general rule of thumb is if you're using this method, you better be damn sure you know what you're doing. If you are using it for debugging a memory leak you're doing it wrong, if you're doing it to see what is going on with an object, you're doing it wrong.
There is one case where I have used it, and found it useful. That is in doing a shared object cache where I wanted to flush the object when nothing had a reference to it anymore. In this situation I waited until the retainCount is equal to 1, and then I can release it knowing that nothing else is holding onto it, this will obviously not work properly in garbage collected environments and there are better ways to do it. But this is still the only 'valid' use case I've seen for it, and isn't something a lot of people will be doing.