Can someone briefly explain to me how ARC works? I know it's different from Garbage Collection, but I was just wondering exactly how it worked.
Also, if ARC does what GC does without hindering performance, then why does Java use GC? Why doesn't it use ARC as well?
Every new developer who comes to Objective-C has to learn the rigid rules of when to retain, release, and autorelease objects. These rules even specify naming conventions that imply the retain count of objects returned from methods. Memory management in Objective-C becomes second nature once you take these rules to heart and apply them consistently, but even the most experienced Cocoa developers slip up from time to time.
With the Clang Static Analyzer, the LLVM developers realized that these rules were reliable enough that they could build a tool to point out memory leaks and overreleases within the paths that your code takes.
Automatic reference counting (ARC) is the next logical step. If the compiler can recognize where you should be retaining and releasing objects, why not have it insert that code for you? Rigid, repetitive tasks are what compilers and their brethren are great at. Humans forget things and make mistakes, but computers are much more consistent.
However, this doesn't completely free you from worrying about memory management on these platforms. I describe the primary issue to watch out for (retain cycles) in my answer here, which may require a little thought on your part to mark weak pointers. However, that's minor when compared to what you're gaining in ARC.
When compared to manual memory management and garbage collection, ARC gives you the best of both worlds by cutting out the need to write retain / release code, yet not having the halting and sawtooth memory profiles seen in a garbage collected environment. About the only advantages garbage collection has over this are its ability to deal with retain cycles and the fact that atomic property assignments are inexpensive (as discussed here). I know I'm replacing all of my existing Mac GC code with ARC implementations.
As to whether this could be extended to other languages, it seems geared around the reference counting system in Objective-C. It might be difficult to apply this to Java or other languages, but I don't know enough about the low-level compiler details to make a definitive statement there. Given that Apple is the one pushing this effort in LLVM, Objective-C will come first unless another party commits significant resources of their own to this.
The unveiling of this shocked developers at WWDC, so people weren't aware that something like this could be done. It may appear on other platforms over time, but for now it's exclusive to LLVM and Objective-C.
ARC is just play old retain/release (MRC) with the compiler figuring out when to call retain/release. It will tend to have higher performance, lower peak memory use, and more predictable performance than a GC system.
On the other hand some types of data structure are not possible with ARC (or MRC), while GC can handle them.
As an example, if you have a class named node, and node has an NSArray of children, and a single reference to its parent that "just works" with GC. With ARC (and manual reference counting as well) you have a problem. Any given node will be referenced from its children and also from its parent.
Like:
A -> [B1, B2, B3]
B1 -> A, B2 -> A, B3 -> A
All is fine while you are using A (say via a local variable).
When you are done with it (and B1/B2/B3), a GC system will eventually decide to look at everything it can find starting from the stack and CPU registers. It will never find A,B1,B2,B3 so it will finalize them and recycle the memory into other objects.
When you use ARC or MRC, and finish with A it have a refcount of 3 (B1, B2, and B3 all reference it), and B1/B2/B3 will all have a reference count of 1 (A's NSArray holds one reference to each). So all of those objects remain live even though nothing can ever use them.
The common solution is to decide one of those references needs to be weak (not contribute to the reference count). That will work for some usage patterns, for example if you reference B1/B2/B3 only via A. However in other patterns it fails. For example if you will sometimes hold onto B1, and expect to climb back up via the parent pointer and find A. With a weak reference if you only hold onto B1, A can (and normally will) evaporate, and take B2, and B3 with it.
Sometimes this isn't an issue, but some very useful and natural ways of working with complex structures of data are very difficult to use with ARC/MRC.
So ARC targets the same sort of problems GC targets. However ARC works on a more limited set of usage patterns then GC, so if you took a GC language (like Java) and grafted something like ARC onto it some programs wouldn't work any more (or at least would generate tons of abandoned memory, and may cause serious swapping issues or run out of memory or swap space).
You can also say ARC puts a bigger priority on performance (or maybe predictability) while GC puts a bigger priority on being a generic solution. As a result GC has less predictable CPU/memory demands, and a lower performance (normally) than ARC, but can handle any usage pattern. ARC will work much better for many many common usage patterns, but for a few (valid!) usage patterns it will fall over and die.
Magic
But more specifically ARC works by doing exactly what you would do with your code (with certain minor differences). ARC is a compile time technology, unlike GC which is runtime and will impact your performance negatively. ARC will track the references to objects for you and synthesize the retain/release/autorelease methods according to the normal rules. Because of this ARC can also release things as soon as they are no longer needed, rather than throwing them into an autorelease pool purely for convention sake.
Some other improvements include zeroing weak references, automatic copying of blocks to the heap, speedups across the board (6x for autorelease pools!).
More detailed discussion about how all this works is found in the LLVM Docs on ARC.
It varies greatly from garbage collection. Have you seen the warnings that tell you that you may be leaking objects on different lines? Those statements even tell you on what line you allocated the object. This has been taken a step further and now can insert retain/release statements at the proper locations, better than most programmers, almost 100% of the time. Occasionally there are some weird instances of retained objects that you need to help it out with.
Very well explained by Apple developer documentation. Read "How ARC Works"
To make sure that instances don’t disappear while they are still needed, ARC tracks how many properties, constants, and variables are currently referring to each class instance. ARC will not deallocate an instance as long as at least one active reference to that instance still exists.
To make sure that instances don’t disappear while they are still needed, ARC tracks how many properties, constants, and variables are currently referring to each class instance. ARC will not deallocate an instance as long as at least one active reference to that instance still exists.
To know Diff. between Garbage collection and ARC: Read this
ARC is a compiler feature that provides automatic memory management of objects.
Instead of you having to remember when to use retain, release, and autorelease, ARC evaluates the lifetime requirements of your objects and automatically inserts appropriate memory management calls for you at compile time. The compiler also generates appropriate dealloc methods for you.
The compiler inserts the necessary retain/release calls at compile time, but those calls are executed at runtime, just like any other code.
The following diagram would give you the better understanding of how ARC works.
Those who're new in iOS development and not having work experience on Objective C.
Please refer the Apple's documentation for Advanced Memory Management Programming Guide for better understanding of memory management.
Most iPhone code examples use the nonatmoc attribute in their properties. Even those that involve [NSThread detachNewThreadSelector:....]. However, is this really an issue if you are not accessing those properties on the separate thread?
If that is the case, how can you be sure nonatomic properties won't be accessed on this different in the future, at which point you may forget those properties are set as nonatomic. This can create difficult bugs.
Besides setting all properties to atomic, which can be impractical in a large app and may introduce new bugs, what is the best approach in this case?
Please note these these questions are specifically for iOS and not Mac in general.
First,know that atomicity by itself does not insure thread safety for your class, it simply generates accessors that will set and get your properties in a thread safe way. This is a subtle distinction. To create thread safe code, you will very likely need to do much more than simply use atomic accessors.
Second, another key point to know is that your accessors can be called from background or foreground threads safely regardless of atomicity. The key here is that they must never be called from two threads simultaneously. Nor can you call the setter from one thread while simultaneously calling the getter from another, etc. How you prevent that simultaneous access depends on what tools you use.
That said, to answer your question, you can't know for sure that your accessors won't be accessed on another thread in the future. This is why thread safety is hard, and a lot of code isn't thread safe. In general, if youre making a framework or library, yeah, you can try to make your code thread safe for the purposes of "defensive programming", or you can leave it non-thread safe. The atomicity of your properties is only a small part of that. Whichever you choose, though, be sure to document it so users of your library don't have to wonder.
I've long considered myself a garbage collection snob – despite a secret love for C++, I find myself sneering at developers who actively choose to use languages without (read: missing) garbage collection when they're given the option.
And then I met Objective-C. Wow! Its system of reference counting seems brilliantly simple – I'd even go so far as to say elegant. When developing for OSX, developers are given the option to use a snazzy GC; when developing for iOS, developers are stuck with reference counting.
My question is:
If I am developing an OSX application that could potentially be ported to iOS, is Objective-C's reference counting system time-consuming enough (development-wise and bug-fixing-wise) to warrant ignoring it for the application's first version?
What problems am I likely to run into if I rely on reference counting*, assuming I'm not clever enough to construct any diabolically complex cyclical data structures? With features like autorelease, it all seems so easy, but I know that Apple wouldn't have invested the effort into creating a garbage collector if this were really the case. What should I be on the lookout for?
* I am aware that I can use the garbage collector even if I am throwing around retains and releases (they'll be ignored). However, considering non-GC applications often use RAII, I don't understand how that would work if a generational GC were to "replace" calls to retain and release. Wouldn't resources potentially be released late?
My experience with developing code to port to iOS is that taking GC only code and back porting it to reference counting is a bit tedious and time consuming and potentially error prone. Having said that, as long as you use properties (make them retain even though it makes no difference in GC) as much as possible and you enable the static analyser build phase, it's not too bad. The static analyser will catch most failures to observe the memory management rules. It won't notice if you fail to release an ivar in dealloc, but you can go through and systematically add all the dealloc methods.
Bear in mind that you can't directly port a Mac application to the iPhone, the VC part of MVC has to be completely rewritten, so you could take the approach of writing the Mac UI solely for garbage collection and only make the model classes compatible with both GC and reference counting.
The documentation for properties in Obj-C 2.0 say that atomic properties use a lock internally, but it doesn't document the specifics of the lock. Does anybody know if this is a per-property lock, a per-object lock separate from the implicit one used by #synchronized(self), or the equivalent of #synchronized(self)?
Looking at the generated code (iOS SDK GCC 4.0/4.2 for ARM),
32-bit assign properties (including struct {int32_t v;}) are accessed directly.
Larger-than-32-bit structs are accessed with objc_copyStruct().
double and int64_t are accessed with objc_copyStruct, except on GCC 4.0 where they're accessed directly with stmia/ldmia (I'm not sure if this is guaranteed to be atomic in case of interrupts).
retain/copy accessors call objc_getProperty and objc_setProperty.
Cocoa with Love: Memory and thread-safe custom property methods gives some details on how they're implemented in runtime version objc4-371.2; obviously the precise implementation can vary between runtimes (for example, on some platforms you can use atomic swap/CAS to spin on the ivar itself instead of using another lock).
The lock used by atomic #properties is an implementation detail--for appropriate types on appropriate platforms, atomic operations without a lock are possible and I'd be surprised if Apple was not taking advantage of them. There is no public access to the lock in any case, so you can't #synchronize on the same lock. Several Apple engineers have pointed out that atomic properties do not guarantee thread safety; atomic properties only guarantee that gets/sets of that value are atomic. For correct thread safety, you will have to make use of higher-level locking or synchronization and you almost certainly would not want to use the same lock as the synthesize getter/setter(s) might be using.
Is there a good rule of thumb for when nonatomic properties should be used in Objective-C (on the desktop or on the iPhone platform), as opposed to the default atomic properties? I understand the difference – atomicity guarantees an entire value at the expense of performance – but most examples I see use nonatomic properties (and aren't unstable), so there are evidently circumstances in which atomicity is required and circumstances under which it is not.
Can anyone provide me with a simple guideline for when I should use atomic properties and when I should favour nonatomic ones?
You should favor nonatomic whenever possible. In general, this means properties that will only be set/accessed from a single thread or properties whose access is protected by higher-level synchronization of some kind. It's important to note that atomic property access does not guarantee thread safety. In other words the algorithms that depend on the values of atomic properties must themselves be thread safe for the entire system to be thread safe. With this in mind, it is often possible to make the properties nonatomic while maintaining the thread safety of the system.