I'm new to Objective-C. When should I used #synchronized, and when should I use lock/unlock? My background is mainly in Java. I know that in Java, obtaining explicit-locks allows you to do more complex, extensive, and flexible operations (vis-à-vis release order, etc.) whereas the synchronized keyword forces locks to be used in a block-structured way and they have also to released in the reverse order of how they were acquired. Does the same rationale hold in Objective-C?
Many would consider locking/unlocking in arbitrary order to be a bug, not a feature. Namely, it quite easily leads to deadlocks.
In any case, there is little difference between #synchonized(), -lock/-unlock, or any other mutex, save for the details of the scope. They are expensive, fragile, error-prone, and, often, getting them absolutely correct leads to performance akin to a single-threaded solution anyway (but with the complexity of threads).
The new hotness are queues. Queues tend to be a lot lighter weight in that they don't generally require system calls for the "fast path" operations that account for most calls. They also tend to be much more expressive of intent.
Grand Central Dispatch or NSOperationQueue specifically. The latter is built upon the former in current OS Release. GCD's API's tend to be lower level, but they are very powerful and surprisingly simple. NSOperationQueue is higher level and allows for directly expressing dependencies and the like.
I would suggest starting with the Cocoa concurrency guide.
You are generally correct. #synchronized takes a lock that's "attached" to a given object (the object referred to in the #synchronized directive does not have to be a lock). As in Java, it defines a new block, and the lock is taken at the beginning, and released at exit, whether by leaving the block normally or through an exception. So, as you guessed, the locks are released in reverse order from acquisition.
Related
Question
I am working on a project where I am concerned about the thread safety of an object's properties. I know that when a property is an object such as an NSString, I can run into situations where multiple threads are reading and writing simultaneously. In this case you can get a corrupt read and the app will either crash or result in corrupted data.
My question is for primitive value type properties such as BOOLs or NSIntegers. I am wondering if I can get into a similar situation where I read a corrupt value when reading and writing from multiple threads (and the app will crash)? In either case, I am interested in why.
Clarification - 1/13/17
I am mostly interested in if a primitive value type property is differently susceptible to crashing due to multiple threads accessing it at the same time than an object such as NSMutableString, custom created object, etc. In addition, if there is a difference when accessing memory on the stack vs heap relative to multithreading.
Clarification - 12/1/17
Thank you to #Rob for pointing me to the answer here: stackoverflow.com/a/34386935/1271826! This answer has a great example that shows that depending on the type of architecture you are on (32-bit vs 64-bit), you can get an undefined result when using a primitive property.
Although this is a great step towards answering my question, I still wonder two things:
If there is a multithreading difference when accessing a primitive value property on the stack vs heap (as noted in my previous clarification)?
If you restrict a program to running on one architecture, can you still find yourself in an undefended state when access a primitive value property and why?
I should note that here has been a lot of conversation around atomic vs nonatomic in response to this question. Although this is generally an important concept, this question has little to do with preventing undefined multithreading behavior by using the atomic property modifier or any other thread safety approach such as using GCD.
If your primitive value type property is atomic, then you're assured it cannot be corrupted because your reading it from one thread while setting it from another (as long as you only use the accessor methods, and not interact with the backing ivar directly). That's the entire purpose of atomic. And, as you suggest, this only applicable to fundamental data types (or objects that are both immutable and stateless). But in these narrow cases, atomic can be useful.
Having said that, this is a far cry from concluding that the app is thread-safe. It only assures you that the access to that one property is thread-safe. But often thread-safety must be considered within a broader context. (I know you assure us that this is not the case here, but I qualify this for future readers who too quickly jump to the conclusion that atomic is sufficient to achieve thread-safety. It often is not.)
For example, if your NSInteger property is "how many items are in this cache object", then not only must that NSInteger have its access synchronized, but it must be also be synchronized in conjunction with all interactions with the cache object (e.g. the "add item to cache" and "remove item from cache" tasks, too). And, in these cases, since you'll synchronize all interaction with this broader object somehow (e.g. with GCD queue, locks, #synchronized directive, whatever), making the NSInteger property atomic then becomes redundant and therefore modestly less efficient.
Bottom line, in limited situations, atomic can provide thread-safety for fundamental data types, but frequently it is insufficient when viewed in a broader context.
You later say that you don't care about race conditions. For what it's worth, Apple argues that there is no such thing as a benign race. See WWDC 2016 video Thread Sanitizer and Static Analysis (about 14:40 into it).
Anyway, you suggest you are merely concerned whether the value can be corrupted or whether the app will crash:
I am wondering if I can get into a similar situation where I read a corrupt value when reading and writing from multiple threads (and the app will crash)?
The bottom line is that if you're reading from one thread while mutating on another, the behavior is simply undefined. It could vary. You are simply well advised to avoid this scenario.
In practice, it's a function of the target architecture. For example on 64-bit type (e.g. long long) on 32-bit x86 target, you can easily retrieve a corrupt value, where one half of the 64-bit value is set and the other is not. (See https://stackoverflow.com/a/34386935/1271826 for example.) This results in merely non-sensical, invalid numeric values when dealing with primitive types. For pointers to objects, this obviously would have catestrophic implications.
But even if you're in an environment where no problems are manifested, it's an incredibly fragile approach to eschew synchronization to achieve thread-safety. It could easily break when run on new, unanticipated hardware architectures or compiled under different configuration. I'd encourage you to watch that Thread Sanitizer and Static Analysis video for more information.
So I was reading this article about an attempt to remove the global interpreter lock (GIL) from the Python interpreter to improve multithreading performance and saw something interesting.
It turns out that one of the places where removing the GIL actually made things worse was in memory management:
With free-threading, reference counting operations lose their thread-safety. Thus, the patch introduces a global reference-counting mutex lock along with atomic operations for updating the count. On Unix, locking is implemented using a standard pthread_mutex_t lock (wrapped inside a PyMutex structure) and the following functions...
...On Unix, it must be emphasized that simple reference count manipulation has been replaced by no fewer than three function calls, plus the overhead of the actual locking. It's far more expensive...
...Clearly fine-grained locking of reference counts is the major culprit behind the poor performance, but even if you take away the locking, the reference counting performance is still very sensitive to any kind of extra overhead (e.g., function call, etc.). In this case, the performance is still about twice as slow as Python with the GIL.
and later:
Reference counting is a really lousy memory-management technique for free-threading. This was already widely known, but the performance numbers put a more concrete figure on it. This will definitely be the most challenging issue for anyone attempting a GIL removal patch.
So the question is, if reference counting is so lousy for threading, how does Objective-C do it? I've written multithreaded Objective-C apps, and haven't noticed much of an overhead for memory management. Are they doing something else? Like some kind of per object lock instead of a global one? Is Objective-C's reference counting actually technically unsafe with threads? I'm not enough of a concurrency expert to really speculate much, but I'd be interested in knowing.
There is overhead and it can be significant in rare cases (like, for example, micro-benchmarks ;), regardless of the optimizations that are in place (of which, there are many). The normal case, though, is optimized for un-contended manipulation of the reference count for the object.
So the question is, if reference counting is so lousy for threading, how does Objective-C do it?
There are multiple locks in play and, effectively, a retain/release on any given object selects a random lock (but always the same lock) for that object. Thus, reducing lock contention while not requiring one lock per object.
(And what Catfish_man said; some classes will implement their own reference counting scheme to use class-specific locking primitives to avoid contention and/or optimize for their specific needs.)
The implementation details are more complex.
Is Objectice-C's reference counting actually technically unsafe with threads?
Nope -- it is safe in regards to threads.
In reality, typical code will call retain and release quite infrequently, compared to other operations. Thus, even if there were significant overhead on those code paths, it would be amortized across all the other operations in the app (where, say, pushing pixels to the screen is really expensive, by comparison).
If an object is shared across threads (bad idea, in general), then the locking overhead protecting the data access and manipulation will generally be vastly greater than the retain/release overhead because of the infrequency of retaining/releasing.
As far as Python's GIL overhead is concerned, I would bet that it has more to do with how often the reference count is incremented and decremented as a part of normal interpreter operations.
In addition to what bbum said, a lot of the most frequently thrown around objects in Cocoa override the normal reference counting mechanisms and store a refcount inline in the object, which they manipulate with atomic add and subtract instructions rather than locking.
(edit from the future: Objective-C now automatically does this optimization on modern Apple platforms, by mixing the refcount in with the 'isa' pointer)
Apple's Grand Central Dispatch reference says:
"...if your application needs to operate at the Unix level of the
system—for example, if it needs to manipulate file descriptors, Mach
ports, signals, or timers. GCD is not restricted to system-level
applications, but before you use it for higher-level applications, you
should consider whether similar functionality provided in Cocoa (via
NSOperation and block objects) would be easier to use or more
appropriate for your needs.".
http://developer.apple.com/library/ios/#documentation/Performance/Reference/GCD_libdispatch_Ref/Reference/reference.html
I can't actually think of situations, for high-level applications, in which the use of GCD is mandatory and NSOperation could/should not be used.
Any thoughts?
The point being made here is the same one that Chris Hanson states in his article "When to use NSOperation vs. GCD":
The straightforward answer is a general guideline for all application
development:
Always use the highest-level abstraction available to you, and drop
down to lower-level abstractions when measurement shows that they are
needed.
In this particular case, it means that when writing Cocoa
applications, you should generally be using NSOperation rather than
using GCD directly. Not because of a difference in efficiency, but
because NSOperation provides a higher-level abstraction atop the
mechanisms of GCD.
In general, I agree with this. NSOperation and NSOperationQueue provide support for dependencies and one or two other things that GCD blocks and queues don't have, and they abstract away the lower-level details of how the concurrent operations are implemented. If you need that functionality, NSOperation is a very good way to go.
However, after working with both, I've found myself replacing all of my NSOperation-based code with GCD blocks and queues. I've done this for two reasons: there is significant overhead when using NSOperation for frequent actions, and I believe my code is cleaner and more descriptive when using GCD blocks.
The first reason comes from profiling in my applications, where I found that the NSOperation object allocation and deallocation process took a significant amount of CPU resources when dealing with small and frequent actions, like rendering an OpenGL ES frame to the screen. GCD blocks completely eliminated that overhead, leading to significant performance improvements.
The second reason is more subjective, but I believe that my code is cleaner when using blocks than NSOperations. The quick capture of scope allowed by a block and the inline nature of them make for less code, because you don't need to create custom NSOperation subclasses or bundle up parameters to be passed into the operation, and more descriptive code in my opinion, because you can place the code to be run in a queue at the point where it is fired off.
Again, its a matter of preference, but I've found myself using GCD more, even in otherwise more abstracted Cocoa applications.
Prefer GCD where task is not much complex and optimum CPU performance is required.
Prefer NSOperationQueue where task is complex and requires canceling or suspending a block and dependency management.
GCD is a lightweight way to represent units of work that are going to be executed concurrently. You don’t schedule these units of work; the system takes care of scheduling for you. Adding dependency among blocks can be a headache. Canceling or suspending a block creates extra work for you as a developer!
NSOperation and NSOperationQueue add a little extra overhead compared to GCD, but you can add dependency among various operations. You can re-use, cancel or suspend operations. NSOperation is compatible with Key-Value Observation (KVO); for example, you can have an NSOperation start running by listening to NSNotificationCenter.
For detailed explanation, refer this question: NSOperation vs Grand Central Dispatch
Well, NSOperation has no equivalents to dispatch_source_t, dispatch_io, dispatch_data_t, dispatch_semaphore_t, etc... It's also somewhat higher overhead.
On the flip side, libdispatch has no equivalents to operation dependencies, operation priorities (queue priorities are somewhat different), or KVO on operations.
There are two things that NSOperationQueue can do that GCD doesn't do: The minor one is dependencies (add an operation to a queue but tell it to only execute when certain other operations are finished), and the big one is that NSOperation gives you an object which can receive messages while the task is executing, unlike GCD which has blocks that cannot receive messages except in a very limited way. You either need these two features, or you don't. If you don't, using GCD is just an awful lot easier to use.
That's why useful examples of NSOperation are always quite complex. If they were easy, you would use GCD instead. You usually create a subclass of NSOperation, which will be some significant amount of work, or use one that someone else has created.
I've actually just been reading 'round about this, and, I'm sure it will come as know surprise, opinions differ.
I can't think of a case where you'd have to use GCD over NSOperation, but that doesn't mean such a case doesn't exist. I, however, agree with a general sentiment in terms of best-practice coding:
If you have a couple tools that suit the job (and in this case, you've got NSOperation vs a GCD block), use the class with the highest level of abstraction (ie, the highest level API). Not only is it typically easier to use/less code, you'll also gain from potential future enhancements introduced to the higher level APIs.
Most iPhone code examples use the nonatmoc attribute in their properties. Even those that involve [NSThread detachNewThreadSelector:....]. However, is this really an issue if you are not accessing those properties on the separate thread?
If that is the case, how can you be sure nonatomic properties won't be accessed on this different in the future, at which point you may forget those properties are set as nonatomic. This can create difficult bugs.
Besides setting all properties to atomic, which can be impractical in a large app and may introduce new bugs, what is the best approach in this case?
Please note these these questions are specifically for iOS and not Mac in general.
First,know that atomicity by itself does not insure thread safety for your class, it simply generates accessors that will set and get your properties in a thread safe way. This is a subtle distinction. To create thread safe code, you will very likely need to do much more than simply use atomic accessors.
Second, another key point to know is that your accessors can be called from background or foreground threads safely regardless of atomicity. The key here is that they must never be called from two threads simultaneously. Nor can you call the setter from one thread while simultaneously calling the getter from another, etc. How you prevent that simultaneous access depends on what tools you use.
That said, to answer your question, you can't know for sure that your accessors won't be accessed on another thread in the future. This is why thread safety is hard, and a lot of code isn't thread safe. In general, if youre making a framework or library, yeah, you can try to make your code thread safe for the purposes of "defensive programming", or you can leave it non-thread safe. The atomicity of your properties is only a small part of that. Whichever you choose, though, be sure to document it so users of your library don't have to wonder.
I've been reading up on RAII and single vs. two-phase construction/initialization. For whatever reason, I was in the two-phase camp up until recently, because at some point I must have heard that it's bad to do error-prone operations in your constructor. However, I think I'm now convinced that single-phase is preferable, based on questions I've read on SO and other articles.
My question is: Why does Objective C use the two-phase approach (alloc/init) almost exclusively for non-convenience constructors? Is there any specific reason in the language, or was it just a design decision by the designers?
I have the enviable situation of working for the guy who wrote +alloc back in 1991, and I happened to ask him a very similar question a few months ago. The addition of +alloc was in order to provide +allocWithZone:, which was in order to add memory pools in NeXTSTEP 2.0 where memory was very tight (4M). This allowed the caller to control where objects were allocated in memory. It was a replacement for +new and its kin, which was (and continues to be, though no one uses it) a 1-phase constructor, based on Smalltalk's new. When Cocoa came over to Apple, the use of +alloc was already entrenched, and there was no going back to +new, even though actually picking your NSZone is seldom of significant value.
So it isn't a big 1-phase/2-phase philosophical question. In practice, Cocoa has a single phase construction, because you always do (and always should) call these back-to-back in a single call without a test on the +alloc. You can think of it as a elaborate way of typing "new".
My experience is with c++, but one downside of c++'s one phase initialization is handling of inheritance/virtual functions. In c++, you can't call virtual functions during construction or destruction (well, you can, it just won't do what you expect). A two phase init could solve this (partially. From what I understand, it would get routed to the right class, but the init might not have finished yet. You could still do things with that) (I'm still in favor of the one phase)