What's the proper way to have a Thread-Unsafe object in a singleton? - objective-c

Let's say we have a NSMutableArray for example on a singleton object. The singleton object can obviously be called from multiple different threads.
Let's say we need users of the singleton to be able to addObjects or removeObjects. Or perhaps we have a custom object that we need to both set and read.
What's the proper way to handle these cases? Should every thread-unsafe property on a singleton have it's own serial/concurrent queue, and then overwrite addObject and removeObject functions for the NSMutableArray, wrapping reads in dispatch_sync, and writes in either dispatch_async(to a serial queue) or dispatch_barrier_async(to a concurrent queue)?
1) Does every thread-unsafe property need its own queue? Or should it at least have one in terms of performance. If multiple properties shared the same queue, it would be slower than necessary.
2) In what cases is this thread protection unnecessary. Ever? Or should thread-unsafe properties always have their setters and getter overwritten.

1) Does every thread-unsafe property need its own queue? Or should it
at least have one in terms of performance. If multiple properties
shared the same queue, it would be slower than necessary.
Depends entirely on how frequently you are pounding on the serialized APIs. Start with a simple model and then move to more complex if needed.
And keep in mind that the implementation isn't necessarily limited to just adding queues at whim.
You could also use a multiple-reader, one writer model. That would involve keeping an NSArray* copy of the mutable array's contents that the readers can simultaneously hit. If the mutable array's contents are changed, then you can (thread safety in mind) swap a new immutable copy of the mutable array for the existing NSArray * (could be done with an atomic #property or you could use a concurrent queue for all reads with a barrier for the swap operation).
Something like the above might make sense when either the array is relatively small (cheap copies) or there are many many times more reads than writes.
2) In what cases is this thread protection unnecessary. Ever? Or
should thread-unsafe properties always have their setters and getter
overwritten.
A mutable collection is always thread unsafe, even when operating in a "readonly" mode. Only immutable collections are thread safe.
So, yes, you need to make sure you serialize any operation that would cause a thread unsafe action.

Related

When to use dispatch_once versus reallocation? (Cocoa/CocoaTouch)

I often use simple non compile-time immutable objects: like an array #[#"a", #"b"] or a dictionary #{#"a": #"b"}.
I struggle between reallocating them all the time:
- (void)doSomeStuff {
NSArray<NSString *> *fileTypes = #[#"h", #"m"];
// use fileTypes
}
And allocating them once:
- (void)doSomeStuff {
static NSArray<NSString *> * fileTypes;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
fileTypes = #[#"h", #"m"];
});
// use fileTypes
}
Are there recommendations on when to use each construct? Like:
depending on the size of the allocated object
depending on the frequency of the allocation
depending on the device (iPhone 4 vs iMac 2016)
...
How do I figure it out?
Your bullet list is a good start. Memory would be another consideration. A static variable will stay in memory from the time it is actually initialized until the termination of the app. Obviously a local variable will be deallocated at the end of the method (assuming no further references).
Readability is something to consider too. The reallocation line is much easier to read than the dispatch_once setup.
For a little array, I'd reallocate as the first choice. The overhead is tiny. Unless you are creating the array in a tight loop, performance will be negligible.
I would use dispatch_once and a static variable for things that take more overhead such as creating a date formatter. But then there is the overhead of reacting to the user changing the device's locale.
In the end, my thought process is to first use reallocation. Then I consider whether there is tangible benefit to using static and dispatch_once. If there isn't a worthwhile reason to use a static, I leave it a local variable.
Use static if the overhead (speed) of reallocation is too much (but not if the permanent memory hit is too large).
Your second approach is more complex. So you should use it, if it is needed, but not as a default.
Typically this is done when the object creating is extreme expensive (almost never) or if you need a single identity of the instance object (shared instance, sometimes called singleton, what is incorrect.) In such a case you will recognize that you need it.
Although this question could probably be closed as "primarily opinion-based", I think this question addresses a choice that programmers frequently make.
The obvious answer is: don't optimize, and if you do, profile first - because most of the time you'll intuitively address the wrong part of your code.
That being said, here's how I address this.
If the needed object is expensive to build (like the formatters per documentation) and lightweight on resources, reuse / cache the object.
If the object is cheap to build, create a new instance when needed.
For crucial things that run on the main thread (UI stuff) I tend to draw the line more strict and cache earlier.
Depending on what you need those objects for, sometimes there are valid alternatives that are cheaper but offer a similar programming comfort. For example, if you wanted to look up some chessboard coordinates, you could either build a dictionary {#"a1" : 0, #"b1" : 1, ...}, use an array alone and the indices, take a plain C array (which now has a much less price tag attached to it), or do a small integer-based calculation.
Another point to consider is where to put the cached object. For example, you could store it statically in the method as in your example. Or you could make it an instance variable. Or a class property / static variable. Sometimes caching is only the half way to the goal - if you consider a table view with cells including a date formatter, you can think about reusing the very same formatter for all your cells. Sometimes you can refactor that reused object into a helper object (be it a singleton or not), and address some issues there.
So there is really no golden bullet here, and each situation needs an individual approach. Just don't fall into the trap of premature optimization and trade clear code for bug-inviting, hardly readable stuff that might not matter to your performance but comes with sure drawbacks like increased memory footprint.
dispatch_once_t
This allows two major benefits: 1) a method is guaranteed to be called only once, during the lifetime of an application run, and 2) it can be used to implement lazy initialization, as reported in the Man Page below.
From the OS X Man Page for dispatch_once():
The dispatch_once() function provides a simple and efficient mechanism to run an initializer exactly once, similar to pthread_once(3). Well designed code hides the use of lazy initialization.
Some use cases for dispatch_once_t
Singleton
Initialization of a file system resource, such as a file handle
Any static variable that would be shared by a group of instances, and takes up a lot of memory
static, without dispatch_once_t
A statically declared variable that is not wrapped in a dispatch_once block still has the benefit of being shared by many instances. For example, if you have a static variable called defaultColor, all instances of the object see the same value. It is therefore class-specific, instead of instance-specific.
However, any time you need a guarantee that a block will be called only once, you will need to use dispatch_once_t.
Immutability
You also mentioned immutability. Immutability is independent of the concern of running something once and only once--so there are cases for both static immutable variables and instance immutable variables. For instance, there may be times when you need to have an immutable object initialized, but it still may be different for each instance (in cases where it's value depends on other instance variables). In that case, an immutable object is not static, and still may be initialized with different values from multiple instances. In this case, a property is derived from other instance variables, and therefore should not be allowed to be changed externally.
A note on immutability vs mutability, from Concepts in Objective-C Programming:
Consider a scenario where all objects are capable of being mutated. In your application you invoke a method and are handed back a reference to an object representing a string. You use this string in your user interface to identify a particular piece of data. Now another subsystem in your application gets its own reference to that same string and decides to mutate it. Suddenly your label has changed out from under you. Things can become even more dire if, for instance, you get a reference to an array that you use to populate a table view. The user selects a row corresponding to an object in the array that has been removed by some code elsewhere in the program, and problems ensue. Immutability is a guarantee that an object won’t unexpectedly change in value while you’re using it.
Objects that are good candidates for immutability are ones that encapsulate collections of discrete values or contain values that are stored in buffers (which are themselves kinds of collections, either of characters or bytes). But not all such value objects necessarily benefit from having mutable versions. Objects that contain a single simple value, such as instances of NSNumber or NSDate, are not good candidates for mutability. When the represented value changes in these cases, it makes more sense to replace the old instance with a new instance.
A note on performance, from the same reference:
Performance is also a reason for immutable versions of objects representing things such as strings and dictionaries. Mutable objects for basic entities such as strings and dictionaries bring some overhead with them. Because they must dynamically manage a changeable backing store—allocating and deallocating chunks of memory as needed—mutable objects can be less efficient than their immutable counterparts.

Can I safely use 'nonatomic' ivars within a dispatch_barrier of a customized concurrent queue?

Is it safe to access non-atomic ivars from within a dispatch_barrier of a customized-concurrent queue?
The following code snippet is an abridged version of method using a dispatch barrier:
- (void)cacheData:(NSData *)data toFile:(NSString *)fileName {
dispatch_barrier_async(concurrentQueue, ^{
[_memoryCache setObject:data forKey:fileName];
// ...
}
});
}
I want to get a basic understanding of the non-atomic thread safety (i.e., without the overhead of using atomic vs non-atomic iVars).
This concerns Swift as will as Objective-C.
Atomic ≠ Thread safe
In order to answer this, let's first split the read/write operations into 3 distinct categories:
Direct Writes: When you directly set a variable/property to a given value. For example:
yourDictionary = [NSMutableDictionary dictionary];
Indirect Writes: When you mutate the object itself by changing a variable on it. For example:
[yourDictionary setObject:#"foo" forKey:#"bar"];
Reads: When you do any form of passive reading from an object (when the reading doesn't lead to any changes to the object itself). For example:
NSString* foo = [yourDictionary objectForKey:#"bar"];
So what does the atomic attribute on properties (you cannot set this on an ivar directly) ensure?
It ensures that direct writes and reads are serialised. However, it does nothing to protect the object from being indirectly written to while being read from another thread, which is unsafe.
Therefore, atomic properties make the most amount of sense with immutable objects (such as NSString, NSDictionary & NSArray). Once these objects are set, you cannot mutate them, and therefore are thread safe when atomic.
See here for full a list of the thread safe immutable objects.
If you want to use the atomic attribute on a mutable object, you will have still to ensure yourself that indirect writes and reads are serialised correctly.
What does the nonatomic attribute ensure?
Nothing.
No serialisation will take place for direct writes, indirect writes and reads. As a consequence, it is faster. However it is completely up to you to ensure that reads and writes are serialised correctly.
So how do I serialise read and writes?
You are indeed correct in approaching this by using a concurrent GCD queue. GCD queues don't impose some of the costs that traditional locks bring with them.
So, if you're looking to do this on a nonatomic ivar, you'll need to serialise the direct writes, indirect writes and reads. If you were doing this on an atomic property, you'd only need to serialise the indirect writes and reads.
As you have in your question, you should use a dispatch_barrier on your concurrent queue in order to make any writes (direct or indirect).
This is because a dispatch_barrier will wait until all tasks on the concurrent queue have been completed before preceding, and will block any further tasks from taking place until it has completed. Therefore the writes will take place without any interruptions from other writes or reads, while multiple reads can take place concurrently
You should therefore also channel any reads of the object through your concurrent queue as well, using a dispatch_sync. You shouldn't have to use a barrier for your reads, as multiple reads from different threads shouldn't cause a problem.
I hope this clarifies the issue for you.
TL;DR
Yes, your approach appears to be correct.

How global variables to be accessed in multi-thread env?

How global variables to be accessed in multi-thread env? Eg: the following messageServerUrl variable how to keep it thread safe? Is it atomic enough to keep safe? If not, any other solution available? Any idea please share, thanks in advance.
#property(atomic, copy) NSString * messageServerUrl;
atomic doesn't magically keep something thread-safe. It just makes a thread-safe accessor method.
Being very very very careful is what makes a shared value safe (and sometimes not even that!). You could still mess up your life seriously if you accessed the same value from two different threads in a way that puts your logic at risk.
If you know, for example, that this value is set before the secondary thread kicks off, and all you ever do after that is use the getter, then yes, that's probably safe.
But the safest way to share data between multiple threads is: don't. If the data doesn't need to be changed, then pass it to the secondary thread at the outset. This is why GCD is so wonderful: it's serial queues by default, plus you keep passing the data down into the block executed on the next thread, so you get effective locking without locks (which are easy to mismanage).

Does atomic actually mean anything for a synthesized primitive?

In Android, I could safely access and modify primitive types from different threads. I used this to share data between my OpenGL draw loop and user settings that were modified in the main thread Android UI. By storing each setting in a primitive type and making each independent of the others' values, it was thread-safe to modify all these variables without using locks or the synchronize keyword.
Is this also true in Objective-C? I read that placing atomic on a variable essentially causes the synthesized getter and setter to use locks, similar to using synchronized methods in Java. And I've read that the reason for this is so an object does not get partially modified while it's being read by another thread.
But are primitive types safe from being partially modified, as they are in Java? If that is the case, it seems like I could use my same old paradigm from Java for sharing data between threads. But then the atomic keyword would be pointless for a primitive, correct?
I also read that a more robust and faster solution than using atomic variables is to copy objects before using them if they are accessed from multiple threads. But I'm not sure how that could be accomplished. Couldn't a nonatomic object get modified while it's in the process of being copied, thereby corrupting the copy?
Primitive types aren't guaranteed to be safe from being partially modified because modifications to primitive types aren't guaranteed to be atomic in C and Objective-C inherits from there. C's only guarantees concern sequence points and there's no requirement that the processing between two sequence points be atomic — a rule of thumb is that each full expression is a sequence point.
In practice, modifying primitives is often a two-step process; the modification is made in a register and then written out to memory. It's very unlikely that the write itself will not be atomic but there's also no guarantee of when it will occur versus the modification. Even with the volatile qualification, the only guarantees provided are in terms of sequence points.
Apple exposes some C functions for atomic actions via OSAtomic.h that map directly to the specialised atomic instructions that CPUs offer for the implementation of concurrency mechanisms. It's possible you could use one of those more directly than via a heavy-handed mutex.
Common patterns in Objective-C are:
immutable objects and functional transformations — there are memory management reasons as well but that's partly why NSString, NSArray, etc, are specifically distinct from NSMutableString, NSMutableArray, etc;
serial dispatch queues, which can be combined with copy-modify-replace by copying on the queue, going off somewhere else to modify, then jumping back onto the queue to replace;
such #synchronizeds, NSConditionLocks or other explicit synchronisation mechanisms as are appropriate.
The main thread itself is a serial dispatch queue, which is why you can ignore the issue of concurrency entirely if you restrict yourself to it.
Atomic synthesized #properties are immune to concurrent partial updates. The accessor methods will use a lock if necessary on that architecture.
In general, primitive types in C are not necessarily safe with respect to concurrent partial updates.
I don't believe you can partially modify a primitive type, that's part of what makes them primitive. You either modify it or you don't. In that sense, I would say that they are thread safe.
You are correct when you say that the atomic keyword would be pointless for a primitive type.
Someone already took a stab at this here:
Objective-c properties for primitive types

Key-value coding for mutable collections & atomic accessors

My question, in brief: Is there any way to make mutable collection KVO accessors thread-safe with the same lock that the #synthesized methods are locked?
Explanation: I have a controller class which contains a collection (NSMutableArray) of Post objects. These objects are downloaded from a website, and thus the collection changes from time to time. I would like to be able to use key-value observing to observe the array, so that I can keep my interface updated.
My controller has a posts property, declared as follows:
#property (retain) NSMutableArray *posts;
If I call #synthesize in my .m file, it will create the -(NSMutableArray *)posts and -(void)setPosts:(NSMutableArray *)obj methods for me. Further, they will be protected by a lock such that two threads cannot stomp on each other while setting (or getting) the value.
However, in order to be key-value coding compliant for a mutable ordered collection, there are a few other methods I need to implement. Specifically, I need to implement at least the following:
-insertObject:inPostsAtIndex:
-removeObjectFromPostsAtIndex:
However, since the posts are downloaded asynchronously, I would like to be able to insert new posts into the array on a background thread as well. This means that access needs to be thread-safe.
So, my question. Is there any way to make those accessors thread-safe with the same lock that the #synthesized methods are locked? Or do I have to resort to specifying the setPosts: and posts methods myself in order to guarantee full atomicity across all accessors?
The Objective-C docs at developer.apple.com[1] don't state that there's a way to use the same lock for your explicitly defined functions as gets used for your #synthesized functions. In that case I'd say that to be completely safe it would be better to fully define your own functions to be sure they all use the same lock.
You may be able to use the debugger to determine the name of the lock that gets used for your #synthesized functions, but that's not something I'd rely on.
You probably don't really want to do this. If you do succeed, KVO-notifications will be received on the same thread that makes the change, and if it's a background thread, will be unsuitable for updating the UI.
Instead, why not have your background thread update the property using the main thread? Then you don't even need the property to be atomic.