Multi-threaded Objective-C accessors: GCD vs locks

Multi-threaded Objective-C accessors: GCD vs locks - objective-c

I'm debating whether or not to move to a GCD-based pattern for multi-threaded accessors. I've been using custom lock-based synchronization in accessors for years, but I've found some info (Intro to GCD) and there seems to be pros of a GCD-based approach. I'm hoping to start a dialog here to help myself and others weigh the decision.
The pattern looks like:
- (id)something
{
__block id localSomething;
dispatch_sync(queue, ^{
localSomething = [something retain];
});
return [localSomething autorelease];
}
- (void)setSomething:(id)newSomething
{
dispatch_async(queue, ^{
if(newSomething != something)
{
[something release];
something = [newSomething retain];
[self updateSomethingCaches];
}
});
}
On the pro side: you get the benefit of, possibly, non-blocking write access; lower overhead than locks (maybe?); safety from forgetting to unlock before returning from critical code sections; others?
Cons: Exception handling is non-existent so you have to code this into every block in which you might need it.
Is this pattern potentially the recommended method of writing multithreaded accessors?
Are there standard approaches for creating dispatch queues for this purpose? In other words, best practices for trading off granularity? With locks, for example, locking on each attribute is more fine grain than locking on the entire object. With dispatch queues, I could imagine that creation of a single queue for all objects would create performance bottlenecks, so are per-object queues appropriate? Obviously , the answer is highly dependent on the specific application, but are there known performance tradeoffs to help gauge the feasibility of the approach.
Any information / insight would be appreciated.

Is this pattern potentially the recommended method of writing
multithreaded accessors?
I guess you wrote that with a serial queue in mind, but there is no reason for it. Consider this:
dispatch_queue_t queue = dispatch_queue_create("com.example", DISPATCH_QUEUE_CONCURRENT);
// same thing as your example
- (NSString*)something {
__block NSString *localSomething;
dispatch_sync(queue, ^{
localSomething = _something;
});
return localSomething;
}
- (void)setSomething:(NSString*)something {
dispatch_barrier_async(queue, ^{
_something = something;
});
}
It reads concurrently but uses a dispatch barrier to disable concurrency while the write is happening. A big advantage of GCD is that allows concurrent reads instead locking the whole object like #property (atomic) does.
Both asyncs (dispatch_async, dispatch_barrier_async) are faster from the client point of view, but slower to execute than a sync because they have to copy the block, and having the block such a small task, the time it takes to copy becomes meaningful. I rather have the client returning fast, so I'm OK with it.

Related

Do I need to use locks if the methods are already executed in a serial queue?

I have 2 method calls exposed in my library as follows:
-(void) createFile {
dispatch_async(serialQueue, ^{
[fileObj createFile:fileInfo completion:^(void){
//completion block C1
}];
});
}
-(void) readFile:(NSUInteger)timeStamp {
dispatch_async(serialQueue, ^{
[fileObj readFile:timeStamp completion:^(void){
//completion block C2
}];
});
}
Now the createFile:fileInfo:completion and readFile:timeStamp:completion are in turn XPC calls that call into a process P1. Their implementation inside the process P1 looks like this:
#interface FileObject : NSObject
+ (instancetype) sharedInstance;
#property (nonatomic, strong) NSData *fileContents;
#end
#implementation FileObject
+ (instancetype)sharedInstance
{
static FileObject *sharedObj = nil;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
sharedObj = [[self alloc] init];
});
return sharedObj;
}
- (void)createFile:(FileInfo *)fileInfo
completion:(void (^))completion {
FileObject *f = [FileObject sharedInstance];
//lock?
f.fileContents = //call some method;
//unlock
}
- (void)readFile:(NSUInteger)timeStamp
completion:(void (^))completion {
FileObject *f = [FileObject sharedInstance];
//do stuff with f.fileContents
}
#end
The point to be noted is that,a call to method createFile is capable of making modifications to the singleton obj and readFile:fileInfo is always called after createFile(serial queue used at invocation).
My question is given the serial queue usage for the two methods exposed in my library,
Do I still need to lock and unlock when I modify f.fileContents in readFileInfo:FileInfo:completion?
How about if multiple different processes call into my library? Will XPC to P1 handle that automatically or should I have to do something about it ?

You asked:
Do I need to use locks if the methods are already executed in a serial queue?
If you can interact with non thread-safe objects at the same time from multiple threads, yes, you can use locks (or a variety of other techniques) to synchronize your access. If you're doing all of your interaction from a serial queue, though, no locks are needed.
But you need to be careful because although you're using a serial queue, you've got completion handlers that could be running on different threads. But if you coordinate the interaction on those completion handlers so that none can ever run at the same time (more on that, below), then no locks (or other synchronization techniques) are needed.
A bunch of other thoughts based upon your code snippets (and conversations we've had elsewhere):
You're still using the GCD dispatch calls with asynchronous methods. There's a cognitive dissonance there. Either
lose the dispatch queues altogether and use completion handler patterns (this is the better solution IMHO, but you say you can't change your endpoints which apparently precludes this more logical approach); or
make createFile and readFile behave synchronously (with semaphores, for example) and then you can take advantage of the GCD queue behaviors (we generally eschew patterns like locks, semaphores, or anything that will "wait", but given that you're doing this on background queue, it's less problematic).
If you want these two fundamentally asynchronous tasks to behave synchronously, you can achieve this with locks, too, but I think dispatch semaphores are more the logical pattern if you're going to use dispatch queues.
The better pattern would be to make createFile and readFile create custom, asynchronous, NSOperation subclasses. This is a much stronger pattern than locks or semaphores in GCD because it minimizes the deadlocking risks.
Completely unrelated to the question here, I would not suggest making FileObject a singleton. Each FileObject is associated with a particular file, and it has no business being a singleton. This is especially important because we discovered offline that you're contemplating this being an XPC service with multiple requests coming in, and the singleton pattern is antithetical to that.
Plus, as an aside, FileObject doesn't pass the basic criteria for when you'd use singletons. For discussion on the considerations of singletons (which we don't need to repeat here, especially since it's unrelated to your main question), see What is so bad about singletons? Singleton's have their place, but this is a scenario where you likely will want separate instances, so the singleton pattern seems especially ill-suited.

Asynchronous Cocoa - Preventing "simple" (obvious) deadlocks in NSOperation?

When subclassing NSOperation to get a little chunk of work done, I've found out it's pretty easy to deadlock. Below I have a toy example that's pretty easy to understand why it never completes.
I can only seem to think through solutions that prevent the deadlock from the caller perspective, never the callee. For example, the caller could continue to run the run loop, not wait for finish, etc. If the main thread needs to be message synchronously during the operation, I'm wondering if there is a canonical solution that an operation subclasser can implement to prevent this type of deadlocking. I'm only just starting to dip my toe in async programming...
#interface ToyOperation : NSOperation
#end
#implementation ToyOperation
- (void)main
{
// Lots of work
NSString *string = #"Important Message";
[self performSelector:#selector(sendMainThreadSensitiveMessage:) onThread:[NSThread mainThread] withObject:string waitUntilDone:YES];
// Lots more work
}
- (void)sendMainThreadSensitiveMessage:(NSString *)string
{
// Update the UI or something that requires the main thread...
}
#end
- (int)main
{
ToyOperation *op = [[ToyOperation alloc] init];
NSOperationQueue *opQ = [[NSOperationQueue alloc] init];
[opQ addOperations: #[ op ] waitUntilFinished:YES]; // Deadlock
return;
}

If the main thread needs to be message synchronously during the
operation, I'm wondering if there is a canonical solution that an
operation subclasser can implement to prevent this type of
deadlocking.
There is. Never make a synchronous call to the main queue. And a follow-on: Never make a synchronous call from the main queue. And, really, it can be summed up as Never make a synchronous call from any queue to any other queue.
By doing that, you guarantee that the main queue is not blocked. Sure, there may be an exceptional case that tempts you to violate this rule and, even, cases where it really, truly, is unavoidable. But that very much should be the exception because even a single dispatch_sync() (or NSOpQueue waitUntilDone) has the potential to deadlock.
Of course, data updates from queue to queue can be tricky. There are several options; a concurrency safe data layer (very hard), only passing immutable objects or copies of the data (typically to the main queue for display purposes -- fairly easy, but potentially expensive), or you can go down the UUID based faulting like model that Core Data uses. Regardless of how you solve this, the problem isn't anything new compared to any other concurrency model.
The one exception is when replacing locks with queues (For example, instead of using #synchronized() internally to a class, use a serial GCD queue and use dispatch_sync() to that queue anywhere that a synchronized operation must take place. Faster and straightforward.). But this isn't so much an exception as solving a completely different problem.

Locking an object from being accessed by multiple threads - Objective-C

I have a question regarding thread safety in Objective-C. I've read a couple of other answers, some of the Apple documentation, and still have some doubts regarding this, so thought I'd ask my own question.
My question is three fold:
Suppose I have an array, NSMutableArray *myAwesomeArray;
Fold 1:
Now correct me if I'm mistaken, but from what I understand, using #synchronized(myAwesomeArray){...} will prevent two threads from accessing the same block of code. So, basically, if I have something like:
-(void)doSomething {
#synchronized(myAwesomeArray) {
//some read/write operation on myAwesomeArray
}
}
then, if two threads access the same method at the same time, that block of code will be thread safe. I'm guessing I've understood this part properly.
Fold 2:
What do I do if myAwesomeArray is being accessed by multiple threads from different methods?
If I have something like:
- (void)readFromArrayAccessedByThreadOne {
//thread 1 reads from myAwesomeArray
}
- (void)writeToArrayAccessedByThreadTwo {
//thread 2 writes to myAwesomeArray
}
Now, both the methods are accessed by two different threads at the same time. How do I ensure that myAwesomeArray won't have problems? Do I use something like NSLock or NSRecursiveLock?
Fold 3:
Now, in the above two cases, myAwesomeArray was an iVar in memory. What if I have a database file, that I don't always keep in memory. I create a databaseManagerInstance whenever I want to perform database operations, and release it once I'm done. Thus, basically, different classes can access the database. Each class creates its own instance of DatabaseManger, but basically, they are all using the same, single database file. How do I ensure that data is not corrupted due to race conditions in such a situation?
This will help me clear out some of my fundamentals.

Fold 1
Generally your understanding of what #synchronized does is correct. However, technically, it doesn't make any code "thread-safe". It prevents different threads from aquiring the same lock at the same time, however you need to ensure that you always use the same synchronization token when performing critical sections. If you don't do it, you can still find yourself in the situation where two threads perform critical sections at the same time. Check the docs.
Fold 2
Most people would probably advise you to use NSRecursiveLock. If I were you, I'd use GCD. Here is a great document showing how to migrate from thread programming to GCD programming, I think this approach to the problem is a lot better than the one based on NSLock. In a nutshell, you create a serial queue and dispatch your tasks into that queue. This way you ensure that your critical sections are handled serially, so there is only one critical section performed at any given time.
Fold 3
This is the same as Fold 2, only more specific. Data base is a resource, by many means it's the same as the array or any other thing. If you want to see the GCD based approach in database programming context, take a look at fmdb implementation. It does exactly what I described in Fold2.
As a side note to Fold 3, I don't think that instantiating DatabaseManager each time you want to use the database and then releasing it is the correct approach. I think you should create one single database connection and retain it through your application session. This way it's easier to manage it. Again, fmdb is a great example on how this can be achieved.
Edit
If don't want to use GCD then yes, you will need to use some kind of locking mechanism, and yes, NSRecursiveLock will prevent deadlocks if you use recursion in your methods, so it's a good choice (it is used by #synchronized). However, there may be one catch. If it's possible that many threads will wait for the same resource and the order in which they get access is relevant, then NSRecursiveLock is not enough. You may still manage this situation with NSCondition, but trust me, you will save a lot of time using GCD in this case. If the order of the threads is not relevant, you are safe with locks.

As in Swift 3 in WWDC 2016 Session Session 720 Concurrent Programming With GCD in Swift 3, you should use queue
class MyObject {
private let internalState: Int
private let internalQueue: DispatchQueue
var state: Int {
get {
return internalQueue.sync { internalState }
}
set (newValue) {
internalQueue.sync { internalState = newValue }
}
}
}

Subclass NSMutableArray to provide locking for the accessor (read and write) methods. Something like:
#interface MySafeMutableArray : NSMutableArray { NSRecursiveLock *lock; } #end
#implementation MySafeMutableArray
- (void)addObject:(id)obj {
[self.lock lock];
[super addObject: obj];
[self.lock unlock];
}
// ...
#end
This approach encapsulates the locking as part of the array. Users don't need to change their calls (but may need to be aware that they could block/wait for access if the access is time critical). A significant advantage to this approach is that if you decide that you prefer not to use locks you can re-implement MySafeMutableArray to use dispatch queues - or whatever is best for your specific problem. For example, you could implement addObject as:
- (void)addObject:(id)obj {
dispatch_sync (self.queue, ^{ [super addObject: obj] });
}
Note: if using locks, you'll surely need NSRecursiveLock, not NSLock, because you don't know of the Objective-C implementations of addObject, etc are themselves recursive.

use NSOperationQueue as a LIFO stack?

i need to do a series of url calls (fetching WMS tiles). i want to use a LIFO stack so the newest url call is the most important. i want to display the tile on the screen now, not a tile that was on the screen 5 seconds ago after a pan.
i can create my own stack from a NSMutableArray, but i'm wondering if a NSOperationQueue can be used as a LIFO stack?

You can set the priority of operations in an operation queue using -[NSOperation setQueuePriority:]. You'll have to rejigger the priorities of existing operations each time you add an operation, but you can achieve something like what you're looking for. You'd essentially demote all of the old ones and give the newest one highest priority.

Sadly I think NSOperationQueues are, as the name suggests, only usable as queues — not as stacks. To avoid having to do a whole bunch of manual marshalling of tasks, probably the easiest thing is to treat your queues as though they were immutable and mutate by copying. E.g.
- (NSOperationQueue *)addOperation:(NSOperation *)operation toHeadOfQueue:(NSOperationQueue *)queue
{
// suspending a queue prevents it from issuing new operations; it doesn't
// pause any already ongoing operations. So we do this to prevent a race
// condition as we copy operations from the queue
queue.suspended = YES;
// create a new queue
NSOperationQueue *mutatedQueue = [[NSOperationQueue alloc] init];
// add the new operation at the head
[mutatedQueue addOperation:operation];
// copy in all the preexisting operations that haven't yet started
for(NSOperation *operation in [queue operations])
{
if(!operation.isExecuting)
[mutatedQueue addOperation:operation];
}
// the caller should now ensure the original queue is disposed of...
}
/* ... elsewhere ... */
NSOperationQueue *newQueue = [self addOperation:newOperation toHeadOfQueue:operationQueue];
[operationQueue release];
operationQueue = newQueue;
It seems at present that releasing a queue that is still working (as will happen to the old operation queue) doesn't cause it to cancel all operations, but that's not documented behaviour so probably not trustworthy. If you want to be really safe, key-value observe the operationCount property on the old queue and release it when it goes to zero.

I'm not sure if you're still looking a solution, but I've the same problem has been bugging me for a while, so I went ahead and implemented an operation stack here: https://github.com/cbrauchli/CBOperationStack. I've used it with a few hundred download operations and it has held up well.

Sadly, you cannot do that without running into some tricky issues, because:
Important You should always configure dependencies before running your operations or adding them to an operation queue. Dependencies added afterward may not prevent a given operation object from running. (From: Concurrency Programming Guide: Configuring Interoperation Dependencies)
Take a look at this related question: AFURLConnectionOperation 'start' method gets called before it is ready and never gets called again afterwards

Found a neat implementation of stack/LIFO features on top of NSOperationQueue. It can be used as a category that extends NSOperationQueue or an NSOperationQueue LIFO subclass.
https://github.com/nicklockwood/NSOperationStack

The easiest way would be to separate your operations and data you will be processing, so you can add operations to NSOperationQueue as usual and then take data from a stack or any other data structure you need.
var tasks: [MyTask]
...
func startOperation() {
myQueue.addOperation {
guard let task = tasks.last else {
return
}
tasks.removeLast()
task.perform()
}
}
Now, obviously, you might need to ensure that tasks collection can be used concurrently, but it's a much more common problem with lots of pre-made solutions than hacking your way around NSOperationQueue execution order.

Recursive synchronization using GCD

I have some model classes that I need to synchronize. There is a main object of type Library that contains several Album objects (imagine a music library, for example). Both the library and albums support serialization by implementing the NSCoding protocol. I need to synchronize modifications to the library with album modifications and the serialization of both classes, so that I know the updates don’t step on each other’s toes and that I don’t serialize objects in the middle of an update.
I thought I would just pass all the objects a shared dispatch queue, dispatch_async all the setter code and dispatch_sync the getters. This is simple and easy, but it does not work, as the program flow is recursive:
// In the Library class
- (void) encodeWithCoder: (NSCoder*) encoder
{
dispatch_sync(queue, ^{
[encoder encodeObject:albums forKey:…];
});
}
// In the Album class, same queue as above
- (void) encodeWithCoder: (NSCoder*) encoder
{
dispatch_sync(queue, ^{
[encoder encodeObject:items forKey:…];
});
}
Now serializing the library triggers the album serialization and since both method use dispatch_sync on the same queue, the code deadlocks. I have seen this pattern somewhere:
- (void) runOnSynchronizationQueue: (dispatch_block_t) block
{
if (dispatch_get_current_queue() == queue) {
block();
} else {
dispatch_sync(queue, block);
}
}
Does it make sense, will it work, is it safe? Is there an easier way to do the synchronization?

For an exposition on recursive locks in GCD, see the Recursive Locks section of the dispatch_async man page. To briefly summarize it, it's generally a good idea to rethink your object hierarchies when something like this occurs.
You can also use dispatch_set_target_queue() to control the hierarchy of execution (targeting subordinate queues at higher level queues) once you've refactored your code such that it's the operations rather than the objects that need to be controlled, but that's also assuming that you weren't able to simply use completion callbacks to accomplish the same synchronization effect (which, frankly, is more recommended since abstract queue hierarchies can be difficult to conceptualize and debug).
I know that's not really the answer you were looking for, but you're kind of in a "you can't get there from here" situation with GCD and a more fundamental rethink of how to do things "the GCD way" is almost certainly necessary in this case.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas