Asynchronous Cocoa - Preventing "simple" (obvious) deadlocks in NSOperation? - objective-c

When subclassing NSOperation to get a little chunk of work done, I've found out it's pretty easy to deadlock. Below I have a toy example that's pretty easy to understand why it never completes.
I can only seem to think through solutions that prevent the deadlock from the caller perspective, never the callee. For example, the caller could continue to run the run loop, not wait for finish, etc. If the main thread needs to be message synchronously during the operation, I'm wondering if there is a canonical solution that an operation subclasser can implement to prevent this type of deadlocking. I'm only just starting to dip my toe in async programming...
#interface ToyOperation : NSOperation
#end
#implementation ToyOperation
- (void)main
{
// Lots of work
NSString *string = #"Important Message";
[self performSelector:#selector(sendMainThreadSensitiveMessage:) onThread:[NSThread mainThread] withObject:string waitUntilDone:YES];
// Lots more work
}
- (void)sendMainThreadSensitiveMessage:(NSString *)string
{
// Update the UI or something that requires the main thread...
}
#end
- (int)main
{
ToyOperation *op = [[ToyOperation alloc] init];
NSOperationQueue *opQ = [[NSOperationQueue alloc] init];
[opQ addOperations: #[ op ] waitUntilFinished:YES]; // Deadlock
return;
}

If the main thread needs to be message synchronously during the
operation, I'm wondering if there is a canonical solution that an
operation subclasser can implement to prevent this type of
deadlocking.
There is. Never make a synchronous call to the main queue. And a follow-on: Never make a synchronous call from the main queue. And, really, it can be summed up as Never make a synchronous call from any queue to any other queue.
By doing that, you guarantee that the main queue is not blocked. Sure, there may be an exceptional case that tempts you to violate this rule and, even, cases where it really, truly, is unavoidable. But that very much should be the exception because even a single dispatch_sync() (or NSOpQueue waitUntilDone) has the potential to deadlock.
Of course, data updates from queue to queue can be tricky. There are several options; a concurrency safe data layer (very hard), only passing immutable objects or copies of the data (typically to the main queue for display purposes -- fairly easy, but potentially expensive), or you can go down the UUID based faulting like model that Core Data uses. Regardless of how you solve this, the problem isn't anything new compared to any other concurrency model.
The one exception is when replacing locks with queues (For example, instead of using #synchronized() internally to a class, use a serial GCD queue and use dispatch_sync() to that queue anywhere that a synchronized operation must take place. Faster and straightforward.). But this isn't so much an exception as solving a completely different problem.

Related

Do I need to use locks if the methods are already executed in a serial queue?

I have 2 method calls exposed in my library as follows:
-(void) createFile {
dispatch_async(serialQueue, ^{
[fileObj createFile:fileInfo completion:^(void){
//completion block C1
}];
});
}
-(void) readFile:(NSUInteger)timeStamp {
dispatch_async(serialQueue, ^{
[fileObj readFile:timeStamp completion:^(void){
//completion block C2
}];
});
}
Now the createFile:fileInfo:completion and readFile:timeStamp:completion are in turn XPC calls that call into a process P1. Their implementation inside the process P1 looks like this:
#interface FileObject : NSObject
+ (instancetype) sharedInstance;
#property (nonatomic, strong) NSData *fileContents;
#end
#implementation FileObject
+ (instancetype)sharedInstance
{
static FileObject *sharedObj = nil;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
sharedObj = [[self alloc] init];
});
return sharedObj;
}
- (void)createFile:(FileInfo *)fileInfo
completion:(void (^))completion {
FileObject *f = [FileObject sharedInstance];
//lock?
f.fileContents = //call some method;
//unlock
}
- (void)readFile:(NSUInteger)timeStamp
completion:(void (^))completion {
FileObject *f = [FileObject sharedInstance];
//do stuff with f.fileContents
}
#end
The point to be noted is that,a call to method createFile is capable of making modifications to the singleton obj and readFile:fileInfo is always called after createFile(serial queue used at invocation).
My question is given the serial queue usage for the two methods exposed in my library,
Do I still need to lock and unlock when I modify f.fileContents in readFileInfo:FileInfo:completion?
How about if multiple different processes call into my library? Will XPC to P1 handle that automatically or should I have to do something about it ?
You asked:
Do I need to use locks if the methods are already executed in a serial queue?
If you can interact with non thread-safe objects at the same time from multiple threads, yes, you can use locks (or a variety of other techniques) to synchronize your access. If you're doing all of your interaction from a serial queue, though, no locks are needed.
But you need to be careful because although you're using a serial queue, you've got completion handlers that could be running on different threads. But if you coordinate the interaction on those completion handlers so that none can ever run at the same time (more on that, below), then no locks (or other synchronization techniques) are needed.
A bunch of other thoughts based upon your code snippets (and conversations we've had elsewhere):
You're still using the GCD dispatch calls with asynchronous methods. There's a cognitive dissonance there. Either
lose the dispatch queues altogether and use completion handler patterns (this is the better solution IMHO, but you say you can't change your endpoints which apparently precludes this more logical approach); or
make createFile and readFile behave synchronously (with semaphores, for example) and then you can take advantage of the GCD queue behaviors (we generally eschew patterns like locks, semaphores, or anything that will "wait", but given that you're doing this on background queue, it's less problematic).
If you want these two fundamentally asynchronous tasks to behave synchronously, you can achieve this with locks, too, but I think dispatch semaphores are more the logical pattern if you're going to use dispatch queues.
The better pattern would be to make createFile and readFile create custom, asynchronous, NSOperation subclasses. This is a much stronger pattern than locks or semaphores in GCD because it minimizes the deadlocking risks.
Completely unrelated to the question here, I would not suggest making FileObject a singleton. Each FileObject is associated with a particular file, and it has no business being a singleton. This is especially important because we discovered offline that you're contemplating this being an XPC service with multiple requests coming in, and the singleton pattern is antithetical to that.
Plus, as an aside, FileObject doesn't pass the basic criteria for when you'd use singletons. For discussion on the considerations of singletons (which we don't need to repeat here, especially since it's unrelated to your main question), see What is so bad about singletons? Singleton's have their place, but this is a scenario where you likely will want separate instances, so the singleton pattern seems especially ill-suited.

convert pthread to objective-c

Im trying to convert the following to objective-c code.
This is the current thread I have in C and works fine
//calling EnrollThread method on a thread in C
pthread_t thread_id;
pthread_create( &thread_id, NULL, EnrollThread, pParams );
//What the EnrollThread method structure looks like in C
void* EnrollThread( void *arg )
What my method structure looks like now that I've changed it to objective-c
-(void)enrollThreadWithParams:(LPBIOPERPARAMS)params;
Now I'm not sure how to call this objective-c method with the pthread_create call.
I've tried something like this:
pthread_create( &thread_id, NULL, [refToSelf enrollThreadWithParams:pParams], pParams );
But I believe I have it wrong. Can anyone enlighten me on why this does not work and what it is I need to do to fix it so that I can create my thread in the background? My UI is getting locked until the method finishes what it's doing.
I was thinking of also using dispatch_sync but I haven't tried that.
In objective C you don't really use pthread_create, although you can still use it, but the thread entry point needs to be a C function, so I'm not sure if this would be the best approach.
There are many options, as you can read in the Threading and Concurrency documents.
performSelectorInBackground method of NSObject (and subclasses)
dispatch_async (not dispatch_sync as you mentioned)
NSOperation and NSOperationQueue
NSThread class
I would suggest giving it a shot to the first one, since it is the easiest, and very straightforward, also the second one is very easy because you don't have to create external objects, you just place inline the code to be executed in parallel.
The go to reference for concurrent programming is the Concurrency Programming Guide which walks you through dispatch queues (known as Grand Central Dispatch, GCD) and operation queues. Both are incredibly easy to use and offer their own respective advantages.
In their simplest forms, both of these are pretty easy to use. As others have pointed out, the process for creating a dispatch queue and then dispatching something to that queue is:
dispatch_queue_t queue = dispatch_queue_create("com.domain.app", DISPATCH_QUEUE_CONCURRENT);
dispatch_async(queue, ^{
// something to do in the background
});
The operation queue equivalent is:
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
[queue addOperationWithBlock:^{
// something to do in the background
}];
Personally, I prefer operation queues where:
I need controlled/limited concurrency (i.e. I'm going to dispatch a bunch of things to that queue and I want them to run concurrent with respect to not only the main queue, but also with respect to each other, but I don't want more than a few of those running simultaneously). A good example would be when doing concurrent network requests, where you want them running concurrently (because you get huge performance benefit) but you generally don't want more than four of them running at any given time). With an operation queue, one can specify maxConcurrentOperationCount whereas this tougher to do with GCD.
I need fine level of control over dependencies. For example, I'm going to start operations A, B, C, D, and E, but B is dependent on A (i.e. B shouldn't start before A finishes), D is dependent upon C, and E is dependent upon both B and D finishing.
I need to enjoy concurrency on tasks that, themselves, run asynchronously. Operations offer a fine degree of control over what determines when the operation is to be declared as isFinished with the use of NSOperation subclass that uses "concurrent operations". A common example is the network operation which, if you use the delegate-based implementation, runs asynchronously, but you still want to use operations to control the flow of one to the next. The very nice networking library, AFNetworking, for example, uses operations extensively, for this reason.
On the other hand, GCD is great for simple one-off asynchronous tasks (because you can avail yourself of built-in "global queues", freeing yourself from making your own queue), serial queues for synchronizing access to some shared resource, dispatch sources like timers, signaling between threads with semaphores, etc. GCD is generally where people get started with concurrent programming in Cocoa and Cocoa Touch.
Bottom line, I personally use operation queues for application-level asynchronous operations (network queues, image processing queues, etc.), where the degree of concurrency becomes and important issue). I tend to use GCD for lower-level stuff or quick and simple stuff. GCD (with dispatch_async) is a great place to start as you dip your toe into the ocean of concurrent programming, so go for it.
There are two things I'd encourage you to be aware of, regardless of which of these two technologies you use:
First, remember that (in iOS at least) you always want to do user interface tasks on the main queue. So the common patterns are:
dispatch_async(queue, ^{
// do something slow here
// when done, update the UI and model objects on the main queue
dispatch_async(dispatch_get_main_queue(), ^{
// UI and model updates can go here
});
});
or
[queue addOperationWithBlock:^{
// do something slow here
// when done, update the UI and model objects on the main queue
[[NSOperationQueue mainQueue] addOperationWithBlock:^{
// do UI and model updates here
}];
}];
The other important issue to consider is synchronization and "thread-safety". (See the Synchronization section of the Threading Programming Guide.) You want to make sure that you don't, for example, have the main thread populating some table view while, at the same time, some background queue is changing the data used by that table view at the same time. You want to make sure that while any given thread is using some model object or other shared resource, that another thread isn't mutating it, leaving it in some inconsistent state.
There's too much to cover in the world of concurrent programming. The WWDC videos (including 2011 and 2012) offer some great background on GCD and asynchronous programming patterns, so make sure you avail yourself of that great resource.
If you already have working code, there is no reason to abandon pthreads. You should be able to use it just fine.
If, however, you want an alternative, but you want to keep your existing pthread entry point, you can do this easily enough...
dispatch_queue_t queue = dispatch_queue_create("EnrollThread", DISPATCH_QUEUE_SERIAL);
dispatch_async(queue, ^{
EnrollThread(parms);
});

Does performBlockAndWait always done on the same thread [duplicate]

I have an NSManagedObjectContext declared like so:
- (NSManagedObjectContext *) backgroundMOC {
if (backgroundMOC != nil) {
return backgroundMOC;
}
backgroundMOC = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
return backgroundMOC;
}
Notice that it is declared with a private queue concurrency type, so its tasks should be run on a background thread. I have the following code:
-(void)testThreading
{
/* ok */
[self.backgroundMOC performBlock:^{
assert(![NSThread isMainThread]);
}];
/* CRASH */
[self.backgroundMOC performBlockAndWait:^{
assert(![NSThread isMainThread]);
}];
}
Why does calling performBlockAndWait execute the task on the main thread rather than background thread?
Tossing in another answer, to try an explain why performBlockAndWait will always run in the calling thread.
performBlock is completely asynchronous. It will always enqueue the block onto the queue of the receiving MOC, and then return immediately. Thus,
[moc performBlock:^{
// Foo
}];
[moc performBlock:^{
// Bar
}];
will place two blocks on the queue for moc. They will always execute asynchronously. Some unknown thread will pull blocks off of the queue and execute them. In addition, those blocks are wrapped within their own autorelease pool, and also they will represent a complete Core Data user event (processPendingChanges).
performBlockAndWait does NOT use the internal queue. It is a synchronous operation that executes in the context of the calling thread. Of course, it will wait until the current operations on the queue have been executed, and then that block will execute in the calling thread. This is documented (and reasserted in several WWDC presentations).
Furthermore, performBockAndWait is re-entrant, so nested calls all happen right in that calling thread.
The Core Data engineers have been very clear that the actual thread in which a queue-based MOC operation runs is not important. It's the synchronization by using the performBlock* API that's key.
So, consider 'performBlock' as "This block is being placed on a queue, to be executed at some undetermined time, in some undetermined thread. The function will return to the caller as soon as it has been enqueued"
performBlockAndWait is "This block will be executed at some undetermined time, in this exact same thread. The function will return after this code has completely executed (which will occur after the current queue associated with this MOC has drained)."
EDIT
Are you sure of "performBlockAndWait does NOT use the internal queue"?
I think it does. The only difference is that performBlockAndWait will
wait until the block's completion. And what do you mean by calling
thread? In my understanding, [moc performBlockAndWait] and [moc
performBloc] both run on its private queue (background or main). The
important concept here is moc owns the queue, not the other way
around. Please correct me if I am wrong. – Philip007
It is unfortunate that I phrased the answer as I did, because, taken by itself, it is incorrect. However, in the context of the original question it is correct. Specifically, when calling performBlockAndWait on a private queue, the block will execute on the thread that called the function - it will not be put on the queue and executed on the "private thread."
Now, before I even get into the details, I want to stress that depending on internal workings of libraries is very dangerous. All you should really care about is that you can never expect a specific thread to execute a block, except anything tied to the main thread. Thus, expecting a performBlockAndWait to not execute on the main thread is not advised because it will execute on the thread that called it.
performBlockAndWait uses GCD, but it also has its own layer (e.g., to prevent deadlocks). If you look at the GCD code (which is open source), you can see how synchronous calls work - and in general they synchronize with the queue and invoke the block on the thread that called the function - unless the queue is the main queue or a global queue. Also, in the WWDC talks, the Core Data engineers stress the point that performBlockAndWait will run in the calling thread.
So, when I say it does not use the internal queue, that does not mean it does not use the data structures at all. It must synchronize the call with the blocks already on the queue, and those submitted in other threads and other asynchronous calls. However, when calling performBlockAndWait it does not put the block on the queue... instead it synchronizes access and runs the submitted block on the thread that called the function.
Now, SO is not a good forum for this, because it's a bit more complex than that, especially w.r.t the main queue, and GCD global queues - but the latter is not important for Core Data.
The main point is that when you call any performBlock* or GCD function, you should not expect it to run on any particular thread (except something tied to the main thread) because queues are not threads, and only the main queue will run blocks on a specific thread.
When calling the core data performBlockAndWait the block will execute in the calling thread (but will be appropriately synchronized with everything submitted to the queue).
I hope that makes sense, though it probably just caused more confusion.
EDIT
Furthermore, you can see the unspoken implications of this, in that the way in which performBlockAndWait provides re-entrant support breaks the FIFO ordering of blocks. As an example...
[context performBlockAndWait:^{
NSLog(#"One");
[context performBlock:^{
NSLog(#"Two");
}];
[context performBlockAndWait:^{
NSLog(#"Three");
}];
}];
Note that strict adherence to the FIFO guarantee of the queue would mean that the nested performBlockAndWait ("Three") would run after the asynchronous block ("Two") since it was submitted after the async block was submitted. However, that is not what happens, as it would be impossible... for the same reason a deadlock ensues with nested dispatch_sync calls. Just something to be aware of if using the synchronous version.
In general, avoid sync versions whenever possible because dispatch_sync can cause a deadlock, and any re-entrant version, like performBlockAndWait will have to make some "bad" decision to support it... like having sync versions "jump" the queue.
Why not? Grand Central Dispatch's block concurrency paradigm (which I assume MOC uses internally) is designed so that only the runtime and operating system need to worry about threads, not the developer (because the OS can do it better than you can do to having more detailed information). Too many people assume that queues are the same as threads. They are not.
Queued blocks are not required to run on any given thread (the exception being blocks in the main queue must execute on the main thread). So, in fact, sometimes sync (i.e. performBlockAndWait) queued blocks will run on the main thread if the runtime feels it would be more efficient than creating a thread for it. Since you are waiting for the result anyway, it wouldn't change the way your program functioned if the main thread were to hang for the duration of the operation.
This last part I am not sure if I remember correctly, but in the WWDC 2011 videos about GCD, I believe that it was mentioned that the runtime will make an effort to run on the main thread, if possible, for sync operations because it is more efficient. In the end though, I suppose the answer to "why" can only be answered by the people who designed the system.
I don't think that the MOC is obligated to use a background thread; it's just obligated to ensure that your code will not run into concurrency issues with the MOC if you use performBlock: or performBlockAndWait:. Since performBlockAndWait: is supposed to block the current thread, it seems reasonable to run that block on that thread.
The performBlockAndWait: call only makes sure that you execute the code in such a way that you don't introduce concurrency (i.e. on 2 threads performBlockAndWait: will not run at the same time, they will block each other).
The long and the short of it is that you can't depend on which thread a MOC operation runs on, well basically ever. I've learned the hard way that if you use GCD or just straight up threads, you always have to create local MOCs for each operation and then merge them to the master MOC.
There is a great library (MagicalRecord) that makes that process very simple.

Multi-threaded Objective-C accessors: GCD vs locks

I'm debating whether or not to move to a GCD-based pattern for multi-threaded accessors. I've been using custom lock-based synchronization in accessors for years, but I've found some info (Intro to GCD) and there seems to be pros of a GCD-based approach. I'm hoping to start a dialog here to help myself and others weigh the decision.
The pattern looks like:
- (id)something
{
__block id localSomething;
dispatch_sync(queue, ^{
localSomething = [something retain];
});
return [localSomething autorelease];
}
- (void)setSomething:(id)newSomething
{
dispatch_async(queue, ^{
if(newSomething != something)
{
[something release];
something = [newSomething retain];
[self updateSomethingCaches];
}
});
}
On the pro side: you get the benefit of, possibly, non-blocking write access; lower overhead than locks (maybe?); safety from forgetting to unlock before returning from critical code sections; others?
Cons: Exception handling is non-existent so you have to code this into every block in which you might need it.
Is this pattern potentially the recommended method of writing multithreaded accessors?
Are there standard approaches for creating dispatch queues for this purpose? In other words, best practices for trading off granularity? With locks, for example, locking on each attribute is more fine grain than locking on the entire object. With dispatch queues, I could imagine that creation of a single queue for all objects would create performance bottlenecks, so are per-object queues appropriate? Obviously , the answer is highly dependent on the specific application, but are there known performance tradeoffs to help gauge the feasibility of the approach.
Any information / insight would be appreciated.
Is this pattern potentially the recommended method of writing
multithreaded accessors?
I guess you wrote that with a serial queue in mind, but there is no reason for it. Consider this:
dispatch_queue_t queue = dispatch_queue_create("com.example", DISPATCH_QUEUE_CONCURRENT);
// same thing as your example
- (NSString*)something {
__block NSString *localSomething;
dispatch_sync(queue, ^{
localSomething = _something;
});
return localSomething;
}
- (void)setSomething:(NSString*)something {
dispatch_barrier_async(queue, ^{
_something = something;
});
}
It reads concurrently but uses a dispatch barrier to disable concurrency while the write is happening. A big advantage of GCD is that allows concurrent reads instead locking the whole object like #property (atomic) does.
Both asyncs (dispatch_async, dispatch_barrier_async) are faster from the client point of view, but slower to execute than a sync because they have to copy the block, and having the block such a small task, the time it takes to copy becomes meaningful. I rather have the client returning fast, so I'm OK with it.

use NSOperationQueue as a LIFO stack?

i need to do a series of url calls (fetching WMS tiles). i want to use a LIFO stack so the newest url call is the most important. i want to display the tile on the screen now, not a tile that was on the screen 5 seconds ago after a pan.
i can create my own stack from a NSMutableArray, but i'm wondering if a NSOperationQueue can be used as a LIFO stack?
You can set the priority of operations in an operation queue using -[NSOperation setQueuePriority:]. You'll have to rejigger the priorities of existing operations each time you add an operation, but you can achieve something like what you're looking for. You'd essentially demote all of the old ones and give the newest one highest priority.
Sadly I think NSOperationQueues are, as the name suggests, only usable as queues — not as stacks. To avoid having to do a whole bunch of manual marshalling of tasks, probably the easiest thing is to treat your queues as though they were immutable and mutate by copying. E.g.
- (NSOperationQueue *)addOperation:(NSOperation *)operation toHeadOfQueue:(NSOperationQueue *)queue
{
// suspending a queue prevents it from issuing new operations; it doesn't
// pause any already ongoing operations. So we do this to prevent a race
// condition as we copy operations from the queue
queue.suspended = YES;
// create a new queue
NSOperationQueue *mutatedQueue = [[NSOperationQueue alloc] init];
// add the new operation at the head
[mutatedQueue addOperation:operation];
// copy in all the preexisting operations that haven't yet started
for(NSOperation *operation in [queue operations])
{
if(!operation.isExecuting)
[mutatedQueue addOperation:operation];
}
// the caller should now ensure the original queue is disposed of...
}
/* ... elsewhere ... */
NSOperationQueue *newQueue = [self addOperation:newOperation toHeadOfQueue:operationQueue];
[operationQueue release];
operationQueue = newQueue;
It seems at present that releasing a queue that is still working (as will happen to the old operation queue) doesn't cause it to cancel all operations, but that's not documented behaviour so probably not trustworthy. If you want to be really safe, key-value observe the operationCount property on the old queue and release it when it goes to zero.
I'm not sure if you're still looking a solution, but I've the same problem has been bugging me for a while, so I went ahead and implemented an operation stack here: https://github.com/cbrauchli/CBOperationStack. I've used it with a few hundred download operations and it has held up well.
Sadly, you cannot do that without running into some tricky issues, because:
Important You should always configure dependencies before running your operations or adding them to an operation queue. Dependencies added afterward may not prevent a given operation object from running. (From: Concurrency Programming Guide: Configuring Interoperation Dependencies)
Take a look at this related question: AFURLConnectionOperation 'start' method gets called before it is ready and never gets called again afterwards
Found a neat implementation of stack/LIFO features on top of NSOperationQueue. It can be used as a category that extends NSOperationQueue or an NSOperationQueue LIFO subclass.
https://github.com/nicklockwood/NSOperationStack
The easiest way would be to separate your operations and data you will be processing, so you can add operations to NSOperationQueue as usual and then take data from a stack or any other data structure you need.
var tasks: [MyTask]
...
func startOperation() {
myQueue.addOperation {
guard let task = tasks.last else {
return
}
tasks.removeLast()
task.perform()
}
}
Now, obviously, you might need to ensure that tasks collection can be used concurrently, but it's a much more common problem with lots of pre-made solutions than hacking your way around NSOperationQueue execution order.