use NSOperationQueue as a LIFO stack? - objective-c

i need to do a series of url calls (fetching WMS tiles). i want to use a LIFO stack so the newest url call is the most important. i want to display the tile on the screen now, not a tile that was on the screen 5 seconds ago after a pan.
i can create my own stack from a NSMutableArray, but i'm wondering if a NSOperationQueue can be used as a LIFO stack?

You can set the priority of operations in an operation queue using -[NSOperation setQueuePriority:]. You'll have to rejigger the priorities of existing operations each time you add an operation, but you can achieve something like what you're looking for. You'd essentially demote all of the old ones and give the newest one highest priority.

Sadly I think NSOperationQueues are, as the name suggests, only usable as queues — not as stacks. To avoid having to do a whole bunch of manual marshalling of tasks, probably the easiest thing is to treat your queues as though they were immutable and mutate by copying. E.g.
- (NSOperationQueue *)addOperation:(NSOperation *)operation toHeadOfQueue:(NSOperationQueue *)queue
{
// suspending a queue prevents it from issuing new operations; it doesn't
// pause any already ongoing operations. So we do this to prevent a race
// condition as we copy operations from the queue
queue.suspended = YES;
// create a new queue
NSOperationQueue *mutatedQueue = [[NSOperationQueue alloc] init];
// add the new operation at the head
[mutatedQueue addOperation:operation];
// copy in all the preexisting operations that haven't yet started
for(NSOperation *operation in [queue operations])
{
if(!operation.isExecuting)
[mutatedQueue addOperation:operation];
}
// the caller should now ensure the original queue is disposed of...
}
/* ... elsewhere ... */
NSOperationQueue *newQueue = [self addOperation:newOperation toHeadOfQueue:operationQueue];
[operationQueue release];
operationQueue = newQueue;
It seems at present that releasing a queue that is still working (as will happen to the old operation queue) doesn't cause it to cancel all operations, but that's not documented behaviour so probably not trustworthy. If you want to be really safe, key-value observe the operationCount property on the old queue and release it when it goes to zero.

I'm not sure if you're still looking a solution, but I've the same problem has been bugging me for a while, so I went ahead and implemented an operation stack here: https://github.com/cbrauchli/CBOperationStack. I've used it with a few hundred download operations and it has held up well.

Sadly, you cannot do that without running into some tricky issues, because:
Important You should always configure dependencies before running your operations or adding them to an operation queue. Dependencies added afterward may not prevent a given operation object from running. (From: Concurrency Programming Guide: Configuring Interoperation Dependencies)
Take a look at this related question: AFURLConnectionOperation 'start' method gets called before it is ready and never gets called again afterwards

Found a neat implementation of stack/LIFO features on top of NSOperationQueue. It can be used as a category that extends NSOperationQueue or an NSOperationQueue LIFO subclass.
https://github.com/nicklockwood/NSOperationStack

The easiest way would be to separate your operations and data you will be processing, so you can add operations to NSOperationQueue as usual and then take data from a stack or any other data structure you need.
var tasks: [MyTask]
...
func startOperation() {
myQueue.addOperation {
guard let task = tasks.last else {
return
}
tasks.removeLast()
task.perform()
}
}
Now, obviously, you might need to ensure that tasks collection can be used concurrently, but it's a much more common problem with lots of pre-made solutions than hacking your way around NSOperationQueue execution order.

Related

Can NSBlockOperation cancel itself while executing, thus canceling dependent NSOperations?

I have a chain of many NSBlockOperations with dependencies. If one operation early in the chain fails - I want the other operations to not run. According to docs, this should be easy to do from the outside - if I cancel an operation, all dependent operations should automatically be cancelled.
However - if only the execution-block of my operation "knows" that it failed, while executing - can it cancel its own work?
I tried the following:
NSBlockOperation *op = [[NSBlockOperation alloc] init];
__weak NSBlockOperation *weakOpRef = op;
[takeScreenShot addExecutionBlock:^{
LOGInfo(#"Say Cheese...");
if (some_condition == NO) { // for some reason we can't take a photo
[weakOpRef cancel];
LOGError(#"Photo failed");
}
else {
// take photo, process it, etc.
LOGInfo(#"Photo taken");
}
}];
However, when I run this, other operations dependent on op are executed even though op was cancelled. Since they are dependent - surely they're not starting before op finished, and I verified (in debugger and using logs) that isCancelled state of op is YES before the block returns. Still the queue executes them as if op finished successfully.
I then further challenged the docs, like thus:
NSOperationQueue *myQueue = [[NSOperationQueue alloc] init];
NSBlockOperation *op = [[NSBlockOperation alloc] init];
__weak NSBlockOperation *weakOpRef = takeScreenShot;
[takeScreenShot addExecutionBlock:^{
NSLog(#"Say Cheese...");
if (weakOpRef.isCancelled) { // Fail every once in a while...
NSLog(#"Photo failed");
}
else {
[NSThread sleepForTimeInterval:0.3f];
NSLog(#"Photo taken");
}
}];
NSOperation *processPhoto = [NSBlockOperation blockOperationWithBlock:^{
NSLog(#"Processing Photo...");
[NSThread sleepForTimeInterval:0.1f]; // Process
NSLog(#"Processing Finished.");
}];
// setup dependencies for the operations.
[processPhoto addDependency: op];
[op cancel]; // cancelled even before dispatching!!!
[myQueue addOperation: op];
[myQueue addOperation: processPhoto];
NSLog(#">>> Operations Dispatched, Wait for processing");
[eventQueue waitUntilAllOperationsAreFinished];
NSLog(#">>> Work Finished");
But was horrified to see the following output in the log:
2020-11-05 16:18:03.803341 >>> Operations Dispatched, Wait for processing
2020-11-05 16:18:03.803427 Processing Photo...
2020-11-05 16:18:03.813557 Processing Finished.
2020-11-05 16:18:03.813638+0200 TesterApp[6887:111445] >>> Work Finished
Pay attention: the cancelled op was never run - but the dependent processPhoto was executed, despite its dependency on op.
Ideas anyone?
OK. I think I solved the mystery. I just misunderstood the [NSOperation cancel] documentation.
it says:
In macOS 10.6 and later, if an operation is in a queue but waiting on
unfinished dependent operations, those operations are subsequently
ignored. Because it is already cancelled, this behavior allows the
operation queue to call the operation’s start method sooner and clear
the object out of the queue. If you cancel an operation that is not in
a queue, this method immediately marks the object as finished. In each
case, marking the object as ready or finished results in the
generation of the appropriate KVO notifications.
I thought if operation B depends on operation A - it implies that if A is canceled (hence - A didn't finish its work) then B should be cancelled as well, because semantically it can't start until A completes its work.
Apparently, that was just wishful thinking...
What documentation says is different. When you cancel operation B (which depends on operation A), then despite being dependent on A - it won't wait for A to finish before it's removed from the queue. If operation A started, but hasn't finished yet - canceling B will remove it (B) immediately from the queue - because it will now ignore dependencies (the completion of A).
Soooo... to accomplish my scheme, I will need to introduce my own "dependencies" mechanism. The straightforward way is by introducing a set of boolean properties like isPhotoTaken, isPhotoProcessed, isPhotoColorAnalyzed etc. Then, an operation dependent on these pre-processing actions, will need to check in its preamble (of execution block) whether all required previous operations actually finished successfully, else cancel itself.
However, it may be worth subclassing NSBlockOperation, overriding the logic that calls 'start' to skip to finished if any of the 'dependencies' has been cancelled!
Initially I thought this is a long shot and may be hard to implement, but fortunately, I wrote this quick subclass, and it seems to work fine. Of course deeper inspection and stress tests are due:
#interface MYBlockOperation : NSBlockOperation {
}
#end
#implementation MYBlockOperation
- (void)start {
if ([[self valueForKeyPath:#"dependencies.#sum.cancelled"] intValue] > 0)
[self cancel];
[super start];
}
#end
When I substitute NSBlockOperation with MYBlockOperation in the original question (and my other tests, the behaviour is the one I described and expected.
If you cancel an operation you just hint that it is done, especially in long running tasks you have to implement the logic yourself. If you cancel something the dependencies will consider the task finished and run no problem.
So what you need to do is have some kind of a global synced variable that you set and get in a synced fashion and that should capture your logic. Your running operations should check that variable periodically and at critical points and exit themselves. Please don't use actual global but use some common variable that all processes can access - I presume you will be comfortable in implementing this?
Cancel is not a magic bullet that stop the operation from running, it is merely a hint to the scheduler that allows it to optimise stuff. Cancel you must do yourself.
This is explanation, I can give sample implementation of it but I think you are able to do that on your own looking at the code?
EDIT
If you have a lot of blocks that are dependent and execute sequentially you do not even need an operation queue or you only need a serial (1 operation at a time) queue. If the blocks execute sequentially but are very different then you need to rather work on the logic of NOT adding new blocks once the condition fails.
EDIT 2
Just some idea on how I suggest you tackle this. Of course detail matters but this is also a nice and direct way of doing it. This is sort of pseudo code so don't get lost in the syntax.
// Do it all in a class if possible, not subclass of NSOpQueue
class A
// Members
queue
// job1
synced state cancel1 // eg triggered by UI
synced state counter1
state calc1 that job 1 calculates (and job 2 needs)
synced state cancel2
synced state counter2
state calc2 that job 2 calculated (and job 3 needs)
...
start
start on queue
schedule job1.1 on (any) queue
periodically check cancel1 and exit
update calc1
when done or exit increase counter1
schedule job1.2 on (any) queue
same
schedule job1.3
same
wait on counter1 to reach 0
check cancel1 and exit early
// When you get here nothing has been cancelled and
// all you need for job2 is calculated and ready as
// state1 in the class.
// This is why state1 need not be synced as it is
// (potentially) written by job1 and read by job2
// so no concurrent access.
schedule job2.1 on (any) queue
and so on
This is to me most direct and ready for future development way of doing it. Easy to maintain and understand and so on.
EDIT 3
Reason I like and prefer this is because it keeps all your interdependent logic in one place and it is easy to later add to it or calibrate it if you need finer control.
Reason I prefer this to e.g. subclassing NSOp is that then you spread out this logic into a number of already complex subclasses and also you loose some control. Here you only schedule stuff after you've tested some condition and know that the next batch needs to run. In the alternative you schedule all at once and need additional logic in all subclasses to monitor progress of the task or state of the cancel so it mushrooms quickly.
Subclassing NSOp I'd do if the specific op that run in that subclass needs calibration, but to subclass it to manage the interdependencies adds complexity I recon.
(Probably final) EDIT 4
If you made it this far I am impressed. Now, looking at my proposed piece of (pseudo) code you might see that it is overkill and that you can simplify it considerably. This is because the way it is presented, the different components of the whole task, being task 1, task 2 and so on, appear to be disconnected. If that is the case there are indeed a number of different and simpler ways in which you can do this. In the reference I give a nice way of doing this if all the tasks are the same or very similar or if you have only a single subsubtask (e.g. 1.1) per subtask (e.g. 1) or only a single (sub or subsub) task running at any point in time.
However, for real problems, you will probably end up with much less of a clean and linear flow between these. In other words, after task 2 say you may kick of task 3.1 which is not required by task 4 or 5 but only needed by task 6. Then the cancel and exit early logic already becomes tricky and the reason I do not break this one up into smaller and simpler bits is really because like here the logic can (easily) also span those subtasks and because this class A represents a bigger whole e.g. clean data or take pictures or whatever your big problem is that you try to solve.
Also, if you work on something that is really slow and you need to squeeze out performance, you can do that by figuring out the dependencies between the (sub and subsub) tasks and kick them off asap. This type of calibration is where (real life) problems that took way too long for the UI becomes doable as you can break them up and (non-linearly) piece them together in such a way that you can traverse them in a most efficient way.
I've had a few such a problems and, one in particular I am thinking know became extremely fragile and the logic difficult to follow, but this way I was able to bring the solution time down from an unacceptable more than a minute to just a few seconds and agreeable to the users.
(This time really almost the final) EDIT 5
Also, the way it is presented here, as you make progress in solving the problem, at those junctures between say task 1 and 2 or between 2 and 3, those are the places where you can update your UI with progress and parts of the full solution as it trickles in from all the various (sub and subsub) tasks.
(The end is coming) EDIT 6
If you work on a single core then, except for the interdependencies between tasks, the order in which you schedule all those sub and subsub tasks do not matter since execution is linear. The moment you have multiple cores you need to break the solution up into as small as possible subtasks and schedule the longer running ones asap for performance. The performance squeeze you get can be significant but comes at the cost of increasingly complex flow between all the small little subtasks and in the way in which you handle the cancel logic.

Enforcing one-at-a-time access to pointer from a primative wrapper

I've read a fair amount on thread-safety, and have been using GCD to keep the math-heavy code off the main thread for a while now (I learned about it before NSOperation, and it seems to still be the easier option). However, I wonder if I could improve part of my code that currently uses a lock.
I have an Objective-C++ class that is a wrapper for a c++ vector. (Reasons: primitive floats are added constantly without knowing a limit beforehand, the container must be contiguous, and the reason for using a vector vs NSMutableData is "just cause" it's what I settled on, and NSMutableData will still suffer from the same "expired" pointer when it goes to resize itself).
The class has instance methods to add data points that are processed and added to the vector (vector.push_back). After new data is added I need to analyze it (by a different object). That processing happens on a background thread, and it uses a pointer directly to the vector. Currently the wrapper has a getter method that will first lock the instance (it suspends a local serial queue for the writes) and then return the pointer. For those that don't know, this is done because when the vector runs out of space push_back causes the vector to move in memory to make room for the new entries - invalidating the pointer that was passed. Upon completion, the math-heavy code will call unlock on the wrapper, and the wrapper will resume the queued writes finish.
I don't see a way to pass the pointer along -for an unknown length of time- without using some type of lock or making a local copy -which would be prohibitively expensive.
Basically: Is there a better way to pass a primitive pointer to a vector (or NSMutableData, for those that are getting hung up by a vector), that while the pointer is being used, any additions to the vector are queued and then when the consumer of the pointer is done, automatically "unlock" the vector and process the write queue
Current Implementation
Classes:
DataArray: a wrapper for a C++ vector
DataProcessor: Takes the most raw data and cleans it up before sending it to the 'DataArray'
DataAnalyzer: Takes the 'DataArray' pointer and does analysis on array
Worker: owns and initializes all 3, it also coordinates the actions (it does other stuff as well that is beyond the scope here). it is also a delegate to the processor and analyzer
What happens:
Worker is listening for new data from another class that handles external devices
When it receives a NSNotification with the data packet, it passes that onto DataProcessor by -(void)checkNewData:(NSArray*)data
DataProcessor, working in a background thread cleans up the data (and keeps partial data) and then tells DataArray to -(void)addRawData:(float)data (shown below)
DataArray then stores that data
When DataProcessor is done with the current chunk it tells Worker
When Worker is notified processing is done it tells DataAnalyzer to get started on the new data by -(void)analyzeAvailableData
DataAnalyzer does some prep work, including asking DataArray for the pointer by - (float*)dataPointer (shown below)
DataAnalyzer does a dispatch_async to a global thread and starts the heavy-lifting. It needs access to the dataPointer the entire time.
When done, it does a dispatch_async to the main thread to tell DataArray to unlock the array.
DataArray can is accessed by other objects for read only purposes as well, but those other reads super quick.
Code snips from DataArray
-(void)addRawData:(float)data {
//quick sanity check
dispatch_async(addDataQueue, ^{
rawVector.push_back(data);
});
}
- (float*)dataPointer {
[self lock];
return &rawVector[0];
}
- (void)lock {
if (!locked) {
locked = YES;
dispatch_suspend(addDataQueue);
}
}
- (void)unlock {
if (locked) {
dispatch_resume(addDataQueue);
locked = NO;
}
}
Code snip from DataAnalyzer
-(void)analyzeAvailableData {
//do some prep work
const float *rawArray = [self.dataArray dataPointer];
dispatch_async(global_queue, ^{
//lots of analysis
//done
dispatch_async(main_queue, ^{
//tell `Worker` analysis is done
[self.dataArray unlock];
};
};
}
If you have a shared resource (your vector) which will be concurrently accessed through reads and writes from different tasks, you may associated a dedicated dispatch queue with this resource where these tasks will exclusively run.
That is, every access to this resource (read or write) will be executed on that dispatch queue exclusively. Let's name this queue "sync_queue".
This "sync_queue" may be a serial queue or a concurrent queue.
If it's a serial queue, it should be immediately obvious that all accesses are thread-safe.
If it's a concurrent queue, you can allow read accesses to happen simultaneously, that is you simply call dispatch_async(sync_queue, block):
dispatch_async(sync_queue, ^{
if (_shared_value == 0) {
dispatch_async(otherQueue, block);
}
});
If that read access "moves" the value to a call-site executing on a different execution context, you should use the synchronous version:
__block int x;
dispatch_sync(sync_queue, ^{
x = _shared_value;
});
return x;
Any write access requires exclusive access to the resource. Having a concurrent queue, you accomplish this through using a barrier:
dispatch_barrier_async(sync_queue, ^{
_shared_value = 0;
dispatch_async(mainQueue, ^{
NSLog(#"value %d", _shared_value);
});
});
It really depends what you're doing, most of the time I drop back to the main queue (or a specifically designated queue) using dispatch_async() or dispatch_sync().
Async is obviously better, if you can do it.
It's going to depend on your specific use case but there are times when dispatch_async/dispatch_sync is multiple orders of magnitude faster than creating a lock.
The entire point of grand central dispatch (and NSOperationQueue) is to take away many of the bottlenecks found in traditional threaded programming, including locks.
Regarding your comment about NSOperation being harder to use... that's true, I don't use it very often either. But it does have useful features, for example if you need to be able to terminate a task half way through execution or before it's even started executing, NSOperation is the way to go.
There is a simple way to get what you need even without locking. The idea is that you have either shared, immutable data or you exclusive, mutable data. The reason why you don't need a lock for shared, immutable data is that it is simply read-only, so no race conditions during writing can occur.
All you need to do is to switch between both depending on what you currently need:
When you are adding samples to your storage, you need exclusive access to the data. If you already have a "working copy" of the data, you can just extend it as you need. If you only have a reference to the shared data, you create a working copy which you then keep for later exclusive access.
When you want to evaluate your samples, you need read-only access to the shared data. If you already have a shared copy, you just use that. If you only have an exclusive-access working copy, you convert that to a shared one.
Both of these operations are performed on demand. Assuming C++, you could use std::shared_ptr<vector const> for the shared, immutable data and std::unique_ptr<vector> for the exclusive-access, mutable data. For the older C++ standard those would be boost::shared_ptr<..> and std::auto_ptr<..> instead. Note the use of const in the shared version and that you can convert from the exclusive to the shared one easily, but the inverse is not possible, in order to get a mutable from an immutable vector, you have to copy.
Note that I'm assuming that copying the sample data is not possible and doesn't explode the complexity of your algorithm. If that doesn't work, your approach with the scrap space that is used while the background operations are in progress is probably the best way to go. You can automate a few things using a dedicated structure that works similar to a smart pointer though.

convert pthread to objective-c

Im trying to convert the following to objective-c code.
This is the current thread I have in C and works fine
//calling EnrollThread method on a thread in C
pthread_t thread_id;
pthread_create( &thread_id, NULL, EnrollThread, pParams );
//What the EnrollThread method structure looks like in C
void* EnrollThread( void *arg )
What my method structure looks like now that I've changed it to objective-c
-(void)enrollThreadWithParams:(LPBIOPERPARAMS)params;
Now I'm not sure how to call this objective-c method with the pthread_create call.
I've tried something like this:
pthread_create( &thread_id, NULL, [refToSelf enrollThreadWithParams:pParams], pParams );
But I believe I have it wrong. Can anyone enlighten me on why this does not work and what it is I need to do to fix it so that I can create my thread in the background? My UI is getting locked until the method finishes what it's doing.
I was thinking of also using dispatch_sync but I haven't tried that.
In objective C you don't really use pthread_create, although you can still use it, but the thread entry point needs to be a C function, so I'm not sure if this would be the best approach.
There are many options, as you can read in the Threading and Concurrency documents.
performSelectorInBackground method of NSObject (and subclasses)
dispatch_async (not dispatch_sync as you mentioned)
NSOperation and NSOperationQueue
NSThread class
I would suggest giving it a shot to the first one, since it is the easiest, and very straightforward, also the second one is very easy because you don't have to create external objects, you just place inline the code to be executed in parallel.
The go to reference for concurrent programming is the Concurrency Programming Guide which walks you through dispatch queues (known as Grand Central Dispatch, GCD) and operation queues. Both are incredibly easy to use and offer their own respective advantages.
In their simplest forms, both of these are pretty easy to use. As others have pointed out, the process for creating a dispatch queue and then dispatching something to that queue is:
dispatch_queue_t queue = dispatch_queue_create("com.domain.app", DISPATCH_QUEUE_CONCURRENT);
dispatch_async(queue, ^{
// something to do in the background
});
The operation queue equivalent is:
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
[queue addOperationWithBlock:^{
// something to do in the background
}];
Personally, I prefer operation queues where:
I need controlled/limited concurrency (i.e. I'm going to dispatch a bunch of things to that queue and I want them to run concurrent with respect to not only the main queue, but also with respect to each other, but I don't want more than a few of those running simultaneously). A good example would be when doing concurrent network requests, where you want them running concurrently (because you get huge performance benefit) but you generally don't want more than four of them running at any given time). With an operation queue, one can specify maxConcurrentOperationCount whereas this tougher to do with GCD.
I need fine level of control over dependencies. For example, I'm going to start operations A, B, C, D, and E, but B is dependent on A (i.e. B shouldn't start before A finishes), D is dependent upon C, and E is dependent upon both B and D finishing.
I need to enjoy concurrency on tasks that, themselves, run asynchronously. Operations offer a fine degree of control over what determines when the operation is to be declared as isFinished with the use of NSOperation subclass that uses "concurrent operations". A common example is the network operation which, if you use the delegate-based implementation, runs asynchronously, but you still want to use operations to control the flow of one to the next. The very nice networking library, AFNetworking, for example, uses operations extensively, for this reason.
On the other hand, GCD is great for simple one-off asynchronous tasks (because you can avail yourself of built-in "global queues", freeing yourself from making your own queue), serial queues for synchronizing access to some shared resource, dispatch sources like timers, signaling between threads with semaphores, etc. GCD is generally where people get started with concurrent programming in Cocoa and Cocoa Touch.
Bottom line, I personally use operation queues for application-level asynchronous operations (network queues, image processing queues, etc.), where the degree of concurrency becomes and important issue). I tend to use GCD for lower-level stuff or quick and simple stuff. GCD (with dispatch_async) is a great place to start as you dip your toe into the ocean of concurrent programming, so go for it.
There are two things I'd encourage you to be aware of, regardless of which of these two technologies you use:
First, remember that (in iOS at least) you always want to do user interface tasks on the main queue. So the common patterns are:
dispatch_async(queue, ^{
// do something slow here
// when done, update the UI and model objects on the main queue
dispatch_async(dispatch_get_main_queue(), ^{
// UI and model updates can go here
});
});
or
[queue addOperationWithBlock:^{
// do something slow here
// when done, update the UI and model objects on the main queue
[[NSOperationQueue mainQueue] addOperationWithBlock:^{
// do UI and model updates here
}];
}];
The other important issue to consider is synchronization and "thread-safety". (See the Synchronization section of the Threading Programming Guide.) You want to make sure that you don't, for example, have the main thread populating some table view while, at the same time, some background queue is changing the data used by that table view at the same time. You want to make sure that while any given thread is using some model object or other shared resource, that another thread isn't mutating it, leaving it in some inconsistent state.
There's too much to cover in the world of concurrent programming. The WWDC videos (including 2011 and 2012) offer some great background on GCD and asynchronous programming patterns, so make sure you avail yourself of that great resource.
If you already have working code, there is no reason to abandon pthreads. You should be able to use it just fine.
If, however, you want an alternative, but you want to keep your existing pthread entry point, you can do this easily enough...
dispatch_queue_t queue = dispatch_queue_create("EnrollThread", DISPATCH_QUEUE_SERIAL);
dispatch_async(queue, ^{
EnrollThread(parms);
});

Asynchronous Cocoa - Preventing "simple" (obvious) deadlocks in NSOperation?

When subclassing NSOperation to get a little chunk of work done, I've found out it's pretty easy to deadlock. Below I have a toy example that's pretty easy to understand why it never completes.
I can only seem to think through solutions that prevent the deadlock from the caller perspective, never the callee. For example, the caller could continue to run the run loop, not wait for finish, etc. If the main thread needs to be message synchronously during the operation, I'm wondering if there is a canonical solution that an operation subclasser can implement to prevent this type of deadlocking. I'm only just starting to dip my toe in async programming...
#interface ToyOperation : NSOperation
#end
#implementation ToyOperation
- (void)main
{
// Lots of work
NSString *string = #"Important Message";
[self performSelector:#selector(sendMainThreadSensitiveMessage:) onThread:[NSThread mainThread] withObject:string waitUntilDone:YES];
// Lots more work
}
- (void)sendMainThreadSensitiveMessage:(NSString *)string
{
// Update the UI or something that requires the main thread...
}
#end
- (int)main
{
ToyOperation *op = [[ToyOperation alloc] init];
NSOperationQueue *opQ = [[NSOperationQueue alloc] init];
[opQ addOperations: #[ op ] waitUntilFinished:YES]; // Deadlock
return;
}
If the main thread needs to be message synchronously during the
operation, I'm wondering if there is a canonical solution that an
operation subclasser can implement to prevent this type of
deadlocking.
There is. Never make a synchronous call to the main queue. And a follow-on: Never make a synchronous call from the main queue. And, really, it can be summed up as Never make a synchronous call from any queue to any other queue.
By doing that, you guarantee that the main queue is not blocked. Sure, there may be an exceptional case that tempts you to violate this rule and, even, cases where it really, truly, is unavoidable. But that very much should be the exception because even a single dispatch_sync() (or NSOpQueue waitUntilDone) has the potential to deadlock.
Of course, data updates from queue to queue can be tricky. There are several options; a concurrency safe data layer (very hard), only passing immutable objects or copies of the data (typically to the main queue for display purposes -- fairly easy, but potentially expensive), or you can go down the UUID based faulting like model that Core Data uses. Regardless of how you solve this, the problem isn't anything new compared to any other concurrency model.
The one exception is when replacing locks with queues (For example, instead of using #synchronized() internally to a class, use a serial GCD queue and use dispatch_sync() to that queue anywhere that a synchronized operation must take place. Faster and straightforward.). But this isn't so much an exception as solving a completely different problem.

iPhone use of mutexes with asynchronous URL requests

My iPhone client has a lot of involvement with asynchronous requests, a lot of the time consistently modifying static collections of dictionaries or arrays. As a result, it's common for me to see larger data structures which take longer to retrieve from a server with the following errors:
*** Terminating app due to uncaught exception 'NSGenericException', reason: '*** Collection <NSCFArray: 0x3777c0> was mutated while being enumerated.'
This typically means that two requests to the server come back with data which are trying to modify the same collection. What I'm looking for is a tutorial/example/understanding of how to properly structure my code to avoid this detrimental error. I do believe the correct answer is mutexes, but I've never personally used them yet.
This is the result of making asynchronous HTTP requests with NSURLConnection and then using NSNotification Center as a means of delegation once requests are complete. When firing off requests that mutate the same collection sets, we get these collisions.
There are several ways to do this. The simplest in your case would probably be to use the #synchronized directive. This will allow you to create a mutex on the fly using an arbitrary object as the lock.
#synchronized(sStaticData) {
// Do something with sStaticData
}
Another way would be to use the NSLock class. Create the lock you want to use, and then you will have a bit more flexibility when it comes to acquiring the mutex (with respect to blocking if the lock is unavailable, etc).
NSLock *lock = [[NSLock alloc] init];
// ... later ...
[lock lock];
// Do something with shared data
[lock unlock];
// Much later
[lock release], lock = nil;
If you decide to take either of these approaches it will be necessary to acquire the lock for both reads and writes since you are using NSMutableArray/Set/whatever as a data store. As you've seen NSFastEnumeration prohibits the mutation of the object being enumerated.
But I think another issue here is the choice of data structures in a multi-threaded environment. Is it strictly necessary to access your dictionaries/arrays from multiple threads? Or could the background threads coalesce the data they receive and then pass it to the main thread which would be the only thread allowed to access the data?
If it's possible that any data (including classes) will be accessed from two threads simultaneously you must take steps to keep these synchronized.
Fortunately Objective-C makes it ridiculously easy to do this using the synchronized keyword. This keywords takes as an argument any Objective-C object. Any other threads that specify the same object in a synchronized section will halt until the first finishes.
-(void) doSomethingWith:(NSArray*)someArray
{
// the synchronized keyword prevents two threads ever using the same variable
#synchronized(someArray)
{
// modify array
}
}
If you need to protect more than just one variable you should consider using a semaphore that represents access to that set of data.
// Get the semaphore.
id groupSemaphore = [Group semaphore];
#synchronized(groupSemaphore)
{
// Critical group code.
}
In response to the sStaticData and NSLock answer (comments are limited to 600 chars), don't you need to be very careful about creating the sStaticData and the NSLock objects in a thread safe way (to avoid the very unlikely scenario of multiple locks being created by different threads)?
I think there are two workarounds:
1) You can mandate those objects get created at the start of day in the single root thread.
2) Define a static object that is automatically created at the start of day to use as the lock, e.g. a static NSString can be created inline:
static NSString *sMyLock1 = #"Lock1";
Then I think you can safely use
#synchronized(sMyLock1)
{
// Stuff
}
Otherwise I think you'll always end up in a 'chicken and egg' situation with creating your locks in a thread safe way?
Of course, you are very unlikely to hit any of these problems as most iPhone apps run in a single thread.
I don't know about the [Group semaphore] suggestion earlier, that might also be a solution.
N.B. If you are using synchronisation don't forget to add -fobjc-exceptions to your GCC flags:
Objective-C provides support for
thread synchronization and exception
handling, which are explained in this
article and “Exception Handling.” To
turn on support for these features,
use the -fobjc-exceptions switch of
the GNU Compiler Collection (GCC)
version 3.3 and later.
http://developer.apple.com/library/ios/#documentation/cocoa/Conceptual/ObjectiveC/Articles/ocThreading.html
Use a copy of the object to modify it. Since you are trying to modify the reference of an array (collection), while someone else might also modify it (multiple access), creating a copy will work for you. Create a copy and then enumerate over that copy.
NSMutableArray *originalArray = #[#"A", #"B", #"C"];
NSMutableArray *arrayToEnumerate = [originalArray copy];
Now modify the arrayToEnumerate. Since it's not referenced to originalArray, but is a copy of the originalArray, it won't cause an issue.
There are other ways if you don't want the overhead of Locking as it has its cost. Instead of using a lock to protect on shared resource (in your case it might be dictionary or array), you can create a queue to serialise the task that is accessing your critical section code.
Queue doesn't take same amount of penalty as locks as it doesn't require trapping into the kernel to acquire mutex.
simply put
dispatch_async(serial_queue, ^{
<#critical code#>
})
In case if you want current execution to wait until task complete, you can use
dispatch_sync(serial_queue Or concurrent, ^{
<#critical code#>
})
Generally if execution doest need not to wait, asynchronous is a preferred way of doing.