I'm trying to enforce a specific order for tasks to complete using Grand Central Dispatch but I'm having a bit of trouble understanding the correct way to do it. I tried using groups in the following way:
Initialization:
startup = dispatch_group_create();
Tasks that need to wait:
//Don't want to wait on the main thread, so dispatch async to a concurrent queue
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0L),^{
//Wait until we're finished starting up
dispatch_group_wait(startup,DISPATCH_TIME_FOREVER);
//Now we can do this stuff back on the main queue
dispatch_async(dispatch_get_main_queue(),^{
//Do work
});
});
Work that I need to wait for:
dispatch_group_async(startup,dispatch_get_main_queue(),^{ // work });
Due to the nature of my app, the tasks that need to wait can occur BEFORE the work that I need to wait for. What I really want is the ability to wait on a condition that way when the condition is done, it's done, and all future threads can do their thing. Does GCD have this?
I'm not sure of all the details of your implementation, so forgive me if I'm repeating what you already know.
Create a dispatch group
Use dispatch_group_async to a serial queue. Using the serial queue, you are assured that your tasks are processed in the order you wish. Use a concurrent queue if you the order doesn't matter, but your question said that they had to complete in a specific order.
After you are done dispatching all of your tasks, use dispatch_group_notify. This will execute a block on the queue you specify once all the tasks assigned to the group have finished processing.
dispatch_group_notify(startup, dispatch_get_main_queue(), ^{
// Don't forget to release the dispatch group!
dispatch_release(startup)
// perform work block;
});
For a particular parsing activity, I need to do some processing of data that can be done while the rest of the parsing activity is ongoing. I assign the processing to a group on a concurrent queue. Then when my parsing is done, I check to see if the group is done. If it is done, I clean things up. If not, utilize the dispatch_group_notify() and execute the work afterwards. Something along these lines:
double delayInSeconds = 2.0;
dispatch_time_t groupWaitTime = dispatch_time(DISPATCH_TIME_NOW, delayInSeconds * NSEC_PER_SEC);
if (dispatch_group_wait(myDispatchGroup, groupWaitTime)==0){
NSLog(#"dispatch group completed in time");
dispatch_release(myDispatchGroup);
[self parsingCompleteWithActivity:activity];
}else{
NSLog(#"dispatch group did not complete in time");
dispatch_group_notify(myDispatchGroup, dispatch_get_main_queue(), ^{
dispatch_release(myDispatchGroup);
[self parsingCompleteWithActivity:activity];
});
}
Good luck!
Works if I use a semaphore and then signal after each call to wait.
Also works if I call dispatch_group_enter and dispatch_group_leave.
Related
I have a recursive function that schedules new tasks on a concurrent queue. I would like to limit the number of simultaneously scheduled tasks and so I use a semaphore so that each task will wait on it until the older threads end and signal the semaphore.
However I find that the queue gets deadlocked when the maximum number of running threads (64) is reached and they all begin waiting on the semaphore. Then GCD doesn't start new tasks even though it has plenty in its pending queue.
What am I doing wrong? Here is my code:
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
dispatch_semaphore_t sem = dispatch_semaphore_create(10);
dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER);
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^
{
[self recurWithSemaphore:sem];
});
}
- (void)recurWithSemaphore:(dispatch_semaphore_t)sem
{
// do some lengthy work here...
// at this point we're done all but scheduling new tasks so let new tasks be created
dispatch_semaphore_signal(sem);
for (NSUInteger i = 0; i < 100; ++i)
{
// don't schedule new tasks until we have enough semaphore
dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER);
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^
{
[self recurWithSemaphore:sem];
});
}
}
The typical pattern when using semaphores to control access to a limited resource is
Create semaphore with non-zero value;
For every task:
When starting, "wait" for semaphore (thereby consuming one of the available signals, or if one not available, wait for one); and
When done, "signal" semaphore (making it available for another task).
So, let's say you wanted to start 1,000,000 tasks, only 4 concurrently at any given time, you could do something like:
dispatch_semaphore_t semaphore = dispatch_semaphore_create(4);
dispatch_queue_t queue = ... // some concurrent queue, either global or your own
dispatch_async(queue, ^{
for (long index = 0; index < 1000000; index++) {
dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);
dispatch_async(queue, ^{
[self performSomeActionWithIndex:index completion:^{
dispatch_semaphore_signal(semaphore);
}];
});
}
});
Clearly, if you're dynamically adding more tasks to perform, you could change this from a for loop into a while loop, checking some synchronized source, but the idea is the same.
The key observation, though, is that I don't have performSomeActionWithIndex, itself, recursively creating tasks itself (because then you get into the deadlock situation of the original question where tasks are stopped because they can't start new tasks).
Now, I don't know if your problem can be refactored into this sort of pattern, but if you can, this might be an option.
By the way, for the sake of completeness, I'd point out that the typical solution for controlling the degree of concurrency is to use operation queues rather than dispatch queues, in which case you can specify maxConcurrentOperationCount.
As you correctly pointed out, there are memory implications of that. In my tests, each scheduled operation takes up at least 500 bytes (and likely more in real-world scenarios), so if you really have more than, say, 5,000-10,000 tasks to be scheduled, operation queues may quickly become impractical. As you advised, future readers should refer to the Performance Implications section in Concurrency Programming Guide: Concurrency and Application Design.
I know that this is not a viable approach in your case, but I only mention it for the benefit of future readers. I would generally advise the use of operation queues when one needs to control the degree of concurrency. I would only jump to an approach like the one outlined above if you're dealing with so many tasks that one can't reasonably just schedule them on an operation queue.
How do I prevent from a dispatch_group from getting stuck? I have found to be possible to get stuck in the following code (with or without the dispatch_group_wait call) if one of the images I attempt to load is not loaded (e.g. due to bad url). The block in dispatch_group_notify is never called.
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0);
dispatch_group_t group = dispatch_group_create();
for (...) {
if (...) {
dispatch_group_enter(group);
dispatch_async(queue, ^{
[self loadImageWithUrl:url onCompletion:^{
dispatch_group_leave(group);
}];
});
}
}
dispatch_group_wait(group, dispatch_time(DISPATCH_TIME_NOW, (int64_t)(2.0 * NSEC_PER_SEC)));
dispatch_group_notify(group, queue, ^{
NSLog(#"load image complete");
});
dispatch_group_notify queues its block when the group is complete. Your group never completes. So don't use dispatch_group_notify. Just use dispatch_group_wait as you are to wait with a timeout, then dispatch your block:
...
dispatch_group_wait(group, dispatch_time(DISPATCH_TIME_NOW, (int64_t)(2.0 * NSEC_PER_SEC)));
dispatch_async(queue, ^{
NSLog(#"load image complete");
});
If you want to mimic a dispatch_group_notify with a timeout, just do the above in its own async block:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
dispatch_group_wait(group, dispatch_time(DISPATCH_TIME_NOW, (int64_t)(2.0 * NSEC_PER_SEC)));
dispatch_sync(queue, ^{
NSLog(#"load image complete");
});
});
Note that you can use the return value of dispatch_group_wait to determine if everything completed or if it timed-out if that is useful information to you.
Keep in mind that the previous blocks will not be cancelled, so they may eventually run their completion blocks. You may need to add cancellation logic to the system if that's a problem.
I do not think the issue is with your group notify process. For me, the issue that leaps out at me is that, rather than trying to handle the scenario where the completion block is not called, that you change loadImageWithUrl to ensure that it always calls the completion block, whether successful or not. You might even want to add a NSError parameter to the block or something like that, so the caller will be notified if something failed (for example to warn the user, or initiate Reachability process that will wait for the connection to be re-established before attempting a retry, etc).
So, it might look like:
- (void)loadImageWithUrl:(NSURL *)url loadImageWithUrl:(void (^)(NSError *error))block
{
BOOL success;
NSError *error;
// do your download, setting `success` and `error` appropriately
// then, when done, call the completion block, whether successful or not
if (block) {
if (success) {
block(nil);
} else {
block(error);
}
}
}
Clearly, the details of the above are entirely dependent upon how you're doing these requests, but that's the basic idea. Then, you just make sure that your caller is changed to include this extra parameter:
for (...) {
if (...) {
dispatch_group_enter(group);
dispatch_async(queue, ^{
[self loadImageWithUrl:url onCompletion:^(NSError *error){
if (error) {
// handle the error however you want, if you want
}
dispatch_group_leave(group);
}];
});
}
}
I care less about how you choose to handle the error than I do in encouraging you ensure your completion block is called regardless of whether the download was successful or not. This ensures that the number of times you enter the group is perfectly balanced with the number of times you leave the group.
Having said that, when downloading many resources, GCD is ill-suited for this task. The issue is that it's non-trivial to constrain GCD to how many concurrent tasks can be performed at one time. Generally, you want to constrain how many requests that can run concurrently. You do this because (a) there's a limit as to how many NSURLSessionTask or NSURLConnection requests can run concurrently anyway; (b) if you run more than that, on slow connections you run serious risk of requests timing-out unnecessarily; (c) you can reduce your app's peak memory usage; but (d) you still enjoy concurrency, striking a balance between memory usage and optimal network bandwidth optimization.
To accomplish this, a common solution is to use operation queues rather than GCD's dispatch queues. You can then wrap your download requests in NSOperation objects and add these network operation to a NSOperationQueue for which you have set some reasonable maxConcurrentOperationCount (e.g. 4 or 5). And instead of a dispatch group notify, you can add a completion operation which is dependent upon the other operations you've added to your queue.
If you don't want to implement this yourself, you can use AFNetworking or SDWebImage, which can facilitate the downloading of images using operation queues to manage the download process.
And one final thought is that many apps adopt a lazy loading process, where images are seamlessly loaded as they're needed. It avoids consuming too much of the user's data plan performing some bulk download (or risking that the image the user needs first is backlogged behind a bunch of other images they don't immediately need). Both AFNetworking and SDWebImage offer UIImageView categories that offer an incredibly simple lazy loading of images.
Would it be possible to do a synchronous load of the image in the inner blocks? That way you could use dispatch_group_async() instead of the manually keeping track of the enter/leave paradigm.
I suspect the error lies in how the blocks complete and how the context is not that correct, it seems weird to me that you enter a group from outside of the block/context you leave the group from.
Finally, are you sure the completion block of the image loading is always called? Is it possible that when the request fails the completion is not called and thus the group counter is never decremented?
Sorry about my initial answer btw, I misread the question totally.
EDIT: Now that I think about what the goal is (synchronising after all images have loaded), it seems that the approach is not really reasonable. Does the code need to block until all the images are loaded? If not, then assuming all completion blocks are fired on a single thread, I would simply keep track of the number of blocks that have been fired and decrement that count in the completion block. When the last one completes, then the contents of the current dispatch_group_notify() could be executed.
Another, perhaps a bit more futureproof option would be to refactor the image loading code to either offer a synchronous way of fetching an image (meant to be used in cases like this) or offer an async API that is capable taking a dispatch group/queue, this obviously assumes that the internals of the image loader uses GCD.
Finally, you could write a NSOperation subclass, that takes care of a single image loading procedure, then those operations could be used in an NSOperationQueue (offering a bit more abstraction from GCD) that can be easily used to keep track how many operations are ongoing and when they all finish.
The problem is your use of dispatch_group_async(). It should not be used unless you are doing tasks that are synchronous that you want to be done asynchronously. Your loadImageWithUrl() is already asynchronous. This is how you should structure your use of dispatch_group.
dispatch_group_t group = dispatch_group_create();
for (...) {
if (...) {
dispatch_group_enter(group);
[self loadImageWithUrl:url onCompletion:^{
dispatch_group_leave(group);
}];
}
}
dispatch_group_notify(group, queue, ^{
NSLog(#"load image complete");
});
Also dispatch_group_wait is the alternative to using dispatch_group_notify. It should only be used if you want to wait synchronously for the group to finish.
I am writing an application with a plugin system. Plugins must work on the main thread (this is not part of the question, I'm not looking for a bunch of answers that say I should remove this requirement).
Plugins get initialised asynchronously so the UI doesn't hang for a few seconds on launch, but other code immediately starts to interact with the plugins after launch. This obviously needs to be delayed until the plugins have finished loading.
Here is what I've got so far...
// Create operation queue
dispatch_queue_t queue = dispatch_queue_create(...);
dispatch_suspend(queue);
// Load the plugins
dispatch_group_t group = dispatch_group_create();
for each plugin {
dispatch_group_async(group, dispatch_get_main_queue(), ^{
load...
});
}
dispatch_group_notify(group, dispatch_get_main_queue(), ^{
dispatch_resume(queue);
});
// Add operations that interact with the plugins
dispatch_async(queue, ^{
dispatch_async(dispatch_get_main_queue(), ^{
operation...
});
});
This will mean that any operations submitted won't start until the plugins have finished loading, however, any new operations will go through two queues before actually being processed. Is this a large overhead? Would it be worth queueing to begin with, and then swapping out method implementations when ready for one that doesn't bother queueing? This would be more tricky to do and I don't know if it would be worth it.
Finally, is there a better design pattern for this type of problem? Perhaps I should be using NSOperations and NSOperationQueues with dependencies? Or would they have a higher overhead than basic GCD operations?
The "double overhead" is actually really low, but there is a slightly better design pattern you can use for this that is also more intuitive. Create your operation queue and then use dispatch_set_target_queue(queue, dispatch_get_main_queue()) to make it, in essence, a sub-queue of the main queue. This will ensure it executes on the main thread while not requiring you to do the cross-submission - you'll just submit the plug-in operation(s) directly to the operation queue.
First I create a serial queue like this
static dispatch_queue_t queue = dispatch_queue_create("myQueue", DISPATCH_QUEUE_SERIAL);
then, at some unknown point in time a task gets added to the queue like this
dispatch_async(queue, ^{
// do something, which takes some time
});
If the first task hasn't finished yet, the new task will wait until the first completes (that's of course what a serial queue is for).
But if I add 5 new tasks to the queue, while the original first one is still running, I don't want to execute new task no.1, then no.2, then no.3 and so on, but want to get rid of tasks 1 to 4 and directly start executing task no.5 after the original first task has finished.
In other words, I want to pop any waiting task (not the one that is currently running) off the queue, if I add a new one.
Is there a build in mechanism for this or do I have to implement this myself? And for the latter, how would I identify single tasks inside a queue and remove them?
Once a block has been submitted to a GCD dispatch queue, it will run. There is no way to cancel it. You can, as you know, implement your own mechanism to "abort" the block execution early.
An easier way to do this would be to use NSOperationQueue, as it already provides an implementation for canceling pending operations (i.e., those not yet running), and you can easily enqueue a block with the new-ish addOperationWithBlock method.
Though NSOperationQueue is implemented using GCD, I find GCD much easier to use in most cases. However, in this case, I would seriously consider using NSOperationQueue because it already handles canceling pending operations.
With Davids answer getting me on track I succeeded in doing this like so
taskCounter++;
dispatch_async(queue, ^{
if (taskCounter > 1) {
taskCounter--;
NSLog(#"%#", #"skip");
return;
}
NSLog(#"%#", #"start");
// do stuff
sleep(3);
taskCounter--;
NSLog(#"%#", #"done");
});
taskCounter has to be either an ivar or a property (initialize it with 0). In that case it doesn't even need the __block attribute.
The way you handle this is to use an ivar that indicates to the queued blocks they should just return:
^{
if(!canceled) {
... do work
}
}
You don't need to use a simple boolean either - you can make this more complex - but the general idea is to use one or more ivars that the block queries before doing anything.
I use this technique (but did not invent it) with great success.
If instead of adding a closure in you add a DispatchWorkItem, you can cancel it as long as it hasn't started executing yet.
In the following code, backgroundWorkItem will never run, because it is cancelled before it starts executing.
let backgroundWorkItem = DispatchWorkItem {
print("Background work item executed")
}
DispatchQueue.main.async(execute: backgroundWorkItem)
backgroundWorkItem.cancel()
Say I have a low priority job. Like checking queues whether there are URL to grab, or stuff.
How can I do that?
I can use timer to check every seconds but I am sure there are better ways.
You can use GCD to schedule a task with low priority in the main queue.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_LOW, 0), ^{
//Your code here
});
You can put this code within a loop, timer or whatever you want.