Concurrent queue vs private dispatch queue

Concurrent queue vs private dispatch queue - objective-c

If I need to create a large number of queues (say 10+ queues for image loading), is it faster to use the global concurrent queue or create the same number of private dispatch queues? For a quad-core CPU, is the concurrent queue limited to four concurrent queues before it turns into serial queue for subsequent queued tasks?

I'd suggest creating your own concurrent queue which constrains how many concurrent operations are permitted. For example, you could create a single concurrent NSOperationQueue with maxConcurrentOperationCount set to four or five. Then add all of your synchronous image retrieval requests to that. For example:
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
queue.maxConcurrentOperationCount = 5;
Then just add all of your image requests with something like:
[queue addOperationWithBlock:^{
// request image
}];
You can get fancier than that, but this is what a basic alternative to your two suggestions. But this will ensure that you do not have more than five concurrent network requests.
Note, for this to work (as well as your GCD suggestions), your operations, themselves, must be synchronous. If they are not synchronous, then you have to do some extra work to make sure that the operations don't complete until the task they perform does.
If you want to know when they're all done, you can use a completion operation:
NSOperation *completionOperation = [NSBlockOperation operationWithBlock:^{
// this is what will happen when they're done
}];
Then add your operations:
NSOperation *operation = [NSBlockOperation operationWithBlock:^{
// do network request here
}];
[completionOperation addDependency:operation];
[queue addOperation:operation];
And when done queuing all of your individual operations, you can then queue that completion operation, which won't fire until the rest are done (because you've declared a dependency between them):
[queue addOperation:completionOperation];

Faster depends on the work being done, of course.
The global concurrent queue attempts to match the number of concurrent activities to the available hardware. That's not documented, so it might or might not match the number of cores (or maybe double the number of cores if they're hyper threaded and the work permits). If queued actions block (e.g. on network activity or disk I/O) then the global queue will start new jobs.
You can create your own queues to force the issue, but that probably won't help. If you have four cores then queueing up 10 or 20 or whatever number of simultaneous CPU-heavy actions isn't going to help the overall speed. Once you max out resources, you've maxed them out, and adding more private queues don't change that.

Related

GCD - does a serial queue require an `NSLock` or a memory barrier to synchronize work?

I read the Apple documentation on GCD queues and started to wonder what happens if I lets say modify an instance member of type NSMutableArray which is not thread safe in a serial queue? The serial queue would guarantee me that I execute the operations serially, but I still feel that I need to either do an #syncrhonized block or other technique to force a memory barrier, since as far as I understand the tasks on my serial queue can be invoked on different threads. Is that correct? Here is a simple example:
#interface Foo : NSObject
-(void)addNumber:(NSNumber*)number;
-(void)printNumbers;
-(void)clearNumbers;
#end
#implementation Foo
{
dispatch_queue_t _queue;
NSMutableArray<NSNumber*>* _numbers;
}
-(instancetype)init
{
if (self = [super init])
{
_queue = dispatch_queue_create(NULL, NULL);
_numbers = [NSMutableArray array];
}
return self;
}
-(void)addNumber:(NSNumber*)number
{
dispatch_async(_queue,
^{
[_numbers addObject:number];
});
}
-(void)printNumbers
{
dispatch_async(_queue,
^{
for (NSNumber* number in _numbers)
{
NSLog(#“%#“, number);
}
});
}
-(void)clearNumbers
{
dispatch_async(_queue,
^{
_numbers = [NSMutableArray array];
});
}
#end;
As far as I understand I could run into memory issues here if I call the member methods from arbitrary threads? Or GCD gives some guarantees under the hood, why I do not need to force memory barriers? Looking at the examples I did not find such constructs anywhere, but coming from C++ it would make sense to touch the member variable under a lock.

If your queue is a serial queue, it will only allow one operation at a time, no matter which thread it's running on. Therefore, if every access to a resource occurs on the queue, there's no need to further protect that resource with a lock or a semaphore. In fact, it's possible to use dispatch queues as a locking mechanism, and for some applications, it can work quite well.
Now if your queue is a concurrent queue, then that's a different story, since multiple operations can run at the same time on a concurrent queue. However, GCD provides the dispatch_barrier_sync and dispatch_barrier_async APIs. Operations that you start via these two function calls will cause the queue to wait until all other operations finish before executing your block, and then disallow any more operations from running until the block is finished. In this way, it can temporarily make the queue behave like a serial queue, allowing even a concurrent queue to be used as a sort of locking mechanism (for example, allowing reads to a resource via a normal dispatch_sync call, but doing writes via a dispatch_barrier_async. If the reads occur very frequently and the writes very infrequently, this can perform pretty well).

The serial queue is a data lock, so no further locking / synchronization is needed, at least as far as this code is concerned. The fact that the same queue may be executed using different threads is an implementation detail about which you should not be thinking; queues are the coin of the realm.
There may, of course, be issues in regard to sharing the array between this queue and the main queue, but that's a different matter.

Limiting number of threads

I need to download images through background threads, but limit the number of threads. Maximum thread number must be 5 and in each thread must be just one serial queue. For client-server using socket rocket library. The main trouble is that i don't need NSOperation pluses like canceling operations. Looking for a simple decision, but can found just something like this:
self.limitingSema = dispatch_semaphore_create(kOperationLimit);
dispatch_queue_t concurentQueue = dispatch_queue_create("limiting queue", DISPATCH_QUEUE_CONCURRENT);
dispatch_async(concurentQueue, ^{
dispatch_semaphore_wait(self.limitingSema, DISPATCH_TIME_FOREVER);
/* upload image here */
dispatch_semaphore_signal(self.limitingSema);
});
But then how to limit number of threads and to wait new operations starting, until they are not ready in queue ?
Is it good to control number of queues ?
NSArray *queues = #[dispatch_queue_create("com.YOU.binaryQueue_1", DISPATCH_QUEUE_SERIAL),
dispatch_queue_create("com.YOU.binaryQueue_2", DISPATCH_QUEUE_SERIAL),
dispatch_queue_create("com.YOU.binaryQueue_3", DISPATCH_QUEUE_SERIAL)
];
NSUInteger randQueue = arc4random() % [queues count];
dispatch_async([queues objectAtIndex:randQueue], ^{
NSLog(#"Do something");
});
randQueue = arc4random() % [queues count];
dispatch_async([queues objectAtIndex:randQueue], ^{
NSLog(#"Do something else");
});

GCD has no option to limit the amount of concurrent blocks running.
This will potentially create one thread that just waits for each operation you enqueue. GCD dynamically adjusts the number of threads it uses. If you enqueue another block and GCD has no more threads available it will spin up another thread if it notices there are free CPU cores available. Since the worker thread is sleeping inside your block the CPU is considered free. This will cause many threads using up a lot of memory - each thread gets 512 KB of Stack.
Your best option would be to use NSOperationQueue for this as you can control directly how many operations will be run in parallel using the maxConcurrentOperationCount property. This will be easier (less code for you to write, test and debug) and much more efficient.

GCD deadlocks when all running tasks are waiting, does not start pending tasks

I have a recursive function that schedules new tasks on a concurrent queue. I would like to limit the number of simultaneously scheduled tasks and so I use a semaphore so that each task will wait on it until the older threads end and signal the semaphore.
However I find that the queue gets deadlocked when the maximum number of running threads (64) is reached and they all begin waiting on the semaphore. Then GCD doesn't start new tasks even though it has plenty in its pending queue.
What am I doing wrong? Here is my code:
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
dispatch_semaphore_t sem = dispatch_semaphore_create(10);
dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER);
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^
{
[self recurWithSemaphore:sem];
});
}
- (void)recurWithSemaphore:(dispatch_semaphore_t)sem
{
// do some lengthy work here...
// at this point we're done all but scheduling new tasks so let new tasks be created
dispatch_semaphore_signal(sem);
for (NSUInteger i = 0; i < 100; ++i)
{
// don't schedule new tasks until we have enough semaphore
dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER);
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^
{
[self recurWithSemaphore:sem];
});
}
}

The typical pattern when using semaphores to control access to a limited resource is
Create semaphore with non-zero value;
For every task:
When starting, "wait" for semaphore (thereby consuming one of the available signals, or if one not available, wait for one); and
When done, "signal" semaphore (making it available for another task).
So, let's say you wanted to start 1,000,000 tasks, only 4 concurrently at any given time, you could do something like:
dispatch_semaphore_t semaphore = dispatch_semaphore_create(4);
dispatch_queue_t queue = ... // some concurrent queue, either global or your own
dispatch_async(queue, ^{
for (long index = 0; index < 1000000; index++) {
dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);
dispatch_async(queue, ^{
[self performSomeActionWithIndex:index completion:^{
dispatch_semaphore_signal(semaphore);
}];
});
}
});
Clearly, if you're dynamically adding more tasks to perform, you could change this from a for loop into a while loop, checking some synchronized source, but the idea is the same.
The key observation, though, is that I don't have performSomeActionWithIndex, itself, recursively creating tasks itself (because then you get into the deadlock situation of the original question where tasks are stopped because they can't start new tasks).
Now, I don't know if your problem can be refactored into this sort of pattern, but if you can, this might be an option.
By the way, for the sake of completeness, I'd point out that the typical solution for controlling the degree of concurrency is to use operation queues rather than dispatch queues, in which case you can specify maxConcurrentOperationCount.
As you correctly pointed out, there are memory implications of that. In my tests, each scheduled operation takes up at least 500 bytes (and likely more in real-world scenarios), so if you really have more than, say, 5,000-10,000 tasks to be scheduled, operation queues may quickly become impractical. As you advised, future readers should refer to the Performance Implications section in Concurrency Programming Guide: Concurrency and Application Design.
I know that this is not a viable approach in your case, but I only mention it for the benefit of future readers. I would generally advise the use of operation queues when one needs to control the degree of concurrency. I would only jump to an approach like the one outlined above if you're dealing with so many tasks that one can't reasonably just schedule them on an operation queue.

Understanding dispatch_async

I have question around this code
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSData* data = [NSData dataWithContentsOfURL:
kLatestKivaLoansURL];
[self performSelectorOnMainThread:#selector(fetchedData:)
withObject:data waitUntilDone:YES];
});
The first parameter of this code is
dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
Are we asking this code to perform serial tasks on global queue whose definition itself is that it returns global concurrent queue of a given priority level?
What is advantage of using dispatch_get_global_queue over the main queue?
I am confused. Could you please help me to understand this better.

The main reason you use the default queue over the main queue is to run tasks in the background.
For instance, if I am downloading a file from the internet and I want to update the user on the progress of the download, I will run the download in the priority default queue and update the UI in the main queue asynchronously.
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void){
//Background Thread
dispatch_async(dispatch_get_main_queue(), ^(void){
//Run UI Updates
});
});

All of the DISPATCH_QUEUE_PRIORITY_X queues are concurrent queues (meaning they can execute multiple tasks at once), and are FIFO in the sense that tasks within a given queue will begin executing using "first in, first out" order. This is in comparison to the main queue (from dispatch_get_main_queue()), which is a serial queue (tasks will begin executing and finish executing in the order in which they are received).
So, if you send 1000 dispatch_async() blocks to DISPATCH_QUEUE_PRIORITY_DEFAULT, those tasks will start executing in the order you sent them into the queue. Likewise for the HIGH, LOW, and BACKGROUND queues. Anything you send into any of these queues is executed in the background on alternate threads, away from your main application thread. Therefore, these queues are suitable for executing tasks such as background downloading, compression, computation, etc.
Note that the order of execution is FIFO on a per-queue basis. So if you send 1000 dispatch_async() tasks to the four different concurrent queues, evenly splitting them and sending them to BACKGROUND, LOW, DEFAULT and HIGH in order (ie you schedule the last 250 tasks on the HIGH queue), it's very likely that the first tasks you see starting will be on that HIGH queue as the system has taken your implication that those tasks need to get to the CPU as quickly as possible.
Note also that I say "will begin executing in order", but keep in mind that as concurrent queues things won't necessarily FINISH executing in order depending on length of time for each task.
As per Apple:
https://developer.apple.com/library/content/documentation/General/Conceptual/ConcurrencyProgrammingGuide/OperationQueues/OperationQueues.html
A concurrent dispatch queue is useful when you have multiple tasks that can run in parallel. A concurrent queue is still a queue in that it dequeues tasks in a first-in, first-out order; however, a concurrent queue may dequeue additional tasks before any previous tasks finish. The actual number of tasks executed by a concurrent queue at any given moment is variable and can change dynamically as conditions in your application change. Many factors affect the number of tasks executed by the concurrent queues, including the number of available cores, the amount of work being done by other processes, and the number and priority of tasks in other serial dispatch queues.
Basically, if you send those 1000 dispatch_async() blocks to a DEFAULT, HIGH, LOW, or BACKGROUND queue they will all start executing in the order you send them. However, shorter tasks may finish before longer ones. Reasons behind this are if there are available CPU cores or if the current queue tasks are performing computationally non-intensive work (thus making the system think it can dispatch additional tasks in parallel regardless of core count).
The level of concurrency is handled entirely by the system and is based on system load and other internally determined factors. This is the beauty of Grand Central Dispatch (the dispatch_async() system) - you just make your work units as code blocks, set a priority for them (based on the queue you choose) and let the system handle the rest.
So to answer your above question: you are partially correct. You are "asking that code" to perform concurrent tasks on a global concurrent queue at the specified priority level. The code in the block will execute in the background and any additional (similar) code will execute potentially in parallel depending on the system's assessment of available resources.
The "main" queue on the other hand (from dispatch_get_main_queue()) is a serial queue (not concurrent). Tasks sent to the main queue will always execute in order and will always finish in order. These tasks will also be executed on the UI Thread so it's suitable for updating your UI with progress messages, completion notifications, etc.

Swift version
This is the Swift version of David's Objective-C answer. You use the global queue to run things in the background and the main queue to update the UI.
DispatchQueue.global(qos: .background).async {
// Background Thread
DispatchQueue.main.async {
// Run UI Updates
}
}

How queues and threading work?

This is related to the Grand Central Dispatch API used in objective-c, with the following codes:
dispatch_queue_t downloadQueue = dispatch_queue_create("other queue", NULL);
dispatch_async(downloadQueue, ^{
....some functions that retrieves data from server...
dispatch_async(dispatch_get_main_queue(), ^{
NSLog(#"got it");
});
});
dispatch_release(downloadQueue);
My current understanding of how queues work is that the blocks in a queue will go on a thread for that queue. So two queues will become two threads. With multi-threading, those two queues will happen simultaneously.
However, the "got it" appears right at when the program received the data. How did that happen?
Please point out if you want to correct or add to my understanding of threading and queue.

So two queues will become two threads.
Not necessarily. One of the advantages of GCD is that the system dynamically decides how many threads it creates, depending on the number of available CPU cores and other factors. It might well be that two custom queues are executed on the same background thread, especially if there are rarely tasks for both queues waiting to be executed.
The only thing you can be certain about is that a serial queue never uses more than one thread at the same time. So the tasks you add to the same (serial) queue will always be executed in order. This is not the case for the three concurrent global queues you get with dispatch_get_global_queue().
Additionally, the main queue (the one you access with dispatch_get_main_queue()) is always bound to the main thread. It is the only queue whose tasks are executed on the program's main thread.
In your example, the task for the downloadQueue gets executed on a background thread. As soon as the code reaches dispatch_async(dispatch_get_main_queue(), ^{, GCD pushes this new task to the main thread where it gets executed practically immediately provided that the main thread is not busy with other things.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas