priority control with semaphore - objective-c

Suppose I have a semaphore to control access to a dispatch_queue_t.
I wait for the semaphore (dispatch_semaphore_wait) before scheduling a block on the dispatch queue.
dispatch_semaphore_wait(semaphore,DISPATCH_TIME_FOREVER)
dispatch_async(queue){ //do work ; dispatch_semaphore_signal(semaphore); }
Suppose I have work waiting in several separate locations. Some "work" have higher priority than the other "work".
Is there a way to control which of the "work" will be scheduled next?
Additional information: using a serial queue without a semaphore is not an option for me because the "work" consist of its own queue with several blocks. All of the work queue has to run, or none of it. No work queues can run simultaneously. I have all of this working fine, except for the priority control.
Edit: (in response to Jeremy, moved from comments)
Ok, suppose you have a device/file/whatever like a printer. A print job consists of multiple function calls/blocks (print header, then print figure, then print text,...) grouped together in a transaction. Put these blocks on a serial queue. One queue per transaction.
However you can have multiple print jobs/transactions. Blocks from different print jobs/transactions can not be mixed. So how do you ensure that a transaction queue runs all of its jobs and that a transaction queue is not started before another queue has finished? (I am not printing, just using this as an example).
Semaphores are used to regulate the use of finite resources.
https://www.mikeash.com/pyblog/friday-qa-2009-09-25-gcd-practicum.html
Concurrency Programming Guide
The next step I am trying to figure out is how to run one transaction before another.

You are misusing the API here. You should not be using semaphores to control what gets scheduled to dispatch queues.
If you want to serialize execution of blocks on the queue, then use a serial queue rather than a concurrent queue.
If different blocks that you are enqueuing have different priority, then you should express that different priority using the QOS mechanisms added in OS X 10.10 and iOS 8.0. If you need to run on older systems, then you can use the different priority global concurrent queues for appropriate work. Beyond that, there isn't much control on older systems.
Furthermore, semaphores inherently work against priority inheritance since there is no way for the system to determine who will signal the semaphore and thus you can easily end up in a situation where a higher priority thread will be blocked for a long time waiting for a lower priority thread to signal the semaphore. This is called priority inversion.

Related

Serial queus and sync operations

im studying multithreading and what i want is some clarification on subject matter.
As far as i know, SERIAL queue execute tasks serially, are always executing one task at a time.
Now, SYNCHRONOUS function is a function, that returns only after all tasks complete.
Now, i'm a bit confused. What difference between those two?
if i understand correct, both of them will block current thread (if they are not "covered" in global concurrent queue), and both of them execute tasks exactly in FIFO order.
So, what exactly a difference between them? Yes, i understand that serial is a property of a queue, and sync is a function (or operation). But their functionality is like to be similiar.
You are comparing a queue with a function, so it is difficult to define "difference". Using a serial queue does guarantee sequential behaviour of its operations. Typically, you use a synchronous dispatch if your program has to wait for all queued operations to complete before your program completes. If every dispatch on a given queue is synchronous, then indeed there is no difference between using a queue or calling the operations.
However, here is a very useful case that shows the difference. Suppose operation A is lengthy and you do not want to block. Suppose operation B returns something computed by operation A, but it is called some arbitrary time later (like in response to a user action). You dispatch_async A onto the queue. Your program is not blocked. Sometime later, you need the result. You dispatch_sync operation B on the same serial queue.
Now if A is already complete, the queue is empty when you add B and B executes immediately. But (and here is the good part) if A is still executing (asynchronously), B is not dispatched until A is done, so your program is blocked until the result it needs is ready.
For more explanation of this, see here.
The dangers of deadlock nicely handled for you by gcd.

RabbitMQ: throttling fast producer against large queues with slow consumer

We're currently using RabbitMQ, where a continuously super-fast producer is paired with a consumer limited by a limited resource (e.g. slow-ish MySQL inserts).
We don't like declaring a queue with x-max-length, since all messages will be dropped or dead-lettered once the limit is reached, and we don't want to loose messages.
Adding more consumers is easy, but they'll all be limited by the one shared resource, so that won't work. The problem still remains: How to slow down the producer?
Sure, we could put a flow control flag in Redis, memcached, MySQL or something else that the producer reads as pointed out in an answer to a similar question, or perhaps better, the producer could periodically test for queue length and throttle itself, but these seem like hacks to me.
I'm mostly questioning whether I have a fundamental misunderstanding. I had expected this to be a common scenario, and so I'm wondering:
What is best practice for throttling producers? How is this done with RabbitMQ? Or do you do this in a completely different way?
Background
Assume the producer actually knows how to slow himself down with the right input. E.g. a hardware sensor or hardware random number generator, that can generate as many events as needed.
In our particular real case, we have an API that users can use to add messages. Instead of devouring and discarding messages, we'd like to apply back-pressure by having our API return an error if the queue is "full", so the caller/user knows to back-off, or have the API block until the consumer catches up. We don't control our user, so regardless of how fast the consumer is, I can create a producer that is faster.
I was hoping for something like the API for a TCP socket, where a write() can block and where a select() can be used to determine if a handle is writable. So either having the RabbitMQ API block or have it return an error if the queue is full.
For the x-max-length property, you said you don't want messages to be dropped or dead-lettered. I see there was an update in adding some more capabilities for this. As I see it is specified in the documentation:
"Use the overflow setting to configure queue overflow behaviour. If overflow is set to reject-publish, the most recently published messages will be discarded. In addition, if publisher confirms are enabled, the publisher will be informed of the reject via a basic.nack message"
So as I understand it, you can use queue limit to reject the new messages from publishers thus pushing some backpressure to the upstream.
I don't think that this is in any way rabbitmq specific. Basically you have a scenario, where there are two systems of different processing capabilities, and this mismatch will either pose a risk of overflowing the queue (whatever it would be), or even in case of a constant mismatch between producer and consumer, simply create more and more time-distance between event creation and its handling.
I used to deal with this kind of scenarios, and unfortunately there is no magic bullet. You either have to speed up even handling (better hardware, more suited software?) or throttle the event creation (which has nothing to do with MQ really).
Now, I would ask you what's the goal and how the events are produced. Are the events are produced constantly, with either unlimitted or just very high rate (for example readings from sensors - the more, the better), or are they created in batches/spikes (for example: user requests in specific time periods, batch loads from CRM system). I assume that the goal is to process everything cause you mention you don't want to loose any queued message.
If the output is constant, then some limiter (either internal counter, if the producer is the only producer, or external queue length checks if queue can be filled with some other system) is definitely in place.
IF eventsInTimePeriod/timePeriod > estimatedConsumerBandwidth
THEN LowerRate()
ELSE RiseRate()
In real world scenarios we used to simply limit the output manually to the estimated values and there were some alerts set for queue length, time from queue entry to queue leaving etc. Where such limiters were omitted (by mistake mostly) we used to find later some tasks that were supposed to be handled in few hours, that were waiting for three months for their turn.
I'm afraid it's hard to answer to "How to slow down the producer?" if we know nothing about it, but some ideas are: aforementioned rate check or maybe a blocking AddMessage method:
AddMessage(message)
WHILE(getQueueLength() > maxAllowedQueueLength)
spin(1000); // or sleep or whatever
mqAdapter.AddMessage(message)
I'd say it all depends on specific of the producer application and in general your architecture.

blocking call on two Queues?

I have an algorithm (task in VxWorks) that is reading data from multiple queues to be able to manage priorities accordingly. Now , the msgQReceive( ) function, can be set to WAIT_FOREVER which would make it a blocking call until something is available to receive and process. Now how can I do this if I have multiple queues? Currently I check in a while(1) loop if any of the queues have any contents and receive them if so but if nothing is there, my algorithm just spins and spins and spins and eats CPU resources for nothing. How can I prevent this best?
You should be able to use VxWorks events coupled with a Message Queue.
See msgQEvStart function and Kernel Programmer's Guide, section 7.9.
This is akin to using a select() for I/O operation.
You do a blocking eventReceive which returns a bitmask indicating which queue has content and you then do a non-blocking msgQReceive to retrieve the data.
Or you can look at How can a task wait on multiple vxworks Queues? which I wrote a while ago,
As already mentioned, you could use events, alternatively if you can use a pipe instead of msgQ, you could potentially use select.
As another alternative, perhaps consider having multiple tasks, each servicing a single msgQ

Understanding dispatch_async

I have question around this code
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSData* data = [NSData dataWithContentsOfURL:
kLatestKivaLoansURL];
[self performSelectorOnMainThread:#selector(fetchedData:)
withObject:data waitUntilDone:YES];
});
The first parameter of this code is
dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
Are we asking this code to perform serial tasks on global queue whose definition itself is that it returns global concurrent queue of a given priority level?
What is advantage of using dispatch_get_global_queue over the main queue?
I am confused. Could you please help me to understand this better.
The main reason you use the default queue over the main queue is to run tasks in the background.
For instance, if I am downloading a file from the internet and I want to update the user on the progress of the download, I will run the download in the priority default queue and update the UI in the main queue asynchronously.
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void){
//Background Thread
dispatch_async(dispatch_get_main_queue(), ^(void){
//Run UI Updates
});
});
All of the DISPATCH_QUEUE_PRIORITY_X queues are concurrent queues (meaning they can execute multiple tasks at once), and are FIFO in the sense that tasks within a given queue will begin executing using "first in, first out" order. This is in comparison to the main queue (from dispatch_get_main_queue()), which is a serial queue (tasks will begin executing and finish executing in the order in which they are received).
So, if you send 1000 dispatch_async() blocks to DISPATCH_QUEUE_PRIORITY_DEFAULT, those tasks will start executing in the order you sent them into the queue. Likewise for the HIGH, LOW, and BACKGROUND queues. Anything you send into any of these queues is executed in the background on alternate threads, away from your main application thread. Therefore, these queues are suitable for executing tasks such as background downloading, compression, computation, etc.
Note that the order of execution is FIFO on a per-queue basis. So if you send 1000 dispatch_async() tasks to the four different concurrent queues, evenly splitting them and sending them to BACKGROUND, LOW, DEFAULT and HIGH in order (ie you schedule the last 250 tasks on the HIGH queue), it's very likely that the first tasks you see starting will be on that HIGH queue as the system has taken your implication that those tasks need to get to the CPU as quickly as possible.
Note also that I say "will begin executing in order", but keep in mind that as concurrent queues things won't necessarily FINISH executing in order depending on length of time for each task.
As per Apple:
https://developer.apple.com/library/content/documentation/General/Conceptual/ConcurrencyProgrammingGuide/OperationQueues/OperationQueues.html
A concurrent dispatch queue is useful when you have multiple tasks that can run in parallel. A concurrent queue is still a queue in that it dequeues tasks in a first-in, first-out order; however, a concurrent queue may dequeue additional tasks before any previous tasks finish. The actual number of tasks executed by a concurrent queue at any given moment is variable and can change dynamically as conditions in your application change. Many factors affect the number of tasks executed by the concurrent queues, including the number of available cores, the amount of work being done by other processes, and the number and priority of tasks in other serial dispatch queues.
Basically, if you send those 1000 dispatch_async() blocks to a DEFAULT, HIGH, LOW, or BACKGROUND queue they will all start executing in the order you send them. However, shorter tasks may finish before longer ones. Reasons behind this are if there are available CPU cores or if the current queue tasks are performing computationally non-intensive work (thus making the system think it can dispatch additional tasks in parallel regardless of core count).
The level of concurrency is handled entirely by the system and is based on system load and other internally determined factors. This is the beauty of Grand Central Dispatch (the dispatch_async() system) - you just make your work units as code blocks, set a priority for them (based on the queue you choose) and let the system handle the rest.
So to answer your above question: you are partially correct. You are "asking that code" to perform concurrent tasks on a global concurrent queue at the specified priority level. The code in the block will execute in the background and any additional (similar) code will execute potentially in parallel depending on the system's assessment of available resources.
The "main" queue on the other hand (from dispatch_get_main_queue()) is a serial queue (not concurrent). Tasks sent to the main queue will always execute in order and will always finish in order. These tasks will also be executed on the UI Thread so it's suitable for updating your UI with progress messages, completion notifications, etc.
Swift version
This is the Swift version of David's Objective-C answer. You use the global queue to run things in the background and the main queue to update the UI.
DispatchQueue.global(qos: .background).async {
// Background Thread
DispatchQueue.main.async {
// Run UI Updates
}
}

How does VxWorks prioritize interrupt bottom-halves?

Suppose I have two tasks, 'A' and 'B', of differing priority executing on SMP-supported VxWorks. Both 'A' and 'B' issue a command to an I/O device (such as a disk or NIC) and both block waiting for results. That is, both 'A' and 'B' are blocked at the same time. Some time later, the I/O device raises an interrupt and the ISR is invoked. The ISR then dispatches deferred work (aka "bottom-half") to a worker-task. Question: What is the priority of the worker-task?
VxWorks Device Driver Developer's Guide is a bit vague. It appears that the priority of the worker-task is set up a-priori. There are no automatic inheritance mechanisms that will increase the priority of the worker-task based upon the priorities of tasks ('A' and 'B') that are blocked waiting for results. This is similar to how threaded interrupt priorities work in PREEMPT_RT Linux. However, both QNX Neutrino and LynxOS will schedule the worker-task with the maximum priority of the blocked tasks-- Ex. priority(worker) = max_priority(A, B).
Can anyone clarify?
It depends exactly on which mechanism the "ISR dispatched deferred work" uses.
If a semaphore/messageQueue/Event is used, then the recipient task (A or B) will run at the priority specified when the task was created. In this scenario, the interrupt is essentially finished, and the task (A and/or B) are ready to run.
Whichever task is has the highest priority will get to run and perform it's work. Note that the task doesn't have access to any information from the interrupt context. If you use global structures (yuk) or pass data via a message queue, then the task could access those elements.
The network stack task (tNetTask) uses this approach, and a semaphore signals tNetTask when a packet has been received. When tNetTask has processed the packet (packet reassembly, etc...), it is then forwarded to whichever task is waiting on the corresponding socket.
It is possible to defer work from an ISR to tExcTask (via a call to excJobAdd). Note that with this approach, excJobAdd takes the pointer to a function and executes the function in the context of the tExcTask (which is at the highest priority in the system). It does not act as a self-contained task.
Note that some things like file systems, SCSI drivers, USB, etc... are much more than a simple driver with interrupts. They include a number of different components that unfortunately also increases complexity.