Checking multiple fences with one fence status check - vulkan

Checking fence status to see if a queue submission has completed should be avoided as such as possible. I think it's a common thing to do that every frame fences for queue submissions are checked and then command buffers are released, and any resources too (either to be destroyed or recycled). Let's just say that you want to check if all submissions submitted on a certain frame have been completed, let's say n, or n + 1 frames ago. You could check all the fences for multiple submissions in one go. Except that you can't, vkGetFenceStatus only takes one fence as an argument ( or is there a way to check multiple fences?)
Is the idea of making a dummy submission that has a wait semaphore waiting for multiple previous submissions, and then checking the fence status on that one dummy submission a good idea?

It's not the checking of status that is expensive, it's blocking and waiting for completion so you stall the CPU and drain the pipeline.
You solve this by pipelining your queues deeply enough that you only ever "wait" on semaphores/fences that are already complete. This typically requires N-buffering resources so you have enough resources to avoid the need to wait "too soon". If you do this properly, I'd expect the CPU cost of testing the fences here to be negligible.

Related

How to program factory reset switch in a small embedded device

I am building a small embedded device. I am using a reset switch, and when this is pressed for more than 5 seconds, the whole device should reset and clear all the data and go to factory reset state.
I know what to clear when this event happens. What I want to know is how do I raise this event? I mean when switch is pressed, how do I design the system to know that 5 seconds have elapsed and I have to reset now. I need high level design with any timers and interrupts. Can some one please help me?
Depends on the device. But few rough ideas:
Possibly the device manual may say about the number of interrupts per second that is produced by "holding down the switch" (switch down). If you have this value, you can easily calculate the 5 seconds.
If not, you would need to use timer too. Start the timer when you get the first interrupt of "switch down" and count up to 5 seconds.
Note that, You should also monitor for "switch up", that is, "release of switch". I hope there will be an interrupt for that too. (Possibly with different status value).
So you should break the above loop (you shouldn't do the reset) when you see this interrupt.
Hope this helps.
Interrupt-driven means low level, close to the hardware. An interrupt-driven solution, with for example a bare metal microcontroller, would look like this:
Like when reading any other switch, sample the switch n number of times and filter out the signal bounce (and potential EMI).
Start a hardware timer. Usually the on-chip timers are far too fast to count a whole 5 seconds, even when you set it to run as slow as possible. So you need to set the timer with a pre-scale value, picked so that one whole timer cycle equals a known time unit (like for example 10 milliseconds).
Upon timer overflow, trigger an interrupt. Inside the interrupt, check that the switch is still pressed, then increase a counter. When the counter reaches a given value, execute the reset code. For example, if you get a timer overflow every 10 milliseconds, your counter should count up to 5000ms/10ms = 500.
If the switch is released before the time is elapsed, reset the counter and stop the timer interrupt.
How to reset the system is highly system-specific. You should put the system in a safe system, then overwrite your current settings by overwriting the NVM where settings is stored with some default factory settings stored elsewhere in NVM. Once that is done, you should force the processor to reset itself and reboot with the new settings in place.
This means that you must have a system with electronically-erasable NVM. Depending on the size of the data, this NVM could either be data flash on-chip in a microcontroller, or some external memory circuit.
Detecting a 5S or 30S timeout can be done using a GPIO on an interrupt.
If using an rtos,
. Interrupt would wake a thread from sleep and disables itself,
. All the thread does is count the time the switch is pressed for (you scan the switch at regular intervals)
. If the switch is pressed for desired time set a global variable/setting in eeprom which will trigger the factory reset function
. Else enable the interrupt again and put the thread to sleep
. Also, use a de-bounce circuit to avoid issues.
Also define what do you mean by factory reset?
There are two kinds in general, both cases I will help using eeprom
Revert all configurations (Low cost, easier)
In this case, you partition the eeprom, have a working configuration and factory configuration. You copy over the factory configurations to the working partition and perform a software reset
Restore complete firmware (Costly, needs more testing)
This is more tricky, but can be done with help of bootloaders that allow for flashing from eeprom/or sd card.
In this case the binary firmware blob will also be stored with the factory configuration, in the safe partition and will be used to flash controller flash and configurations.
All depends on the size/memory and cost. can be designed in many more ways, i am just laying out simplest examples.
I created some products with a combined switch to. I did so by using a capacitator to initiate a reset pulse on the reset pin of the device (current and levels limit by some resistors and/or diodes). At start-up I monitor the state of the input pin connected to the switch. I simply wait until this pin goes height with a time-out of 5 seconds. In case of a time-out I reset my configuration to default.

Serial queus and sync operations

im studying multithreading and what i want is some clarification on subject matter.
As far as i know, SERIAL queue execute tasks serially, are always executing one task at a time.
Now, SYNCHRONOUS function is a function, that returns only after all tasks complete.
Now, i'm a bit confused. What difference between those two?
if i understand correct, both of them will block current thread (if they are not "covered" in global concurrent queue), and both of them execute tasks exactly in FIFO order.
So, what exactly a difference between them? Yes, i understand that serial is a property of a queue, and sync is a function (or operation). But their functionality is like to be similiar.
You are comparing a queue with a function, so it is difficult to define "difference". Using a serial queue does guarantee sequential behaviour of its operations. Typically, you use a synchronous dispatch if your program has to wait for all queued operations to complete before your program completes. If every dispatch on a given queue is synchronous, then indeed there is no difference between using a queue or calling the operations.
However, here is a very useful case that shows the difference. Suppose operation A is lengthy and you do not want to block. Suppose operation B returns something computed by operation A, but it is called some arbitrary time later (like in response to a user action). You dispatch_async A onto the queue. Your program is not blocked. Sometime later, you need the result. You dispatch_sync operation B on the same serial queue.
Now if A is already complete, the queue is empty when you add B and B executes immediately. But (and here is the good part) if A is still executing (asynchronously), B is not dispatched until A is done, so your program is blocked until the result it needs is ready.
For more explanation of this, see here.
The dangers of deadlock nicely handled for you by gcd.

priority control with semaphore

Suppose I have a semaphore to control access to a dispatch_queue_t.
I wait for the semaphore (dispatch_semaphore_wait) before scheduling a block on the dispatch queue.
dispatch_semaphore_wait(semaphore,DISPATCH_TIME_FOREVER)
dispatch_async(queue){ //do work ; dispatch_semaphore_signal(semaphore); }
Suppose I have work waiting in several separate locations. Some "work" have higher priority than the other "work".
Is there a way to control which of the "work" will be scheduled next?
Additional information: using a serial queue without a semaphore is not an option for me because the "work" consist of its own queue with several blocks. All of the work queue has to run, or none of it. No work queues can run simultaneously. I have all of this working fine, except for the priority control.
Edit: (in response to Jeremy, moved from comments)
Ok, suppose you have a device/file/whatever like a printer. A print job consists of multiple function calls/blocks (print header, then print figure, then print text,...) grouped together in a transaction. Put these blocks on a serial queue. One queue per transaction.
However you can have multiple print jobs/transactions. Blocks from different print jobs/transactions can not be mixed. So how do you ensure that a transaction queue runs all of its jobs and that a transaction queue is not started before another queue has finished? (I am not printing, just using this as an example).
Semaphores are used to regulate the use of finite resources.
https://www.mikeash.com/pyblog/friday-qa-2009-09-25-gcd-practicum.html
Concurrency Programming Guide
The next step I am trying to figure out is how to run one transaction before another.
You are misusing the API here. You should not be using semaphores to control what gets scheduled to dispatch queues.
If you want to serialize execution of blocks on the queue, then use a serial queue rather than a concurrent queue.
If different blocks that you are enqueuing have different priority, then you should express that different priority using the QOS mechanisms added in OS X 10.10 and iOS 8.0. If you need to run on older systems, then you can use the different priority global concurrent queues for appropriate work. Beyond that, there isn't much control on older systems.
Furthermore, semaphores inherently work against priority inheritance since there is no way for the system to determine who will signal the semaphore and thus you can easily end up in a situation where a higher priority thread will be blocked for a long time waiting for a lower priority thread to signal the semaphore. This is called priority inversion.

VxWorks signals

I have a question regarding previous question asked in VxWorks forum.
My goal is when the high priority function generates a signal the low priority function will handle it immidiately(the high priority function must be preempted)
The code is:
sig_hdr () { ... }
task_low_priority() {
...
// Install signal handler for SIGUSR1
signal(SIGUSR1, sig_hdr);
...
}
task_high_priority() {
...
kill(pid, SIGUSR1); //pid is the ID of task_low_priority
...
}
After the line:
signal(SIGUSR1, sig_hdr);
i added
taskDelay(0).
I wanted to block the high priority task so the low priority task can gain the CPU in order to execute the signal handler but it does not happen unless i do taskDelay(1).
Can any one explain why it does not work with taskDelay(0)?
Indeed, taskDelay(0) will not let lower priority tasks run because of the following:
high priority task is executing
high priority task issues taskDelay(0)
Scheduler is invoked and it scans for the next task to run, it will select the highest priority task that is "ready"
The task that issued the taskDelay(0) is ready because the delay has expired (i.e. 0 ticks have elapsed)
So the high priority task is rescheduled immediately, in this case taskDelay(0) is effectively a waste of CPU cycles.
Now in the case where you issue taskDelay(1) the same steps are followed, but the difference is that the high priority task isn't in the ready state because one tick has not elapsed, so a lower priority task that is ready can have 1 tick of CPU time then it will be preempted by the high priority task.
Now there are some poorly designed systems out there that do things like:
taskLock();
...
taskDelay(0);
...
taskUnlock();
With the intention of having a low priority task hog the CPU until some point where it then allows a high priority task to take over by issuing a taskDelay(0). However if you play games like this then you should reconsider your design.
Also in your case I would consider a more robust system, rather than doing a taskDelay() to allow a low priority task to process an event, you should send a message to a low priority task and have that low priority task to process the message queue. While your high priority task blocks on a semaphore that is given by your event handler or some thing similar. In this situation you are hoping to force a ping pong between two different tasks to get a job done, but if you add a queue that will act as a buffer, so as long as your system is schedulable (i.e. there is enough time to respond to all events, queue them up and fully process them) then it will work.
Update
I assume your system is supposed to be something like this:
Event occurs (interrupt driven?).
High priority task runs to gather data.
Data is processed by low priority task.
If this is the case the pattern you want to follow is actually quite simple, and in fact could be accomplished with just 1 task:
Interrupt handler gathers data, and sends a message (msgQSend()) to task.
Task is pending on the message queue with msgQReceive.
But it might help if I knew more about your system (what are you really trying to do) and also why you are using posix calls rather than native vxworks calls.
If you are new to real time systems, you should learn about Rate monotonic analysis, there is a very brief summary on wikipedia:
http://en.wikipedia.org/wiki/Rate-monotonic_scheduling
Also note that in VxWorks a "high priority" is 0, and "low priority" is 255, the actual numbers are inversely related to their meaning :D
this is exactly the point i dont understand how the low priority task will get some CPU time when the high priority task is running?
High priority task will continue run till it gets blocked. OInce it gets blocked, lower priority task that are ready run will run.
My answer has 2 parts:
1. How to use correctly task Delay with vxWorks
2. TaskDelay is not the correct solution for your problem
First part:
TaskDelay in vxWorks can confused:
taskDelay(0) – don't perform delay at all!!!
It is a command to the scheduler to remove the current task from the CPU. If this is still the highest priority task in the system, it will return to the head of the queue with no delay at all. You will use this command if the scheduler configured to FIFO in case tasks in the same priority and your task have a CPU real time consumer function to run, the can try to release the CPU for other tasks in the same priority (nice).
BTW, it is the same as taskDelay(NO_WAIT).
TaskDelay(1) – this will delay the calling task sometime between zero (!!!) to 1 system tick. The delay in vxWorks finish at a round system tick.
TaskDelay(2) – sometime between 1 system tick to 2 system ticks.
3 …… (understood…)
TaksDelay(-1) (A.K.A taskDelay(WAIT_FOREVER)) – will delay the task forever (not recommended).
Second part:
Using taskDelay to enable low priority task might be a wrong idea. You didn't provided the all problem information but please note that delaying the high priority task will not ensure your low priority task will run (regardless the sleep time you'll write). Other tasks in highest priority from your high & low priority tasks might run for the all 'sleep time'.
There are several synchronized methods in vxWorks, like binary semaphores, changing task priority, signals, …

How does VxWorks prioritize interrupt bottom-halves?

Suppose I have two tasks, 'A' and 'B', of differing priority executing on SMP-supported VxWorks. Both 'A' and 'B' issue a command to an I/O device (such as a disk or NIC) and both block waiting for results. That is, both 'A' and 'B' are blocked at the same time. Some time later, the I/O device raises an interrupt and the ISR is invoked. The ISR then dispatches deferred work (aka "bottom-half") to a worker-task. Question: What is the priority of the worker-task?
VxWorks Device Driver Developer's Guide is a bit vague. It appears that the priority of the worker-task is set up a-priori. There are no automatic inheritance mechanisms that will increase the priority of the worker-task based upon the priorities of tasks ('A' and 'B') that are blocked waiting for results. This is similar to how threaded interrupt priorities work in PREEMPT_RT Linux. However, both QNX Neutrino and LynxOS will schedule the worker-task with the maximum priority of the blocked tasks-- Ex. priority(worker) = max_priority(A, B).
Can anyone clarify?
It depends exactly on which mechanism the "ISR dispatched deferred work" uses.
If a semaphore/messageQueue/Event is used, then the recipient task (A or B) will run at the priority specified when the task was created. In this scenario, the interrupt is essentially finished, and the task (A and/or B) are ready to run.
Whichever task is has the highest priority will get to run and perform it's work. Note that the task doesn't have access to any information from the interrupt context. If you use global structures (yuk) or pass data via a message queue, then the task could access those elements.
The network stack task (tNetTask) uses this approach, and a semaphore signals tNetTask when a packet has been received. When tNetTask has processed the packet (packet reassembly, etc...), it is then forwarded to whichever task is waiting on the corresponding socket.
It is possible to defer work from an ISR to tExcTask (via a call to excJobAdd). Note that with this approach, excJobAdd takes the pointer to a function and executes the function in the context of the tExcTask (which is at the highest priority in the system). It does not act as a self-contained task.
Note that some things like file systems, SCSI drivers, USB, etc... are much more than a simple driver with interrupts. They include a number of different components that unfortunately also increases complexity.