It is said that using sleep() is poor design, but what about waiting for hardware to settle? - embedded

I am using sleep() in two ways in my current embedded (real time) software design:
To throttle a processing loop, but this is discussed here, and as pointed out thread priority will most likely answer very well for that.
Waiting for hardware to "settle". Lets say I am writing an interface to some hardware. Communications with the hardware is all good, but I want to change its mode and I know it only takes a small number of instruction cycles to do it. I am using a sleep(1); to pause briefly to allow for this. I could setup a loop that keeps pinging it until I receive a valid response, but this would arguably be harder to read (much more code) and, in fact, slower because of data transfer times. In fact I could probably do a usleep(100) or less in my case.
So my question is, is this a good practice? And if not, is there a better/efficient alternative?

Callback
The most ideal solution to this would be to have the hardware notify you when a particular operation is complete through some form of callback/signal.
When writing production code, I would almost always favor this solution above all others. Of course this is provided that the api you are using exposes such methods.
Poll
If there is no way for you to receive such events then the only other option would be for you to check if the operation has completed. The most naive solution would be one that constantly checks (spin-lock).
However, if you know roughly how long an operation should take you could always sleep for that duration, wake-up, check operation status then sleep again or continue.
If you are 100% sure about the timings and can guarantee that your thread is not woken up early then you can rely solely on sleep.
Poor Design?
I wouldn't necessarily say that using sleep for this task is poor design. Sometimes you have no other choice. I would say that to rely solely on sleep is poor design when you cannot guarantee timing because you can not be 100% sure that the operation you are waiting for has in fact completed.

In Linux I use sigsuspend it suspends the software until it receives a signal.
Example
My main thread needs a data, but this data isn't ready, so main thread is suspended.
Other thread reads the data and when it finishes, it fires a signal.
Now then main thread continues and it has the data ready.
If you use sleep the data can be ready or not.

Related

Operating System Basics

I am reading process management,and I have a few doubts-
What is meant by an I/o request,for E.g.-A process is executing and
hence it is in running state,it is in waiting state if it is waiting
for the completion of an I/O request.I am not getting by what is meant by an I/O request,Can you
please give an example to elaborate.
Another doubt is -Lets say that a process is executing and suddenly
an interrupt occurs,then the process stops its execution and will be
put in the ready state,is it possible that some other process began
its execution while the interrupt is also being processed?
Regarding the first question:
A simple way to think about it...
Your computer has lots of components. CPU, Hard Drive, network card, sound card, gpu, etc. All those work in parallel and independent of each other. They are also generally slower than the CPU.
This means that whenever a process makes a call that down the line (on the OS side) ends up communicating with an external device, there is no point for the OS to be stuck waiting for the result since the time it takes for that operation to complete is probably an eternity (in the CPU view point of things).
So, the OS fires up whatever communication the process requested (call it IO request), flags the process as waiting for IO, and switches execution to another process so the CPU can do something useful instead of sitting around blocked waiting for the IO request to complete.
When the external device finishes whatever operation was requested, it generates an interrupt, so the OS is informed the work is done, and it can then flag the blocked process as ready again.
This is all a very simplified view of course, but that's the main idea. It allows the CPU to do useful work instead of waiting for IO requests to complete.
Regarding the second question:
It's tricky, even for single CPU machines, and depends on how the OS handles interrupts.
For code simplicity, a simple OS might for example, whenever an interrupt happens process the interrupt in one go, then resume whatever process it decides it's appropriate whenever the interrupt handling is done. So in this case, no other process would run until the interrupt handling is complete.
In practice, things get a bit more complicated for performance and latency reasons.
If you think about an interrupt lifetime as just another task for the CPU (From when the interrupt starts to the point the OS considers that handling complete), you can effectively code the interrupt handling to run in parallel with other things.
Just think of the interrupt as notification for the OS to start another task (that interrupt handling). It grabs whatever context it needs at the point the interrupt started, then keeps processing that task in parallel with other processes.
I/O request generally just means request to do either Input , Output or both. The exact meaning varies depending on your context like HTTP, Networks, Console Ops, or may be some process in the CPU.
A process is waiting for IO: Say for example you were writing a program in C to accept user's name on command line, and then would like to print 'Hello User' back. Your code will go into waiting state until user enters their name and hits Enter. This is a higher level example, but even on a very low level process executing in your computer's processor works on same basic principle
Can Processor work on other processes when current is interrupted and waiting on something? Yes! You better hope it does. Thats what scheduling algorithms and stacks are for. However the real answer depending on what Architecture you are on, does it support parallel or serial processing etc.

operating system - context switches

I have been confused about the issue of context switches between processes, given round robin scheduler of certain time slice (which is what unix/windows both use in a basic sense).
So, suppose we have 200 processes running on a single core machine. If the scheduler is using even 1ms time slice, each process would get its share every 200ms, which is probably not the case (imagine a Java high-frequency app, I would not assume it gets scheduled every 200ms to serve requests). Having said that, what am I missing in the picture?
Furthermore, java and other languages allows to put the running thread to sleep for e.g. 100ms. Am I correct in saying that this does not cause context switch, and if so, how is this achieved?
So, suppose we have 200 processes running on a single core machine. If
the scheduler is using even 1ms time slice, each process would get its
share every 200ms, which is probably not the case (imagine a Java
high-frequency app, I would not assume it gets scheduled every 200ms
to serve requests). Having said that, what am I missing in the
picture?
No, you aren't missing anything. It's the same case in the case of non-pre-emptive systems. Those having pre-emptive rights(meaning high priority as compared to other processes) can easily swap the less useful process, up to an extent that a high-priority process would run 10 times(say/assume --- actual results are totally depending on the situation and implementation) than the lowest priority process till the former doesn't produce the condition of starvation of the least priority process.
Talking about the processes of similar priority, it totally depends on the Round-Robin Algorithm which you've mentioned, though which process would be picked first is again based on the implementation. And, Windows and Unix have same process scheduling algorithms. Windows and Unix does utilise Round-Robin, but, Linux task scheduler is called Completely Fair Scheduler (CFS).
Furthermore, java and other languages allows to put the running thread
to sleep for e.g. 100ms. Am I correct in saying that this does not
cause context switch, and if so, how is this achieved?
Programming languages and libraries implement "sleep" functionality with the aid of the kernel. Without kernel-level support, they'd have to busy-wait, spinning in a tight loop, until the requested sleep duration elapsed. This would wastefully consume the processor.
Talking about the threads which are caused to sleep(Thread.sleep(long millis)) generally the following is done in most of the systems :
Suspend execution of the process and mark it as not runnable.
Set a timer for the given wait time. Systems provide hardware timers that let the kernel register to receive an interrupt at a given point in the future.
When the timer hits, mark the process as runnable.
I hope you might be aware of threading models like one to one, many to one, and many to many. So, I am not getting into much detail, jut a reference for yourself.
It might appear to you as if it increases the overhead/complexity. But, that's how threads(user-threads created in JVM) are operated upon. And, then the selection is based upon those memory models which I mentioned above. Check this Quora question and answers to that one, and please go through the best answer given by Robert-Love.
For further reading, I'd suggest you to read from Scheduling Algorithms explanation on OSDev.org and Operating System Concepts book by Galvin, Gagne, Silberschatz.

How to determine what's blocking the main thread

So I restructured a central part in my Cocoa application (I really had to!) and I am running into issues since then.
Quick outline: my application controls the playback of QuickTime movies so that they are in sync with external timecode.
Thus, external timecode arrives on a CoreMIDI callback thread and gets posted to the application about 25 times per sec. The sync is then checked and adjusted if it needs to be.
All this checking and adjusting is done on the main thread.
Even if I put all the processing on a background thread it would be a ton of work as I'm currently using a lot of GCD blocks and I would need to rewrite a lot of functions so that they can be called from NSThread. So I would like to make sure first if it will solve my problem.
The problem
My Core MIDI callback is always called in time, but the GCD block that is dispatched to the main queue is sometimes blocked for up to 500 ms. Understandable that adjusting the sync does not quite work if that happens. I couldn't find a reason for it, so I'm guessing that I'm doing something that blocks the main thread.
I'm familiar with Instruments, but I couldn't find the right mode to see what keeps my messages from being processed in time.
I would appreciate if anyone could help.
Don't know what I can do about it.
Thanks in advance!
Watchdog
You can use watch dog that stop when the main thread stopped for time
https://github.com/wojteklu/Watchdog
you can install it using cocoapod
pod 'Watchdog'
You may be blocking the main thread or you might be flooding it with events.
I would suggest three things:
Grab a timestamp for when the timecode arrives in the CoreMIDI callback thread (see mach_absolute_time(). Then grab the current time when your main thread work is being done. You can then adjust accordingly based on how much time has elapsed between posting to the main thread and it actually being processed.
create some kind of coalescing mechanism such that when your main thread is blocked, interim timecode events (that are now out of date) are tossed. This can be as simple as a global NSUInteger that is incremented every time an event is received. The block dispatched to the main queue could capture the current value on creation, then check it when it is processed. If it differs by more than N (N for you to determine), then toss the event because more are in flight.
consider not sending an event to the main thread for every timecode notification. 25 adjustments per second is a lot of work. If processing only 5 per second yields a "good enough" perceptual experience, then that is an awful lot of work saved.
In general, instrumenting the main event loop is a bit tricky. The CPU profiler in Instruments can be quite helpful. It may come as a surprise, but so can the Allocations instrument. In particular, you can use the Allocations instrument to measure memory throughput. If there are tons of transient (short lived) allocations, it'll chew up a ton of CPU time doing all those allocations/deallocations.

Is it safe to access the hard drive via many different GCD queues?

Is it safe? For instance, if I create a bunch of different GCD queues that each compress (tar cvzf) some files, am I doing something wrong? Will the hard drive be destroyed?
Or does the system properly take care of such things?
Dietrich's answer is correct save for one detail (that is completely non-obvious).
If you were to spin off, say, 100 asynchronous tar executions via GCD, you'd quickly find that you have 100 threads running in your application (which would also be dead slow due to gross abuse of the I/O subsystem).
In a fully asynchronous concurrent system with queues, there is no way to know if a particular unit of work is blocked because it is waiting for a system resource or waiting for some other enqueued unit of work. Therefore, anytime anything blocks, you pretty much have to spin up another thread and consume another unit of work or risk locking up the application.
In such a case, the "obvious" solution is to wait a bit when a unit of work blocks before spinning up another thread to de-queue and process another unit of work with the hope that the first unit of work "unblocks" and continues processing.
Doing so, though, would mean that any asynchronous concurrent system with interaction between units of work -- a common case -- would be so slow as to be useless.
Far more effective is to limit the # of units of work that are enqueued in the global asynchronous queues at any one time. A GCD semaphore makes this quite easy; you have a single serial queue into which all units of work are enqueued. Every time you dequeue a unit of work, you increment the semaphore. Every time a unit of work is completed, you decrement the semaphore. As long as the semaphore is below some maximum value (say, 4), then you enqueue a new unit of work.
If you take something that is normally IO limited, such as tar, and run a bunch of copies in GCD,
It will run more slowly because you are throwing more CPU at an IO-bound task, meaning the IO will be more scattered and there will be more of it at the same time,
No more than N tasks will run at a time, which is the point of GCD, so "a billion queue entries" and "ten queue entries" give you the same thing if you have less than 10 threads,
Your hard drive will be fine.
Even though this question was asked back in May, it's still worth noting that GCD has now provided I/O primitives with the release of 10.7 (OS X Lion). See the man pages for dispatch_read and dispatch_io_create for examples on how to do efficient I/O with the new APIs. They are smart enough to properly schedule I/O against a single disk (or multiple disks) with knowledge of how much concurrency is, or is not, possible in the actual I/O requests.

Feedback from threads to main program

My software will simulate a few hundred hardware devices, each of which will send several thousand reports to a database server.
Trying it without threading did not give very good results, so now it's time to thread.
Since I am load testing the d/b server, some of those transactions will succeed and a few may fail. The GUI of the main program needs to reflect this. How should the threads communicate their results back to the main program? Update global variables? Send a message? Or something lese?
Now, if I update only at the end of each thread then the GUI is going to look rather boring (and I can't tell if the program hung). It might be nice to update the GUI periodically. But that might cause contention, with threads waiting for other threads to update (for instance, if I am writing to global variables, I need a mutex, which will block each thread which is waiting to write).
I'm new to threading. How is this normally done? Perhaps the main program could poll the threads, instead of the threads iforming the main program?
One way to organize this is for your threads to add messages to a thread-safe queue (e.g. a ConcurrentQueue) as they get data. To keep things simple you can have a timer thread in your UI that periodically dequeues the queued messages to a private list and then renders them. This design allows your threads to easily queue and forget messages with minimal contention, and for your UI to periodically update itself without blocking your writers too much (i.e. for only the period it takes to dequeue current messages to a private list).
Although you are attempting to simulate the load of hundreds of devices, using thread per device is not the way to model this as you can only run so many threads concurrently anyway.