Can calls to printf() block? - ssh

How does printf() (or writes to stdout in general) work regarding flow control and buffering? Is printf() guaranteed to return in reasonable amount of time, or will it block until it was actually able to write something to stdout?
I'm thinking about a slow SSH connection or something like that. Can that actually slow down some program at a printf()? Will that occur immediately, or only after some buffer has been filled up (some internal buffer of stdout, or possibly even the TCP send buffer)? Does using SSH make some other difference, besides possibly increasing latency and decreasing output speed?
If so, how is that circumvented? Are threads a common solution to that problem, or is there some easier way to set a flag or do a call to switch stdout to "non-blocking mode"? Or do I have some basic facts about the workings of Unix I/O completely wrong and the question doesn't make any sense? :)
I'm asking mainly out of interest, but the issue has recently come up on two occasions during experimenting with C and the shell, and I wonder where and how it should be addressed in general:
When developing an application that potentially does lots of output to the terminal for debugging or information purposes, can slow terminals potentially cause performance issues? Should I (as a programmer) avoid constructs like (for each x in y: print y) for large lists?
Or is that a problem that should best be handled by the application user instead of the developer, by redirecting the output to a file etc.? If so, are there any useful shell idioms to decouple program output from the actual execution, like mytool | cat (I'm not sure if that particular example would actually change anything, but cat could be replaced by something more sophisticated).

Yes, calls to printf() could block. One example is stdout connected to a pipe with a slow reader at the other end.
Whether this is an issue or how best to mitigate it really depends on the nature of the application, and why it's an issue in the first place.

Related

General question: Adding new test code to embedded system

this maybe will be off topic, but I am preparing for an exam in real time. And I have been browsing the book and Internet for an answer for a problem.
Basically I wonder if by adding additional test code if it may change the real time behavior for an embedded system, and or also if it will introduce new errors.
Anyone who might know the answer for this, or refer me to some reading material for it?
Your question is too general.. So I guess the default answer would be it depends.. But considering the possibilities as an exercise of logic and thought, yes it surely can!
There are many schemes available to guarantee the 'real-timeness' of an embedded system. For example, one can have a pre-emptive timer based ISR to service the real-time task.. In such a case, your test code could possibly not affect the 'real-timeness'.. But if the testing takes too long, and the context switches are not pre-emptive, you could get into trouble..
But again it depends on what you're testing and how you're testing. Your test code can possible mess with the timers, interrupts or the memory of system. The possibilities to mess up stuff if you're not careful are endless..
Having an OS underneath will prevent some errors, but again depending on how it works, you may be saved from bad 'test code'..
Yes, when you add code (test, diagnostic, statistic) it may change the real time behavior. It depends on the design, the implementation and the CPU power if it will actually change the behavior. You also have more lines of code and the probability for errors may increase. But I wouldn't say, "it will introduce errors", since it can introduce errors.
Yes it can. See How can adding data to a segment in flash memory screw up a program's timing? for an example of how even adding non-executable code can adjust timing enough to screw up a system.
Yea, changing your code base could totally change its timing. Consider if you dumped some debug output to a serial port, it takes time to call that function, format the data, and if the function is synchronous, then for it to wait for data to go out. This kinda stuff definitely changes system timing behavior.

overhead of exec() call?

I have a web script that is a simple wrapper around a perl program:
#!/bin/sh
source ~/.myinit // pull in some configurations like [PRIVATE_DIR]
exec [PRIVATE_DIR]/myprog.pl
This is really just to keep the code better compartmentalized, since the main program (myprog.pl) runs on different machines with different configuration modes and it's cleaner to not have to update that for every installation. However, I would be willing to sacrifice cleanliness for efficiency.
So the question is, does this extra sh exec() call add any non-negligible overhead, considering it could be hit by the webserver quite frequently (we have 1000s of users on at a time)? I ask, because I know that people have gone to great lengths to embed programs into httpd to avoid having to make an extra fork/exec call. My assumption has been that this was due to the overhead of the program being called (eg mod_perl bypasses the extremely slow perl startup), not the process of calling itself. Would that be accurate?
I've tried to benchmark this but I can't get the resolution I need to see if it makes a difference. Any advice on that would also be appreciated.
Thanks!
Try a test?
Assuming you have a test environmentn separate from production.
(and this is a little vague, and real testers will be able to amplify how to improve on this)
Run your current 2 level design in a tight loop, using the same param for 100,000? 1,000,000? ...
Then 'fix' your code so the perl is called directly, with the same params as above, looping for same count.
Capture performance stats for both runs. The diff between the two should be (roughly) the cost of the extra call.
If this works out, I'd appreciate seeing the results on this:-)
(There are many more posts for tag=testing than tag=benchmark)
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer
It's easy for a perl program to take over some envars from a shell script, see the answer to Have perl execute shellscript & take over env vars.
Using this there is no need for exec at all which should alleviate all your worries about exec overhead. :-)

Embedded systems : last gasp before reboot

When things go badly awry in embedded systems I tend to write an error to a special log file in flash and then reboot (there's not much option if, say, you run out of memory).
I realize even that can go wrong, so I try to minimize it (by not allocating any memory during the final write, and boosting the write processes priority).
But that relies on someone retrieving the log file. Now I was considering sending a message over the intertubes to report the error before rebooting.
On second thoughts, of course, it would be better to send that message after reboot, but it did get me to thinking...
What sort of things ought I be doing if I discover an irrecoverable error, and how can I do them as safely as possible in a system which is in an unstable state?
One strategy is to use a section of RAM that is not initialised by during power-on/reboot. That can be used to store data that survives a reboot, and then when your app restarts, early on in the code it can check that memory and see if it contains any useful data. If it does, then write it to a log, or send it over a comms channel.
How to reserve a section of RAM that is non-initialised is platform-dependent, and depends if you're running a full-blown OS (Linux) that manages RAM initialisation or not. If you're on a small system where RAM initialisation is done by the C start-up code, then your compiler probably has a way to put data (a file-scope variable) in a different section (besides the usual e.g. .bss) which is not initialised by the C start-up code.
If the data is not initialised, then it will probably contain random data at power-up. To determine whether it contains random data or valid data, use a hash, e.g. CRC-32, to determine its validity. If your processor has a way to tell you if you're in a reboot vs a power-up reset, then you should also use that to decide that the data is invalid after a power-up.
There is no single answer to this. I would start with a Watchdog timer. This reboots the system if things go terribly awry.
Something else to consider - what is not in a log file is also important. If you have routine updates from various tasks/actions logged then you can learn from what is missing.
Finally, in the case that things go bad and you are still running: enter a critical section, turn off as much of the OS a possible, shut down peripherals, log as much state info as possible, then reboot!
The one thing you want to make sure you do is to not corrupt data that might legitimately be in flash, so if you try to write information in a crash situation you need to do so carefully and with the knowledge that the system might be an a very bad state so anything you do needs to be done in a way that doesn't make things worse.
Generally, when I detect a crash state I try to spit information out a serial port. A UART driver that's accessible from a crashed state is usually pretty simple - it just needs to be a simple polling driver that writes characters to the transmit data register when the busy bit is clear - a crash handler generally doesn't need to play nice with multitasking, so polling is fine. And it generally doesn't need to worry about incoming data; or at least not needing to worry about incoming data in a fashion that can't be handled by polling. In fact, a crash handler generally cannot expect that multitasking and interrupt handling will be working since the system is screwed up.
I try to have it write the register file, a portion of the stack and any important OS data structures (the current task control block or something) that might be available and interesting. A watchdog timer usually is responsible for resetting the system in this state, so the crash handler might not have the opportunity to write everything, so dump the most important stuff first (do not have the crash handler kick the watchdog - you don't want to have some bug mistakenly prevent the watchdog from resetting the system).
Of course this is most useful in a development setup, since when the device is released it might not have anything attached to the serial port. If you want to be able to capture these kinds of crash dumps after release, then they need to get written somewhere appropriate (like maybe a reserved section of flash - just make sure it's not part of the normal data/file system area unless you're sure it can't corrupt that data). Of course you'd need to have something examine that area at boot so it can be detected and sent somewhere useful or there's no point, unless you might get units back post-mortem and can hook them up to a debugging setup that can look at the data.
I think the most well known example of proper exception handling is a missile self-destruction. The exception was caused by arithmetic overflow in software. There obviously was a lot of tracing/recording media involved because the root cause is known. It was discovered debugged.
So, every embedded design must include 2 features: recording media like your log file and graceful halt, like disabling all timers/interrupts, shutting all ports and sitting in infinite loop or in case of a missile - self-destruction.
Writing messages to flash before reboot in embedded systems is often a bad idea. As you point out, no one is going to read the message, and if the problem is not transient you wear out the flash.
When the system is in an inconsistent state, there is almost nothing you can do reliably and the best thing to do is to restart the system as quickly as possible so that you can recover from transient failures (timing, special external events, etc.). In some systems I have written a trap handler that uses some reserved memory so that it can, set up the serial port and then emit a stack dump and register contents without requiring extra stack space or clobbering registers.
A simple restart with a dump like that is reasonable because if the problem is transient the restart will resolve the problem and you want to keep it simple and let the device continue. If the problem is not transient you are not going to make forward progress anyway and someone can come along and connect a diagnostic device.
Very interesting paper on failures and recovery: WHY DO COMPUTERS STOP AND WHAT CAN BE DONE ABOUT IT?
For a very simple system, do you have a pin you can wiggle? For example, when you start up configure it to have high output, if things go way south (i.e. watchdog reset pending) then set it to low.
Have you ever considered using a garbage collector ?
And I'm not joking.
If you do dynamic allocation at runtime in embedded systems,
why not reserve a mark buffer and mark and sweep when the excrement hits the rotating air blower.
You've probably got the malloc (or whatever) implementation's source, right ?
If you don't have library sources for your embedded system forget I ever suggested it, but tell the rest of us what equipment it is in so we can avoid ever using it. Yikes (how do you debug without library sources?).
If you're system is already dead.... who cares how long it takes. It obviously isn't critical that it be running this instant;
if it was you couldn't risk "dieing" like this anyway ?

How to save a program's progress, and resume later?

You may know a lot of programs, e.g some password cracking programs, we can stop them while they're running, and when we run the program again (with or without entering a same input), they will be able to continue from where they have left. I wonder what kind of technique those programs are using?
[Edit] I am writing a program mainly based on recursion functions. Within my knowledge, I think it is incredibly difficult to save such states in my program. Is there any technique, somehow, saves the stack contents, function calls, and data involved in my program, and then when it is restarted, it can run as if it hasn't been stopped? This is just some concepts I got in my mind, so please forgive me if it doesn't make sense...
It's going to be different for every program. For something as simple as, say, a brute force password cracker all that would really need to be saved was the last password tried. For other apps you may need to store several data points, but that's really all there is too it: saving and loading the minimum amount of information needed to reconstruct where you were.
Another common technique is to save an image of the entire program state. If you've ever played with a game console emulator with the ability to save state, this is how they do it. A similar technique exists in Python with pickling. If the environment is stable enough (ie: no varying pointers) you simply copy the entire apps memory state into a binary file. When you want to resume, you copy it back into memory and begin running again. This gives you near perfect state recovery, but whether or not it's at all possible is highly environment/language dependent. (For example: most C++ apps couldn't do this without help from the OS or if they were built VERY carefully with this in mind.)
Use Persistence.
Persistence is a mechanism through which the life of an object is beyond programs execution lifetime.
Store the state of the objects involved in the process on the local hard drive using serialization.
Implement Persistent Objects with Java Serialization
To achieve this, you need to continually save state (i.e. where you are in your calculation). This way, if you interrupt the probram, when it restarts, it will know it is in the middle of calculation, and where it was in that calculation.
You also probably want to have your main calculation in a separate thread from your user interface - this way you can respond to "close / interrupt" requests from your user interface and handle them appropriately by stopping / pausing the thread.
For linux, there is a project named CRIU, which supports process-level save and resume. It is quite like hibernation and resuming of the OS, but the granularity is broken down to processes. It also supports container technologies, specifically Docker. Refer to http://criu.org/ for more information.

How you test your applications for reliability under badly behaving i/o

Almost every application out there performs i/o operations, either with disk or over network.
As my applications work fine under the development-time environment, I want to be sure they will still do when the Internet connection is slow or unstable, or when the user attempts to read data from badly-written CD.
What tools would you recommend to simulate:
slow i/o (opening files, closing files, reading and writing, enumeration of directory items)
occasional i/o errors
occasional 'access denied' responses
packet loss in tcp/ip
etc...
EDIT:
Windows:
The closest solution to do the job as described seems to be holodeck, commercial software (>$900).
Linux:
Open solution wasn't found by now, but the same effect
can be achived as specified by smcameron and krosenvold.
Decorator pattern is a good idea.
It would require to wrap my i/o classes, but resulting in a testing framework.
The only remaining untested code would be in 3rd party libraries.
Yet I decided not to go this way, but leave my code as it is and simulate i/o errors from outside.
I now know that what I need is called 'fault injection'.
I thought it was a common production-line part with plenty of solutions I just didn't know.
(By the way, another similar good idea is 'fuzz testing', thanks to Lennart)
On my mind, the problem is still not worth $900.
I'm going to implement my own open-source tool based on hooks (targeting win32).
I'll update this post when I'm done with it. Come back in 3 or 4 weeks or so...
What you need is a fault injecting testing system. James Whittaker's 'How to break software' is a good read on this subject and includes a CD with many of the tools needed.
If you're on linux you can do tons of magic with iptables;
iptables -I OUTPUT -p tcp --dport 7991 -j DROP
Can simulate connections up/down as well. There's lots of tutorials out there.
Check out "Fuzz testing": http://en.wikipedia.org/wiki/Fuzzing
At a programming level many frameworks will let you wrap the IO stream classes and delegate calls to the wrapped instance. I'd do this and add in a couple of wait calls in the key methods (writing bytes, closing the stream, throwing IO exceptions, etc). You could write a few of these with different failure or issue type and use the decorator pattern to combine as needed.
This should give you quite a lot of flexibility with tweaking which operations would be slowed down, inserting "random" errors every so often etc.
The other advantage is that you could develop it in the same code as your software so maintenance wouldn't require any new skills.
You don't say what OS, but if it's linux or unix-ish, you can wrap open(), read(), write(), or any library or system call etc, with an LD_PRELOAD-able library to inject faults.
Along these lines:
http://scaryreasoner.wordpress.com/2007/11/17/using-ld_preload-libraries-and-glibc-backtrace-function-for-debugging/
I didn't go writing my own file system filter, as I initially thought, because there's a simpler solution.
1. Network i/o
I've found at least 2 ways to simulate i/o errors here.
a) Running a virtual machine (such as vmware) allows to configure bandwidth and packet loss rate. Vmware supports on-machine debugging.
b) Running a proxy on the local machine and tunneling all the traffic through it. For the case of upd/tcp communications a proxifier (e.g. widecap) can be used.
2. File i/o
I've managed to deduce this scenario to the previous one by mapping a drive letter to a network share which resides inside the virtual machine. The file i/o will be slow.
A cheaper alternative exists: to set up a local ftp server (e.g. FileZilla), configure speeds and use Novell's NetDrive to access it.
You'll wanna setup a test lab for this. What type of application are you building anyway? Are you really expecting the application be fed corrupt data?
A test technique I know the Microsoft Exchange Server people tried was sending noise to the server. Basically feeding every possible input with seemingly random data. They managed to crash the server quite often this way.
But still, if you can't trust input that hasn't been signed then general rules apply. Track every operation which could potentially be untrusted (result of corrupt data) and you should be able to handle most problems gracefully.
Just test your application behavior on random input, that should catch most problems but you'll never be able to fully protect your self from corrupt data. That's just not possible, as the data could be part of some internal buffer being handed off within the application itself.
Be mindful of when and how you decode data. That is all.
The first thing you'll need to do is define what "correct" means under these circumstances. You can only test against a definition of what behaviour is intended.
The tactics of testing will depend on technology. In the context of automated unit testing, I have found it very useful, in OO languages such as Java, to use various flavors of "mocking" or "stubbing" to pass e.g. misbehaving InputStreams to parts of my code that used file I/O.
Consider holodeck for some of the fault injection, if you have access to spare hardware you can simulate network impairment using Netem or a commercial product based on it the Mini-Maxwell, which is much more expensive than free but possibly easier to use.