What does post-layout simulation take a long time? - verification

I am wondering why post-layout simulations for digital designs take a long time?
Why can't software just figure out a chip's timing and model the behavior with a program that creates delays with sleep() or something? My guess is that sleep() isn't accurate enough to model hardware, but I'm not sure.
So, what is it actually doing that takes so long?
Thanks!

Post layout simulations (in fact - anything post synthesis) will be simulating gates rather than RTL, and there's a lot of gates.
I think you've got your understanding of how a simulator works a little confused. I say that because a call like sleep() is related to waiting for time as measured by the clock on the wall, not the simulator time counter. Simulator time advances however quickly the simulator runs.
A simulator is a loop that evaluates the system state. Each iteration of the loop is a 'time slice' e.g. what the state of the system is at time 100ns. It only advances from one time slice to the next when all the signals in it have reached a steady state.
In an RTL or untimed gate simulation, most evaluation of signals happens in 'zero time', which is to say that the effect of evaluating an assignment happens in the same time slice. The one exception tends to be the clock, which is defined to change at a certain time and it causes registers to fire, which causes them to change their output, which causes processes, modules, assignments which have inputs from registers to re-evaluate, which causes other signals to change, which causes other processes to re-evaluate, etc, etc, etc.... until everything has settled down, and we can move to the next clock edge.
In a post layout simulation, with back-annotated timing, every gate in the system has a time from input to output associated with it. This means nothing happens in 'zero time' any more. The simulator now has put the effect of every assignment on a list saying 'signal b will change to 1 at time 102.35ns'. Every gate has different timing. Every input on every gate will have different timing to the output. This means that a back-annotated simulation has to evaluate lots and lots of time slices as signals are changing state at lot's of different times. Not just when the clock changes. There probably isn't much happening in each slice, but there's lots of them.
...and I've only talked about adding gate timing. Add wire timing and things get even more complex.
Basically there's a whole lot more to worry about, and so the sims get slower.

Related

ATMEGA88PA Timet 1 Fast PWM Issue

I'm working on a project for a customer in which we're using the chip mentioned in the title. The chip works as a fan motor driver, setting a duty cycle in accordance to a required tach count. One channel is utilizing timer 0 while another is utilizing timer 1. The fan driver on timer 0, which is set up on fast PWM mode, works perfectly every time in accordance to the required duty cycle. On channel 1 the fan driver looks like it works almost all the time, but sometimes it fails in strange occurrences, this is set up in fast PWM mode as well. When it does fail, the PWM output pin can only be set to zero percent duty cycle or 100 percent (or rather just on and off) and will not take in values between that (utilizing the ICR register to set the frequency, so
ICR=MAIN_CLOCK_CPU/required_frequency
and OCR1A to set the duty cycle
OCR1A=(required_duty*ICR)/100)
). I have yet to have this fail on my bench top setup, but if I take the PCB out of the product back to my desk, reprogram it and then place it back in the product, it fails consistently (but the PCB will still work fine at my desk?).
The main clock frequency is set to 8MHz and the MCU is being powered by 3.3V. According to the data sheet, this is fine, and we shouldn't expect any strange behavior. I feel like I have exhausted all of my options as besides sometimes when the controller boots up it gets in this state, but it is not consistent. I'm just curious if others have experienced this issue before.
I feel like this can't be a firmware issue, due to the inconsistency I am seeing, but I'm not sure how the hardware could just simply not output a proper signal?

What are some factors that could affect program runtime?

I'm doing some work on profiling the behavior of programs. One thing I would like to do is get the amount of time that a process has run on the CPU. I am accomplishing this by reading the sum_exec_runtime field in the Linux kernel's sched_entity data structure.
After testing this with some fairly simple programs which simply execute a loop and then exit, I am running into a peculiar issue, being that the program does not finish with the same runtime each time it is executed. Seeing as sum_exec_runtime is a value represented in nanoseconds, I would expect the value to differ within a few microseconds. However, I am seeing variations of several milliseconds.
My initial reaction was that this could be due to I/O waiting times, however it is my understanding that the process should give up the CPU while waiting for I/O. Furthermore, my test programs are simply executing loops, so there should be very little to no I/O.
I am seeking any advice on the following:
Is sum_exec_runtime not the actual time that a process has had control of the CPU?
Does the process not actually give up the CPU while waiting for I/O?
Are there other factors that could affect the actual runtime of a process (besides I/O)?
Keep in mind, I am only trying to find the actual time that the process spent executing on the CPU. I do not care about the total execution time including sleeping or waiting to run.
Edit: I also want to make clear that there are no branches in my test program aside from the loop, which simply loops for a constant number of iterations.
Thanks.
Your question is really broad, but you can incur context switches for various reasons. Calling most system calls involves at least one context switch. Page faults cause contexts switches. Exceeding your time slice causes a context switch.
sum_exec_runtime is equal to utime + stime from /proc/$PID/stat, but sum_exec_runtime is measured in nanoseconds. It sounds like you only care about utime which is the time your process has been scheduled in user mode. See proc(5) for more details.
You can look at nr_switches both voluntary and involuntary which are also part of sched_entity. That will probably account for most variation, but I would not expect successive runs to be identical. The exact time that you get for each run will be affected by all of the other processes running on the system.
You'll also be affected by the amount of file system cache used on your system and how many file system cache hits you get in successive runs if you are doing any IO at all.
To give a very concrete and obvious example of how other processes can affect the run time of the current process, think about if you are exceeding your physical RAM constraints. If your program asks for more RAM, then the kernel is going to spend more time swapping. That time swapping will be accounted in stime but will vary depending on how much RAM you need and how much RAM is available. There are lot's of other ways that other processes can affect your process's run time. This is just one example.
To answer your 3 points:
sum_exec_runtime is the actual time the scheduler ran the process including system time
If you count switching to the kernel as the process giving up the CPU, then yes, but it does not necessarily mean a different user process may get the CPU back once the kernel is done.
I think I've already answered this question that there are lot's of factors.

How to synchronize host's and client's games in an online RTS game? (VB.NET)

So my project is an online RTS (real-time strategy) game (VB.NET) where - after a matchmaking server matched 2 players - one player is assigned as host and the other as client in a socketcommunication (TPC).
My design is that server and client only record and send information about input, and the game takes care of all the animations/movements according to the information from the inputs.
This works well, except when it comes to having both the server's and client's games run exacly at the same speed. It seems like the operations in the games are handled at different speeds, regardless if I have both client and server on the same computer, or if I use different computers as server/client. This is obviously crucial for the game to work.
Since the game at times have 300-500 units, I thought it would be too much to send an entire gamestate from server/client 10 times a second.
So my questions are:
How to synchronize the games while sending only inputs? (if possible)
What other designs are doable in this case, and how do they work?
(in VB.NET i use timers for operations, such that every 100ms(timer interval) a character moves and changes animation, and stuff like that)
Thanks in advance for any help on this issue, my project really depends on it!
Timers are not guaranteed to tick at exactly the set rate. The speed at which they tick can be affected by how busy the system and, in particular, your process are. However, while you cannot get a timer to tick at an accurate pace, you can calculate, exactly, how long it has been since some particular point in time, such as the point in time when you started the timer. The simplest way to measure the time, like that, is to use the Stopwatch class.
When you do it this way, that means that you don't have a smooth order of events where, for instance, something moves one pixel per timer tick, but rather, you have a path where you know something is moving from one place to another over a specified span of time and then, given your current exact time according to the stopwatch, you can determine the current position of that object along that path. Therefore, on a system which is running faster, the events would appear more smooth, but on a system which is running slower, the events would occur in a more jumpy fashion.

Calculating number of seconds between two points in time, in Cocoa, even when system clock has changed mid-way

I'm writing a Cocoa OS X (Leopard 10.5+) end-user program that's using timestamps to calculate statistics for how long something is being displayed on the screen. Time is calculated periodically while the program runs using a repeating NSTimer. [NSDate date] is used to capture timestamps, Start and Finish. Calculating the difference between the two dates in seconds is trivial.
A problem occurs if an end-user or ntp changes the system clock. [NSDate date] relies on the system clock, so if it's changed, the Finish variable will be skewed relative to the Start, messing up the time calculation significantly. My question:
1. How can I accurately calculate the time between Start and Finish, in seconds, even when the system clock is changed mid-way?
I'm thinking that I need a non-changing reference point in time so I can calculate how many seconds has passed since then. For example, system uptime. 10.6 has - (NSTimeInterval)systemUptime, part of NSProcessInfo, which provides system uptime. However, this won't work as my app must work in 10.5.
I've tried creating a time counter using NSTimer, but this isn't accurate. NSTimer has several different run modes and can only run one at a time. NSTimer (by default) is put into the default run mode. If a user starts manipulating the UI for a long enough time, this will enter NSEventTrackingRunLoopMode and skip over the default run mode, which can lead to NSTimer firings being skipped, making it an inaccurate way of counting seconds.
I've also thought about creating a separate thread (NSRunLoop) to run a NSTimer second-counter, keeping it away from UI interactions. But I'm very new to multi-threading and I'd like to stay away from that if possible. Also, I'm not sure if this would work accurately in the event the CPU gets pegged by another application (Photoshop rendering a large image, etc...), causing my NSRunLoop to be put on hold for long enough to mess up its NSTimer.
I appreciate any help. :)
Depending on what's driving this code, you have 2 choices:
For absolute precision, use mach_absolute_time(). It will give the time interval exactly between the points at which you called the function.
But in a GUI app, this is often actually undesirable. Instead, you want the time difference between the events that started and finished your duration. If so, compare [[NSApp currentEvent] timestamp]
Okay so this is a long shot, but you could try implementing something sort of like NSSystemClockDidChangeNotification available in Snow Leopard.
So bear with me here, because this is a strange idea and is definitely non-derterministic. But what if you had a watchdog thread running through the duration of your program? This thread would, every n seconds, read the system time and store it. For the sake of argument, let's just make it 5 seconds. So every 5 seconds, it compares the previous reading to the current system time. If there's a "big enough" difference ("big enough" would need to definitely be greater than 5, but not too much greater, to account for the non-determinism of process scheduling and thread prioritization), post a notification that there has been a significant time change. You would need to play around with fuzzing the value that constitutes "big enough" (or small enough, if the clock was reset to an earlier time) for your accuracy needs.
I know this is kind of hacky, but barring any other solution, what do you think? Might that, or something like that, solve your issue?
Edit
Okay so you modified your original question to say that you'd rather not use a watchdog thread because you are new to multithreading. I understand the fear of doing something a bit more advanced than you are comfortable with, but this might end up being the only solution. In that case, you might have a bit of reading to do. =)
And yeah, I know that something such as Photoshop pegging the crap out of the processor is a problem. Another (even more complicated) solution would be to, instead of having a watchdog thread, have a separate watchdog process that has top priority so it is a bit more immune to processor pegging. But again, this is getting really complicated.
Final Edit
I'm going to leave all my other ideas above for completeness' sake, but it seems that using the system's uptime will also be a valid way to deal with this. Since [[NSProcessInfo processInfo] systemUptime] only works in 10.6+, you can just call mach_absolute_time(). To get access to that function, just #include <mach/mach_time.h>. That should be the same value as returned by NSProcessInfo.
I figured out a way to do this using the UpTime() C function, provided in <CoreServices/CoreServices.h>. This returns Absolute Time (CPU-specific), which can easily be converted into Duration Time (milliseconds, or nanoseconds). Details here: http://www.meandmark.com/timingpart1.html (look under part 3 for UpTime)
I couldn't get mach_absolute_time() to work properly, likely due to my lack of knowledge on it, and not being able to find much documentation on the web about it. It appears to grab the same time as UpTime(), but converting it into a double left me dumbfounded.
[[NSApp currentEvent] timestamp] did work, but only if the application was receiving NSEvents. If the application went into the foreground, it wouldn't receive events, and [[NSApp currentEvent] timestamp] would simply continue to return the same old timestamp again and again in an NSTimer firing method, until the end-user decided to interact with the app again.
Thanks for all your help Marc and Mike! You both definitely sent me in the right direction leading to the answer. :)

Testing Real Time Operating System for Hardness

I have an embedded device (Technologic TS-7800) that advertises real-time capabilities, but says nothing about 'hard' or 'soft'. While I wait for a response from the manufacturer, I figured it wouldn't hurt to test the system myself.
What are some established procedures to determine the 'hardness' of a particular device with respect to real time/deterministic behavior (latency and jitter)?
Being at college, I have access to some pretty neat hardware (good oscilloscopes and signal generators), so I don't think I'll run into any issues in terms of testing equipment, just expertise.
With that kind of equipment, it ought to be fairly easy to sync the o-scope to a steady clock, produce a spike each time the real-time system produces an output, an see how much that spike varies from center. The less the variation, the greater the hardness.
To clarify Bob's answer maybe:
Use the signal generator to generate a pulse at some varying frequency.
Random distribution across some range would be best.
use the signal generator (trigger signal) to start the scope.
the RTOS has to respond, do it thing and send an output pulse.
feed the RTOS output into input 2 of the scope.
get the scope to persist/collect mode.
get the scope to start on A , stop on B. if you can.
in an ideal workd, get it to measure the distribution for you. A LeCroy would.
Start with a much slower trace than you would expect. You need to be able to see slow outliers.
You'll be able to see the distribution.
Assuming a normal distribution the SD of the response time variation is the SOFTNESS.
(This won't really happen in practice, but if you don't get outliers it is reasonably useful. )
If there are outliers of large latency, then the RTOS is NOT very hard. Does not meet deadlines well. Unsuitable then it is for hard real time work.
Many RTOS-like things have a good left edge to the curve, sloping down like a 1/f curve.
Thats indicitive of combined jitters. The thing to look out for is spikes of slow response on the right end of the scope. Keep repeating the experiment with faster traces if there are no outliers to get a good image of the slope. Should be good for some speculative conclusion in your paper.
If for your application, say a delta of 1uS is okay, and you measure 0.5us, it's all cool.
Anyway, you can publish the results ( and probably in the publish sense, but certainly on the web.)
Link from this Question to the paper when you've written it.
Hard real-time has more to do with how your software works than the hardware on its own. When asking if something is hard real-time it must be applied to the complete system (Hardware, RTOS and application). This means hard or soft real-time is system design issues.
Under loading exceeding the specification even a hard real-time system will fail (hopefully with proper failure indication) while a soft real-time system with low loading would give hard real-time results. How much processing must happen in time and how much pre/post processing can be performed is the real key to hard/soft real-time.
In some real-time applications some data loss is not a failure it should just be below a certain level, again a system criteria.
You can generate inputs to the board and have a small application count them and check at what level data is going to be lost. But that gives you a rating specific to that system running that application. As soon as you start doing more processing your computational load increases and you now have a different hard real-time limit.
This board will running a bare bones scheduler will give great predictable hard real-time performance for most tasks.
Running a full RTOS with heavy computational load you probably only get soft real-time.
Edit after comment
The most efficient and easiest way I have used to measure my software's performance (assuming you use a schedular) is by using a free running hardware timer on the board and to time stamp my start and end of my cycle. Or if you run a full RTOS time stamp you acquisition and transition. Save your Max time and run a average on the values over a second. If your average is around 50% and you max is within 20% of your average you are OK. If not it is time to refactor your application. As your application grows the cycle time will grow. You can monitor the effect of all your software changes on your cycle time.
Another way is to use a hardware timer generate a cyclical interrupt. If you are in time reset the interrupt. If you miss the deadline you have interrupt handler signal a failure. This however will only give you a warning once your application is taking to long but it rely on hardware and interrupts so you can't miss.
These solutions also eliminate the requirement to hook up a scope to monitor the output since the time information can be displayed in any kind of terminal by a background task. If it is easy to monitor you will monitor it regularly avoiding solving the timing problems at the end but as soon as they are introduced.
Hope this helps
I have the same board here at work. It's a slightly-modified 2.6 Kernel, I believe... not the real-time version.
I don't know that I've read anything in the docs yet that indicates that it is meant for strict RTOS work.
I think that this is not a hard real-time device, since it runs no RTOS.
I understand being geek, but using oscilloscope to test a computer with ethernet/usb/other digital ports and HUGE internal state (RAM) is both ineffective and unreliable.
Instead of watching wave forms, you can connect any PC to the output port and run proper statistical analysis.
The established procedure (if the input signal is analog by nature) is to test system against several characteristic inputs - traditionally spikes, step functions and sine waves of different frequencies - and measure phase shift and variance for each input type. Worst case is then used in specifications of the system.
Again, if you are using standard ports, you can easily generate those on PC. If the input is truly analog, a separate DAC or simply a good sound card would be needed.
Now, that won't say anything about OS being real-time - it could be running vanilla Linux or even Win CE and still produce good and stable results in those tests if hardware is fast enough.
So, you need to simulate heavy and varying loads on processor, memory and all ports, let it heat and eat memory for a few hours, and then repeat tests. If latency stays constant, it's hard real-time. If it doesn't, under any load and input signal type, increase above acceptable limit, it's soft. Otherwise, it's advertisement.
P.S.: Implication is that even for critical systems you don't actually need hard real-time if you have hardware.