Understanding simple simulation and rendering loop - rendering

This is an example (pseudo code) of how you could simulate and render a video game.
//simulate 20ms into the future
const long delta = 20;
long simulationTime = 0;
while(true)
{
while(simulationTime < GetMilliSeconds()) //GetMilliSeconds = Wall Clock Time
{
//the frame we simulated is still in the past
input = GetUserlnput();
UpdateSimulation(delta, input);
//we are trying to catch up and eventually pass the wall clock time
simulationTime += delta;
}
//since my current simulation is in the future and
//my last simulation is in the past
//the current looking of the world has got to be somewhere inbetween
RenderGraphics(InterpolateWorldState(GetMilliSeconds() - simulationTime));
}
That's my question:
I have 40ms to go through the outer 'while true' loop (means 25FPS).
The RenderGraphics method takes 10ms. So that means I have 30ms for the inner loop. The UpdateSimulation method takes 5ms. Everything else can be ignored since it's a value under 0.1ms.
What is the maximum I can set the variable 'delta' to in order to stay in my time schedule of 40ms (outer loop)?
And why?

This largely depends on how often you want and need to update your simulation status and user input, given the constraints mentioned below. For example, if your game contains internal state based on physical behavior, you would need a smaller delta to ensure that movements and collisions, if any, are properly evaluated and reflected within the game state. Also, if your user input requires fine-grained evaluation and state update, you would also need smaller delta values. For example, a shooting game with analogue user input (e.g. mouse, joystick), would benefit from update frequencies larger than 30Hz. If your game does not need such high-frequency evaluation of input and game state, then you could get away with larger delta values, or even by simply updating your game state once any input by the player was being detected.
In your specific pseudo-code, your simulation would update according to a fixed time-slice of length delta, which requires your simulation update to be processed in less wallclock time than the wallclock time to be simulated. Otherwise, wallclock time would proceed faster, than your simulation time can be updated. This ultimately limits your delta depending on how quick any simulation update of delta simulation time can actually be computed. This relationship also depends on your use case and may not be linear or constant. For example, physics engines often would divide your delta time given internally to what update rate they can reasonably process, as longer delta times may cause numerical instabilities and harder to solve linear systems raising processing effort non-linearly. In other use cases, simulation updates may take a linear or even constant time. Even so, many (possibly external) events could cause your simulation update to be processed too slowly, if it is inherently demanding. For example, loading resources during simulation updates, your operating system deciding to lay your execution thread aside, another process run by the user, anti-virus software kicking in, low memory pressure, a slow CPU and so on. Until now, I saw mostly two strategies to evade this problem or remedy its effects. First, simply ignoring it could work if the simulation update effort is low and it is assumed that the cause of the slowdown is temporary only. This would result in more or less noticeable "slow motion" behavior of your simulation, which could - in worst case - lead to simulation time lag piling up forever. The second strategy I often saw was to simply cap the measured frame time to be simulated to some artificial value, say 1000ms. This leads to smooth behavior as soon as the cause of slow down disappears, but has the drawback that the 'capped' simulation time is 'lost', which may lead to animation hiccups if not handled or accounted for. To choose a strategy, analyzing your use case could consist of measuring the wallclock time it takes to process simulation updates of delta and x * delta time and how changing the delta time and simulation load to process actually reflects in wallclock time needed to compute it, which will hint you to what the maximum value of delta is for your specific hardware and software environment.

Related

How to handle huge amount of changes in real time planning?

If real time planning and daemon mode is enabled, when an update or addition of planning entity is to be made a problem fact change must be invoked.
So let say the average rate of change will be 1/sec, so for every second a problem fact change must be called resulting to restarting the solver every second.
Do we just invoke or schedule a problem fact change every second resulting to restarting the solver every second or if we know that there will be huge amount of changes, stop the solver first, apply changes then start the solver?
In the scenario you describe, the solver will be likely restarted every time. It's not a complete restart as if you just call the Solver.solve() with the updated last known solution, but the ScoreDirector, a component responsible for score calculation, is restarted each time a problem change is applied.
If problem changes come faster, they might be processed in a batch. The solver checks problem changes between the evaluation of individual moves, so if multiple changes come before the solver finishes the evaluation of the current move, they are all applied and the solver restarts just once. In the opposite case, when there are seldom changes coming, the restart doesn't matter much, as there is enough time for the solver to improve the solution.
But the rate of 1 change/sec will likely lead to frequent solver restarts and will affect its ability to produce better solutions.
The solver does not know if there is going to be a bigger amount of changes in the next second. The current behavior may be improved by processing the problem changes periodically in a predefined time interval rather than between move evaluations.
Of course, the periodic grouping of problem changes can be done outside the solver as well.

Recommended way of measuring execution time in Tensorflow Federated

I would like to know whether there is a recommended way of measuring execution time in Tensorflow Federated. To be more specific, if one would like to extract the execution time for each client in a certain round, e.g., for each client involved in a FedAvg round, saving the time stamp before the local training starts and the time stamp just before sending back the updates, what is the best (or just correct) strategy to do this? Furthermore, since the clients' code run in parallel, are such a time stamps untruthful (especially considering the hypothesis that different clients may be using differently sized models for local training)?
To be very practical, using tf.timestamp() at the beginning and at the end of #tf.function client_update(model, dataset, server_message, client_optimizer) -- this is probably a simplified signature -- and then subtracting such time stamps is appropriate?
I have the feeling that this is not the right way to do this given that clients run in parallel on the same machine.
Thanks to anyone can help me on that.
There are multiple potential places to measure execution time, first might be defining very specifically what is the intended measurement.
Measuring the training time of each client as proposed is a great way to get a sense of the variability among clients. This could help identify whether rounds frequently have stragglers. Using tf.timestamp() at the beginning and end of the client_update function seems reasonable. The question correctly notes that this happens in parallel, summing all of these times would be akin to CPU time.
Measuring the time it takes to complete all client training in a round would generally be the maximum of the values above. This might not be true when simulating FL in TFF, as TFF maybe decided to run some number of clients sequentially due to system resources constraints. In practice all of these clients would run in parallel.
Measuring the time it takes to complete a full round (the maximum time it takes to run a client, plus the time it takes for the server to update) could be done by moving the tf.timestamp calls to the outer training loop. This would be wrapping the call to trainer.next() in the snippet on https://www.tensorflow.org/federated. This would be most similar to elapsed real time (wall clock time).

SUMO sim time and real time difference

I used traci.simulation.getTime to get the current sim time of SUMO.
However, this time runs faster than real time.
For example, when sim time grow from 0-100, real time just grow from 0-20.
How can I make SUMO simulation time be the same with real time?
I tried --step-length = 1, but this didn't work
The --step-length property is a value in seconds describing the length of one simulation step. If you put a higher number here vehicles have less time to react, but your simulation probably runs faster.
For the real time issue you might have a look to the sumo-user mailinglist. I think the mail gives a pretty good answer to your issue:
the current limit to the real time factor is the speed of your
computer. If you want to slow the GUI down you can change the delay
value (which is measured in milliseconds) so a value of 100 would add
100ms to every simulation step (if you simulation is small and runs
with the default step length of 1s this means factor 10). If you want
to speed it up, run without GUI or buy a faster computer ;-).
To check how close your simulation is to wall clock time, you can check the generated output from SUMO. The thing you're looking for is called Real time factor

How should I handle measurement logging in my Discrete Event Simulation engine?

NOTE: This question has been ported over from Programmers since it appears to be more appropriate here given the limitation of the language I'm using (VBA), the availability of appropriate tags here and the specificity of the problem (on the inference that Programmers addresses more theoretical Computer Science questions).
I'm attempting to build a Discrete Event Simulation library by following this tutorial and fleshing it out. I am limited to using VBA, so "just switch to [insert language here] and it's easy!" is unfortunately not possible. I have specifically chosen to implement this in Access VBA to have a convenient location to store configuration information and metrics.
How should I handle logging metrics in my Discrete Event Simulation engine?
If you don't want/need background, skip to The Design or The Question section below...
Simulation
The goal of a simulation of the type in question is to model a process to perform analysis of it that wouldn't be feasible or cost-effective in reality.
The canonical example of a simulation of this kind is a Bank:
Customers enter the bank and get in line with a statistically distributed frequency
Tellers are available to handle customers from the front of the line one by one taking an amount of time with a modelable distribution
As the line grows longer, the number of tellers available may have to be increased or decreased based on business rules
You can break this down into generic objects:
Entity: These would be the customers
Generator: This object generates Entities according to a distribution
Queue: This object represents the line at the bank. They find much real world use in acting as a buffer between a source of customers and a limited service.
Activity: This is a representation of the work done by a teller. It generally processes Entities from a Queue
Discrete Event Simulation
Instead of a continuous tick by tick simulation such as one might do with physical systems, a "Discrete Event" Simulation is a recognition that in many systems only critical events require process and the rest of the time nothing important to the state of the system is happening.
In the case of the Bank, critical events might be a customer entering the line, a teller becoming available, the manager deciding whether or not to open a new teller window, etc.
In a Discrete Event Simulation, the flow of time is kept by maintaining a Priority Queue of Events instead of an explicit clock. Time is incremented by popping the next event in chronological order (the minimum event time) off the queue and processing as necessary.
The Design
I've got a Priority Queue implemented as a Min Heap for now.
In order for the objects of the simulation to be processed as events, they implement an ISimulationEvent interface that provides an EventTime property and an Execute method. Those together mean the Priority Queue can schedule the events, then Execute them one at a time in the correct order and increment the simulation clock appropriately.
The simulation engine is a basic event loop that pops the next event and Executes it until there are none left. An event can reschedule itself to occur again or allow itself to go idle. For example, when a Generator is Executed it creates an Entity and then reschedules itself for the generation of the next Entity at some point in the future.
The Question
How should I handle logging metrics in my Discrete Event Simulation engine?
In the midst of this simulation, it is necessary to take metrics. How long are Entities waiting in the Queue? How many Acitivity resources are being utilized at any one point? How many Entities were generated since the last metrics were logged?
It follows logically that the metric logging should be scheduled as an event to take place every few units of time in the simulation.
The difficulty is that this ends up being a cross-cutting concern: metrics may need to be taken of Generators or Queues or Activities or even Entities. Consider also that it might be necessary to take derivative calculated metrics: e.g. measure a, b, c, and ((a-c)/100) + Log(b).
I'm thinking there are a few main ways to go:
Have a single, global Stats object that is aware of all of the simulation objects. Have the Generator/Queue/Activity/Entity objects store their properties in an associative array so that they can be referred to at runtime (VBA doesn't support much in the way of reflection). This way the statistics can be attached as needed Stats.AddStats(Object, Properties). This wouldn't support calculated metrics easily unless they are built into each object class as properties somehow.
Have a single, global Stats object that is aware of all of the simulation objects. Create some sort of ISimStats interface for the Generator/Queue/Activity/Entity classes to implement that returns an associative array of the important stats for that particular object. This would also allow runtime attachment, Stats.AddStats(ISimStats). The calculated metrics would have to be hardcoded in the straightforward implementation of this option.
Have multiple Stats objects, one per Generator/Queue/Activity/Entity as a child object. This might make it easier to implement simulation object-specific calculated metrics, but clogs up the Priority Queue a little bit with extra things to schedule. It might also cause tighter coupling, which is bad :(.
Some combination of the above or completely different solution I haven't thought of?
Let me know if I can provide more (or less) detail to clarify my question!
Any and every performance metric is a function of the model's state. The only time the state changes in a discrete event simulation is when an event occurs, so events are the only time you have to update your metrics. If you have enough storage, you can log every event, its time, and the state variables which got updated, and retrospectively construct any performance metric you want. If storage is an issue you can calculate some performance measures within the events that affect those measures. For instance, the appropriate time to calculate delay in queue is when a customer begins service (assuming you tagged each customer object with its arrival time). For delay in system it's when the customer ends service. If you want average delays, you can update the averages in those events. When somebody arrives, the size of the queue gets incremented, then they begin service it gets decremented. Etc., etc., etc.
You'll have to be careful calculating statistics such as average queue length, because you have to weight the queue lengths by the amount of time you were in that state: Avg(queue_length) = (1/T) integral[queue_length(t) dt]. Since the queue_length can only change at events, this actually boils down to summing the queue lengths multiplied by the amount of time you were at that length, then divide by total elapsed time.

Performance overhead for frequent (5Hz) Core Data saves

For an iPhone app that plays audio files, I'm working on a system to track the user's progress in any episode they've listened to (eg, they listen to the first 4:35 of file1, then starts another file, and goes back to file1 and it starts at 4:35).
I've set up a Core Data model to store the metadata, but I'm wondering how aggressively I could/should cache the current location during playback.
Currently I have just stuck the save: call in a method that was previously being used to update the time labels and UISlider playhead. That method is being called by a NSTimerInterval every 0.2 seconds.
0.2 seconds is much more precision than I need to keep track of for the progress cache. The values are rounded to the nearest second anyway, so essentially 4/5 of every save is redundant.
Given, though, that this is pretty much all Core Data is doing, it's only only ever dealing with a single value for a single record at any given time, I'm wondering if it makes more sense to just do the extra, unnecessary save:'s, or to manage a second timer for doing the update less frequently.
As is, Instruments reports the Save Duration of each event as ~800, peaking around 2000. I'm not really sure how to interpret those results. Actual app performance in the simulator doesn't appear to be significantly impacted.
If this kind of save is so cheap that it makes sense to keep code complexity low (only managing a single timer), I would keep it as is, but my gut instinct is that that's a lot of operations, no matter how cheap.
You shouldn't see as much of a difference in performance as you may see in battery consumption.
Writing to disk with flash storage in an iOS device is much faster than writing to a spinning plate HDD on a computer. Also, a write to a HDD does not cost much electricity compared to just keeping the plated spinning anyway. However, writing to the flash storage takes more power relative to a read or just leaving the flash alone.
In other words, the power consumption for a write on an iOS device it not negligible. If you can get away with 4hz, that could easily result in a notable improvement in batter consumption for your app.