Why does Jest --runInBand speed up tests? - testing

I read that the --runInBand flag speeds up Jest test duration by 50% on CI servers. I can't really find an explanation online on what that flag does except that it lets tests run in the same thread and sequentially.
Why does running the test in the same thread and sequentially make it faster? Intuitively, shouldn't that make it slower?

Reading your linked page and some other related sources (like this github issue) some users have found that:
...using the --runInBand helps in an environment with limited resources.
and
... --runInBand took our tests from >1.5 hours (actually I don't know how long because Jenkins timed out at 1.5 hours) to around 4 minutes. (Note: we have really poor resources for our build server)
As we can see, those users had improvements in their performances on their machines even though they had limited resources on them. If we read what does the --runInBand flag does from the docs it says:
Alias: -i. Run all tests serially in the current process, rather than creating a worker pool of child processes that run tests. This can be useful for debugging.
Therefore, taking into consideration these comments and the docs, I believe the improvement in performance is due to the fact that now the process runs in a single thread. This greatly helps a limited-resource-computer because it does not have to spend memory and time dealing and handling multiple threads in a thread pool, a task that could prove to be too expensive for its limited resources.
However, I believe this is the case only if the machine you are using also has limited resources. If you used a more "powerful" machine (i.e.: several cores, decent RAM, SSD, etc.) using multiple threads probably will be better than running a single one.

When you run tests in multi-threads, jest creates a cache for every thread. When you run with --runInBand jest uses one cache storage for all tests.
I found it after runs 20 identical tests files, first with key --runInBand, a first test takes 25 seconds and next identical tests take 2-3s each.
When I run tests without --runInBand key, each identical test file executes in 25 seconds.

Related

What is the maximum concurrency TestCafe can support

We are trying TestCafe with 50 concurrency and TestCafe randomly with "Cannot read property 'stackFrames' of undefined"
Same code base when we run with 20 threads works fine without any issues. Is there any upper limit on number of threads in Testcafe?
Note: - We have enough cpu to support 50 threads in aws (c5a.8xlarge)
TestCafe has no formal limit on the number of threads. However, we have not tested the concurrency mode with so many threads. Therefore, I can only give general recommendations: make sure that you have enough resources to run so many browsers (not only CPU): RAM, disk I/O, GPU. Also, run browsers in headless mode, for example: testcafe -c 50 chrome:headless test.js.

Is there a way to force Bazel to run tests serially

By default, Bazel runs tests in a parallel fashion to speed things up. However, I have a resource (GPU) that can't handle parallel jobs due to the GPU memory limit. Is there a way to force Bazel to run tests in a serial, i.e., non-parallel way?
Thanks.
--jobs 1 will limit the number of parallel jobs Bazel runs to 1.
You can also modify the test targets and add tags = ["exclusive"] to prevent specific test to run in parallel (see http://bazel.io/docs/test-encyclopedia.html).
Use --local_test_jobs=1 to only run a single test job at a time locally.
The max number of local test jobs to run concurrently. Takes an integer, or a keyword ("auto", "HOST_CPUS", "HOST_RAM"), optionally followed by an operation ([-|]) eg. "auto", "HOST_CPUS.5". 0 means local resources will limit the number of local test jobs to run concurrently instead. Setting this greater than the value for --jobs is ineffectual
tags = ["exclusive"] has other complications to consider with respect to caching.
--jobs will serialize the entire build process, not just testing, so it's less than ideal.
There are 2 resources Bazel will respect limitations upon: RAM and CPU. You may hijack one (Probably RAM) to represent GPU(s) as they're available to a run and required by a test. (I've stopped short of doing this for a limited hardware resource because it feels to inelegant, but I can't think of a reason it shouldn't work.)
Future releases of Bazel should support extra resources like GPUs
and releases that contain that change should support extra resource tags like "resources:GPU:1" when --local_extra_resources=gpu=1 is set. This should enable GPU tests to be bound by a limited quantity of GPUs, and for them to run non-exclusively and without limiting the total number of --jobs or "test_jobs"

What are some factors that could affect program runtime?

I'm doing some work on profiling the behavior of programs. One thing I would like to do is get the amount of time that a process has run on the CPU. I am accomplishing this by reading the sum_exec_runtime field in the Linux kernel's sched_entity data structure.
After testing this with some fairly simple programs which simply execute a loop and then exit, I am running into a peculiar issue, being that the program does not finish with the same runtime each time it is executed. Seeing as sum_exec_runtime is a value represented in nanoseconds, I would expect the value to differ within a few microseconds. However, I am seeing variations of several milliseconds.
My initial reaction was that this could be due to I/O waiting times, however it is my understanding that the process should give up the CPU while waiting for I/O. Furthermore, my test programs are simply executing loops, so there should be very little to no I/O.
I am seeking any advice on the following:
Is sum_exec_runtime not the actual time that a process has had control of the CPU?
Does the process not actually give up the CPU while waiting for I/O?
Are there other factors that could affect the actual runtime of a process (besides I/O)?
Keep in mind, I am only trying to find the actual time that the process spent executing on the CPU. I do not care about the total execution time including sleeping or waiting to run.
Edit: I also want to make clear that there are no branches in my test program aside from the loop, which simply loops for a constant number of iterations.
Thanks.
Your question is really broad, but you can incur context switches for various reasons. Calling most system calls involves at least one context switch. Page faults cause contexts switches. Exceeding your time slice causes a context switch.
sum_exec_runtime is equal to utime + stime from /proc/$PID/stat, but sum_exec_runtime is measured in nanoseconds. It sounds like you only care about utime which is the time your process has been scheduled in user mode. See proc(5) for more details.
You can look at nr_switches both voluntary and involuntary which are also part of sched_entity. That will probably account for most variation, but I would not expect successive runs to be identical. The exact time that you get for each run will be affected by all of the other processes running on the system.
You'll also be affected by the amount of file system cache used on your system and how many file system cache hits you get in successive runs if you are doing any IO at all.
To give a very concrete and obvious example of how other processes can affect the run time of the current process, think about if you are exceeding your physical RAM constraints. If your program asks for more RAM, then the kernel is going to spend more time swapping. That time swapping will be accounted in stime but will vary depending on how much RAM you need and how much RAM is available. There are lot's of other ways that other processes can affect your process's run time. This is just one example.
To answer your 3 points:
sum_exec_runtime is the actual time the scheduler ran the process including system time
If you count switching to the kernel as the process giving up the CPU, then yes, but it does not necessarily mean a different user process may get the CPU back once the kernel is done.
I think I've already answered this question that there are lot's of factors.

What is the difference between load tests and performance tests?

What is the difference between load tests and performance tests? Are load tests just a special type of performance tests? If so, could you provide an example of performance tests, which are not load tests?
Terminology questions are always difficult because many definitions float around. Yet, most of the time "performance test" is a wide category of test in which we look at how the Software Under Test behaves from a technical point of view: time to do some computation, response time of API or UI, memory used on the machine, disk footprint etc. And "load test" is the special case where you check your SUT under heavy load (lots of connections to your server for example).
An example of perf test that is not load test? For example "longevity test": test how your SUT behaves when it runs (under normal load) for a long time (several days/weeks). This test might highlight a memory or thread leakage, or you could discover that a given log file become huge, or you could discover that after some time, for some reasons, the system become unstable.

What types of testing do you include in your build process?

I use TFS 2008. We run unit tests as part of our continuous integration build and integration tests nightly.
What other types of testing do you automate and include in your build process? what technologies do you use to do so?
I'm thinking about smoke tests, performance tests, load tests but don't know how realistic it is to integrate these with Team Build.
First, we have check-in (smoke) tests that must run before code can be checked in. It's done automatically by running a job that runs the tests and then makes the check-in to source control upon successful test completion. Second, cruise control kicks off build and regression tests. The product is built then several sets of integration tests are run. The number of tests vary by where we are in the release cycle. More testing is added late in the cycle during ramp down. Cruise control takes all submissions within a certain time window (12 minutes) so your changes may be built and tested with a small number of others. Third, there's an automated nightly build and tests that are quite extensive. We have load or milestone points every 2 or 3 weeks. At a load point, all automated tests are run plus manual testing is done. Performance testing is also done for each milestone. Performance tests can be kicked off on request but the hardware available is limited so people have to queue up for performance tests. Usually people rely on the load performance tests unless they are making changes specifically to improve performance. Finally, stress tests are also done for each load. These tests are focussed on making sure the product has no memory leaks or anything else that prevents 24/7 running of the product as opposed to performance. All of this is done with ant, cruise control, and Python scripts.
Integrating load testing during you build process is a bad idea, just do your normal unit testing to make sure that all your codes work as expected. Load and performance testing should be done separately.