Is there a way to force Bazel to run tests serially - gpu

By default, Bazel runs tests in a parallel fashion to speed things up. However, I have a resource (GPU) that can't handle parallel jobs due to the GPU memory limit. Is there a way to force Bazel to run tests in a serial, i.e., non-parallel way?
Thanks.

--jobs 1 will limit the number of parallel jobs Bazel runs to 1.
You can also modify the test targets and add tags = ["exclusive"] to prevent specific test to run in parallel (see http://bazel.io/docs/test-encyclopedia.html).

Use --local_test_jobs=1 to only run a single test job at a time locally.
The max number of local test jobs to run concurrently. Takes an integer, or a keyword ("auto", "HOST_CPUS", "HOST_RAM"), optionally followed by an operation ([-|]) eg. "auto", "HOST_CPUS.5". 0 means local resources will limit the number of local test jobs to run concurrently instead. Setting this greater than the value for --jobs is ineffectual
tags = ["exclusive"] has other complications to consider with respect to caching.
--jobs will serialize the entire build process, not just testing, so it's less than ideal.

There are 2 resources Bazel will respect limitations upon: RAM and CPU. You may hijack one (Probably RAM) to represent GPU(s) as they're available to a run and required by a test. (I've stopped short of doing this for a limited hardware resource because it feels to inelegant, but I can't think of a reason it shouldn't work.)

Future releases of Bazel should support extra resources like GPUs
and releases that contain that change should support extra resource tags like "resources:GPU:1" when --local_extra_resources=gpu=1 is set. This should enable GPU tests to be bound by a limited quantity of GPUs, and for them to run non-exclusively and without limiting the total number of --jobs or "test_jobs"

Related

Usages of cores in Spark SQL Execution

I am new to Spark SQL queries and trying to understand it's working under the hood.
I have come across the term "Core" in the Spark vocabulary but still struggling to get a hold on the same.
I know that - 1 core = 1 task.
My questions -
Can anyone please explain what exactly does a core mean ?
Does Spark UI show the number of cores currently allocated for my job ? If yes,
then where can I see it ?
If I find in the Spark UI that the number of tasks running is less, is
there a way to increase the number of cores allocated for my job, so
that Spark can submit more tasks and make my job run faster ?
Please advise.
Yes, you are right in a way.
In spark task are distributed across executors, on each executor number of task running is equal to the number of cores on that executors. So basically core is something that is going to execute your task. The task here is the most granular work that needs to be carried out.
JOB=>STAGE=>TASK
Yes, spark UI shows you the number of the task currently running on your every executor. You can check them under the Executors tab. This tab shows you a very detailed view of your task allocation against the number of cores available and a lot of other details.
Yes, you can increase the number of cores. You can do that by passing the argument in the spark-submit command.
--executor-cores n
Here n is the number of cores you want. For optimum usage, it should be 5.
It is not necessary that more than the number of cores faster your job will run.
Your task needs to be distributed equally across all the cores available to run faster.
If you provide more cores than required they will remain idle most of the time.

Stopping when the solution is good enough?

I successfully implemented a solver that fits my needs. However, I need to run the solver on 1500+ different "problems" at 0:00 precisely, everyday. Because my web-app is in ruby, I built a quarkus "micro-service" that takes the data, calculate a solution and return it to my main app.
In my application.properties, I set:
quarkus.optaplanner.solver.termination.spent-limit=5s
which means each request take ~5s to solve. But sending 1500 requests at once will saturate the CPU on my machine.
Is there a way to tell OptaPlanner to stop when the solution is good enough ? ( for example if the score is stable ... ). That way I can maybe reduce the time from 5s to 1-2s depending on the problem?
What are your recommandations for my specific scenario?
The SolverManager will automatically queue solver jobs if too many come in, based on its parallelSolverCount configuration:
quarkus.optaplanner.solver-manager.parallel-solver-count=3
In this case, it will run 3 solvers in parallel. So if 7 datasets come in, it will solve 3 of them and the other 4 later, as the earlier solvers terminate. However if you use moveThreadCount=2, then each solver uses at least 2 cpu cores, so you're using at least 6 CPU cores.
By default parallelSolverCount is currently set to half your CPU cores (it currently ignores moveThreadCount). In containers, it's important to use JDK 11+: the CPU count of the container is often different than from the bare metal machine.
You can indeed tell the OptaPlanner Solvers to stop when the solution is good enough, for example when a certain score is attained or the score hasn't improved in an amount of time, or combinations thereof. See these OptaPlanner docs. Quarkus exposes some of these already (the rest currently still need a solverConfig.xml file), some Quarkus examples:
quarkus.optaplanner.solver.termination.spent-limit=5s
quarkus.optaplanner.solver.termination.unimproved-spent-limit=2s
quarkus.optaplanner.solver.termination.best-score-limit=0hard/-1000soft

How to increase performance when building Vue webapp?

When I build my Vue project for production, it usually takes a few minutes of processing, even when using a powerfull workstation ...
I think this may be due to the default hardware limitation of 1 worker and 2GB memory as shown by the log below :
$ vue-cli-service build
Building for production...Starting type checking service...
Using 1 worker with 2048MB memory limit
\ Building for production...
Would the build process be faster if this limit was increased ? If yes, how can I change it ?
The build process taking minutes seems really high. There could be a number of reasons for that.
Specific to your question related to the memory limit, that's just for type checker. Vue uses fork-ts-checker-webpack-plugin to pull out the type checking into a separate process to speed up the build. If that is the main cause of slow build, then indeed playing with the memory limit and workers may help.
Somebody already answered how to do that in this SO post.
That only answers how to change the memory limit.
But you can change the number of workers in the same way. E.g.
forkTsCheckerOptions.workers = 4;

Run two jobs in parallel in one GPU

Running on tensorflow, the moment that I try and run two jobs in parallel with any gpu related jobs, it flags and gives me an error regarding cuDNN module being not available. Is there any method where I can run two jobs in parallel (maybe at the cost of increased time length)?

Why does Jest --runInBand speed up tests?

I read that the --runInBand flag speeds up Jest test duration by 50% on CI servers. I can't really find an explanation online on what that flag does except that it lets tests run in the same thread and sequentially.
Why does running the test in the same thread and sequentially make it faster? Intuitively, shouldn't that make it slower?
Reading your linked page and some other related sources (like this github issue) some users have found that:
...using the --runInBand helps in an environment with limited resources.
and
... --runInBand took our tests from >1.5 hours (actually I don't know how long because Jenkins timed out at 1.5 hours) to around 4 minutes. (Note: we have really poor resources for our build server)
As we can see, those users had improvements in their performances on their machines even though they had limited resources on them. If we read what does the --runInBand flag does from the docs it says:
Alias: -i. Run all tests serially in the current process, rather than creating a worker pool of child processes that run tests. This can be useful for debugging.
Therefore, taking into consideration these comments and the docs, I believe the improvement in performance is due to the fact that now the process runs in a single thread. This greatly helps a limited-resource-computer because it does not have to spend memory and time dealing and handling multiple threads in a thread pool, a task that could prove to be too expensive for its limited resources.
However, I believe this is the case only if the machine you are using also has limited resources. If you used a more "powerful" machine (i.e.: several cores, decent RAM, SSD, etc.) using multiple threads probably will be better than running a single one.
When you run tests in multi-threads, jest creates a cache for every thread. When you run with --runInBand jest uses one cache storage for all tests.
I found it after runs 20 identical tests files, first with key --runInBand, a first test takes 25 seconds and next identical tests take 2-3s each.
When I run tests without --runInBand key, each identical test file executes in 25 seconds.