Calculating time of a job in HPC - jobs

I'm start to use Cloud resources
In my project I need to run a job, then I need to calculate the time between the begin of the execution of the job in the queue and the end of the job
To I put the job in the queue, I used the command:
qsub myjob
How can I do this?
Thanks in advance

the simplest (although not most accurate) way is to get the report of your queue system. If you use PBS (which is my first guess from your qsub command), you can insert in your script the options:
#PBS -m abe
#PBS -M your email
This sends you a notification at start (b) end (e) and abort(a). The end notification should have a resources_used.walltime with the information of the wall time spent.
If you use another queue system, there must be some similar option in the manual.

Related

Slurm - job runs, gets data but gives TIMEOUT error

So I'm running some code which takes about 2 hours to run on the cluster. I configured the batch file with
# Set maximum wallclock time limit for this job
#Time Format = days-hours:minutes:seconds
#SBATCH --time=0-02:15:00
Just to give some overhead if the job slows for whatever reason. I checked the directory that the generated files are stored in and the simulation completes successfully every time. Despite this, slurm keeps the job running until it hits the max time. The .out file keeps saying
slurmstepd: *** JOB CANCELLED AT 2022-03-05T10:38:26 DUE TO TIME LIMIT ***
Any ideas why it doesn't show as complete instead?
In my opinion, this error is not related to Slurm rather about your application. Your application is somehow not sending the exit signal to the slurm.
You can use sstat -j jobid to see the status of the job, may be after 2 hours to see how the cpu consumption etc going and figure out what happens in your application (where it hangs after completion or so).

What is the meaning of -- -j 8 in cmake command

This is a similar question: What does --target option mean in CMake?
but the answer there doesn't explain anything about "-- -j 8".
What does it actually do?
The -- option means to pass the following options on to the native tool, probably make. For make, the -j option means the number of simultaneous jobs to run:
-j [jobs], --jobs[=jobs]
Specifies the number of jobs (commands) to run simultaneously. If there is more than one -j option, the
last one is effective. If the -j option is given without an argument, make will not limit the number of
jobs that can run simultaneously.
This allows make to use multiple processes to run different build steps at the same time, likely reducing build time.
Normally, make will execute only one recipe at a time, waiting for it to finish before executing the next. However, the ‘-j’ or ‘--jobs’ option tells make to execute many recipes simultaneously.
The -j option tells cmake to run up to N separate jobs at the same time.
From the cmake build options:
The maximum number of concurrent processes to use when building.

Redis mass insertions on remote server

I have a remote server running Redis where I want to push a lot of a data from a Java application. Until now I used Webdis to push one command at the time which is not efficient, but I did not have any security issues because I could define the IPs that were accepted as connections and coomand authorizations while redis was not accepting requests from outside (protected mode).
I want to try to use jedis(Java API) and the implementation of pipeline for faster insertion but that means I have to open my Redis to accept requests from outside.
My question is this: Is it possible to use webdis in a similar way(pipilined mass insertion)? And if not what are the security configurations I need to make to use something like Jedis over the internet?
Thanks in advance for any answer
IMO it should be transparent for Redis driver how you set up the security. No driver or password protection will be so secure as specifically designed protocols or technologies.
In the most simple way I'd handle the security, is letting Redis listening on 127.0.0.1:<some port> and using an SSH tunnel to the machine. At least this way you can test the performance agains your current scenario.
You can also use IPSec or OpenVPN afterwards to organize private network which is able to communicate with Redis server.
This question is almost 4 years old so I hope its author has moved on by now, but in case someone else has the same issue I thought I might suggest a way to send data to Webdis more efficiently.
You can indeed make data ingest faster by batching your inserts, meaning you can use MSET to insert multiple keys in a single request (or HMSET for hashes, etc).
As an example, here's ApacheBench (ab) inserting one key 100,000 times using 100 clients:
$ ab -c 100 -n 100000 -k 'http://127.0.0.1:7379/SET/foo/bar'
[...]
Requests per second: 82235.15 [#/sec] (mean)
We're measuring 82,235 single-key inserts per second. Keep in mind that there's a lot more to HTTP benchmarking than just looking at averages (the latency distribution is still important, etc.) but this example is only about showing the difference that batching can make.
You can send commands to Webdis in one of three ways (documented here):
GET /COMMAND/arg0/.../argN
POST / with COMMAND/arg0/.../argN in the HTTP body (demonstrated below)
PUT /COMMAND/arg0.../argN-1 with argN in the HTTP body
If instead of inserting one key per request we create a file containing the MSET command to write 100 keys in a single request, we can significantly increase the write rate.
# first showing what the command looks like for 3 keys
$ echo -n 'MSET' ; for i in $(seq 1 3); do echo -n "/key-${i}/value-${i}"; done
MSET/key-1/value-1/key-2/value-2/key-3/value-3
# then saving the command to write 100 keys to a file:
$ (echo -n 'MSET' ; for i in $(seq 1 100); do echo -n "/key-${i}/value-${i}"; done) > batch-contents.txt
With this file, we can use ab to send this multi-insert file as a POST request (-p) to Webdis:
$ ab -c 100 -n 10000 -k -p ./batch-contents.txt -T 'application/x-www-form-urlencoded' 'http://127.0.0.1:7379/'
[...]
Requests per second: 18762.82 [#/sec] (mean)
This is showing 18,762 requests per second… with each request performing 100 inserts, for a total of 1,876,282 actual key inserts per second.
If you track the CPU usage of Redis while ab is running, you'll find that the MSET use case pegs it at 100% CPU while sending individual SET does not.
Once again keep in mind that this is a rough benchmark, just enough to show that there is a significant difference when you batch inserts. This is true regardless of whether Webdis is used, by the way: batching inserts from a client connecting directly to Redis should also be much faster than individual inserts.
Note: (I am the author of Webdis)

What is the difference between a job pid and a process id in *nix?

A job is a pipeline of processes. After I execute a command line, for example, sleep 42 & , the terminal will give me some information like this
[1] 31562
Is this 31562 the "job pid" of this job? Is it the same as the process of the ls command?
And if I have a command with a pipe, there will be more than one process created, is the job pid same as the process id of the first process of the pipeline?
A job is a pipeline of processes.
Not necessarily. Though most of the times a job is comprised of a processes pipeline, it can be a single command, or it can be a set of commands separated by &&. For example, this would create a job with multiple processes that are not connected by a pipeline:
cat && ps u && ls -l && pwd &
Now, with that out of the way, let's get to the interesting stuff.
Is this 31562 the "job pid" of this job? Is it the same as the process
of the ls command?
The job identifier is given inside the square brackets. In this case, it's 1. That's the ID you'll use to bring it to the foreground and perform other administrative tasks. It's what identifies this job in the shell.
The number 31562 is the process group ID of the process group running the job. UNIX/Linux shells make use of process groups: a process group is a set of processes that are somehow related (often by a linear pipeline, but as mentioned before, not necessarily the case). At any moment, you can have 0 or more background process groups, and at most one foreground process group (the shell controls which groups are in the background and which is in the foreground with tcsetpgrp(3)).
A group of processes is identified by a process group ID, which is the ID of the process group leader. The process group leader is the process that first created and joined the group by calling setpgid(2). The exact process that does this depends on how the shell is implemented, but in bash, IIRC, it is the last process in the pipeline.
In any case, what the shell shows is the process group ID of the process group running the job (which, again, is really just the PID of the group leader).
Note that the group leader may have died in the past; the process group ID won't change. This means that the process group ID does not necessarily correspond to a live process.

Schedule a cronjob on ssh with command line

I am using the amazonaws es3 server.I want to schedule my cron with command line.
I am using the this command for scheduling the cron job
at -f shellscript.sh -v 18:30
but it will schedule for only one time i want to configure manually like once a day or every five minutes .
Please help with command which command i have to used
Thnaks,
As #The.Anti.9 noted, this kind of question fits in Serverfault.
To answer your question, crontab is a little more powerful than 'at' and gives you more flexibility as you can run the job repeatedly for instance daily, weekly, monthly.
For instance for your example, if you need to run the script every day at 18:30 you'd do this,
$ crontab -e
then add the following
30 18 * * * /path/to/your/script.sh
save and you are done.
Note: 30 18 indicates time 18:30, and *s indicate that it should run every day of every month) If you need to run it on a particular day of the month just check it out on the man page of crontab.
Doesn't crontab -e work?
and to generate crontab code http://www.openjs.com/scripts/jslibrary/demos/crontab.php should help.
You can use the command crontab -e to edit your cron planned execution. A good explanation on how to set the time can be found on the Ubuntu forum