Running Jobs in Parallel for same project and different branches

Running Jobs in Parallel for same project and different branches - gitlab-ci

What do I need to change in order to run these jobs in parallel?
There is one more runner available on the server, but it's not picking up the "pending" job until the "running" one is finished.
UPDATE
The jobs are picked up by different runners, but in a sequential mode. See ci-runner-1 and ci-runner-2.
See screenshots

The problem was that in config.toml (/etc/gitlab-runner/config.toml in my case) I had:
concurrent = 1.
Changed this to 0 or a value > 1, restart gitlab-runner and all good.
Reference:
https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-global-section

Related

How to monitor Databricks jobs using CLI or Databricks API to get the information about all jobs

I want to monitor the status of the jobs to see whether the jobs are running overtime or it failed. if you have the script or any reference then please help me with this. thanks

You can use the databricks runs list command to list all the jobs ran. This will list all jobs and their current status RUNNING/FAILED/SUCCESS/TERMINATED.
If you wanted to see if a job is running over you would then have to use databricks runs get --run-id command to list the metadata from the run. This will return a json which you can parse out the start_time and end_time.
# Lists job runs.
databricks runs list
# Gets the metadata about a run in json form
databricks runs get --run-id 1234
Hope this helps get you on track!

Run a SQL Server job until it succeeds

I have a SQL Server job that has run for almost 2 years.
It's connecting to a bad Oracle database that keeps disconnecting, it always fails due to that. And when I run it again after 10 or 15 minutes, it works successfully. I'm getting bored of checking it every day...
Is there a way that make the job run to connect to that Oracle source until it succeeds, or another job that looks over this job status and if it failed, then it runs it again until it succeeds?

A solution we are using is something like this:
Wrap your Oracle query in an SSIS package, and after the query, have the package update a SQL table that keeps either a history of executions, or just a single row that tracks the last time the job ran successfully. In short, if the Oracle query was successful, then put something in a table saying the query ran successfully today. If it was not successful, then don't put anything in the table for today.
Then at the beginning of the package, BEFORE the Oracle query, check to see if the query has been run successfully today. If it has already run successfully, then do nothing and exit the package. If it has not run successfully today, then go ahead and try to run it, following the post-query steps described above. If you have any other conditions about when the package should run (like "only after 10 am" or anything like that) you would include that logic here.
Finally, schedule the job to call the package, and schedule to run every 15 minutes, or however often you like. It will try every 15 minutes until it runs successfully, and after that it will stop doing anything until the next day.
As a bonus, you can use this same package and job to initiate all tasks that you want handled the same way. You just need to keep meta data about all these tasks in your history/metadata table.

an alternative is to create the job step and leave it unscheduled, and create an ssis job that acts as the master to all your jobs and it runs every minute checking all job steps from a config table that have yet to succeed today and any it finds execute using sp_start_job.
if they do run successfully log the stats to a log table and this prevents them ever being launched again until the next day. This prevents all yours jobs needing to be scheduled every 15 minutes etc, they launch asap, and you can add extra logic to handle dependencies, number parallel running, importance level etc, start time, latest start time, max number to retty etc

How do I troubleshoot a hanging SQL Server Agent Job

I have a SQL Server Agent Job with 4 steps. If I run it, it displays as "Executing" indefinitely. If I run the code in the four steps sequentially directly into SSMS, they take ~7 seconds to execute. No configuration information (owner, run as, database, etc.) differs from any other Job that runs normally. What else can I examine?

As with any problem that comes in a group, break it down to it's individual parts. You have run each step separately and you know each individual step works. Next, add step 1 and 2 and see if it runs. Next add step 1,2 and 3 and see what happens. Eliminate all the possible issues step by step. My guess is that one step is not returning a success and the error logic on it does not say to fail or move to the next step.
Check the properties on each step under advanced and check the On success and On Failure actions.

What is the difference between a job pid and a process id in *nix?

A job is a pipeline of processes. After I execute a command line, for example, sleep 42 & , the terminal will give me some information like this
[1] 31562
Is this 31562 the "job pid" of this job? Is it the same as the process of the ls command?
And if I have a command with a pipe, there will be more than one process created, is the job pid same as the process id of the first process of the pipeline?

A job is a pipeline of processes.
Not necessarily. Though most of the times a job is comprised of a processes pipeline, it can be a single command, or it can be a set of commands separated by &&. For example, this would create a job with multiple processes that are not connected by a pipeline:
cat && ps u && ls -l && pwd &
Now, with that out of the way, let's get to the interesting stuff.
Is this 31562 the "job pid" of this job? Is it the same as the process
of the ls command?
The job identifier is given inside the square brackets. In this case, it's 1. That's the ID you'll use to bring it to the foreground and perform other administrative tasks. It's what identifies this job in the shell.
The number 31562 is the process group ID of the process group running the job. UNIX/Linux shells make use of process groups: a process group is a set of processes that are somehow related (often by a linear pipeline, but as mentioned before, not necessarily the case). At any moment, you can have 0 or more background process groups, and at most one foreground process group (the shell controls which groups are in the background and which is in the foreground with tcsetpgrp(3)).
A group of processes is identified by a process group ID, which is the ID of the process group leader. The process group leader is the process that first created and joined the group by calling setpgid(2). The exact process that does this depends on how the shell is implemented, but in bash, IIRC, it is the last process in the pipeline.
In any case, what the shell shows is the process group ID of the process group running the job (which, again, is really just the PID of the group leader).
Note that the group leader may have died in the past; the process group ID won't change. This means that the process group ID does not necessarily correspond to a live process.

Running same Kettle Job from two different scripts Issue

Is it possible to run a kettle job simultaneously more than once at the same time?
What I am Trying
Say we are running this script twice at a same time,
sh kitchen.sh -rep="development" -dir="job_directory" -job="job1"
If I run it only once at a point of time, data-flow is perfectly fine.
But, when I run this command twice at the same time, it throws error like:
ERROR 09-01 13:34:13,295 - job1 - Error in step, asking everyone to stop because of:
ERROR 09-01 13:34:13,295 - job1 - org.pentaho.di.core.exception.KettleException:
java.lang.Exception: Return code 1 received from statement : mkfifo /tmp/fiforeg
Return code 1 received from statement : mkfifo /tmp/fiforeg
at org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoader.execute(MySQLBulkLoader.java:140)
at org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoader.processRow(MySQLBulkLoader.java:267)
at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.Exception: Return code 1 received from statement : mkfifo /tmp/fiforeg
at org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoader.execute(MySQLBulkLoader.java:95)
... 3 more
It's important to run the jobs simultaneously twice at a same time. To accomplish this, I can duplicate every job and run the original and the duplicate job at a point of time. But, not a good approach for long run!
Question:
Is Pentaho not maintaining threads?
Am I missing some option, or can I enable some option to make pentaho create different threads for different job instances?

Of course Kettle maintains threads. A great many of them in fact. It looks like the problem is that the MySQL bulk loader uses a FIFO. You have two instances of a FIFO called /tmp/fiforeg. The first instance to run creates the FIFO just fine; the second then tries to create another instance with the same name and that results in an error.
At the start of the job, you need to generate a unique FIFO name for that instance. I think you can do this by adding a transformation at the start of the job that uses a Generate random value step to generate a random string or even a UUID and store it in a variable in the job via the Set variables step.
Then you can use this variable in the 'Fifo file' field of the MySQL bulk loader.
Hope that works for you. I don't use MySQL, so I have no way to make sure.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas