Efficient way to write gitlab ci script - gitlab-ci

I have a gitalb CI pipiline as follows.
Stage A
job1
job2
job3
Stage B
job1
In all the jobs in both stages, first package dependencies are installed. How can I handle this in an efficient way so that the total pipeline running time is the least. Also I can use multiple runners so job1, job2, job3 can run in parallel. Also multiple pipelines can run at a time.

You should define a cache: rule to cache your dependencies. Use an appropriate key: on the cache to ensure the cache is resorted at appropriate points.
See also: caching dependencies

Related

Skip a long-running job if it's already running in GitLab CI

We have a test that takes several hours to run and that we'd like to run on our codebase as often as possible in GitLab CI. The idea is for it to validate as many commits as possible by merging them from dev into main but we know it's too slow to run on every commit.
It could run on a schedule, e.g. 18:00 every evening, but then it would run unnecessarily if there have been no changes and it wouldn't run as often as it could, e.g. 2-3 times a day.
Limiting concurrent jobs as suggested here isn't enough because the jobs will pile up, one per commit, and there will never be time to run them all.
We'd like it to complete the test for one commit, and then restart on the latest commit available, skipping over any commits that came in earlier.
I've looked through the rules section of the docs but don't see any magic variables that would let me say "run this job if it's not already running". Perhaps some kind of semaphore as described here (requested but not implemented as far as I can see).
How can we tell GitLab CI to run this particular job only if it's not already running and skip the job otherwise?

Can a snakemake job mistakenly run twice at the same time?

When running Snakemake on a cluster, jobs get scheduled fine via slurm. Sometimes I have a case that one job is failing and consequently leads to a stop of the snakemake instance/run after completion of the still running jobs. To speed up this I have stopped snakemake (CTRl+C) and restarted it. What I did not thought of was that in this case some jobs from the previous run might still be running on the cluster. Hence it could potentially happen that the same job is started again in case no output has been written until then. In this case it could finally lead to the situation where 2 jobs write to the same output file. Or is that prevented by some other log of snakemake to care about successful completion?
I hope you can follow this explanation. Happy for every comment !
In this case it could finally lead to the situation where 2 jobs write to the same output file.
Snakemake should be aware that the previous execution didn't exit clean (because of Ctrl+C) and the jobs that were running at that moment are incomplete or absent. However, snakemake cannot know that those pending jobs are still running as independent processes.
So yes, I think it can happen that jobs steps on each other feet in what you are doing.
In my opinion, before re-running snakemake it would be safer to kill the pending jobs and start fresh. (Those that have completed before snakemake was killed are ok of course).
Note that there is an option in snakemake that may help you:
--keep-going, -k Go on with independent jobs if a job fails. (default:
False)

Can spinnaker prevent out-of-order deployments?

Currently
We use a CI platform to build, test, and release new code when a new PR is merged into master. The "release" step is quite simple/stupid, and essentially runs kubectl patch with the tag of the newly-pushed docker image.
The Problem
When two PRs merge at about the same time (ex: A, then B -- B includes A's commits, but not vice-versa), it may happen that B finishes its build/test first, and begins its release step first. When this happens, A releases second, even though it has older code. The result is a steady-state in which B's code has been effectively rolled-back by As deployment.
We want to keep our CI/CD as continuous as possible, ideally without:
serializing our CI pipeline (so that only one workflow runs at a time)
delaying/batching our deployments
Does Spinnaker have functionality or best-practice that solves for this?
Best practises for your issue are widely described in Message Ordering for Asynchronous systems. The simpliest solution would be to implement FIFO priciple for your CI/CD pipeline.
It will save you from implementing checks between CI and CD parts.

queue job all day and execute it at a specified time

Is there a plugin or can I somehow configure it, that a job (that is triggered by 3 other jobs) queues until a specified time and only then executes the whole queue?
Our case is this:
we have tests run for 3 branches
each of the 3 build jobs for those branches triggers the same smoke-test-job that runs immediately
each of the 3 build jobs for those branches triggers the same complete-test-job
points 1. and 2. work perfectly fine.
The complete-test-job should queue the tests all day long and just execute them in the evening or at night (starting from a defined time like 6 pm), so that the tests are run at night and during the day the job is silent.
It's no option to trigger the complete-test-job on a specified time with the newest version. we absolutely need the trigger of the upstream build-job (because of promotion plugin and we do not want to run already run versions again).
That seems a rather strange request. Why queue a build if you don't want it now... And if you want a build later, then you shouldn't be triggering it now.
You can use Jenkins Exclusion plugin. Have your test jobs use a certain resource. Make another job whose task is to "hold" the resource during the day. While the resource is in use, the test jobs won't run.
Problem with this: you are going to kill your executors by having queued non-executing jobs, and there won't be free executors for other jobs.
Haven't tried it myself, but this sounds like a solution to your problem.

Atlassian Bamboo: Continuously run a plan

I have configured a plan bamboo to build a project. This plan first checkout the latest code from SVN and executes a command to build the project. Building of this project takes 4hrs-5hrs. I want my plan to run continuously i.e if the plan creates one build then immediately bamboo should start another build. I want the event to start building a project should completion of previous build not commiting something in to svn. Is there any way I can achieve this?
You can create a scheduled trigger with a cron expression causing your plan to build every X minutes. X should then be <= the estimated build time. Disadvantage may be that it can cause to have multiple builds in the build queue after a while.
To do this I would do the following.
In the plan settings you can set the maximum number of builds to be 1 at an time.
Then configure the queue to be maximum of 1.
This way only a single build will build with only a single job in the queue.
Then you can either "set a scheduled trigger with a cron expression causing your plan to build every X minutes"
or make a final step do a commit to the repo when if you have a change driven trigger it would immediately run another job as it would detect a change.