Configure allowed_pull_policies on shared GitLab runner - gitlab-ci

I'm using GitLab.com's managed CI runners, and I'd like to run my CI jobs using the if-not-present pull policy to avoid the extra minutes it takes to pull the image for each job. Trying to set that value in the .gitlab-ci.yml file gives me this error:
pull_policy ([if-not-present]) defined in GitLab pipeline config is not one of the allowed_pull_policies ([always])
This led me to the config.toml settings for restricting Docker pull policies, so I created a config.toml file at the root of my repository and tried that. However, I still get the same error.
Is config.toml only available for manual/self-hosted runners? Is there any other way to get past this?
Context
Image selection in .gitlab-ci.yml:
default:
image:
name: registry.gitlab.com/myorg/myrepo/ci/builder:latest
pull_policy: if-not-present
Contents of config.toml:
[[runners]]
executor = "docker"
[runners.docker]
pull_policy = ["if-not-present"]
allowed_pull_policies = ["always", "if-not-present"]

First of all, the config.toml file is not meant to be in your repo but on the runner machine (or container).
But anyways, the always pull policy should not cause image pulls to last minutes if the layers are already cached locally: it just ensures you have the latest version by checking the metadata. If the pulls take minutes, it means that either the layers are not available locally, or the image was actually updated (or that the connection to your container registry is so incredibly slow that just checking the metadata takes minutes, but that is unlikely).
It is very possible that Gitlab's managed runners do not have a way to locally cache layers, and thus there would be no practical difference between the always and if-not-present policies. For instance if you use Gitlab Saas:
A dedicated temporary runner VM hosts and runs each CI job.
(see https://docs.gitlab.com/ee/ci/runners/index.html)
Thus the downloaded layers are discarded as soon as the job finishes.

Related

Gitlab CI - Trigger daily pipeline only if new chanes have been commited

The company I work for has a self hosted Gitlab CE server v13.2.1.
For a specific project I've setup the CI jobs to build according to the following workflow :
If a commit has been pushed to the main branch
If a merge request has been created
If a tag has been pushed
Every day at midnight to build the main branch (using scheduled pipelines)
Everything works fine. The only thing I would like to improve is that the nightly builds are performed even if the main branch has not been modified (no new commit).
I had a look to the Gitlab documentation to change my workflow: rules in the .gitlab-ci.yml file but I didn't find anything relevant.
The gitlab runner is installed in a VM and is setup as a shell executor. I was thinking of creating in the home directory a file to store the last commit ID. I'm not a big fan of that solution, because :
it's a ugly fix.
The pipeline will be triggered by Gitlab even if it does nothing. This will pollute the pipeline list.
Is there any way to setup the workflow: section to perform this so the pipeline list won't contain unnecessary pipeline ?

GitLab Runner fails to upload artifacts with "invalid argument" error

I'm completely new to trying to implement GitLab's CI/CD pipelines, but it's been going quite well. In fact, for my ASP.NET project, if I specify a Publish Profile in the msbuild command that uses Web Deploy, it actually deploys the code successfully to the web server.
However, I'm now wanting to have the "build" job create artifacts which are uploaded to GitLab that I can then subsequently deploy. We're using a self-hosted instance of GitLab, for which I'm not an admin, but I can speak to the admin if I know what I'm asking for!
So I've configured my gitlab-ci.yml file like this:
variables:
NUGET_PATH: 'C:\Program Files\Nuget\Nuget.exe'
NUGET_SOURCES: 'https://api.nuget.org/v3/index.json'
MSBUILD_PATH: 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Current\Bin\msbuild.exe'
stages:
- build
build-job:
variables:
CI_DEBUG_TRACE: "true"
stage: build
script:
- '& "$env:NUGET_PATH" restore ApplicationTemplate.sln -Source "$env:NUGET_SOURCES"'
- '& "$env:MSBUILD_PATH" ApplicationTemplate\ApplicationTemplate.csproj /p:DeployOnBuild=true /p:Configuration=Release /p:PublishProfile=FolderPublish.pubxml'
artifacts:
paths:
- '.\ApplicationTemplate\bin\Release\Publish\'
The output shows that this builds the code just fine, and it also seems to successfully find the artifacts for upload. However, when it uploads the artifacts, even though the request gets a 200 OK response, the process fails. Here is the log output:
So, it finds the artifacts, it attempts to upload them and even gets a 200 OK response (in contrast to the handful of similar reports of this error I've been able to find online), but it still fails due to an invalid argument.
I've already enabled verbose debugging, as you can see from the output, but I'm none the wiser. Looking at the GitLab Runner entries in the Windows Event Log on the box where the runner is hosted doesn't shed any light on things either. The total size of the artifacts is 61.1MB, so I don't think my issue is related to that.
Can anyone see from this output what's invalid? Can I identify which argument is invalid and/or why it's invalid?
Edit: Things I've tried
Specifying a value for artifacts:expire_in.
Setting artifacts:public to FALSE, since I'm using a self-hosted GitLab environment and the default value for this setting (TRUE) is not valid in such an environment.
Trying every format I can think of for the value of the artifacts:paths setting (this seems to be incredibly robust - regardless of the format I use, the Runner seems to have no problem parsing it and finding the files to upload).
Taking a cue from this question I created a new project with a very simple build job to upload a single file:
stages:
- build
build-job:
variables:
CI_DEBUG_TRACE: "true"
stage: build
script:
- echo "Test" > test.txt
artifacts:
paths:
- test.txt
About 50% of the time this job hangs on the uploading of the artifacts and I have to cancel it. The other half of the time it fails in exactly the same way as the my previous project:
After countless hours working on this, it seems that ultimately the issue was that our internal Web Application Firewall was blocking some part of the transfer of artefacts to the server, or the response back from it. With the WAF reconfigured not to block traffic from the machine running the GitLab Runner, the artefacts are successfully uploaded and the job succeeds.
This would have been significantly easier to diagnose if the logging from GitLab was better. As per my comment on this issue, it should be possible to see the content of the response from the GitLab server after uploading artefacts, even when the response code is 200.
What's strange - and made diagnosing the issue even harder - is that when I worked through the issue with the admin of our GitLab instance, digging through logs and running it in debug mode, the artefact upload process was uploading something successfully. We could see, for example, the GitLab Runner's log had been uploaded to the server. Clearly the WAF's blocking was selective and didn't block everything in both directions.

Why do I care which gitlab runner runs my pipeline

I have a gitlab-ci.yml file which I have inherited. And I have a local gitlab server running on my laptop. I have managed to create several gitlab runners and kickoff this inherited pipeline -- which gets immediately stuck. The error I am getting is:
...because you dont have any active runners online or available with any of these tags assigned to them: sometag
I have pieced together that the gitlab-ci.yml file references several tags and if there is a runner with a given tag, the runner will pickup my pipeline --- but why do I need this control (or hassle, more like it). What does it matter which runner runs my pipeline? Do i need to closely examine the gitlab-ci.yml file and based on that make some special runner for it ?
After I have modified my runners and gave them the missing tags, I am still getting the same error. Looking at the runner API results, the results do show that where it says "online" it shows "null". What does it mean? How do I make this runner "online"
There may be several runner, which will have different executors set up and thus, have different functionalities. So, the best practice is to give tags in gitlab-ci.yml file to run the jobs on particular runner.
In order to bring your runner online, you can go in the server where that particular gitlab runner is installed, and in the restart the gitlab-runner service using gitlab-runner restart with the user you have installed the runner or root user.
Sometimes, it might happen that you have changed the tags or added some tags to the runner using Gitlab UI but the same tags has not been saved in config.toml file. (config.toml file stores the gitlab-runner configurations. More details here https://docs.gitlab.com/runner/configuration/advanced-configuration.html)
So, in this particular case, you have to go the server where the gitlab-runner is installed, and modify the tags in config.toml file and then restart the gitlab-runner service. If everything goes well, you can see the runner is online in Gitlab UI.

How to interrupt triggered gitlab pipelines

I'm using a webhook to trigger my Gitlab pipeline. Sometimes, this trigger is triggered a bunch of times, but my pipelines only has to run the last one (static site generation). Right now, it will run as many pipelines as I have triggered. My pipelines takes 20 minutes so sometimes it's running the rest of the day, which is completely unnecessary.
https://docs.gitlab.com/ee/ci/yaml/#interruptible and https://docs.gitlab.com/ee/user/project/pipelines/settings.html#auto-cancel-pending-pipelines only work on pushed commits, not on triggers
A similar problem is discussed in gitlab-org/gitlab-foss issue 41560
Example of a use-case:
I want to always push the same Docker "image:tag", for example: "myapp:dev-CI". The idea is that "myapp:dev-CI" should always be the latest Docker image of the application that matches the HEAD of the develop branch.
However if 2 commits are pushed, then 2 pipelines are triggered and executed in paralell. Then the latest triggered pipeline often finishes before the oldest one.
As a consequence the pushed Docker image is not the latest one.
Proposition:
As a workaround for *nix you can get running pipelines from API and wait until they finished or cancel them with the same API.
In the example below script checks for running pipelines with lower id's for the same branch and sleeps.
jq package is required for this code to work.
Or:
Create a new runner instance
Configure it to run jobs marked as deploy with concurrency 1
Add the deploy tag to your CD job.
It's now impossible for two deploy jobs to run concurrently.
To guard against a situation where an older pipeline may run after a new one, add a check in your deploy job to exit if the current pipeline ID is less than the current deployment.
Slight modification:
For me, one slight change: I kept the global concurrency setting the same (8 runners on my machine so concurrency: 8).
But, I tagged one of the runners with deploy and added limit: 1 to its config.
I then updated my .gitlab-ci.yml to use the deploy tag in my deploy job.
Works perfectly: my code_tests job can run simultaneously on 7 runners but deploy is "single threaded" and any other deploy jobs go into pending state until that runner is freed up.

gitlab: Is there a way to http access an artifact during a job, rather than after?

I'm trying to run a pipeline with two stages. The first stage creates a zip file and the second stage executes a http curl POST for that file. If the curl succeeds, the pipeline is completed.
The problem is that gitlab only exposes the zip file AFTER the pipeline has completed - which means the zip file from the previous pipeline gets sent instead.
I've tried using artifacts and dependencies, but it seems the http url is only exposed for completed pipelines. I tried using the url of the specific job that executed the build stage, but it didn't work either.
Does anyone know how to access an artifact by URL, before pipeline completion?
i was unable to find a way to access the artifact remotely before the pipeline had completed. sad face.
i have a workaround though - i moved the deploy stage to a separate pipeline. so the first pipeline just executes the build (generating the artifacts) and then triggers the second pipeline. the second pipeline is then able to access the artifacts of the first pipeline and it just executes a deploy stage. happy face.