Why is GitLab docker-windows executor so slow? - gitlab-ci

When I run a completely new git repository, with only README.md and .gitlab-ci.yml and using the standard shell executor in GitLab, the whole job takes 4 seconds. When I do the same using the docker-windows executor, it takes 33 seconds!
My .gitlab-ci.yml:
no_git_nor_submodules:
image: base_on_python36:ltsc2019
stage: build
tags:
- docker-windows
variables:
GIT_SUBMODULE_STRATEGY: none
GIT_STRATEGY: none
script:
- echo test
no_docker_no_git_nor_submodules:
stage: build
tags:
- normal_runner
variables:
GIT_SUBMODULE_STRATEGY: none
GIT_STRATEGY: none
script:
- echo test
One problem that I thought that it could be is that docker images on Windows tend to be huge. The one I've tested with here is 5.8 GB. When I start a container manually on the server, it just takes a few seconds for it to start. I have also tested with an even larger image, 36 GB, but it also takes around 33 seconds for a job using that image.
As these jobs doesn't do anything and doesn't have any git clone or submodules, what is it that takes time?
I know that GitLab uses a mysterious helper image for cloning the git repository and for other things like that. Could it be this image that makes it super slow to run?
Update 2019-11-04
I looked a bit more at this, using docker events. It showed that GitLab starts a total of 7 containers, 6 of their own helper image and one of the image that I've defined in .gitlab-ci.yml. Each of these docker containers take around 5 seconds to create, run, and destroy, so that explains the time. The only question now is if this is normal behavior for docker-windows executor, or if I have something set up the wrong way that makes this super slow.

Short answer: Docker on Windows has a high overhead when starting new containers and GitLab uses 7 containers per job.
I opened an issue on GitLab here, but I'll post part of my text from there here as well:
I looked a bit more at this now, and I think I have figured out at least part of what is going on. There's a command that you can run, docker events. This will print all command that are executed for docker, creating/destroying containers/volumes etc. I ran this command and then started a simple job using the docker-windows executor. The output is like this (cleaned up and filtered a bit):
2019-11-04T16:19:02.179255700+01:00 container create image=sha256:6aff8da9cd6b656b0ea3bd4e919c899fb4d62e5e8ac95b876eb4bfd340ed8345, name=runner-Q1iF4bKz-project-305-concurrent-0-predefined-0)
2019-11-04T16:19:07.217784200+01:00 container create image=sha256:6aff8da9cd6b656b0ea3bd4e919c899fb4d62e5e8ac95b876eb4bfd340ed8345, name=runner-Q1iF4bKz-project-305-concurrent-0-predefined-1)
2019-11-04T16:19:13.190800700+01:00 container create image=sha256:6aff8da9cd6b656b0ea3bd4e919c899fb4d62e5e8ac95b876eb4bfd340ed8345, name=runner-Q1iF4bKz-project-305-concurrent-0-predefined-2)
2019-11-04T16:19:18.183059500+01:00 container create image=sha256:6aff8da9cd6b656b0ea3bd4e919c899fb4d62e5e8ac95b876eb4bfd340ed8345, name=runner-Q1iF4bKz-project-305-concurrent-0-predefined-3)
2019-11-04T16:19:23.192798200+01:00 container create image=sha256:b024a0511db77bf777cee287927151584f49a4018798a2bb1aa31332b766cf14, name=runner-Q1iF4bKz-project-305-concurrent-0-build-4)
2019-11-04T16:19:26.221921000+01:00 container create image=sha256:6aff8da9cd6b656b0ea3bd4e919c899fb4d62e5e8ac95b876eb4bfd340ed8345, name=runner-Q1iF4bKz-project-305-concurrent-0-predefined-5)
2019-11-04T16:19:31.239818900+01:00 container create image=sha256:6aff8da9cd6b656b0ea3bd4e919c899fb4d62e5e8ac95b876eb4bfd340ed8345, name=runner-Q1iF4bKz-project-305-concurrent-0-predefined-6)
There are 7 containers created in total, 6 of which is the gitlab helper image. Notice how it is around 5 seconds per gitlab image helper created. 6 * 5 seconds = 30 seconds, so about the extra overhead that I've noticed.
I also tested the performance again 5 months ago and our shell executor takes 2 seconds to just echo a message. The docker executor takes 21 seconds for the same job. The overhead is less than it was two years ago, but still significant.

Related

Configure allowed_pull_policies on shared GitLab runner

I'm using GitLab.com's managed CI runners, and I'd like to run my CI jobs using the if-not-present pull policy to avoid the extra minutes it takes to pull the image for each job. Trying to set that value in the .gitlab-ci.yml file gives me this error:
pull_policy ([if-not-present]) defined in GitLab pipeline config is not one of the allowed_pull_policies ([always])
This led me to the config.toml settings for restricting Docker pull policies, so I created a config.toml file at the root of my repository and tried that. However, I still get the same error.
Is config.toml only available for manual/self-hosted runners? Is there any other way to get past this?
Context
Image selection in .gitlab-ci.yml:
default:
image:
name: registry.gitlab.com/myorg/myrepo/ci/builder:latest
pull_policy: if-not-present
Contents of config.toml:
[[runners]]
executor = "docker"
[runners.docker]
pull_policy = ["if-not-present"]
allowed_pull_policies = ["always", "if-not-present"]
First of all, the config.toml file is not meant to be in your repo but on the runner machine (or container).
But anyways, the always pull policy should not cause image pulls to last minutes if the layers are already cached locally: it just ensures you have the latest version by checking the metadata. If the pulls take minutes, it means that either the layers are not available locally, or the image was actually updated (or that the connection to your container registry is so incredibly slow that just checking the metadata takes minutes, but that is unlikely).
It is very possible that Gitlab's managed runners do not have a way to locally cache layers, and thus there would be no practical difference between the always and if-not-present policies. For instance if you use Gitlab Saas:
A dedicated temporary runner VM hosts and runs each CI job.
(see https://docs.gitlab.com/ee/ci/runners/index.html)
Thus the downloaded layers are discarded as soon as the job finishes.

How to interrupt triggered gitlab pipelines

I'm using a webhook to trigger my Gitlab pipeline. Sometimes, this trigger is triggered a bunch of times, but my pipelines only has to run the last one (static site generation). Right now, it will run as many pipelines as I have triggered. My pipelines takes 20 minutes so sometimes it's running the rest of the day, which is completely unnecessary.
https://docs.gitlab.com/ee/ci/yaml/#interruptible and https://docs.gitlab.com/ee/user/project/pipelines/settings.html#auto-cancel-pending-pipelines only work on pushed commits, not on triggers
A similar problem is discussed in gitlab-org/gitlab-foss issue 41560
Example of a use-case:
I want to always push the same Docker "image:tag", for example: "myapp:dev-CI". The idea is that "myapp:dev-CI" should always be the latest Docker image of the application that matches the HEAD of the develop branch.
However if 2 commits are pushed, then 2 pipelines are triggered and executed in paralell. Then the latest triggered pipeline often finishes before the oldest one.
As a consequence the pushed Docker image is not the latest one.
Proposition:
As a workaround for *nix you can get running pipelines from API and wait until they finished or cancel them with the same API.
In the example below script checks for running pipelines with lower id's for the same branch and sleeps.
jq package is required for this code to work.
Or:
Create a new runner instance
Configure it to run jobs marked as deploy with concurrency 1
Add the deploy tag to your CD job.
It's now impossible for two deploy jobs to run concurrently.
To guard against a situation where an older pipeline may run after a new one, add a check in your deploy job to exit if the current pipeline ID is less than the current deployment.
Slight modification:
For me, one slight change: I kept the global concurrency setting the same (8 runners on my machine so concurrency: 8).
But, I tagged one of the runners with deploy and added limit: 1 to its config.
I then updated my .gitlab-ci.yml to use the deploy tag in my deploy job.
Works perfectly: my code_tests job can run simultaneously on 7 runners but deploy is "single threaded" and any other deploy jobs go into pending state until that runner is freed up.

docker image not working or running properly

This is part of a major issue i've been fighting to get resolve in a span of 2 or even 3 weeks, first of all, i'm not a docker expert, in fact, i don't even know a thing about docker, all i know is that i need to use it in order to make a connection between an api in localhost and my app in react native, the thing is, i manage to make it work on another two projects i created to test docker, but not in the one i actually need to. This is a dockerfile for an api in .net core 2.2
my dockerfile is a combination of the code i found in stackoverflow and the example in docker documentation to create a docker in .net core, this specific file worked for me on another two api, one as a blank project, and the other one with a class library.
The code below shows the dockerfile, when i run the command line and create the image, it shows no errors, but i know there is something wrong, because when i run docker image ls, the docker image is around 200-300mb size, which seems way too small, and when i run that image with docker run... and check the list of docker containers runnning, it shows nothing
FROM mcr.microsoft.com/dotnet/core/sdk:2.2 AS build-env
WORKDIR /app
# Copy csproj and restore as distinct layers
WORKDIR /src
COPY ISARRHH.sln ./
COPY ISARRHH.BusinessGraph/*.csproj ./ISARRHH.BusinessGraph/
COPY ISARRHH.APIWeb/*.csproj ./ISARRHH.APIWeb/
RUN dotnet restore
# Copy everything else and build
COPY . ./
WORKDIR /src/ISARRHH.BusinessGraph
RUN dotnet publish -c Release -o /app
WORKDIR /src/ISARRHH.APIWeb
RUN dotnet publish -c Release -o /app
# Build runtime image
FROM mcr.microsoft.com/dotnet/core/aspnet:2.2
WORKDIR /app
COPY --from=build-env /app .
ENTRYPOINT ["dotnet", "isarrhh.dll"]
#######################################################
I want this bloody docker to work, this was the plan b on one of the modules i'm working on, and is giving me a headache, i managed to make it work on another project, i want it to work on this api which works with office 365 and sharepoint
EDIT: this is the project structure
ISARRHH (Solution)
|
|--ISARRHH.APIWeb (API)
| |_Dependencies
| |_Controllers
| |_Models
| |_Properties
| |_appsettings.json
| |_appsettings.Development.json
| |_Authentication.cs
| |_Configuration.cs
| |_Program.cs
| |_ProtectedApiCallHelper.cs
| |_PublicAppUsingUsernamePassword.cs
| |_SiteInformation.cs
| |_Startup.cs
| |_SiteInformation.cs
|
|--ISARRHH.BusinessGraph (Class Library)
| |_Dependencies
| |_UserGraph.cs
|
|--Solution Items
|_Dockerfile
|_.dockerignore
EDIT2: More information
REPOSITORY TAG IMAGE ID CREATED SIZE
isarrhh latest 67fc0628c921 13 minutes ago 268MB
according to this, the image was created succesfully apparently, but when i run it with
docker run -d -p 3001:80 ...
then i check with
docker container ls
i see no container running, also, when i check with the command you provided here
docker logs -t isachile
i get this:
MacBook: ISARRHH$ docker logs -t isachile
2019-07-31T18:49:22.553317346Z Did you mean to run dotnet SDK commands? Please install dotnet SDK from:
2019-07-31T18:49:22.553390430Z https://go.microsoft.com/fwlink/?LinkID=798306&clcid=0x409
EDIT 3: SOLVED IT -- SORT OF...
i manage to run my docker by manually copy and pasting ever file on a different project, each file individually copy and paste in this second project, and each time creating the docker image, yes, a seriously horrible and tedious process, but it worked, although, we're not considering this solution anymore, since the process is too slow for our scrum project, we need to connect react native to our localhost api, i still need an answer for this
So there's two things here, and neither necessarily indicates a problem with Docker or your Dockerfile.
Size is only 200-300MB
That's about right. You haven't indicated whether you're using Windows or Linux containers, but in either case, most of the weight comes simply from the .NET Core runtime. The whole point of containers is that the host OS is shared (unlike a VM where every VM gets its own separate OS installation). The only things coming from the base OS image are user-specific files and directories. The main system components are proxied to the host operating system. Long and short, I don't know what you're expecting here in terms of size, but honestly 200-300MB is a bit on the large size for an image. It's possible in many cases to package ASP.NET Core app images down to as little as 25MB-30MB, though if you include the full runtime, it's generally going to be closer to your 200-300MB.
The container isn't running.
All the means is that it exited. When the container is run, the entrypoint line will be called, which just starts up the ASP.NET Core app running in Kestrel. That of course runs Program.Main, since it's just a console app, after all. That in turn builds the web host and calls Run, which listens for TCP socket connections, keeping the app running, which therefore keeps the container running.
If the container isn't running, then the app exited. That could happen for different reasons, but the most likely cause is that a runtime exception was thrown during the web host build phase (i.e. something in Program or Startup is throwing an exception). Try running something like:
docker logs -t {container name}
And you'll probably see a stacktrace and exception there. Fix the issue accordingly.

How to time the execution in a container of Docker?

I want to time the execution of a process in a container of Docker.
I tried to time it, calculating FinishedAt - StartedAt from docker inspect, but it's not an exact time.
I don't want to execute time in a container.
How can I time it exactly?
EDIT:
The process I want to time is cmd parameter of docker create.
The following creates an image that, when run, will wait for two seconds. We then run the image, outputting the overall time. It shows that the process execution overhead is about 0.3 seconds.
Build an image
$ docker create -ti ubuntu:12.04 sleep 2
785e9a63629b10676672656bc8412840faa6f00fc83e521628b0f9ca9ba01e14
Time a container running the image
$ time docker start -i 785
real 0m2.329s
user 0m0.064s
sys 0m0.016s

Dockerfile : RUN results in a No op

I have a Dockerfile, in which im trying to run a deamon that starts a java process.
If I embed the script in the Dockerfile, like so.
RUN myscript.sh
When I run /bin/bash on the resulting container, I see no entries from jps.
However, I can easily embed the script as CMD in which case, when i issue
docker run asdfg
I see the process start normally.
So, my question is, when we start a background async process in a Dockerfile, is it always the case that its side effects will be excluded from the container?
Background-processes needs to be started at container-start, not image build.
So your script needs to run via CMD or ENTRYPOINT.
CMD or ENTRYPOINT can still be a script, containing multiple commands. But I would imagine in your case, if you want several background processes, that using example supervisord would be your best option.
Also, check out some already existing Dockerfiles to get an idea of how it all fits together.