Tekton sidecar: docker daemon failing to start - tekton

I have a Tekton pipeline that builds and pushes a Docker image to a private repository. The task that handles this uses a DinD sidecar. Originally, it worked just fine, but it's started failing with the error Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?. This was an intermittent error at first, but now it seems to be happening every time I try to run the pipeline. I tried making it wait until it can connect to the daemon, in case it was a timing issue, but it ends up just waiting forever. What might be preventing the Docker daemon from starting, or preventing the task from connecting to it?

Older Docker-DIND images used to create that socket file, a while ago. Nowadays, you would have to use a TCP socket.
See TektonCD samples to patch your Tasks: https://github.com/tektoncd/catalog/blob/main/task/docker-build/0.1/docker-build.yaml

Related

Dynatrace one agent in ecs fargate containers stops but application container is running

Am trying to install one agent in my ECS fargate task. Along with application container i have added another container definition for one agent with image as alpine:latest and used run time injection.
While running the task, initially the one agent container is in running state and after a minute it goes to stopped state same time application container will be in running state.
In dynatrace the same host is available and keeps recreating after 5-10mins frequently.
Actually the issue that I had was task was in draining status because of application issue due to which in dynatrace it keeps recreating... And the same time i used run time injection for my ECS fargate so once the binaries are downloaded and injected to volume, the one agent container definition will stop while the application container keeps running and injecting logs in dynatrace.
I have the same problem and connected via ssh to the cluster I saw that the agent needs to be privileged. The only thing that worked for me was sending traces and metrics through Opentelemetry.
https://aws-otel.github.io/docs/components/otlp-exporter
Alternative:
use sleep infinity in the command field of your oneAgent container.

Ansible playbook stops after loosing connection (even for few seconds) with ssh window of VM on which it is running?

My ansible playbook consist several task in it and I am running my ansible playbook on Virtual Machine. I am using ssh method to log in to VM and run the playbook. if my ssh window gets closed during the execution of any task (when internet connection is not stable and not reliable), the execution of ansible playbook stops as the ssh window already got closed.
It takes around 1 hour for My play book to run, and sometimes even if I loose internet connectivity for few seconds , the ssh terminal lost its connection and thus entire playbook stops. any idea how to make ansible script more redundant to avoid this problem ?
Thanks in advance !!
If you need to run a job on an external system that hangs for a long time and it is relevant that the task completes. It is extremly bad idea to run that job in the foreground.
It is not important that the task is Ansible or the connection is SSH. In every case you would always just "push" the command to the remote host and send it to background with something like "nohup" if available. The problem is of course the tree of processes. Your connection creates a process on the remote system and that creates the job you want to run. Is the connection gets lost, al subprocesses will be killed automatically by the OS.
So - under Windows - maybe use RDP to open a screen that stays available even after connection is lost or use something like Cygwin and nohup via SSH to change the hung up your process from the ssh session.
Or - when you need to run a playbook on that system install for example a AWX container and use that. There are many options based on your requirements, resources and administrative options.

Dynamically created Jenkins Slave using Jenkins Docker Plugin get removed in the middle of Job execution

I am using Jenkins-Docker-Pluginhttps://wiki.jenkins.io/display/JENKINS/Docker+Plugin to dynamically create containers and use them as Jenkins Slaves. This is working fine for some jobs. However for some longer running jobs (10mins >) docker container get removed in midway. Making job failed.
I have tried increasing various timeout options in plugin configuration, However no result. Can anyone please help.
I know I am quite late to post answer here. I am able to get the root cause of the issue. Problem was using two Jenkins instance with same Jenkins Home Directory. Seems Jenkins Docker plugin runs daemon to kill docker container associated with Jenkins Master. As we are running two Jenkins instance with same Jenkins Home directory (Copy of It) Docker containers started for CI work get deleted due to daemon of each other.

Kubernetes Apache2 Killed

I have a kubernetes cluster and I am getting cgroup out of memory. I have resources declared in the YAML but I have no idea which apache2 needs more memory. It gives me a process id but how do I tell which pod is being killed?
Thank you.
It is what it is. Your Apache process is using more memory than you are allowing in your pod/container definition.
Reasons why it could be needing more memory:
You have an increase in traffic and sessions being handled
Apache is forking more processes within the container running into memory limits.
Apache not reaping some lingering sessions because of a config issue.
If you are running Docker for containers (which most people do) you can ssh into the node in your cluster and run a:
docker ps -a
You should see the Exited container where your Apache process(es) was running. Then you can run:
docker logs <container-id>
And you might get details on why Apache was doing before it was killed. If you only see minimal info, I recommend increasing the verbosity of your Apache logs.
Hope it helps.

Docker Redis container orderly shutdown

I am running redis-server in a Docker container on Ubuntu 14.10 x64. If I access the redis database via phpRedisAdmin, do a few edits and then get them to be saved to disk, shutdown the container and then restart it everything is fine - the edited redis keys are present and correct. However, if I edit keys and then shut down the container then restart it the edits do not stick.
Clearly, the dump.rdb file is not being saved automatically when the container is shutdown. I imagine that I could fix this by putting in an /etc/init.d script that is symlinked from /etc/rc6.d. However, I am wondering - why does shutting down a redis container not perform an orderly shutdown of the running process(es) in the container? After all, when I reboot my server (both the server & the container run Ubuntu 14.10) I do not have to explicitly commit the redis db changes to disk.
The main process in a Docker container will be sent a SIGTERM signal when you run docker stop -t N CONTAINER. The process should then begin to shut itself down cleanly. If after N seconds (10 by default) this still hasn't happened, Docker will use a SIGKILL signal, which will kill the process without giving it a chance to clean up. The reason you were having problems was probably because you simply weren't giving Redis long enough to shutdown cleanly.
It's important to note that only the main process in the container (PID 1) will be sent signals. This means that the main process must be responsible for shutting down any child processes in the container, or you can end up with zombie processes.
If you still have problems with redis not doing what you want on shutdown, you could wrap it in a script which acts as PID 1, catches the SIGTERM signal and does whatever tidying up you want (just make sure you do shutdown redis and any other processes you've started).