Docker Redis container orderly shutdown - redis

I am running redis-server in a Docker container on Ubuntu 14.10 x64. If I access the redis database via phpRedisAdmin, do a few edits and then get them to be saved to disk, shutdown the container and then restart it everything is fine - the edited redis keys are present and correct. However, if I edit keys and then shut down the container then restart it the edits do not stick.
Clearly, the dump.rdb file is not being saved automatically when the container is shutdown. I imagine that I could fix this by putting in an /etc/init.d script that is symlinked from /etc/rc6.d. However, I am wondering - why does shutting down a redis container not perform an orderly shutdown of the running process(es) in the container? After all, when I reboot my server (both the server & the container run Ubuntu 14.10) I do not have to explicitly commit the redis db changes to disk.

The main process in a Docker container will be sent a SIGTERM signal when you run docker stop -t N CONTAINER. The process should then begin to shut itself down cleanly. If after N seconds (10 by default) this still hasn't happened, Docker will use a SIGKILL signal, which will kill the process without giving it a chance to clean up. The reason you were having problems was probably because you simply weren't giving Redis long enough to shutdown cleanly.
It's important to note that only the main process in the container (PID 1) will be sent signals. This means that the main process must be responsible for shutting down any child processes in the container, or you can end up with zombie processes.
If you still have problems with redis not doing what you want on shutdown, you could wrap it in a script which acts as PID 1, catches the SIGTERM signal and does whatever tidying up you want (just make sure you do shutdown redis and any other processes you've started).

Related

Tekton sidecar: docker daemon failing to start

I have a Tekton pipeline that builds and pushes a Docker image to a private repository. The task that handles this uses a DinD sidecar. Originally, it worked just fine, but it's started failing with the error Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?. This was an intermittent error at first, but now it seems to be happening every time I try to run the pipeline. I tried making it wait until it can connect to the daemon, in case it was a timing issue, but it ends up just waiting forever. What might be preventing the Docker daemon from starting, or preventing the task from connecting to it?
Older Docker-DIND images used to create that socket file, a while ago. Nowadays, you would have to use a TCP socket.
See TektonCD samples to patch your Tasks: https://github.com/tektoncd/catalog/blob/main/task/docker-build/0.1/docker-build.yaml

How do I get rid of a zombie Celery worker?

I am running Celery with RabbitMQ backend.
Somehow I have ended up with what appears to be a zombie Celery worker. I see the worker in Flower, and in commands like celery inspect scheduled. But it references a PID that doesn't exist. There is no worker process. It is a big problem because Celery will delegate tasks to this worker, and they never get executed.
I believe what happened is the docker container within which this is running got shut down uncleanly. But now, even if I restart the docker container, this zombie worker always comes back. Always has the same name: celery#0357c65d991b.
The Celery docs say that to kill a worker you must send its process TERM. But I can't do that because there is no process. It's a zombie.
RabbitMQ must have a dangling reference to this worker. They only thing I could find in the RabbitMQ management interface is a queue named celery#0357c65d991b.celery.pidbox. I deleted this queue, but it simply reappeared a few seconds later.
Can anyone give me a pointer on where to look to get rid of this thing?

mod_wsgi DaemonProcess mode gets problem with httpd graceful reload

I'm using httpd -k graceful to dynamically reload my server, and I use time.sleep in python code to make a slow request, and I expected the active requests would't be interrupted after apache reload. But it did.
So I tried a simple python server using CGI, it works well. Then I tried mod_wsgi using apache process (only specifying WSGIScriptAlias), and it works well, too.
So I found that the problem is the WSGIDaemonProcess, which I originally used.
Then in the mod_wsgi doc I found this:
eviction-timeout=sss
When a daemon process is sent the graceful restart signal, usually SIGUSR1, to restart a process, this timeout controls how many seconds the process will wait, while still accepting new requests, before it reaches an idle state with no active requests and shutdown.
If this timeout is not specified, then the value of the graceful-timeout will instead be used. If the graceful-timeout is not specified, then the restart when sent the graceful restart signal will instead happen immediately, with the process being forcibly killed, if necessary, when the shutdown timeout has expired.
when I thought I'm going to find the reason, I found that these arguments(and i tried graceful-timeout too) didn't work at all.The requests were still interrupted by graceful reload. So why?
I'm using apache 2.4.6, with mpm mode prefork. And modwsgi 4.6.5, I compiled it myself and replaced my old-version mod_wsgi.so with it.
answer from GrahamDumpleton#Github: (https://github.com/GrahamDumpleton/mod_wsgi/issues/383)
What you are seeing is exactly as expected. Apache does not pass graceful restart signals onto managed sub processes, it only passes them onto its own child worker processes. For managed processes it will send a SIGTERM and it will brutally kill them after 3 or 5 seconds (can't remember exactly how long) if they haven't shutdown. There is no way around it. It is a limitation of Apache.
The eviction timeout thus only applies as the docs say to when a 'daemon process' is sent a graceful restart signal directly. That is, restarting Apache as a whole gracefully doesn't do anything, but send the graceful restart signal to the pid of the daemon processes themselves will.
So the only solution if this behaviour is important is to ensure you use display-name option to WSGIDaemonProcess directive so daemon processes named uniquely compared to Apache processes, and then send signals to them direct only.
Usually this only becomes an issue because some Linux systems completely ignore the fact that Apache has a perfectly good log file rotation system and instead do external log file rotation by renaming log files once a day and then attempting a graceful restart. People will see issues with interrupted requests they don't expect. In this case you should use Apache's own log file rotation mechanism if it is important and not rely on external log file rotation systems.

Kubernetes Apache2 Killed

I have a kubernetes cluster and I am getting cgroup out of memory. I have resources declared in the YAML but I have no idea which apache2 needs more memory. It gives me a process id but how do I tell which pod is being killed?
Thank you.
It is what it is. Your Apache process is using more memory than you are allowing in your pod/container definition.
Reasons why it could be needing more memory:
You have an increase in traffic and sessions being handled
Apache is forking more processes within the container running into memory limits.
Apache not reaping some lingering sessions because of a config issue.
If you are running Docker for containers (which most people do) you can ssh into the node in your cluster and run a:
docker ps -a
You should see the Exited container where your Apache process(es) was running. Then you can run:
docker logs <container-id>
And you might get details on why Apache was doing before it was killed. If you only see minimal info, I recommend increasing the verbosity of your Apache logs.
Hope it helps.

Is it recommended to run redis using Supervisor

Is it a good practice to run redis in production with Supervisor?
I've googled around, but haven't seen many examples of doing so. If not, what is the proper way of running redis in production?
I personally just use Monit on Redis in production. If Redis crash Monit will restart it but more importantly Monit will be able to monitor (and alert when a threeshold is reach) the amount of RAM that Redis currently takes (which is the biggest issue)
Configuration could be something like this (if maxmemory was set to 1Gb in Redis)
check process redis
with pidfile /var/run/redis.pid
start program = "/etc/init.d/redis-server start"
stop program = "/etc/init.d/redis-server stop"
if 10 restarts within 10 cycles
then timeout
if failed host 127.0.0.1 port 6379 then restart
if memory is greater than 1GB for 2 cycles then alert
Well..it depends. If I were do use redis under daemon control I would use runit. I do use monit but only for monitoring. I like to see the green light.
However, for redis to exploit the true power, you dont run redis as a deamon esp a master. If a master goes down, you will have to switch a slave to a master. Quit simply, I just shoot the node in the head and I have a chef recipe bring up a new node.
But then again....it also depends on how often you snapshot. I do not snapshot thus no need for deamon control.
People use reids for brute force speed. that means not writing to disk and keep all data in ram. If a node goes down...and you dont snapshot...data is lost.