Redis AOF rewrite issue, no new client connections - redis

My Redis instance has apparently stopped rewriting AOF file (it has grown to many Gbs). What is worse, it seems to stop serving new client connections (when connecting with redis-cli, connection goes through, but then it freezes on any command). That means that I cannot ask for BGREWRITEAOF. At the same time, existing connections are being served normally.
Log file doesn't show anything useful. redis-check-aof just reports that AOF file is not corrupted. I really don't want to restart the server as I don't know how long it's going to start with such a huge AOF file.
Is there a way to call AOF rewrite externally? Anything else I can do?

So I've just hit this Redis bug: https://github.com/antirez/redis/issues/2883 :(

Related

Can I prevent Redis from expiring any keys?

I'm importing an old Redis backup in order to browse its databases.
However upon launching Redis with the imported dump.rdb a lot of data is missing.
My guess is because Redis immediately expires all the old keys.
Is there a configuration setting to prevent Redis from expiring anything?
Redis doesn't do that, but you can try setting the server's clock to some time in the distant past before loading the RDB.

mod_wsgi DaemonProcess mode gets problem with httpd graceful reload

I'm using httpd -k graceful to dynamically reload my server, and I use time.sleep in python code to make a slow request, and I expected the active requests would't be interrupted after apache reload. But it did.
So I tried a simple python server using CGI, it works well. Then I tried mod_wsgi using apache process (only specifying WSGIScriptAlias), and it works well, too.
So I found that the problem is the WSGIDaemonProcess, which I originally used.
Then in the mod_wsgi doc I found this:
eviction-timeout=sss
When a daemon process is sent the graceful restart signal, usually SIGUSR1, to restart a process, this timeout controls how many seconds the process will wait, while still accepting new requests, before it reaches an idle state with no active requests and shutdown.
If this timeout is not specified, then the value of the graceful-timeout will instead be used. If the graceful-timeout is not specified, then the restart when sent the graceful restart signal will instead happen immediately, with the process being forcibly killed, if necessary, when the shutdown timeout has expired.
when I thought I'm going to find the reason, I found that these arguments(and i tried graceful-timeout too) didn't work at all.The requests were still interrupted by graceful reload. So why?
I'm using apache 2.4.6, with mpm mode prefork. And modwsgi 4.6.5, I compiled it myself and replaced my old-version mod_wsgi.so with it.
answer from GrahamDumpleton#Github: (https://github.com/GrahamDumpleton/mod_wsgi/issues/383)
What you are seeing is exactly as expected. Apache does not pass graceful restart signals onto managed sub processes, it only passes them onto its own child worker processes. For managed processes it will send a SIGTERM and it will brutally kill them after 3 or 5 seconds (can't remember exactly how long) if they haven't shutdown. There is no way around it. It is a limitation of Apache.
The eviction timeout thus only applies as the docs say to when a 'daemon process' is sent a graceful restart signal directly. That is, restarting Apache as a whole gracefully doesn't do anything, but send the graceful restart signal to the pid of the daemon processes themselves will.
So the only solution if this behaviour is important is to ensure you use display-name option to WSGIDaemonProcess directive so daemon processes named uniquely compared to Apache processes, and then send signals to them direct only.
Usually this only becomes an issue because some Linux systems completely ignore the fact that Apache has a perfectly good log file rotation system and instead do external log file rotation by renaming log files once a day and then attempting a graceful restart. People will see issues with interrupted requests they don't expect. In this case you should use Apache's own log file rotation mechanism if it is important and not rely on external log file rotation systems.

How to kill a single apache connection without a server restart?

Using the apache status module, you can see what connections are currently connected through apache. Without restarting the apache service, I would like to kill some of those connections.
A line from the status looks something like:
Srv PID ...
2-0 3326 ...
What is the best way to kill just one of these connections?
Can one, with impunity, in a shell just kill the PID shown from apache status?
Will this harm apache in some way if some of its child processes are manually killed?
Will it still be able to respawn new processes correctly?
Any strange side effects one should be aware of?
After having done this with impunity for a while, it appears that killing the process by using the PID given from apache status indeed is an effective and safe (at least as far as I can tell) way to kill an individual connection and keep the server alive.

How to use rotate_logs on a log file that is 80+gb's for RabbitMQ on windows server

I need to run rabbitmqctl rotate_logs on a rabbitmq log file that is over 80gb's in size. When I tried to run this the first time it froze rabbit and no messages could be received. The freeze lasted 20 mins before I had to kill the command and restart the rabbit server.
This is a production server and completing this in a small amount of time without losing messages or killing the broker would be optimal.
Would it be possible to shut down the service and move the current log file to another location and restart the service and then run the rotate_logs command?
I'm fairly new to rabbitmq and I am not sure what the best way to handle this would be.
This is installed on a windows 2008 server as a service for a heavy traffic production site (However the message queue has a small load and only affects the administrative side of things).
Any help or insight would be appreciated.
I ran into a similar situation, but with only about 4GB of log file instead of 80.
the workaround I used was pretty much what you suggested... stop the service, move the log file and restart the service as quickly as possible.
for me, specifically, instead of moving the file while the service was stopped i just renamed it. i also wrote a commandline script to do the work for me.
this allowed me to stop the service, rename the file and restart the service in a matter of seconds.
once the service was back up and running, i was free to move / rename / whatever the large log file as needed.

Using redis with logstash

I'm wondering what are the pros and cons of using redis as a broker in an infrastructure?
At the moment, all my agents are sending to a central NXLog server which proxies the requests to logstash --> ES.
What would I gain by using a redis server in between my nxlog collector and logstash? To me, it seems pointless as nxlog has already good mem and disk buffers in case logstash is down.
What would I gain?
Thank you
On a heavy load : calling ES (HTTP) directly can be dangerous and you can have problems if ES break down .
Redis can handle More (Much more) Write request and send it in asynch logic to ES(HTTP).
I started using redis because I felt that it would separat the input and the filter part.
At least during periodes in which I change the configuration a lot.
As you know if you change the logstash configuration you have to restart the thing. All clients (in my case via syslog) are doomed to reconnect to the logstash daemon when he is back in business.
By putting an indexer in front which holds the relativly static input configuration and pusing everything to redis I am able to restart logstash without causing hickups throughout the datacenter.
I encountered some issues, because our developers hadn't found time (yet) to reduce the amount of useless logs send to syslog, thus overflowing the server. Before we had logstash they overflowed the disk space for logs - more general issue though... :)
When used with Logstash, Redis acts as a message queue. You can have multiple writers and multiple readers.
By using Redis (or any other queueing service) allows you to scale Logstash horizontaly by adding more servers to the 'cluster'. This will not matter for small operations but can be extremely useful for larger installations.
When using Logstash with Redis, you can configure Redis to only store all the log entries in memory which would like a in memory queue (like memcache).
You mat come to the point where the number of logs sent will not be processed by Logstash and it can bring down your system on constant basis (observed in our environment).
If you feel Redis is an overhead for your disk, you can configure it to store all the logs in memory until they are processed by logstash.
As we built our ELK infrastructure, we originally had a lot of problems with the logstash indexer (reading from redis). Redis would back up and eventually die. I believe this was because, in the hope of not losing log files, redis was configured to persist the cache to disk once in a while. When the queue got "too large" (but still within available disk space), redis would die, taking all of the cached entries with it.
If this is the best redis can do, I wouldn't recommend it.
Fortunately, we were able to resolve the issues with the indexer, which typically kept the redis queue empty. We set our monitoring to alert quickly when the queue did back up, and it was a good sign that the indexer was unhappy again.
Hope that helps.