Redis service automatically stops after few minutes of running - redis

On my Ubuntu machine, redis server was running fine and suddenly it stops. After I started it, again it automatically stops after few minutes. So I start again, and so on. Why is this happening?
Here are the logs when I start redis:
21479:C 29 Apr 21:59:10.986 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
21479:C 29 Apr 21:59:10.987 # Redis version=4.0.9, bits=64, commit=00000000, modified=0, pid=21479, just started
21479:C 29 Apr 21:59:10.987 # Configuration loaded
21480:M 29 Apr 21:59:10.990 * Increased maximum number of open files to 10032 (it was originally set to 1024).
21480:M 29 Apr 21:59:10.991 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
21480:M 29 Apr 21:59:10.992 # Server initialized
21480:M 29 Apr 21:59:14.588 * DB loaded from disk: 3.596 seconds
21480:M 29 Apr 21:59:14.591 * Ready to accept connections

Related

Redis service crashes with "Failed opening the RDB file systemdd (in server root dir /etc/cron.d) for saving: Permission denied"

I am running Redis server version 6.0.6 on Ubuntu 20.04. The process is run by "redis" user.
Sometimes, the Redis process crashes and gets restarted on its own and when this happens, lot of data cached in Redis becomes unavailable. This happens every few days/weeks. I can see the following messages in the logs - saving was working fine till 2:32:43 and suddenly failed at 2:34:15:
133121:C 23 Jun 2021 02:27:54.383 * RDB: 22 MB of memory used by copy-on-write
105798:M 23 Jun 2021 02:27:54.511 * Background saving terminated with success
105798:M 23 Jun 2021 02:29:46.279 * 10000 changes in 60 seconds. Saving...
105798:M 23 Jun 2021 02:29:46.354 * Background saving started by pid 133125
133125:C 23 Jun 2021 02:30:16.363 * DB saved on disk
133125:C 23 Jun 2021 02:30:16.464 * RDB: 18 MB of memory used by copy-on-write
105798:M 23 Jun 2021 02:30:16.583 * Background saving terminated with success
105798:M 23 Jun 2021 02:32:14.138 * 10000 changes in 60 seconds. Saving...
105798:M 23 Jun 2021 02:32:14.222 * Background saving started by pid 133131
133131:C 23 Jun 2021 02:32:42.924 * DB saved on disk
133131:C 23 Jun 2021 02:32:42.988 * RDB: 22 MB of memory used by copy-on-write
105798:M 23 Jun 2021 02:32:43.123 * Background saving terminated with success
105798:M 23 Jun 2021 02:34:14.958 * DB saved on disk
105798:M 23 Jun 2021 02:34:15.705 # Failed opening the RDB file systemdd (in server root dir /etc/cron.d) for saving: Permission denied
=== REDIS BUG REPORT START: Cut & paste starting from here ===
105798:M 23 Jun 2021 02:34:15.705 # Redis 6.0.6 crashed by signal: 11
105798:M 23 Jun 2021 02:34:15.705 # Crashed running the instruction at: 0x55f2e7e35099
105798:M 23 Jun 2021 02:34:15.705 # Accessing address: 0x149968
105798:M 23 Jun 2021 02:34:15.705 # Failed assertion: <no assertion failed> (<no file>:0)
------ STACK TRACE ------
EIP:
/usr/bin/redis-server 172.16.106.88:6379(je_malloc_usable_size+0x89)[0x55f2e7e35099]
Backtrace:
/usr/bin/redis-server 172.16.106.88:6379(logStackTrace+0x4f)[0x55f2e7db2bcf]
/usr/bin/redis-server 172.16.106.88:6379(sigsegvHandler+0xb5)[0x55f2e7db33d5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fb934c173c0]
/usr/bin/redis-server 172.16.106.88:6379(je_malloc_usable_size+0x89)[0x55f2e7e35099]
/usr/bin/redis-server 172.16.106.88:6379(+0x50b79)[0x55f2e7d72b79]
/usr/bin/redis-server 172.16.106.88:6379(rdbSave+0x2ba)[0x55f2e7d9345a]
/usr/bin/redis-server 172.16.106.88:6379(saveCommand+0x67)[0x55f2e7d94ab7]
/usr/bin/redis-server 172.16.106.88:6379(call+0xb1)[0x55f2e7d6a8b1]
/usr/bin/redis-server 172.16.106.88:6379(processCommand+0x4a6)[0x55f2e7d6b446]
/usr/bin/redis-server 172.16.106.88:6379(processCommandAndResetClient+0x14)[0x55f2e7d799e4]
/usr/bin/redis-server 172.16.106.88:6379(processInputBuffer+0x18f)[0x55f2e7d7e39f]
/usr/bin/redis-server 172.16.106.88:6379(+0xe10ac)[0x55f2e7e030ac]
/usr/bin/redis-server 172.16.106.88:6379(aeProcessEvents+0x303)[0x55f2e7d63b83]
/usr/bin/redis-server 172.16.106.88:6379(aeMain+0x1d)[0x55f2e7d63ebd]
/usr/bin/redis-server 172.16.106.88:6379(main+0x4e5)[0x55f2e7d603d5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fb934a370b3]
/usr/bin/redis-server 172.16.106.88:6379(_start+0x2e)[0x55f2e7d606ae]
The service restarts on its own and the Redis server starts working fine for a few days/weeks and crashes again with the same error!
I have checked several posts in SO, but none of them resolve my issue since:
a) The instance where the Redis server is running is in a private network (public access is disabled).
b) The DB file name and dir have not been corrupted as observed from "config get dbfilename" and "config get dir" commands. They show the default values.
c) The permissions of the directories are correct (/var/lib/redis is owned by redis with 755 permissions and /var/lib/redis/dump.rdb is owned by redis with 660 permissions)
Can anyone help me identify the root cause of this issue please?

Redis crashing without any log errors

I'm debugging some weird behavior in my redis, where it's crashing each 2 days more or less, but not showing any errors whatsoever, only this on the logs:
1:C 10 Sep 2020 15:44:14.517 # Configuration loaded
1:M 10 Sep 2020 15:44:14.522 * Running mode=standalone, port=6379.
1:M 10 Sep 2020 15:44:14.522 # Server initialized
1:M 10 Sep 2020 15:44:14.524 * Ready to accept connections
1:C 12 Sep 2020 13:20:23.751 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 12 Sep 2020 13:20:23.751 # Redis version=6.0.5, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 12 Sep 2020 13:20:23.751 # Configuration loaded
1:M 12 Sep 2020 13:20:23.757 * Running mode=standalone, port=6379.
1:M 12 Sep 2020 13:20:23.757 # Server initialized
1:M 12 Sep 2020 13:20:23.758 * Ready to accept connections
That's all redis says to me.
I have lots of RAM available, but I have redis running as a single instance on a docker container, could the lack of processing power cause this? Should I use multiple nodes? I don't want to setup a cluster just to find out the problem was another, how can I trace down the actually cause of the problem?
So, in the end, it was exactly what I thought it was not: a memory leak!
I had 16GB that was slowly being consumed until redis crashed with no warnings, nor the operating system/docker. I fixed the app that caused the leak and the problem was gone.

Redis timeout with almost no data in the database, using the .NET client

I received this error:
StackExchange.Redis.RedisTimeoutException: Timeout performing GET (5000ms),
next: GET RetryCount, inst: 3, qu: 0, qs: 1, aw: False, rs: ReadAsync, ws: Idle, in: 7, in-pipe: 0, out-pipe: 0,
serverEndpoint: redis:6379, mc: 1/1/0, mgr: 10 of 10 available, clientName: 18745af38fec,
IOCP: (Busy=0,Free=1000,Min=1,Max=1000),
WORKER: (Busy=6,Free=32761,Min=1,Max=32767), v: 2.1.58.34321
(Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
We can see that there is only a single message in the queue (qs=1) and that there are only 7 bytes waiting to be read (in=7). Redis is used by 2 processes and holds settings for the system and store logs.
It was a re-install so no logs were written and the database has probably 2-3kb of data :)
This is the only output from Redis:
1:C 12 Sep 2020 15:20:49.293 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 12 Sep 2020 15:20:49.293 # Redis version=6.0.8, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 12 Sep 2020 15:20:49.293 # Configuration loaded
1:M 12 Sep 2020 15:20:49.296 * Running mode=standalone, port=6379.
1:M 12 Sep 2020 15:20:49.296 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 12 Sep 2020 15:20:49.296 # Server initialized
1:M 12 Sep 2020 15:20:49.296 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memor
y=1' for this to take effect.
1:M 12 Sep 2020 15:20:49.296 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepag
e/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
1:M 12 Sep 2020 15:20:49.305 * DB loaded from append only file: 0.000 seconds
1:M 12 Sep 2020 15:20:49.305 * Ready to accept connections
so it looks like nothing went wrong on that side.
The 2 processes accessing it are in docker containers, so does Redis. All on a single AWS instance with a lot of ram and disk available.
this is also a one time event, it has never happened before with the same config.
I'm not very experienced with Redis; is there anything in the error message that would look suspicious?

Cannot restart redis-sentinel unit

I'm trying to configure 3 Redis instances and 6 sentinels (3 of them running on the Redises and the rest are on the different hosts). But when I install redis-sentinel package and put my configuration under /etc/redis/sentinel.conf and restart the service using systemctl restart redis-sentinel I get this error:
Job for redis-sentinel.service failed because a timeout was exceeded.
See "systemctl status redis-sentinel.service" and "journalctl -xe" for details.
Here is the output of journalctl -u redis-sentinel:
Jan 01 08:07:07 redis1 systemd[1]: Starting Advanced key-value store...
Jan 01 08:07:07 redis1 redis-sentinel[16269]: 16269:X 01 Jan 2020 08:07:07.263 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
Jan 01 08:07:07 redis1 redis-sentinel[16269]: 16269:X 01 Jan 2020 08:07:07.263 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=16269, just started
Jan 01 08:07:07 redis1 redis-sentinel[16269]: 16269:X 01 Jan 2020 08:07:07.263 # Configuration loaded
Jan 01 08:07:07 redis1 systemd[1]: redis-sentinel.service: Can't open PID file /var/run/sentinel/redis-sentinel.pid (yet?) after start: No such file or directory
Jan 01 08:08:37 redis1 systemd[1]: redis-sentinel.service: Start operation timed out. Terminating.
Jan 01 08:08:37 redis1 systemd[1]: redis-sentinel.service: Failed with result 'timeout'.
Jan 01 08:08:37 redis1 systemd[1]: Failed to start Advanced key-value store.
Jan 01 08:08:37 redis1 systemd[1]: redis-sentinel.service: Service hold-off time over, scheduling restart.
Jan 01 08:08:37 redis1 systemd[1]: redis-sentinel.service: Scheduled restart job, restart counter is at 5.
Jan 01 08:08:37 redis1 systemd[1]: Stopped Advanced key-value store.
Jan 01 08:08:37 redis1 systemd[1]: Starting Advanced key-value store...
Jan 01 08:08:37 redis1 redis-sentinel[16307]: 16307:X 01 Jan 2020 08:08:37.738 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
Jan 01 08:08:37 redis1 redis-sentinel[16307]: 16307:X 01 Jan 2020 08:08:37.739 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=16307, just started
Jan 01 08:08:37 redis1 redis-sentinel[16307]: 16307:X 01 Jan 2020 08:08:37.739 # Configuration loaded
Jan 01 08:08:37 redis1 systemd[1]: redis-sentinel.service: Can't open PID file /var/run/sentinel/redis-sentinel.pid (yet?) after start: No such file or directory
and my sentinel.conf file:
port 26379
daemonize yes
sentinel myid 851994c7364e2138e03ee1cd346fbdc4f1404e4c
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 172.28.128.11 6379 2
sentinel down-after-milliseconds mymaster 5000
# Generated by CONFIG REWRITE
dir "/"
protected-mode no
sentinel failover-timeout mymaster 60000
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
sentinel current-epoch 0
If you are trying to run your Redis servers on Debian based distribution, add below to your Redis configurations:
pidfile /var/run/redis/redis-sentinel.pid to /etc/redis/sentinel.conf
pidfile /var/run/redis/redis-server.pid to /etc/redis/redis.conf
What's the output in the sentinel log file?
I had a similar issue where Sentinel received a lot of sigterms.
In that case you need to make sure that if you use the daemonize yes setting, the systemd unit file must be using Type=forking.
Also make sure that the location of the PID file specified in the sentinel config matches the location specified in the systemd unit file.
If you face below error in journalctl or systemctl logs,
Jun 26 10:13:02 x systemd[1]: redis-server.service: Failed with result 'exit-code'.
Jun 26 10:13:02 x systemd[1]: redis-server.service: Scheduled restart job, restart counter is at 5.
Jun 26 10:13:02 x systemd[1]: Stopped Advanced key-value store.
Jun 26 10:13:02 x systemd[1]: redis-server.service: Start request repeated too quickly.
Jun 26 10:13:02 x systemd[1]: redis-server.service: Failed with result 'exit-code'.
Jun 26 10:13:02 x systemd[1]: Failed to start Advanced key-value store.
Then check /var/log/redis/redis-server.log for more information
In most cases issue is mentioned there.
i.e if a dump.rdb file is placed in /var/lib/redis then the issue might be with database count or redis version.
or in another scenario disabled IPV6 might be the issue.

redis-master slave setup failing

I have started server with port 6001 as master with persistence aof turned off,slave with port 6002 as master of 6001.However on startup of slave i am getting below error in infinite loop also note able to find any error logs of the same..
Slave infinite loop logs :
[5556] 20 Aug 21:34:28.499 # Server started, Redis version 3.2.100
[5556] 20 Aug 21:34:28.500 * DB loaded from disk: 0.001 seconds
[5556] 20 Aug 21:34:28.500 * The server is now ready to accept connections on port 6002
[5556] 20 Aug 21:34:28.501 * Connecting to MASTER localhost:6001
[5556] 20 Aug 21:34:28.513 * MASTER <-> SLAVE sync started
[5556] 20 Aug 21:34:29.513 * Non blocking connect for SYNC fired the event.
[5556] 20 Aug 21:34:29.513 # Sending command to master in replication handshake: -Writing to master: Unknown error
[5556] 20 Aug 21:34:29.516 * Connecting to MASTER localhost:6001
[5556] 20 Aug 21:34:29.517 * MASTER <-> SLAVE sync started
Issue resolved,redis.conf contained 127.0.0.1 as bind value,and from slave redis.conf file ,I had SLAVE OF localhost .Replacing localhost with 127.0.0.1 resolved the issue