Redis - why is redis-server decreasing in memory? - redis

I'm running Redis on windows and have noticed that the size of redis-server.exe decreases over time. When I open Redis, it reads from a dump file and loads all of the hashed key values into memory to about 1.4 GB. However, over time, the amount of memory that redis-server.exe takes up decreases. I have seen it go down to less than 100 MB.
The only reason that I could see this happening is that the keys are expiring and leaving memory, however I have set Redis up so that they never expire. I have also made sure that I have given enough memory.
Some of my settings include:
maxmemory 2gb
maxmemory-policy noeviction
hash-max-zipmap-entries 512
hash-max-zipmap-value 64
activerehashing no
If it's of interest, when I first loaded the keys into Redis, I did it through Python like so:
r.hset(key, field, value)
Any help would be appreciated. I want the keys to be there forever.
This is my output from the INFO command right after I first run it:
redis 127.0.0.1:6379> INFO
redis_version:2.4.6
redis_git_sha1:26cdd13a
redis_git_dirty:0
arch_bits:64
multiplexing_api:winsock2
gcc_version:4.6.1
process_id:9092
uptime_in_seconds:69
uptime_in_days:0
lru_clock:248011
used_cpu_sys:3.34
used_cpu_user:10.06
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
connected_clients:1
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:1129560232
used_memory_human:1.05G
used_memory_rss:1129560232
used_memory_peak:1129560144
used_memory_peak_human:1.05G
mem_fragmentation_ratio:1.00
mem_allocator:libc
loading:0
aof_enabled:0
changes_since_last_save:0
bgsave_in_progress:0
last_save_time:1386600366
bgrewriteaof_in_progress:0
total_connections_received:1
total_commands_processed:0
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
vm_enabled:0
role:master
db0:keys=4007989,expires=0
After I run it when I noticed the memory has decreased in Windows Task Manager, there are not many differences:
uptime_in_seconds:4412 (from 69)
lru_clock:248445 (from 248011)
used_cpu_sys:4.59 (from 3.34)
used_cpu_user:10.25 (from 10.06)
used_memory:1129561240 (from 1129560232)
used_memory_human:1.05G (same!)
used_memory_rss:1129561240 (from 1129560232)
used_memory_peak:1129568960 (from 1129560144)
used_memory_peak_human:1.05G (same!)
mem_fragmentation_ratio:1.00 (same!)
last_save_time:1386600366 (same!)
total_connections_received:4 (from 1)
total_commands_processed:10 (from 0)
expired_keys:0 (same!)
evicted_keys:0 (same!)
keyspace_hits:0 (same!)
keyspace_misses:2 (from 0)
The lookups are taking longer when the memory size is lower. What is going on here?

What version of Redis are you using ?
Do you have a cron of some sort that removes key ? (do a grep on the del command on your codebase just to be sure)

Redis usually runs a single process to manage the in-memory data. However, when the data is persisted to the RDB file, a second process starts to save all the data. During that process, you can see your memory use increase up to double the size of your data set.
I am familiar with how it is done in linux, but I don't know the details about the windows port, so maybe the differences in size you are seeing are because of this second process that is launched periodically? You can easily try if this is the case by issuing a BGSAVE command in redis. This will start the synchronization of data to RDB on the background, so you can see if the memory usage pattern is the one you observed.
If that is the case, then you already know what is going on :)
good luck

Related

"OOM command not allowed when used memory > 'maxmemory'" for an Amazon ElastiCache Redis

I'm getting "OOM command not allowed when used memory > 'maxmemory'" error from time to time when trying to insert into an Elasticache redis node.
I went from a self-managed redis instance (maxmemory = 12Go, maxmemory-policy = allkeys-lru) to an Elasticache redis one (r5.large, i.e. maxmemory = 14 Go, maxmemory-policy = allkeys-lru).
However, after the migration of keys I'm getting "OOM command not allowed when used memory > 'maxmemory'" error from time to time that I don't manage to understand.
I've checked what they recommend here: https://aws.amazon.com/premiumsupport/knowledge-center/oom-command-not-allowed-redis/ to solve the problem but so far:
I have a TTL on all keys
I'm already in allkeys-lru
When I look at the node freeable memory I have about 7Go available
Here is the output when INFO memory
# Memory
used_memory:10526693040
used_memory_human:9.80G
used_memory_rss:11520012288
used_memory_rss_human:10.73G
used_memory_peak:10560011952
used_memory_peak_human:9.83G
used_memory_peak_perc:99.68%
used_memory_overhead:201133315
used_memory_startup:4203584
used_memory_dataset:10325559725
used_memory_dataset_perc:98.13%
allocator_allocated:10527575720
allocator_active:11510194176
allocator_resident:11667750912
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:10527885773
maxmemory_human:9.80G
maxmemory_policy:allkeys-lru
allocator_frag_ratio:1.09
allocator_frag_bytes:982618456
allocator_rss_ratio:1.01
allocator_rss_bytes:157556736
rss_overhead_ratio:0.99
rss_overhead_bytes:-147738624
mem_fragmentation_ratio:1.09
mem_fragmentation_bytes:993361528
mem_not_counted_for_evict:0
mem_replication_backlog:1048576
mem_clients_slaves:0
mem_clients_normal:153411
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
If you have any clue for how to solve this.
Thanks!

Why received ZFS dataset uses less space than original?

I have a dataset on the server1 that I want to back up to the second server2.
Server1 (original):
zfs list -o name,used,avail,refer,creation,usedds,usedsnap,origin,compression,compressratio,refcompressratio,mounted,atime,lused storage/iscsi/webhost-old produces:
NAME USED AVAIL REFER CREATION USEDDS USEDSNAP ORIGIN COMPRESS RATIO REFRATIO MOUNTED ATIME LUSED
storage/iscsi/webhost-old 67,8G 1,87T 67,8G Út kvě 31 6:54 2016 67,8G 16K - lz4 1.00x 1.00x - - 67,4G
Sending volume to the 2nd server:
zfs send storage/iscsi/webhost-old | pv | ssh -c arcfour,aes128-gcm#openssh.com root#10.0.0.2 zfs receive -Fduv pool/bkp-storage
received 69,6GB stream in 378 seconds (189MB/sec)
Server2 zfs list produces:
NAME USED AVAIL REFER CREATION USEDDS USEDSNAP ORIGIN COMPRESS RATIO REFRATIO MOUNTED ATIME LUSED
pool/bkp-storage/iscsi/webhost-old 36,1G 3,01T 36,1G Pá pro 29 10:25 2017 36,1G 0 - lz4 1.15x 1.15x - - 28,4G
Why is there such a difference in sizes? Thanks.
From what you posted, I noticed 3 things that seemed odd:
the compressratio is 1.15x on system 2, but 1.00x on system 1
on system 2, used is 1.27x higher than logicalused
the logicalused and the number zfs receive report are ~2.3x higher on system 1 than system 2
These terms are all defined in the man page, but are still confusing to reverse-engineer explanations for in practice.
(1) could happen if you enabled compression on the source dataset after you wrote all the data to it, since ZFS doesn't rewrite the data to compress it when you enable that setting. The data sent by zfs send is uncompressed unless you use -c, but system 2 will try to compress it as it runs zfs receive if the setting is enabled on the destination dataset. If both system 1 and system 2 had the same compression settings before the data was written, they would have the same compressratio as well.
(2) can happen due to metadata written along with your data, but in this case it's too high for "normal" metadata, which accounts for 1-2% of most pools. It's probably caused by a pool-wide setting, like configuring RAID-Z, or a weird combination of striping and mirroring (like 4 stripes, but with one of them being a mirror).
For (3), I re-read the man page to try to figure it out:
logicalused
The amount of space that is "logically" consumed by this dataset and
all its descendents. See the used property. The logical space
ignores the effect of the compression and copies properties, giving a
quantity closer to the amount of data that applications see.
If you were sending a dataset (instead of a single iSCSI volume) and the send size matched system 2's logicalused value (instead of system 1's), I would guess you forgot to send some child datasets (i.e. by using zfs send -R). However, neither of those are true in this case.
I had to do some additional digging -- this blog post from 2005 might contain the explanation. If system 1 didn't have compression enabled when the data was written (like I guessed above for (1)), the function responsible for not writing zeroed-out blocks (zio_compress_data) would not be run, so you probably have a bunch of empty blocks written to disk, and accounted for in the logicalused size. However, since lz4 is configured on system 2, it would run there, and those blocks would not be counted.

Erlang VM killed when creating millions of processes

So after Joe Armstrongs' claims that erlang processes are cheap and vm can handle millions of them. I decided to test it on my machine:
process_galore(N)->
io:format("process limit: ~p~n", [erlang:system_info(process_limit)]),
statistics(runtime),
statistics(wall_clock),
L = for(0, N, fun()-> spawn(fun() -> wait() end) end),
{_, Rt} = statistics(runtime),
{_, Wt} = statistics(wall_clock),
lists:foreach(fun(Pid)-> Pid ! die end, L),
io:format("Processes created: ~p~n
Run time ms: ~p~n
Wall time ms: ~p~n
Average run time: ~p microseconds!~n", [N, Rt, Wt, (Rt/N)*1000]).
wait()->
receive die ->
done
end.
for(N, N, _)->
[];
for(I, N, Fun) when I < N ->
[Fun()|for(I+1, N, Fun)].
Results are impressive for million processes - I get aprox 6.6 micro! seconds average spawn time. But when starting 3m processes, OS shell prints "Killed" with erlang runtime gone.
I run erl with +P 5000000 flag, system is: arch linux with quadcore i7 and 8GB ram.
Erlang processes are cheap, but they're not free. Erlang processes spawned by spawn use 338 words of memory, which is 2704 bytes on a 64 bit system. Spawning 3 million processes will use at least 8112 MB of RAM, not counting the overhead of creating the linked list of pids and the anonymous function created for each process (I'm not sure if they're shared if they're created like you're creating.) You'll probably need 10-12GB of free RAM to spawn and keep alive 3 million (almost) empty processes.
As I pointed out in the comments (and you later verified), the "Killed" message was printed by the Linux Kernel when it killed the Erlang VM, most likely for using up too much RAM. More information here.

Reset Redis "used_memory_peak" stat

I'm using Redis (2.4.2) and with the INFO command I can read stats about my Redis server.
There are many stats, including some about how much memory is used. And one is "used_memory_peak" that seems to hold the maximum amount of memory Redis has ever taken.
I've deleted a bunch of key, and I'd like to reset this stat since it affects the scale of my Munin graphs.
There is a CONFIG RESETSTAT command, but it doesn't seem to affect this particular stat.
Any idea how I could do this, without having to export/delete/import my dataset ?
EDIT :
According to #antirez himself (issue 369 on GitHub), this is an intended behavior, but it this feature could be improved to be more useful in a future release.
The implementation of CONFIG RESETSTAT is quite simple:
} else if (!strcasecmp(c->argv[1]->ptr,"resetstat")) {
if (c->argc != 2) goto badarity;
server.stat_keyspace_hits = 0;
server.stat_keyspace_misses = 0;
server.stat_numcommands = 0;
server.stat_numconnections = 0;
server.stat_expiredkeys = 0;
addReply(c,shared.ok);
So it does not initialize the server.stat_peak_memory field used to store the maximum amount of memory ever used by Redis. I don't know if it is a bug or a feature.
Here is a hack to reset the value without having to stop Redis. The idea is to use gdb in batch mode to just change the value of the variable (which is part of a static structure). Normally Redis is compiled with debugging symbols.
# Here we have plenty of things in this instance
> ./redis-cli info | grep peak
used_memory_peak:1363052184
used_memory_peak_human:1.27G
# Let's do some cleaning: everything is wiped out
# don't do this in production !!!
> ./redis-cli flushdb
OK
# Again the same values, while some memory has been freed
> ./redis-cli info | grep peak
used_memory_peak:1363052184
used_memory_peak_human:1.27G
# Here is the magic command: reset the parameter with gdb (output and warnings to be ignored)
> gdb -batch -n -ex 'set variable server.stat_peak_memory = 0' ./redis-server `pidof redis-server`
Missing separate debuginfo for /lib64/libm.so.6
Missing separate debuginfo for /lib64/libdl.so.2
Missing separate debuginfo for /lib64/libpthread.so.0
[Thread debugging using libthread_db enabled]
[New Thread 0x41001940 (LWP 22837)]
[New Thread 0x40800940 (LWP 22836)]
Missing separate debuginfo for /lib64/libc.so.6
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff51ff000
0x00002af0b5eef218 in epoll_wait () from /lib64/libc.so.6
# And now, result is different: great !
> ./redis-cli info | grep peak
used_memory_peak:718768
used_memory_peak_human:701.92K
This is a hack: think twice before applying this trick on a production instance.
Simple trick to clear peal memory::
Step 1:
/home/logproc/redis/bin/redis-cli BGREWRITEAOF
wait till it finish rewriting aof file.
Step 2:
restart redis db
Done. Thats It.

Where is the default max locked memory value coming from?

So on one system, I have values that are pretty wide open:
$ ulimit -a | grep mem
max locked memory (kbytes, -l) 40000
max memory size (kbytes, -m) unlimited
virtual memory (kbytes, -v) unlimited
Another system has much more limiting values, but I can't for the life of me find out where the 32MB upper limit (it is 32MB despite the mislabling) is being set:
# ulimit -a | grep mem
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
virtual memory (kbytes, -v) unlimited
The second system is a RHEL 5.5 box. I am looking to increase this limit for at least one user- I need a bigger APC mmap memory allocation, but I can't go above 30 MB without running into the above limit, and I would rather not hack the provided apache init script. Where should I be trying to override the system default value so I can map a bigger segment of memory? Doing it in limits.conf for the apache user doesn't do a whole lot; probably because the init script doesn't do anything through PAM.
If the user granularity setting you tried isn't working, you should make sure that's you've correctly matched which user is hitting the limit.
You should also be able to add a line like this to limits.conf:
* hard memlock 40000
That'll change the default setting for all users.
From the limits.conf manpage:
The syntax of the lines is as follows:
<domain> <type> <item> <value>
The fields listed above should be filled as follows:
<domain>
[snip]
· the wildcard *, for default entry.