I want to understand when a cache is created with native persistence enabled, will it store the data in the defined data region/RAM and in the disk at the same time? Is there any way I can restrict the disk utilization for storing the data?
Additionally, in a cluster of 3 due to any reason the disk got full for one of the nodes and there is not enough memory available, what will be the impact on the cluster?
Yes, data will be stored both in RAM and on the disk. I does not have to fit in RAM at the same time.
If you run out of disk space, your persistent store will likely be corrupted.
I setup a Redis (version 4.0.6) sentinel cluster in two centos 6 VMs. Both master and slave Redis server has maxmemory set to 10GB and maxmemory_policy as volatile-lru.
The problem is, both servers are taking alot of memory.
Master
used_memory:8959732536
used_memory_human:8.34G
used_memory_rss:14763728896
used_memory_rss_human:13.75G
used_memory_peak:10002148536
used_memory_peak_human:9.32G
used_memory_peak_perc:89.58%
used_memory_overhead:1344839894
used_memory_startup:761776
used_memory_dataset:7614892642
used_memory_dataset_perc:85.00%
total_system_memory:20957556736
total_system_memory_human:19.52G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:10000000000
maxmemory_human:9.31G
maxmemory_policy:volatile-lru
mem_fragmentation_ratio:1.65
mem_allocator:jemalloc-3.6.0
active_defrag_running:0
lazyfree_pending_objects:0
Slave
used_memory:8927665872
used_memory_human:8.31G
used_memory_rss:16422535168
used_memory_rss_human:15.29G
used_memory_peak:10000009472
used_memory_peak_human:9.31G
used_memory_peak_perc:89.28%
used_memory_overhead:1340505548
used_memory_startup:761792
used_memory_dataset:7587160324
used_memory_dataset_perc:84.99%
total_system_memory:20957556736
total_system_memory_human:19.52G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:10000000000
maxmemory_human:9.31G
maxmemory_policy:volatile-lru
mem_fragmentation_ratio:1.84
mem_allocator:jemalloc-3.6.0
active_defrag_running:0
lazyfree_pending_objects:0
Redis is taking 14064.8 MB and 15664.2 MB on master and slave respectively.
I do have alot of data stored in redis. Most of them has expiry set to them and some have no expiry.
The problem is even after setting max memory set to 10 GB why is redis taking around 15GB in the VM.
I see that used memory is below 10GB and the rss memory is 15GB.
I did run MEMORY PURGE which clears some of the rss memory but it gets re populated within a few minutes and keeps growing.
Any suggestion on how I can control the memory consumption or a permanent solution for this issue. Should I increase RAM in the VM? if yes how much RAM should I add to handle this situation.
RSS memory will always be larger than the actual memory used by Redis for the dataset. It appears that, in your case, you're also suffering from memory fragmentation so you should consider enabling the active defragmentor.
That said, allocating more RAM to your servers will allow them to reach higher fragmentation rates, so the more you add the longer it will take to reach memory pressure. Since fragmentation is usage dependent, it is hard to say accurately how much more you'll need, but in most case fragmentation plateaus after a while so that should give you some indication.
I might be wrong but still asking this question. ;-)
So I am planning to use redis as a persistent storage(Primary Storage). I am having AOF enabled.I know redis will load this data during server start up. Let us say I have 10GB data and 5 GB ram, If I try to search for a key which is not loaded in RAM, will it check AOF and load that data to RAM by offloading any unused keys?
You cannot have less memory than data size in Redis. In your example Redis would run out of memory during start-up. You find more answers here: http://redis.io/topics/faq
We recently migrated to Couchbase 3.1.0. The odd thing is - when performing full backup of a bucket, web UI alerts "Hard Out Of Memory Error. Bucket X on node Y is full. All memory allocated to this bucket is used for metadata". The numbers from RAM usage in the web UI contradict that - about 75% is used, but not 100%. I looked into the logs, but haven't find any similar errors there.
Is that even normal?
This is a known issue in the Couchbase Server 3.x releases.
To understand the problem, we must also first understand Database Change Protocol (DCP), the protocol used to transfer data throughout the system. At a high level the flow-control for DCP is as follows:
The Consumer creates a connection with the Producer and sends an Open Connection message. The Consumer then sends a Control message to indicate per stream flow control. This messages will contain “stream_buffer_size” in the key section and the buffer size the Consumer would like each stream to have in the value section.
The Consumer will then start opening streams so that is can receive data from the server.
The Producer will then continue to send data for the stream that has buffer space available until it reaches the maximum send size.
Steps 1-3 continue until the connection is closed, as the Consumer continues to consume items from the stream.
The cbbackup utility does not implement any flow control (data buffer limits) however, and it will try to stream all vbuckets from all nodes at once, with no cap on the buffer size.
While this does not mean that it will use the same amount of memory as your overall data size (as the streams are being drained slowly by the cbbackup process), it does mean that a large memory overhead is required to be able to store the data streams.
When you are in a heavy DGM (disk greater than memory) scenario, the amount of memory required to store the streams is likely to grow more rapidly than cbbackup can drain them as it is streaming large quantities of data off of disk, leading to very large streams, which take up a lot of memory as previously mentioned.
The slightly misleading message about metadata taking up all of the memory is displayed as there is no memory left for the data, so all of the remaining memory is allocated to the metadata, which when using value eviction cannot be ejected from memory.
The reason that this only affects Couchbase Server versions prior to 4.0 is that in 4.0 a server-side improvement to DCP stream management was made that allows the pausing of DCP streams to keep the memory footprint down, this is tracked as MB-12179.
As a result, you should not experience the same issue on Couchbase Server versions 4.x+, regardless of how DGM your bucket is.
Workaround
If you find yourself in a situation where this issue is occurring, then terminating the backup job should release all of the memory consumed by the streams immediately.
Unfortunately if you have already had most of your data evicted from memory as a result of the backup, then you will have to retrieve a large quantity of data off of disk instead of RAM for a small period of time, which is likely to increase your get latencies.
Over time 'hot' data will be brought into memory when requested, so this will only be a problem for a small period of time, however this is still a fairly undesirable situation to be in.
The workaround to avoid this issue completely is to only stream a small number of vbuckets at once when performing the backup, as opposed to all vbuckets which cbbackup does by default.
This can be achieved using cbbackupwrapper which comes bundled with all Couchbase Server releases 3.1.0 and later, details of using cbbackupwrapper can be found in the Couchbase Server documentation.
In particular the parameter to pay attention to is the -n flag, which specifies the number of vbuckets to be backed up in a batch at once.
As the name suggests, cbbackupwrapper is simply a wrapper script on top of cbbackup which partitions the vbuckets up and automatically handles all of the directory creation and backup generation, while still using cbbackup under the hood.
As an example, with a batch size of 50, cbbackupwrapper would backup vbuckets 0-49 first, followed by 50-99, then 100-149 etc.
It is suggested that you test with cbbackupwrapper in a testing environment which mirrors your production environment to find a suitable value for -n and -P (which controls how many backup processes run at once, the combination of these two controls the amount of memory pressure caused by backup as well as the overall speed).
You should not find that lowering the value of -n from its default 100 decreases the backup speed, in some cases you may find that the backup speed actually increases due to the fact that there is far less memory pressure on the server.
You may however wish to sensibly adjust the -P parameter if you wish to speed up the backup further.
Below is an example command:
cbbackupwrapper http://[host]:8091 [backup_dir] -u [user_name] -p [password] -n 50
It should be noted that if you use cbbackupwrapper to perform your backup then you must also use cbrestorewrapper to restore the data, as cbrestorewrapper is automatically aware of the directory structures used by cbbackupwrapper.
When you run a full backup, by default the backup tool streams data from all nodes over the network. This is not the best way, because it causes a lot of extra load and increased memory usage, especially of you run cbbackup on one of the Couchbase nodes. I would use the data-copy mode of cbbackup, which copies data directly from the files on disk:
> sudo /opt/couchbase/bin/cbbackup couchstore-files:///opt/couchbase/var/lib/couchbase/data/ /tmp/backup
Of course, change the data path to wherever your Couchbase data is actually stored. (In my example it runs as sudo because only root has read access to /opt/couchbase/blabla..) Do this on every node, then collect all the backup folders and put them somewhere. Note that the backups are very compressible, so you might want to zip them before copying over the network.
I am facing some scaling issues with my redis instances and was wondering if there's a way to configure redis to save data only to disk (and not hold it in memory). That way I could just increase disk space and not RAM.
Right now my instances are getting stuck and just hang when they reach the memory limit.
Thanks!
No - Redis, atm, is an in-memory database. That means that all data that it manages resides first and foremost in RAM.