Is it possible configuring working directory with multiple data folders - redis

I've currently installed Redis on VM which has two mounted disks. I'd like to use those two mounted disks as working directories for Redis.
So is it possible to configure Redis working directory dir with multiple folder locations?
Thanks!
AVR

NO, you CANNOT do that.
Redis can only hold data that fits into memory. Normally that size is much smaller than the size of a disk, and there's no need to use multiple disks to extend the storage.
In some cases, multiple disks might help, e.g. Redis is dumping data set to disk while syncing with slaves, Redis dumps both AOF and RDB files. In these cases, there are multiple readers or writers working at the same time, and that might cause performance issues (i.e. too many disk seeks).
However, since Redis focuses on in-memory store, I'm not sure if that's a big problem to concern.

Related

How to save memory from unpopular/cold Redis?

We have a lot of Redis instances, consuming TBs of memory and hundreds of machines.
With our business activities goes up and down, some Redis instances are just not used that frequent any more -- they are "unpopular" or "cold". But Redis stores everything in memory, so a lot of infrequent data that should have been stored in cheap disk are occupying expensive memory.
We are exploring a way to save the memory from these unpopular/cold Redis, as to reduce our machines usage.
We cannot delete data, nor can we migrate to other database. Are there some way to achieve our goals?
PS: We are thinking of some Redis compatible product that can "mix" memory and disk, i.e. it stores hot data in memory but cold in disk, and USING LIMITED RESOURCES. We know RedisLabs' "Redis on Flash(ROF)" solution, but it uses RocksDB, which is very memory unfriendly. What we want is a very memory restrained product. Besides, ROF is not open source :(
Thanks in advance!
ElastiCache Redis now supports data tiering. Data tiering provides a new cost optimal option for storing data in Redis by utilizing lower-cost local NVMe SSDs in each cluster node in addition to storing data in memory. It is ideal for workloads that access up to 20 percent of their overall dataset regularly, and for applications that can tolerate additional latency when accessing data on SSD. More details about data tiering can be found here.
Your problem might be solved by using an orchestrator approach: scaledown when not in use, scale up when in demand.
Implementation depends much on your infrastructure, but a base requirement is proper monitoring of Redis instances usage.
Based on that, if you are running on Kubernetes, you can leverage pod autoscaling.
Otherwise you can implement Consul and use HAProxy to handle the shutdown/spin-up logic. A starting point for that strategy is this article.
Of course Reiner's idea of using swap is a quick win if it works the intended way!

Redis snapshot overloading memory

I'm using redis as a client side caching mechanism.
Implemented with C# using stackexchange.redis.
I configured the snapshotting to "save 5 1" and rdbcompression is on.
The RDB mechanism loads the rdb file to memory every time it needs to append data.
The problem is when you have a fairly large RDB file and it's loaded to memory all at once. It chokes up the memory, disk and cpu for the average endpoint.
Is there a way to update the rdb file without loading the whole file to memory?
Also any other solution that lowers the load on the memory and cpu is welcome.
The RDB mechanism loads the rdb file to memory every time it needs to append data.
This isn't what the open source Redis server does (other variants, such as the MSFT fork, may behave differently) - RDBs are created by copying the contents of the memory to disk with a forked process. The dump's file is never loaded, except when used for recovery. The increased memory usage during the save process is dependent on the amount of writes performed while the dump is undergoing because of the copy-on-write (COW) mechanism.
Also any other solution that lowers the load on the memory and cpu is welcome.
There are several ways to tackle this, depending on your requirements and budget. These include:
Using both RDB and AOF for data persistency, thus reducing the frequency of dumps.
Delegating persistency to a slave instance.
Sharding your databases and performing cascading dumps.
We tackled the problem by using RDB and now use AOF exclusively.
We have reduced the memory peaks by reducing the auto-aof-rewrite-percentage and also limiting the auto-aof-rewrite-min-size to the desired size.

How to configure namespace to keep partial data as cache in ram and the remaining in hard disk?

I am trying to write some data to a namespace in Aerospike, but i don't have enough ram for the whole data.
How can i configure my Aerospike so that a portion of the data in kept in the ram as cache and remaining is kept in the hard drive?
Can I reduce the number of copies of data made in Aerospike kept in ram?
It can be done by modifying the contents ofaerospike.conf file but how exactly am i going to achieve it.
You should have seen the configuration page in aerospike documentation before asking such question
http://www.aerospike.com/docs/operations/configure/namespace/storage/
How can i configure my Aerospike so that a portion of the data in kept in the ram as cache and remaining is kept in the hard drive?
The post-write-queue parameter defines the amount of RAM to use to keep recently written records in RAM. As long as these records are still in the post-write-queue Aerospike will read directly from RAM rather than disk. This will allow you to configure a LRU cache for an namespace that is storage-engine device and data-in-memory false. Note that this is least recently updated (or created) rather than least recently used (read or write) cache eviction algorithm.

Running moses server on Amazon

I am trying to run a moses server on Amazon ec2 ebs-backed instance. The languages models and translation models are about 200GB in total. I am thinking to have a moses installed instance loads languages models and translation models stored on s3. But i do not know how to configure moses.ini file in order to make moses knowing the path of ttable-file and lmodel-file. If anyone has done this before, any help would be greatly appreciated!!
Thank you.
I wouldn't recommend Amazon S3 for this. Amazon S3 is used for efficiently distributing files across the web. But if your whole purpose is to just read these files inside a VM - then saving these in S3 is not the correct choice. Refer to this answer for more details.
To answer your question, yes it is possible to mount an S3 bucket as a folder inside your server using S3FS. Here are instructions for Ubuntu and Red Hat.
But other ideal approaches are:
If you don't have enough space in the hard disk, then install Moses server on a different partition, format it using BTRFS and enable Transparent Compression. This will automatically compress/decompress files when you save/retrieve from the hard disk, so you will end up using much much less space. Also in a lot of benchmarks, Transparent Compression is shown to be faster, since lesser amount of data is transferred between hard disk and RAM. Specially when involving large files.
You can always attach a secondary EBS disk to your running VM (like a secondary hard disk). Use that for storing the translations/models (and you can also combine enable transparent compression as above too)
Run a separate VM without EBS, and just using the normal instance storage, and use that to store the translations alone. Now in your Moses server, you can mount the translations alone from this separate non-EBS VM using SSHFS
Overall, don't use S3, there are other much better ways.
Edit: Added link

Redis - Can data size be greater than memory size?

I'm rather new to Redis and before using it I'd like to learn some important (as for me) details on it. So....
Redis is using RAM and HDD for storing data. RAM is used as fast read/write storage, HDD is used to make this data persistant. When Redis is started it loads all data from HDD to RAM or it loads only often queried data to the RAM? What if I have 500Mb Redis storage on HDD, but I have only 100Mb or RAM for Redis. Where can I read about it?
Redis loads everything into RAM. All the data is written to disk, but will only be read for things like restarting the server or making a backup.
There are a couple of ways you can use it with less RAM than data though. You can set it up in combination with MySQL or another disk based store to work much like memcached - you manage cache misses and persistence manually.
Redis has a VM mode where all keys must fit in RAM but infrequently accessed data can be on disk. However, I'm not sure if this is in the stable builds yet.
Recent versions (>2.0) have improved significantly and memory management is more efficient. See this blog post that explains how to use hashes to optimize RAM memory footprint: http://antirez.com/post/redis-weekly-update-7.html
The feature called Virtual Memory and it official deprecated
Redis VM is now deprecated. Redis 2.4 will be the latest Redis version featuring Virtual Memory (but it also warns you that Virtual Memory usage is discouraged). We found that using VM has several disadvantages and problems. In the future of Redis we want to simply provide the best in-memory database (but persistent on disk as usual) ever, without considering at least for now the support for databases bigger than RAM. Our future efforts are focused into providing scripting, cluster, and better persistence.
more information about VM: https://redis.io/topics/virtual-memory