Dump the whole redis database instance using hiredis - redis

I have a buffer that needs to read all values(hash, field and values) from redis after reboot, is there a way to do that in a fast way? I have approximately 100,000 hashes with 4 fields each.
Thanks!
EDIT:
The Slow way: Current Implementation is getting all the hashes using
Keys *
then
HGETALL xxx
to get all the fields' values.

There are two ways to approach this problem.
The first one is to try to optimize the KEYS/HGETALL combination you have described. Because you do not have millions of keys (100K is not so high by Redis standard), the KEYS command will not block the instance for a long time, and the output buffer size required to return 100K items is probably acceptable. Once the list of keys have been received by your program, then the next challenge is to run many HGETALL commands as fast as possible. The key is to pipeline them (for instance in synchronous batches of 1000 items) which is quite easy to implement with hiredis (just use redisAppendCommand / redisGetReply). The 100K items will be retrieved in 100 roundtrips only. Because most Redis instances can sustain 100K op/s or more, it should not last more than a few seconds. A more efficient solution would be to use the asynchronous interface of hiredis to try to maximize the throughput, but it is more complex to implement. I'm not sure it is worth it for 100K items.
The second approach is to use a BGSAVE command to take a snapshot of Redis content, retrieve the generated dump file, and then parse the file to extract the data. You can have a look at the excellent redis-rdb-tools package for a Python implementation. The main benefit of this approach is there is no impact on the Redis instance (no KEYS command to block the event loop) while still retrieving consistent data.

Related

Fetch keys with matched pattern

I have a Spring boot application which is connected to Redis.
I want to perform a redis operation to fetch the keys which matches certain pattern.
I understand this can be achieved in multiple ways
Redis Template and Keys command : But its not suitable to be used on large data sets . As it may block the client (not the server) for long time and also exhaust the server memory due to the response buffer size.
Redis Template and Scan command : Redis docs recommends to use scan command in comparison to Keys. As it does the scanning iteratively which makes faster smaller operations and better on server resources.
Spring Data Redis Repository : Fetch all by creating a hash on the pattern in the Redis Entity.
But i am not sure which will give me overall faster performance to fetch all the matched keys under high load and would be recommended for my scenario.
Best Regards,
Saurav
Redis is single-threaded so traversing all keys on a large dataset (a large number of keys) may block the server for a long time (even several seconds). And so it is not advised to run 'Keys' in production at all.
The scan operation is built to run iteratively but you should note that you might get the same key more than once and also there is a chance that some keys will not be returned. Overall your system will run faster with Scan.

How in Redis you can determine the size of the occupied memory of each database separately

I have several databases in Redis and I want to determine how much each of them takes up in RAM individually.
In Redis documentation there is a command INFO https://redis.io/commands/info but it gives out information on the occupied memory entirely, without splitting into existing databases.
There are also libraries like https://github.com/snmaynard/redis-audit, which essentially exploit the KEYS * command, and then manual calculation.
Is there a built-in ability in Redis to monitor the current value of the memory occupied by each database separately, or maybe there are libraries for this, that are not based on the KEYS * command?
Redis do not offer the separate database memory used. Usually, we attention more on all memory used because of performance.
If your keys and values have less difference in each database, you can use DBSIZE command to sum each database keys, without KEYS * command.
Do not forget SELECT n before DBSIZE command.

Max value size for Redis

I've been trying to make replay system. So basically when player moves, system saves his datas(movements, location, animation etc.) into JSON file. In the end of the record, JSON file may be over 50 MB. I'd want to save this data into Redis with expire date (24-48 hours).
My questions are;
Is it bad to save over 50 MB into Redis with expire date?
How many datas that over 50 MB can Redis handle without performance loss?
If players make 500 records in 48 hours, may it be bad for Redis?
How many milliseconds does it takes 50 MB data from Redis with average VDS/VPS?
Storing a large object(in terms of size) is not a good practice. You may read it from here. One of the problem is network. You need to send 50MB payload to a redis server in a single call. Also if you save them as one big object, then while retrieving, updating it (a single field, element etc), you need to get 50 MB back from server and parse it to get a single field, update it back end send back to server. That's a serious problem in terms of network.
Instead of redis strings, you may prefer sorted sets or lists depending on your use case. If you are going to store them with timestamps and get the range of events between these timestamps, then sorted sets may be an ideal solution for you. It's good for pagination etc. One of the crucial drawback is the complexity of adding a new element is O(log(N)).
lists may also provide a good playground for your case. You may use LPUSH/RPUSH to add new events to your list, and since Redis lists are implemented with linked lists, both adding a message to the beginning or end of the list is same, O(1), which is great.
Whenever an event happens, you either call ZADD or RPUSH/LPUSH to send the events to redis. If you need to query those to you may use available functions such as ZRANGEBYSCORE or LRANGE depending on your choice.
While designing your keys you may use an identifier such as user-id just like you mentioned in the comments. You will not have the problems with lists/sorted sets like you will have in strings. But choosing which one is most suitable for your depends on your use case for reads/writes or business rules.
Here some useful links to read;
Redis data types intro
Redis data types
Redis labs documentation about data types

Is there a way to increment all fields in a Redis hash at once?

I want to use Redis to cache a large amount of highly dynamic counters. In my case users are subscribing to sources and I want to cache each user's counter for that source. When a new item arrives at the source it's natural that the counters for all users subscribed to this source should be incremented by 1. Some sources have hundreds of thousands of subscribers, so it's important that this happens immediately.
However, Redis doesn't have a native method to increment all hash fields at once, only HINCRBY. Is there a solution for this?
While searching, I stumbled upon some threads where people wanted a bulk version of HINCRBY, but my case is different. I want to increment all fields in the hash.
Such operation should be achieved by using LUA script see:
https://redis.io/commands/evalsha
https://redis.io/commands/eval
You can use LUA script to "batch" couple of operations on a single hash

Redis: Dump db and delete dumped key / value pairs

I have multiple servers that all store set members in a shared Redis cache. When the cache fills up, I need to persist the data to disk to free up RAM. I then plan to parse the dumped data such that I will be able to combine all of the values that belong to a given key in MongoDB.
My first plan was to have each server process attempt an sadd operation. If the request fails because Redis has reached maxmemory, I planned to query for each of my set keys, and write each to disk.
However, I am wondering if there is a way to use one of the inbuilt persistence methods in Redis to write the Redis data to disk and delete the key/value pairs after writing. If this is possible I could just parse the rdb dump and work with the data in that fashion. I'd be grateful for any help others can offer on this question.
Redis' persistence is meant to be used for whatever's in the RAM. Put differently, you can't persist what ain't in RAM.
To answer your question: no, you can't use persistence to "offload" data from RAM.