I use redis "flushdb" to flush all data in Redis but caused redis-server went away, I wondered the problem cound be cleaning a large amount of keys. So is there any idea for flushing Redis in a smoothly way? Maybe with more time for flushing all data?
flushall is "delete all keys" as described here: http://redis.io/commands/flushall
Delete operations are blocking operations.
Large delete operations may block redis 1minute or more. (e.g. you delete a 16GB hash with tons of keys)
You should write a Script which uses cursors to do this.
//edit:
I found my old answer here and wanted to be more specific providing resources:
Large number of keys, use SCAN to iterate over them with a cursor and do a gracefully cleanup in smaller batches.
Large hash, use either UNLINK command to async delete or HSCAN to iterate over it with a cursor and do a gracefully cleanup.
Related
I am using redis to store some pretty large bitsets. Redis is run in master/slave sentinel mode.
I got curious about the replication performance for very big bitsets (my bitset has a size of +-100Kbyte).
From the documentation: Async replication works by sending a stream of commands between master and slave.
Can I expect those commands to update a single bit in a slave or do they copy entire keys each time? Obviously I would prefer SETBIT commands to be passed instead of setting entire keys in order to decrease network traffic.
Async replication will only pass the write command eg SETBIT to the replica in most cases.
If the replica falls too far behind however, the replica will get flushed (cleared out) and a full resync will occur. This happens if there is a lot of latency and if there are a large number of writes. If you see this happening you can tune your replication buffers to lower the possibility of a full sync
I used redis for synchronize some datas.
Precondition : Data inserted into redis continuously. (About 30,000 in 10 minutes)
Here is work flow that execute every 5 minutes.
Scan keys by specific pattern(ex. 'users*')
Get all values by keys
Flush all keys
In workflow 1, I used scan_iter() to avoid locking.
I wonder that in my workflow, there is any things to causing redis lock?
If data insertion and scanning keys occur simultaneously, it can cause locking?
If you are not using the ASYNC option then FLUSHDB and FLUSHALL are blocking commands.
https://redis.io/commands/flushall
Recently, I met a problem when I use setbit in redis. As I use redis as a bloomFilter part for store, 0.2 billion data cost 380MB memory for 99.99% accuracy. Every day I need to delete the redis key for bloomfilter and create a new one, but found slow log, and this may affect other service in product environment. Counld anybody give a better suggest what to do to forbid this? thx a lot~
according command costs(ms):
DEL bloomFilterKey
use(microseconds):83886
Freeing the large mount of memory, i.e. 380MB, costs too much time, and blocks Redis.
In order to avoid this, you can upgrade your Redis to version 4.0, and use the new command UNLINK to delete the key. This command frees memory in a different thread, and won't block Redis.
In Redis 4.0, there is a new command UNLINK to delete the keys in Redis memory.
This command is very similar to DEL: it removes the specified keys.
Just like DEL a key is ignored if it does not exist. However the
command performs the actual memory reclaiming in a different thread,
so it is not blocking, while DEL is. This is where the command name
comes from: the command just unlinks the keys from the keyspace. The
actual removal will happen later asynchronously.
So one can always (100% times) use UNLINK instead of DEL as UNLINK is nonblocking, unlike DEL, right?
Before discussing which one is better, let's take a look at the difference between these commands. Both DEL and UNLINK free the key part in blocking mode. And the difference is the way they free the value part.
DEL always frees the value part in blocking mode. However, if the value is too large, e.g. too many allocations for a large LIST or HASH, it blocks Redis for a long time. In order to solve the problem, Redis implements the UNLINK command, i.e. an 'non-blocking' delete.
In fact, UNLINK is NOT always non-blocking/async. If the value is small, e.g. the size of LIST or HASH is less than 64, the value will be freed immediately. In this way, UNLINK is almost the same as DEL, except that it costs a few more function calls than DEL. However, if the value is large, Redis puts the value into a list, and the value will be freed by another thread i.e. the non-blocking free. In this way, the main thread has to do some synchronization with the background thread, and that's also a cost.
In a conclusion, if the value is small, DEL is at least, as good as UNLINK. If value is very large, e.g. LIST with thousands or millions of items, UNLINK is much better than DEL. You can always safely replace DEL with UNLINK. However, if you find the thread synchronization becomes a problem (multi-threading is always a headache), you can rollback to DEL.
UPDATE:
Since Redis 6.0, there's a new configuration: lazyfree-lazy-user-del. You can set it to be yes, and Redis will run the DEL command as if running a UNLINK command.
YES. please have a read on Lazy Redis is better Redis from antirez. But the reason is not that the unlink is nonblocking command. The reason is unlink is smarter than del.
UNLINK is a smart command: it calculates the deallocation cost of an object, and if it is very small it will just do what DEL is supposed to do and free the object ASAP. Otherwise the object is sent to the background queue for processing.
Also, I think the more fast way is we do the decision for redis: using DEL for small key, using UNLINK for a huge key such as big list or set. We can decrease the needless calculation of redis.
I have a buffer that needs to read all values(hash, field and values) from redis after reboot, is there a way to do that in a fast way? I have approximately 100,000 hashes with 4 fields each.
Thanks!
EDIT:
The Slow way: Current Implementation is getting all the hashes using
Keys *
then
HGETALL xxx
to get all the fields' values.
There are two ways to approach this problem.
The first one is to try to optimize the KEYS/HGETALL combination you have described. Because you do not have millions of keys (100K is not so high by Redis standard), the KEYS command will not block the instance for a long time, and the output buffer size required to return 100K items is probably acceptable. Once the list of keys have been received by your program, then the next challenge is to run many HGETALL commands as fast as possible. The key is to pipeline them (for instance in synchronous batches of 1000 items) which is quite easy to implement with hiredis (just use redisAppendCommand / redisGetReply). The 100K items will be retrieved in 100 roundtrips only. Because most Redis instances can sustain 100K op/s or more, it should not last more than a few seconds. A more efficient solution would be to use the asynchronous interface of hiredis to try to maximize the throughput, but it is more complex to implement. I'm not sure it is worth it for 100K items.
The second approach is to use a BGSAVE command to take a snapshot of Redis content, retrieve the generated dump file, and then parse the file to extract the data. You can have a look at the excellent redis-rdb-tools package for a Python implementation. The main benefit of this approach is there is no impact on the Redis instance (no KEYS command to block the event loop) while still retrieving consistent data.