Is there a way to delete all objects in a redis list faster then o(n) ?
like the way truncate works in DB, just point the first object to null or something..
NO. There's no way to make the delete operation faster than O(n), since Redis has to free resources for each item one-by-one.
However, with the UNLINK command, you can ask Redis to delete the list asynchronously, so that the delete operation won't block Redis, but delete the list in a background thread. Check this question for more info.
Related
Recently, I met a problem when I use setbit in redis. As I use redis as a bloomFilter part for store, 0.2 billion data cost 380MB memory for 99.99% accuracy. Every day I need to delete the redis key for bloomfilter and create a new one, but found slow log, and this may affect other service in product environment. Counld anybody give a better suggest what to do to forbid this? thx a lot~
according command costs(ms):
DEL bloomFilterKey
use(microseconds):83886
Freeing the large mount of memory, i.e. 380MB, costs too much time, and blocks Redis.
In order to avoid this, you can upgrade your Redis to version 4.0, and use the new command UNLINK to delete the key. This command frees memory in a different thread, and won't block Redis.
When I run a rename command, I think it does something like this,
Use new name for new data
remove reference for old name
remove old data (this can take some time if it’s large)
For clients accessing this data, is there ever a time where any of these happen?
The key does not exist
The data is not in a good state
Redis hangs during access
What steps are performed during a Redis rename command?
Since Redis has single threaded execution of commands, the rename will be atomic, so the answer to 1 and 2 are no. The thing about it "removing old data" is only if the destination key already points to a large structure that it needs to delete (Redis will clobber it.) The original data object will not be copied. Only hash table entries pointing to it might be moved around. Since rehashing in Redis is incremental, this should essentially be constant time.
Redis will always "hang" on slow commands due to the single threaded command execution. So for 3, it can always be yes depending on what you're doing, but in this case, only if you're doing significantly large implicit delete.
Edit: as of Redis 4.0 you can actually specify the config option lazyfree-lazy-server-del yes (default is no) and the server will actually delete asynchronously for side-effect deletes such as this. In other words, instead of delete blocking, the object will be queued for background deletion. This would effectively make RENAME constant time. See sample cfg: https://raw.githubusercontent.com/antirez/redis/4.0/redis.conf
In Redis 4.0, there is a new command UNLINK to delete the keys in Redis memory.
This command is very similar to DEL: it removes the specified keys.
Just like DEL a key is ignored if it does not exist. However the
command performs the actual memory reclaiming in a different thread,
so it is not blocking, while DEL is. This is where the command name
comes from: the command just unlinks the keys from the keyspace. The
actual removal will happen later asynchronously.
So one can always (100% times) use UNLINK instead of DEL as UNLINK is nonblocking, unlike DEL, right?
Before discussing which one is better, let's take a look at the difference between these commands. Both DEL and UNLINK free the key part in blocking mode. And the difference is the way they free the value part.
DEL always frees the value part in blocking mode. However, if the value is too large, e.g. too many allocations for a large LIST or HASH, it blocks Redis for a long time. In order to solve the problem, Redis implements the UNLINK command, i.e. an 'non-blocking' delete.
In fact, UNLINK is NOT always non-blocking/async. If the value is small, e.g. the size of LIST or HASH is less than 64, the value will be freed immediately. In this way, UNLINK is almost the same as DEL, except that it costs a few more function calls than DEL. However, if the value is large, Redis puts the value into a list, and the value will be freed by another thread i.e. the non-blocking free. In this way, the main thread has to do some synchronization with the background thread, and that's also a cost.
In a conclusion, if the value is small, DEL is at least, as good as UNLINK. If value is very large, e.g. LIST with thousands or millions of items, UNLINK is much better than DEL. You can always safely replace DEL with UNLINK. However, if you find the thread synchronization becomes a problem (multi-threading is always a headache), you can rollback to DEL.
UPDATE:
Since Redis 6.0, there's a new configuration: lazyfree-lazy-user-del. You can set it to be yes, and Redis will run the DEL command as if running a UNLINK command.
YES. please have a read on Lazy Redis is better Redis from antirez. But the reason is not that the unlink is nonblocking command. The reason is unlink is smarter than del.
UNLINK is a smart command: it calculates the deallocation cost of an object, and if it is very small it will just do what DEL is supposed to do and free the object ASAP. Otherwise the object is sent to the background queue for processing.
Also, I think the more fast way is we do the decision for redis: using DEL for small key, using UNLINK for a huge key such as big list or set. We can decrease the needless calculation of redis.
I use redis "flushdb" to flush all data in Redis but caused redis-server went away, I wondered the problem cound be cleaning a large amount of keys. So is there any idea for flushing Redis in a smoothly way? Maybe with more time for flushing all data?
flushall is "delete all keys" as described here: http://redis.io/commands/flushall
Delete operations are blocking operations.
Large delete operations may block redis 1minute or more. (e.g. you delete a 16GB hash with tons of keys)
You should write a Script which uses cursors to do this.
//edit:
I found my old answer here and wanted to be more specific providing resources:
Large number of keys, use SCAN to iterate over them with a cursor and do a gracefully cleanup in smaller batches.
Large hash, use either UNLINK command to async delete or HSCAN to iterate over it with a cursor and do a gracefully cleanup.
I have a buffer that needs to read all values(hash, field and values) from redis after reboot, is there a way to do that in a fast way? I have approximately 100,000 hashes with 4 fields each.
Thanks!
EDIT:
The Slow way: Current Implementation is getting all the hashes using
Keys *
then
HGETALL xxx
to get all the fields' values.
There are two ways to approach this problem.
The first one is to try to optimize the KEYS/HGETALL combination you have described. Because you do not have millions of keys (100K is not so high by Redis standard), the KEYS command will not block the instance for a long time, and the output buffer size required to return 100K items is probably acceptable. Once the list of keys have been received by your program, then the next challenge is to run many HGETALL commands as fast as possible. The key is to pipeline them (for instance in synchronous batches of 1000 items) which is quite easy to implement with hiredis (just use redisAppendCommand / redisGetReply). The 100K items will be retrieved in 100 roundtrips only. Because most Redis instances can sustain 100K op/s or more, it should not last more than a few seconds. A more efficient solution would be to use the asynchronous interface of hiredis to try to maximize the throughput, but it is more complex to implement. I'm not sure it is worth it for 100K items.
The second approach is to use a BGSAVE command to take a snapshot of Redis content, retrieve the generated dump file, and then parse the file to extract the data. You can have a look at the excellent redis-rdb-tools package for a Python implementation. The main benefit of this approach is there is no impact on the Redis instance (no KEYS command to block the event loop) while still retrieving consistent data.