Delete large number of keys in Redis - redis

I have a large number of simple keys to delete from Redis with a prefix and I am trying to find the most efficient way to do this atomically inside a transaction with Lua script:
Iterate with SCAN and DEL keys?
Iterate with SCAN and EXPIRE each key?
Iterate with SCAN and UNLINK keys?
Which of the above is the recommended way to proceed? Shall I have a different approach - like using a hash and multiple keys inside hash? Would any of the above be any problem in case of Redis cluster?

I would suggest to go with unlink with batch processing like following and it will clear the memory in efficient way. I don't suggest expiry as redis will look 10times(this default configuration) within 1second to delete expired keys and it may not be efficient way.
redis-cli --scan --pattern 'prefix:*' | xargs -L 1000 redis-cli unlink

if your keys are simple strings, you can organise your prefixes inside hashes, with keys as name/value pairs.
Then, just drop the entire hash.
PS: Yes you have to rewrite all your read queries, but the performance impact should be negligible?
PS2: Doesn't address the clustered Redis problems.

A Redis Lua script must be provided explicitly with all key names it touches via the KEYS array. All three approaches violate that principle as they fetch the key names with SCAN.

Related

Is it safe to delete keys from Redis while iterating through them using scan?

I want to iterate through a number of keys that are stored in Redis with the same prefix using scan and do some processing in my app code with the key's values. Is it safe to delete the keys returned from scan's output after processing them? I don't see this mentioned in the scan documentation: https://redis.io/commands/scan
Yes, it's safe to delete the returned keys.
Redis scan is stateless. Keyspace change (e.g. adding new keys or removing old keys) during the scan process does not make the scan fail, although Keyspace change might lead to keys missing or duplicated keys returned.
Check this on the details of how Redis scan work.

How to get ALL LFU (Least Frequently Used) keys from Redis?

Is it possible to retrieve all LFU keys from Redis database? If yes, please advice with a sample command-line.
Thanks a million,
Mughrabi
You can use OBJECT IDLETIME to get the number of seconds for which a key is not requested by a read or write operation. Then sort the idletimes for all keys to find LFU keys. Redis doesn't ship with a command which can print LFU keys as such. You probably have to write a script to do this.
https://redis.io/commands/OBJECT

Get all hashes exists in redis

I'm have hashes in redis cache like:
Hash Key Value
hashme:1 Hello World
hashme:2 Here Iam
myhash:1 Next One
My goal is to get the Hashes as output in the CLI like:
hashme
myhash
If there's no such option, this is ok too:
hashme:1
hashme:2
myhash:1
I didn't find any relevant command for it in Redis API.
Any suggestions ?
You can use the SCAN command to get all keys from Redis. Then for each key, use the TYPE command to check if it's a hash.
UPDATE:
With Redis 6.0, the SCAN command supports TYPE subcommand, and you can use this subcommand to scan all keys of a specified type:
SCAN 0 TYPE hash
Also never use KEYS command in production environment!!! It's a dangerous command which might block Redis for a long time.
keys *
is work for me. you Can try it.
The idea of redis (and others K/v stores) is for you to build an index. The database won't do it for you. It's a major difference with relational databases, which conributes to better performances.
So when your app creates a hash, put its key into a SET. When your app deletes a hash, remove its key from the SET. And then to get the list of hash IDs, just use SMEMBERS to get the content of the SET.
connection.keys('*') this will bring all the keys irrespective of the data type as everything in redis is stored as key value format
for redis in python, you can use below command to retrieve keys from redis db
def standardize_list(bytelist):
return [x.decode('utf-8') for x in bytelist]
>>> standardize_list(r.keys())
['hat:1236154736', 'hat:56854717', 'hat:1326692461']
here r variable is redis connection object

How can I get the count of keys in redis?

I can get some keys with this command:
keys *_products_*
However that command returns all the keys, where I just need the count of those. How can I get It?
DBSIZE
Return the number of keys in the database
http://redis.io/commands/dbsize
You can use DBSIZE or INFO KEYSPACE
But if you want all the keys with a certain pattern in the name you need to use KEYS or SCAN And you need to pay attention to KEYS, running it in production can affect the performance so it should be used with caution.
Both (evil) KEYS and the much preferred SCAN do not return counts, only key names. You could wrap them in a Lua script to just return the count.
However.
Doing KEYS (or SCAN) in real time is very expensive in terms of performance - it means that you're iterating over the entire keyspace. What you want to do is keep the result ready beforehand, and then just fetch it. Usually in such cases, you'd use a Redis Set in which you'd store each relevant key name - SADD to it whenever you create a key that matches the patter *products* and delete from it whenever a key is removed.
But.
Since you're only interesting in a count, you can replace that Set with a simple counter.
Use this command line:
redis-cli --scan --pattern '*_products_*' | wc -l

Redis expire on large set of keys

My problem is: i have a set of values that each of them has to have an expire value.
code:
set a:11111:22222 someValue
expire a:11111:22222 604800 \\usually equal a week
In a perfect world i would have put all those values in a hash and give each of them it's appropriate expire value, but redis does not allow expire on a hash fields.
problem is that i also have a process that need to get all those keys about once an hour
keys a:*
this command is really expensive and according to redis documentation can cause performance issues. I have about 25000-30000 keys at each given moment.
Does someone knows how can i solve such a problem?
thumbs up it guarantee (-;
Roy
Let me propose an alternative solution.
Rather than asking Redis to scan all the keys, why not perform a background dump, and parse the dump to extract the keys? This way, there is zero impact on the Redis instance itself.
Parsing the dump file is not as scary as it sounds, because you can use the excellent redis-rdb-tools package:
https://github.com/sripathikrishnan/redis-rdb-tools
You can either convert the dump file into a json file, and then parse the json file, or use the Python API to extract the keys by yourself.
As you've already mentioned, using keys is not a good solution to get your keys:
Warning: consider KEYS as a command that should only be used in production environments with extreme care. It may ruin performance when it is executed against large databases. This command is intended for debugging and special operations, such as changing your keyspace layout. Don't use KEYS in your regular application code. If you're looking for a way to find keys in a subset of your keyspace, consider using sets.
Source: Redis docs for KEYS
As the docs are suggesting, you should build your own indices!
A common way of building an index is to use a sorted set. You can read more on how it's working on my question over here.
Building references to your a:* keys using a sorted set, will also allow you to only select the required keys in relation to a date or any other int value, in case you're filtering the results!
And yes: it would be awesome if hashes could expire. Sadly it looks like its not going to happen, but there are in fact creative alternatives to take care about it by yourself.
Why don't you use a sorted set.
Here is some data creation sequence.
redis 127.0.0.1:6379> setex a:11111:22222 604800 someValue
OK
redis 127.0.0.1:6379> zadd user:index 1385112435 a:11111:22222 // 1384507635 + 604800
(integer) 1
redis 127.0.0.1:6379> setex a:11111:22223 604800 someValue2
OK
redis 127.0.0.1:6379> zadd user:index 1385113289 a:11111:22223 // 1384508489 + 604800
(integer) 1
redis 127.0.0.1:6379> zrangebyscore user:index 1385112435 1385113289
1) "a:11111:22222"
2) "a:11111:22223"
This is no select performance issue.
but, It spends more memory and insert cost.