ServiceStack.Redis SearchKeys - redis

I am using the ServiceStack.Redis client on C#.
I added about 5 million records of type1 using the following pattern a::name::1 and 11 million records of type2 using the pattern b::RecId::1.
Now I am using redis typed client as client = redis.As<String>. I want to retrieve all the keys of type2. I am using the following pattern:
var keys = client.SearchKeys("b::RecID::*");
But it takes forever (approximately 3-5 mins) to retrieve the keys.
Is there any faster and more efficient way to do this?

You should work hard to avoid the need to scan the keyspace. KYES is literally a server stopper, but even if you have SCAN available: don't do that. Now, you could choose to keep the keys of things you have available in a set somewhere, but there is no SRANGE etc - in 2. you'd have to use SMEMBERS, which is still going to need to return a few million records - but at least they will all be available. In later server versions, you have access to SCAN (think: KEYS) and SSCAN (think: SMEMBERS), but ultimately you simply have the issue of wanting millions of rows, which is never free.
If possible, you could mitigate the impact by using a master/slave pair, and running the expensive operations on the slave. At least other clients will be able to do something while you're killing the server.

The keys command in Redis is slow (well, not slow, but time consuming). It also blocks your server from accepting any other command while it's running.
If you really want to iterate over all of your keys take a look at the scan command instead- although I have no idea about ServiceStack for this

You can use the SCAN command, make a loop search, where each search is restricted to a smaller number of keys. For a complete example, refer to this article: http://blog.bossma.cn/csharp/nservicekit-redis-support-scan-solution/

Related

Efficiently delete RedisKeys in bulk via wildcard pattern

Problem:
I need to efficiently delete keys from my Redis Cache using a wildcard pattern. I don't need atomicity; eventual consistency is acceptable.
Tech stack:
.NET 6 (async all the way through)
StackExchange.Redis 2.6.66
Redis Server 6.2.6
I currently have ~500k keys in Redis.
I'm not able to use RedisJSON for various reasons
Example:
I store the following 3 STRING types with keys:
dailynote:getitemsforuser:region:sw:user:123
dailynote:getitemsforuser:region:fl:user:123
dailynote:getitemsforuser:region:sw:user:456
...
where each STRING stores JSON like so:
> dump dailynote:getitemsforuser:region:fl:user:123
"{\"Name\":\"john\",\"Age\":22}"
The original solution used the KeysAsync method to retrieve the list of keys to delete via a wildcard pattern. Since the Redis Server is 6.x, the SCAN feature is being used by KeysAsync internally by the StackExchange.Redis nuget.
Original implementation used a wildcard pattern dailynote:getitemsforuser:region:*. As one would expect, this solution didn't scale well and we started seeing RedisTimeoutExceptions.
I'm aware of the "avoid this in PROD if you can" and have seen Marc Gravell respond to a couple other questions/issues on SO and StackExchange.Redis GitHub. The only potential alternative I could think of is to use a Redis SET to "track" each RedisKey and then retrieve the list of values from the SET (which are the keys I need to remove). Then delete the SET as well as the returned keys.
Potential Solution?:
Create a Redis SET with a key of dailynote:getitemsforuser with a value which is the key of the form dailynote:getitemsforuser:region:XX...
The SET would look like:
dailynote:getitemsforuser (KEY)
dailynote:getitemsforuser:region:sw:user:123 (VALUE)
dailynote:getitemsforuser:region:fl:user:123 (VALUE)
dailynote:getitemsforuser:region:sw:user:456 (VALUE)
...
I would still have each individual STRING type as well:
dailynote:getitemsforuser:region:sw:user:123
dailynote:getitemsforuser:region:fl:user:123
dailynote:getitemsforuser:region:sw:user:456
...
when it is time to do the "wildcard" remove, I get the members of the dailynote:getitemsforuser SET, then call RemoveAsync passing the members of the set as the RedisKey[]. Then call RemoveAsync with the key of the SET (dailynote:getitemsforuser)
I'm looking for feedback on how viable of a solution this is, alternative ideas, gotchas, and suggestions for improvement. TIA
UPDATE
Added my solution I went with below...
The big problem with both KEYS and SCAN with Redis is that they require a complete scan of the massive hash table that stores every Redis key. Even if you use a pattern, it still needs to check each entry in that hash table to see if it matches.
Assuming you are calling SADD when you are also setting the value in your key—and thus avoiding the call to SCAN—this should work. It is worth noting that calls to SMEMBERS to get all the members of a Set can also cause issues if the Set is big. Redis—being single-threaded—will block while all the members are returned. You can mitigate this by using SSCAN instead. StackExchange.Redis might do this already. I'm not sure.
You might also be able to write a Lua script that reads the Set and UNLINKs all the keys atomically. This would reduce network but could tie Redis up if this takes too long.
I ended up using the solution I suggested above where I use a Redis SET with a known/fixed key to "track" each of the necessary keys.
When a key that needs to be tracked is added, I call StackExchange.Redis.IDatabase.SetAddAsync (SADD) while calling StackExchange.Redis.IDatabase.HashSetAsync (HSET) for adding the "tracked" key (along with its TTL).
When it is time to remove the "tracked" key, I first call StackExchange.Redis.IDatabase.SetScanAsync (SSCAN) (with a page size of 250) iterating on the IAsyncEnumerable and call StackExchange.Redis.IDatabase.KeyDeleteAsync (HDEL) on chunks of the members of the SET. I then call StackExchange.Redis.IDatabase.KeyDeleteAsync on the actual key of the SET itself.
Hope this helps someone else.

Which one to use Hset or HMSet in Redis?

I am kind of new to Redis. I am just trying to store values in Redis using the HashSet method available using StackExchange.Redis.StrongName assembly (Lets say I have 4000 items). Is it better to store the 4000 items individually (using HSet ) or shall I pass a dictionary (using HMSet) so it will call only Redis server call is required but a huge amount of data. Which one is better?
Thanks in Advance
HMSET has been deprecated as of redis 4.0.0 in favor of using HSET with multiple key/value pairs:
https://redis.io/commands/hset
https://redis.io/commands/hmset
Performance will be O(n)
TL;DR A single call is "better" in terms of performance.
Taking into consideration #dizzyf's answer about HMSET's deprecation, the question becomes "should I use a lot of small calls instead of one big one?". Because there is an overhead to process every command, it is usually preferred to "batch" calls together to reduce the price.
Some commands in Redis are varidiac, a.k.a dynamic arity, meaning they can accept one or more values to eliminate multiple calls. That said, overloading a single call with a huge amount of arguments is also not the best practice - that typically leads to massive memory allocations and blocks the server from serving other requests during its processing.
I would approach this by dividing the values to constant-sized "batches" - 100 is a good start, but you can always tune it afterwards - and sending each such "batch" in a single HSET key field1 value1 ... field100 value1000 call.
Pro tip: if your Redis client supports it, you can use pipelining to make everything more responsive ("better"?).

How many keys can be deleted in a single redis del command?

I want to delete multiple redis keys using a single delete command on redis client.
Is there any limit in the number of keys to be deleted?
i will be using del key1 key2 ....
There's no hard limit on the number of keys, but the query buffer limit does provide a bound. Connections are closed when the buffer hits 1 GB, so practically speaking this is somewhat difficult to hit.
Docs:
https://redis.io/topics/clients
However! You may want to take into consideration that Redis is single-threaded: a time-consuming command will block all other commands until completed. Depending on your use-case this may make a good case for "chunking" up your deletes into groups of, say, 1000 at a time, because it allows other commands to squeeze in between. (Whether or not this is tolerable is something you'll need to determine based on your specific scenario.)

Redis scan match performance with large number of keys?

Can't find any info about redis scan match
does it mean that if I have 500,000 keys it will iterate over all of them one by one and check if they match the pattern? or it have some other clever trick to pull only relevance keys?
if its actually scan them all, is it performance wisely?
THanks
Scan is basically an alternate to keys command which is blocking. It will return a cursor and with that cursor you need to scan again and the process continues. Duplicates are also possible so you need to handle them in the app logic which means even if you have only 1 million keys and you scan for 10,000 items in each scan it can go more than 10 times.
So it's actually a trade off instead of using keys which is a blocking command but quick you can use scan which is actually slow in comparison with keys command but will not block in the production environment and still achieves what you need.
Hope this helps

Correct modeling in Redis for writing single entity but querying multiple

I'm trying to convert data which is on a Sql DB to Redis. In order to gain much higher throughput because it's a very high throughput. I'm aware of the downsides of persistence, storage costs etc...
So, I have a table called "Users" with few columns. Let's assume: ID, Name, Phone, Gender
Around 90% of the requests are Writes. to update a single row.
Around 10% of the requests are Reads. to get 20 rows in each request.
I'm trying to get my head around the right modeling of this in order to get the max out of it.
If there were only updates - I would use Hashes.
But because of the 10% of Reads I'm afraid it won't be efficient.
Any suggestions?
Actually, the real question is whether you need to support partial updates.
Supposing partial update is not required, you can store your record in a blob associated to a key (i.e. string datatype). All write operations can be done in one roundtrip, since the record is always written at once. Several read operations can be done in one rountrip as well using the MGET command.
Now, supposing partial update is required, you can store your record in a dictionary associated to a key (i.e. hash datatype). All write operations can be done in one roundtrip (even if they are partial). Several read operations can also be done in one roundtrip provided HGETALL commands are pipelined.
Pipelining several HGETALL commands is a bit more CPU consuming than using MGET, but not that much. In term of latency, it should not be significantly different, except if you execute hundreds of thousands of them per second on the Redis instance.