Redis: does separate database improve performance for KEYS and SORT - redis

Does separate database improve performance for KEYS and SORT?

In case you mean that, by spreading the same number of keys across multiple databases, your KEYS and SORT operations will be faster, then the answer is yes.
This is because there are less keys to check against and the time complexity of both these operations is dependent on the number of keys.
At the same time, sorting two result sets in two different databases will be far more costly.
See:
Redis commands - Sort
Redis commands - Keys

No. Both of those commands are run on one database. If you have 2 or more databases and wanted to run that command, then you would have to execute in each database, therefor taking twice the amount of time.

Related

How many keys can be deleted in a single redis del command?

I want to delete multiple redis keys using a single delete command on redis client.
Is there any limit in the number of keys to be deleted?
i will be using del key1 key2 ....
There's no hard limit on the number of keys, but the query buffer limit does provide a bound. Connections are closed when the buffer hits 1 GB, so practically speaking this is somewhat difficult to hit.
Docs:
https://redis.io/topics/clients
However! You may want to take into consideration that Redis is single-threaded: a time-consuming command will block all other commands until completed. Depending on your use-case this may make a good case for "chunking" up your deletes into groups of, say, 1000 at a time, because it allows other commands to squeeze in between. (Whether or not this is tolerable is something you'll need to determine based on your specific scenario.)

Correct modeling in Redis for writing single entity but querying multiple

I'm trying to convert data which is on a Sql DB to Redis. In order to gain much higher throughput because it's a very high throughput. I'm aware of the downsides of persistence, storage costs etc...
So, I have a table called "Users" with few columns. Let's assume: ID, Name, Phone, Gender
Around 90% of the requests are Writes. to update a single row.
Around 10% of the requests are Reads. to get 20 rows in each request.
I'm trying to get my head around the right modeling of this in order to get the max out of it.
If there were only updates - I would use Hashes.
But because of the 10% of Reads I'm afraid it won't be efficient.
Any suggestions?
Actually, the real question is whether you need to support partial updates.
Supposing partial update is not required, you can store your record in a blob associated to a key (i.e. string datatype). All write operations can be done in one roundtrip, since the record is always written at once. Several read operations can be done in one rountrip as well using the MGET command.
Now, supposing partial update is required, you can store your record in a dictionary associated to a key (i.e. hash datatype). All write operations can be done in one roundtrip (even if they are partial). Several read operations can also be done in one roundtrip provided HGETALL commands are pipelined.
Pipelining several HGETALL commands is a bit more CPU consuming than using MGET, but not that much. In term of latency, it should not be significantly different, except if you execute hundreds of thousands of them per second on the Redis instance.

ServiceStack.Redis SearchKeys

I am using the ServiceStack.Redis client on C#.
I added about 5 million records of type1 using the following pattern a::name::1 and 11 million records of type2 using the pattern b::RecId::1.
Now I am using redis typed client as client = redis.As<String>. I want to retrieve all the keys of type2. I am using the following pattern:
var keys = client.SearchKeys("b::RecID::*");
But it takes forever (approximately 3-5 mins) to retrieve the keys.
Is there any faster and more efficient way to do this?
You should work hard to avoid the need to scan the keyspace. KYES is literally a server stopper, but even if you have SCAN available: don't do that. Now, you could choose to keep the keys of things you have available in a set somewhere, but there is no SRANGE etc - in 2. you'd have to use SMEMBERS, which is still going to need to return a few million records - but at least they will all be available. In later server versions, you have access to SCAN (think: KEYS) and SSCAN (think: SMEMBERS), but ultimately you simply have the issue of wanting millions of rows, which is never free.
If possible, you could mitigate the impact by using a master/slave pair, and running the expensive operations on the slave. At least other clients will be able to do something while you're killing the server.
The keys command in Redis is slow (well, not slow, but time consuming). It also blocks your server from accepting any other command while it's running.
If you really want to iterate over all of your keys take a look at the scan command instead- although I have no idea about ServiceStack for this
You can use the SCAN command, make a loop search, where each search is restricted to a smaller number of keys. For a complete example, refer to this article: http://blog.bossma.cn/csharp/nservicekit-redis-support-scan-solution/

Sybase ASE data purge batch - design & performance

I am working on a Sybase ASE (migrating to 15.7) data purge utility to be used by multiple tables/ databases to delete huge amount of unwanted older data.
After receiving an input table name, automatically figure out the child tables and delete data. But, I couldn't find an hierarchical query clause like Oracle's "Connect by .. Prior" clause. Is there any other way to implement this?
I am deleting data by looping through multiple transaction/ commits in small increments. After the deletes, at what interval, should I do "reorg rebuild"?
Do I need to do update statistics? If I have to, what is the criteria that I should consider before doing update statistics?
Some tables may be partitioned. Is there anything that I need to consider in partition's perspective?
Some of our DB's (i guess index..?) are clustered. I don't have much idea about clustering. Do I need to consider anything in clustering perspective?
Send Email at the end of processing. Does built-in email package similar to oracle's UTIL_SMTP?
Some of the points are blank right now, and I will fill them as I get a chance.
1 - Check out this post on replicating this feature in Sybase ASE.
2 - My post over on the dba stack covers a lot of the key points on determining when to run a reorg
3 - Since updating statistics can be done more quickly than a reorg(which also updates statistics), it's sometimes used to help improve performance between reorgs. Deciding when to run them will depend on how quickly performance degrades when you do your purges. sp_sysmon is a valuable tool that can capture metrics to help you make your decision.
4 - Partioned tables shouldn't really impact your purge. It's another case where it may improve performance for your deletes, as the data may be accessed more quickly than other configurations.
5 - Not really. In theory your deletes should go a bit faster if your delete is using the clustered index. Clustered indexes are used to keep the data pages in order, as records are inserted, instead of heaping the inserts.
6 - For Windows based systems, xp_sendmail can be used. For *nix based systems, xp_cmdshell can be used to access sendmail. The documentation for those Extended Stored Procedures is here.

Segmenting Redis By Database

By default, Redis is configured with 16 databases, numbered 0-15. Is this simply a form of name spacing, or are there performance implications of segregating by database ?
For example, if I use the default database (0), and I have 10 million keys, best practices suggest that using the keys command to find keys by wildcard patterns will be inefficient. But what if I store my major keys, perhaps the first 4 segments of 8 segment keys, resulting in a much smaller subset of keys in a separate database (say database 3). Will Redis see these as a smaller set of keys, or do all keys across all databases appear as one giant index of keys ?
More explicitly put, in terms of time complexity, if my databases look like this:
Database 0: 10,000,000 keys
Database 3: 10,000 keys
will the time complexity of keys calls against Database 3 be O(10m) or will it be O(10k) ?
Thanks for your time.
Redis has a separate dictionary for each database. From your example, the keys call against database 3 will be O(10K)
That said, using keys is against best practice. Additionally, using multiple databases for the same application is against best practices as well. If you want to iterate over keys, you should index them in an application specific way. A SortedSet is a good way way to build an index.
References :
The structure redisServer has an array of redisDB. See redisServer in redis.h
Each redisDB has its own dictionary object. See redisDB in redis.h
keys command operates on the dictionary for the current database