Merge Sort in a data store? - redis

I'm trying to make a "friend stream" for the project I'm working on. I have individual users streams saved in Redis ZSETS. Something like:
key : { stream_id : time }
user1-stream: { 1:9931112, 3:93291, 9:9181273, ...}
user2-stream: { 4:4239191, 2:92919, 7:3293021, ...}
user3-stream: { 8:3299213, 5:97313, 6:7919921, ...}
...
user4-friends: [1,2,3]
Right now, to make user4's friend stream, I would call:
ZUNIONSTORE user4-friend-stream, [user1-stream, user2-stream, user3-stream]
However, ZUNIONSTORE is slow when you try to merge ZSETS totaling more than 1-2000 elements.
I'd really love to have Redis do a merge sort on the ZSETS, and limit the results to a few hundred elements. Are there any off-the-shelf data stores that will do what I want? If not, is there any kind of framework for developing redis-like data stores?
I suppose I could just fork Redis and add the function I need, but I was hoping to avoid that.

People tend to think that a zset is just a skip list. This is wrong. It is a skip list (ordered data structure) plus a non ordered dictionary (implemented as a hash table). The semantic of a merge operation would have to be defined. For instance, how would you merge non disjoint zsets whose common items do not have the same score?
To implement a merge algorithm for ZUNIONSTORE, you would have to get the items ordered (easy with the skip lists), merge them while building the output (which happens to be a zset as well: skiplist plus dictionary).
Because the cardinality of the result cannot be guessed at the beginning of the algorithm, I don't think it is possible to build this skiplist + dictionary in linear time. It will be O(n log n) at best. So the merge is linear, but building the output is not: it defeats the benefit of using a merge algorithm.
Now, if you want to implement a ZUNION (i.e. directly returning the result, not building the result as a zset), and limit the result to a given number of items, a merge algorithm makes sense.
RDBMS supporting merge joins can typically do it (but this is usually not very efficient, due to the cost of random I/Os). I'm not aware of a NoSQL store supporting similar capabilities.
To implement it in Redis, you could try a Lua server-side script, but it may be complex, and I think it will be efficient only if the zsets are much larger than the limit provided in the zunion. In that case, the limit on the number of items will offset the overhead of running interpreted Lua code.
The last possibility is to implement it in C in the Redis source code, which is not that difficult. The drawback is the burden to maintain a patch for the Redis versions you use. Redis itself provides no framework to do that, and the idea of defining Redis plugins (isolated from Redis source code) is generally rejected by the author.

Related

Which approach is better when using Redis?

I'm facing following problem:
I wan't to keep track of tasks given to users and I want to store this state in Redis.
I can do:
1) create list called "dispatched_tasks" holding many objects (username, task)
2) create many (potentialy thousands) lists called dispatched_tasks:username holding usually few objects (task)
Which approach is better? If I only thought of my comfort, I would choose the second one, as from time to time I will have to search for particular user tasks, and this second approach gives this for free.
But how about Redis? Which approach will be more performant?
Thanks for any help.
Redis supports different kinds of data structures as shown here. There are different approaches you can take:
Scenario 1:
Using a list data type, your list will contain all the task/user combination for your problem. However, accessing and deleting a task runs in O(n) time complexity (it has to traverse the list to get to the element). This can have an impact in performance if your user has a lot of tasks.
Using sets:
Similar to lists, but you can add/delete/check for existence in O(1) and sets elements are unique. So if you add another username/task that already exists, it won't add it.
Scenario 2:
The data types do not change. The only difference is that there will be a lot more keys in redis, which in can increase the memory footprint.
From the FAQ:
What is the maximum number of keys a single Redis instance can hold? and what the max number of elements in a Hash, List, Set, Sorted
Set?
Redis can handle up to 232 keys, and was tested in practice to handle
at least 250 million keys per instance.
Every hash, list, set, and sorted set, can hold 232 elements.
In other words your limit is likely the available memory in your
system.
What's the Redis memory footprint?
To give you a few examples (all obtained using 64-bit instances):
An empty instance uses ~ 3MB of memory. 1 Million small Keys ->
String Value pairs use ~ 85MB of memory. 1 Million Keys -> Hash
value, representing an object with 5 fields, use ~ 160 MB of
memory. To test your use case is trivial using the
redis-benchmark utility to generate random data sets and check with
the INFO memory command the space used.

Redis PFADD to check a exists-in-set query

I have a requirement to process multiple records from a queue. But due to some external issues the items may sporadically occur multiple times.
I need to process items only once
What I planned to use is PFADD into redis every record ( as a md5sum) and then see if that returns success. If that shows no increment then the record is a duplicate else process the record.
This seems pretty straightforward , but I am getting too many false positives while using PFADD
Is there a better way to do this ?
Being the probabilistic data structure that it is, Redis' HyperLogLog exhibits 0.81% standard error. You can reduce (but never get rid of) the probability for false positives by using multiple HLLs, each counting a the value of a different hash function on your record.
Also note that if you're using a single HLL there's no real need to hash the record - just PFADD as is.
Alternatively, use a Redis Set to keep all the identifiers/hashes/records and have 100%-accurate membership tests with SISMEMBER. This approach requires more (RAM) resources as you're storing each processed element, but unless your queue is really huge that shouldn't be a problem for a modest Redis instance. To keep memory consumption under control, switch between Sets according to the date and set an expiry on the Set keys (another approach is to use a single Sorted Set and manually remove old items from it by keeping their timestamp in the score).
In general in distributed systems you have to choose between processing items either :
at most once
at least once
Processing something exactly-once would be convenient however this is generally impossible.
That being said there could be acceptable workarounds for your specific use case, and as you suggest storing the items already processed could be an acceptable solution.
Be aware though that PFADD uses HyperLogLog, which is fast and scales but is approximate about the count of the items, so in this case I do not think this is what you want.
However if you are fine with having a small probability of errors, the most appropriate data structure here would be a Bloom filter (as described here for Redis), which can be implemented in a very memory-efficient way.
A simple, efficient, and recommended solution would be to use a simple redis key (for instance a hash) storing a boolean-like value ("0", "1" or "true", "false") for instance with the HSET or SET with the NX option instruction. You could also put it under a namespace if you wish to. It has the added benefit of being able to expire keys also.
It would avoid you to use a set (not the SET command, but rather the SINTER, SUNION commands), which doesn't necessarily work well with Redis cluster if you want to scale to more than one node. SISMEMBER is still fine though (but lacks some features from hashes such as time to live).
If you use a hash, I would also advise you to pick a hash function that has fewer chances of collisions than md5 (a collision means that two different objects end up with the same hash).
An alternative approach to the hash would be to assign an uuid to every item when putting it in the queue (or a squuid if you want to have some time information).

Correct modeling in Redis for writing single entity but querying multiple

I'm trying to convert data which is on a Sql DB to Redis. In order to gain much higher throughput because it's a very high throughput. I'm aware of the downsides of persistence, storage costs etc...
So, I have a table called "Users" with few columns. Let's assume: ID, Name, Phone, Gender
Around 90% of the requests are Writes. to update a single row.
Around 10% of the requests are Reads. to get 20 rows in each request.
I'm trying to get my head around the right modeling of this in order to get the max out of it.
If there were only updates - I would use Hashes.
But because of the 10% of Reads I'm afraid it won't be efficient.
Any suggestions?
Actually, the real question is whether you need to support partial updates.
Supposing partial update is not required, you can store your record in a blob associated to a key (i.e. string datatype). All write operations can be done in one roundtrip, since the record is always written at once. Several read operations can be done in one rountrip as well using the MGET command.
Now, supposing partial update is required, you can store your record in a dictionary associated to a key (i.e. hash datatype). All write operations can be done in one roundtrip (even if they are partial). Several read operations can also be done in one roundtrip provided HGETALL commands are pipelined.
Pipelining several HGETALL commands is a bit more CPU consuming than using MGET, but not that much. In term of latency, it should not be significantly different, except if you execute hundreds of thousands of them per second on the Redis instance.

Aerospike - Store *small quantity* of large values

Scenario
Let's say I am storing up to 5 byte arrays, each 50kB, per user.
Possible Implementations:
1) One byte array per record, indexed by secondary key.
Pros: Fast read/write.
Cons: High cardinality query (up to 5 results per query). Bad for horizontal scaling, if byte arrays are frequently accessed.
2) All byte arrays in single record in separate bins
Pros: Fast read
Neutral: Blocksize must be greater than 250kB
Cons: Slow write (one change means rewriting all byte arrays).
3) Store byte arrays in a LLIST LDT
Pros: Avoid the cons of solution (1) and (2)
Cons: LDTs are generally slow
4) Store each byte array in a separate record, keyed to a UUID. Store a UUID list in another record.
Pros: Writes to each byte array does not require rewriting all arrays. No low-cardinality concern of secondary indexes. Avoids use of LDT.
Cons: A client read is 2-stage: Get list of UUIDs from meta record, then multi-get for each UUID (very slow?)
5) Store each byte array as a separate record, using a pre-determined primary key scheme (e.g. userid_index, e.g. 123_0, 123_1, 123_2, 123_3, 123_4)
Pros: Avoid 2-stage read
Cons: Theoretical collision possibility with another user (e.g. user1_index1 and user2_index2 product same hash). I know this is (very, very) low-probability, but avoidance is still preferred (imagine one user being able to read the byte array of another user due to collision).
My Evaluation
For balanced read/write OR high read/low write situations, use #2 (One record, multiple bins). A rewrite is more costly, but avoids other cons (LDT penalty, 2-stage read).
For a high (re)write/low read situation, use #3 (LDT). This avoids having to rewrite all byte arrays when one of them is updated, due to the fact that records are copy-on-write.
Question
Which implementation is preferable, given the current data pattern (small quantity, large objects)? Do you agree with my evaluation (above)?
Here is some input. (I want to disclose that I do work at Aerospike).
Do avoid #3. Do not use LDT as the feature is definitely not as mature as the rest of the platform, especially when it comes to performance / reliability during cluster rebalance (migrations) situations when nodes leave/join a cluster.
I would try to stick as much as possible with basic Key/Value transactions. That should always be the fastest and most scalable. As you pointed out, option #1 would not scale. Secondary indices also do have an overhead in memory and currently do not allow for fast start (enterprise edition only anyways).
You are also correct on #2 for high write loads, especially if you are going to always update 1 bin...
So, this leaves options #4 and #5. For option #5, the collision will not happen in practice. You can go over the math, it will simply not happen. If it does, you will get famous and can publish a paper :) (there may even be a price for having found a collision). Also, note thatyou have the option to store the key along the record which will provide you with a 'key check' on writes which should be very cheap (since records are read anyway before being written). Option #4 would work as well, it will just do an extra read (which should be super fast).
It all depends on where you want the bit extra complexity. So you can do some simple benchmarking between the 2 options if you have that luxury before deciding.

What is the conventional way to store objects in a sorted set in redis?

What is the most convenient/fast way to implement a sorted set in redis where the values are objects, not just strings.
Should I just store object id's in the sorted set and then query every one of them individually by its key or is there a way that I can store them directly in the sorted set, i.e. must the value be a string?
It depends on your needs, if you need to share this data with other zsets/structures and want to write the value only once for every change, you can put an id as the zset value and add a hash to store the object. However, it implies making additionnal queries when you read data from the zset (one zrange + n hgetall for n values in the zset), but writing and synchronising the value between many structures is cheap (only updating the hash corresponding to the value).
But if it is "self-contained", with no or few accesses outside the zset, you can serialize to a chosen format (JSON, MESSAGEPACK, KRYO...) your object and then store it as the value of your zset entry. This way, you will have better performance when you read from the zset (only 1 query with O(log(N)+M), it is actually pretty good, probably the best you can get), but maybe you will have to duplicate the value in other zsets / structures if you need to read / write this value outside, which also implies maintaining synchronisation by hand on the value.
Redis has good documentation on performance of each command, so check what queries you would write and calculate the total cost, so that you can make a good comparison of these two options.
Also, don't forget that redis comes with optimistic locking, so if you need pessimistic (because of contention for instance) you will have to do it by hand and/or using lua scripts. If you need a lot of sync, the first option seems better (less performance on read, but still good, less queries and complexity on writes), but if you have values that don't change a lot and memory space is not a problem, the second option will provide better performance on reads (you can duplicate the value in redis, synchronize the values periodically for instance).
Short answer: Yes, everything must be stored as a string
Longer answer: you can serialize your object into any text-based format of your choosing. Most people choose MsgPack or JSON because it is very compact and serializers are available in just about any language.