How to efficiently union non-overlapping sets in redis? - redis

I have a use case where I know for a fact that some sets I have materialized in my redis store are disjoint. Some of my sets are quite large, as a result, their sunion or sunionstore takes quite a large amount of time. Does redis provide any functionality for handling such unions?
Alternatively, if there is a way to add elements to a set in Redis without checking for uniqueness before each insert, it could solve my issue.

Actually, there is no need for such feature, because of the relative cost of operations.
When you build Redis objects (such as sets or lists), the cost is not dominated by the data structure management (hash table or linked lists), because the amortized complexity of individual insertion operations is O(1). The cost is dominated by the allocation and initialization of all the items (i.e. the set objects or the list objects). When you retrieve those objects, the cost is dominated by the allocation and formatting of the output buffer, not by the access paths in the data structure.
So bypassing the uniqueness property of the sets does not bring a significant optimization.
To optimize a SUNION command if the sets are disjoint, the best is to replace it by a pipeline of several SMEMBERS commands to retrieve the individual sets (and build the union on client side).
Optimizing a SUNIONSTORE is not really possible since disjoint sets is a worst case for the performance. The performance is dominated by the number of resulting items, so the less items in common, the more response time.

Related

Correct modeling in Redis for writing single entity but querying multiple

I'm trying to convert data which is on a Sql DB to Redis. In order to gain much higher throughput because it's a very high throughput. I'm aware of the downsides of persistence, storage costs etc...
So, I have a table called "Users" with few columns. Let's assume: ID, Name, Phone, Gender
Around 90% of the requests are Writes. to update a single row.
Around 10% of the requests are Reads. to get 20 rows in each request.
I'm trying to get my head around the right modeling of this in order to get the max out of it.
If there were only updates - I would use Hashes.
But because of the 10% of Reads I'm afraid it won't be efficient.
Any suggestions?
Actually, the real question is whether you need to support partial updates.
Supposing partial update is not required, you can store your record in a blob associated to a key (i.e. string datatype). All write operations can be done in one roundtrip, since the record is always written at once. Several read operations can be done in one rountrip as well using the MGET command.
Now, supposing partial update is required, you can store your record in a dictionary associated to a key (i.e. hash datatype). All write operations can be done in one roundtrip (even if they are partial). Several read operations can also be done in one roundtrip provided HGETALL commands are pipelined.
Pipelining several HGETALL commands is a bit more CPU consuming than using MGET, but not that much. In term of latency, it should not be significantly different, except if you execute hundreds of thousands of them per second on the Redis instance.

Aerospike - Store *small quantity* of large values

Scenario
Let's say I am storing up to 5 byte arrays, each 50kB, per user.
Possible Implementations:
1) One byte array per record, indexed by secondary key.
Pros: Fast read/write.
Cons: High cardinality query (up to 5 results per query). Bad for horizontal scaling, if byte arrays are frequently accessed.
2) All byte arrays in single record in separate bins
Pros: Fast read
Neutral: Blocksize must be greater than 250kB
Cons: Slow write (one change means rewriting all byte arrays).
3) Store byte arrays in a LLIST LDT
Pros: Avoid the cons of solution (1) and (2)
Cons: LDTs are generally slow
4) Store each byte array in a separate record, keyed to a UUID. Store a UUID list in another record.
Pros: Writes to each byte array does not require rewriting all arrays. No low-cardinality concern of secondary indexes. Avoids use of LDT.
Cons: A client read is 2-stage: Get list of UUIDs from meta record, then multi-get for each UUID (very slow?)
5) Store each byte array as a separate record, using a pre-determined primary key scheme (e.g. userid_index, e.g. 123_0, 123_1, 123_2, 123_3, 123_4)
Pros: Avoid 2-stage read
Cons: Theoretical collision possibility with another user (e.g. user1_index1 and user2_index2 product same hash). I know this is (very, very) low-probability, but avoidance is still preferred (imagine one user being able to read the byte array of another user due to collision).
My Evaluation
For balanced read/write OR high read/low write situations, use #2 (One record, multiple bins). A rewrite is more costly, but avoids other cons (LDT penalty, 2-stage read).
For a high (re)write/low read situation, use #3 (LDT). This avoids having to rewrite all byte arrays when one of them is updated, due to the fact that records are copy-on-write.
Question
Which implementation is preferable, given the current data pattern (small quantity, large objects)? Do you agree with my evaluation (above)?
Here is some input. (I want to disclose that I do work at Aerospike).
Do avoid #3. Do not use LDT as the feature is definitely not as mature as the rest of the platform, especially when it comes to performance / reliability during cluster rebalance (migrations) situations when nodes leave/join a cluster.
I would try to stick as much as possible with basic Key/Value transactions. That should always be the fastest and most scalable. As you pointed out, option #1 would not scale. Secondary indices also do have an overhead in memory and currently do not allow for fast start (enterprise edition only anyways).
You are also correct on #2 for high write loads, especially if you are going to always update 1 bin...
So, this leaves options #4 and #5. For option #5, the collision will not happen in practice. You can go over the math, it will simply not happen. If it does, you will get famous and can publish a paper :) (there may even be a price for having found a collision). Also, note thatyou have the option to store the key along the record which will provide you with a 'key check' on writes which should be very cheap (since records are read anyway before being written). Option #4 would work as well, it will just do an extra read (which should be super fast).
It all depends on where you want the bit extra complexity. So you can do some simple benchmarking between the 2 options if you have that luxury before deciding.

How to implement a scalable, unordered collection in DynamoDB?

I am looking into implementing a scalable unordered collection of objects on top of Amazon DynamoDB. So far the following options have been considered:
Use DynamoDB document data types (map, list) and use document path to access stand-alone items. This has one obvious drawback for collection being limited to 400KB of data, meaning perhaps 1..10K objects depending on their size. Less obvious drawback is that cost of insertion of a new object into such collection is going to be huge: Amazon specifies that the write capacity will be deducted based on the total item size, not just newly added object -- therefore ~400 capacity units for inserting 1KB object when approaching the size limit. So considering this ruled out?
Using composite primary hash + range key, where primary hash remains the same for all objects in the collection, and range key is just something random or an atomic counter. Obvious drawback is that having identical hash key results in bad key distribution -- cardinality is low when there are collections with large number of objects. This means bad partitioning, and having a scale issue with all reads/writes on the same collection being stuck to one shard, becoming subject to 3000 reads / 1000 writes per second limitation of DynamoDB partition.
Using global secondary index with secondary hash + range key, where hash key remains the same for all objects belonging to the same collection, and range key is just something random or an atomic counter. Similar to above, partitioning becomes poor for the GSI, and it will become a bottleneck with too many identical hashes draining all the provisioned capacity to the index rapidly. I didn't find how the GSI is implemented exactly, thus not sure how badly it suffers from low cardinality.
Question is, whether I could live with (2) or (3) and suffer from non-ideal key distribution, or is there another way of implementing collection that was overlooked, or perhaps I should at all consider looking into another nosql database engine.
This is a "shooting from the hip" answer, what you end up doing may depend on how much and what type of reading and writing you do.
Two things the dynamo docs encourage you to avoid are hot keys and, in general, scans. You noted that in cases (2) and (3), you end up with a hot key. If you expect this to scale (large collections), the hot key will probably hurt more and more, especially if this is a write-intensive application.
The docs on Query and Scan operations (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html) say that, for a query, "you must specify the hash key attribute name and value as an equality condition." So if you want to avoid scans, this might still force your hand and put you back into that hot key situation.
Maybe one route would be to embrace doing a scan operation, but just have one table devoted to your collection. Then you could just have a fully random (well distributed) hash key and do a scan every time. This assumes you always want everything from the collection (you didn't say). This will still hurt if you scale up to a large collection, but if you always want the full set back, you'll have to deal with that pain regardless. If you just want a subset, you can add a limit parameter. This would help performance, but you will always get back the same subset (or you can use the last evaluated key and keep going). The docs also mention parallel scans.
If you are using AWS, elasticache/redis might be another route to try? The first pass might code up a lot faster/cleaner than situation (1) that you mentioned.

Merge Sort in a data store?

I'm trying to make a "friend stream" for the project I'm working on. I have individual users streams saved in Redis ZSETS. Something like:
key : { stream_id : time }
user1-stream: { 1:9931112, 3:93291, 9:9181273, ...}
user2-stream: { 4:4239191, 2:92919, 7:3293021, ...}
user3-stream: { 8:3299213, 5:97313, 6:7919921, ...}
...
user4-friends: [1,2,3]
Right now, to make user4's friend stream, I would call:
ZUNIONSTORE user4-friend-stream, [user1-stream, user2-stream, user3-stream]
However, ZUNIONSTORE is slow when you try to merge ZSETS totaling more than 1-2000 elements.
I'd really love to have Redis do a merge sort on the ZSETS, and limit the results to a few hundred elements. Are there any off-the-shelf data stores that will do what I want? If not, is there any kind of framework for developing redis-like data stores?
I suppose I could just fork Redis and add the function I need, but I was hoping to avoid that.
People tend to think that a zset is just a skip list. This is wrong. It is a skip list (ordered data structure) plus a non ordered dictionary (implemented as a hash table). The semantic of a merge operation would have to be defined. For instance, how would you merge non disjoint zsets whose common items do not have the same score?
To implement a merge algorithm for ZUNIONSTORE, you would have to get the items ordered (easy with the skip lists), merge them while building the output (which happens to be a zset as well: skiplist plus dictionary).
Because the cardinality of the result cannot be guessed at the beginning of the algorithm, I don't think it is possible to build this skiplist + dictionary in linear time. It will be O(n log n) at best. So the merge is linear, but building the output is not: it defeats the benefit of using a merge algorithm.
Now, if you want to implement a ZUNION (i.e. directly returning the result, not building the result as a zset), and limit the result to a given number of items, a merge algorithm makes sense.
RDBMS supporting merge joins can typically do it (but this is usually not very efficient, due to the cost of random I/Os). I'm not aware of a NoSQL store supporting similar capabilities.
To implement it in Redis, you could try a Lua server-side script, but it may be complex, and I think it will be efficient only if the zsets are much larger than the limit provided in the zunion. In that case, the limit on the number of items will offset the overhead of running interpreted Lua code.
The last possibility is to implement it in C in the Redis source code, which is not that difficult. The drawback is the burden to maintain a patch for the Redis versions you use. Redis itself provides no framework to do that, and the idea of defining Redis plugins (isolated from Redis source code) is generally rejected by the author.

Efficient Hashmap Use

What is the more efficient approach for using hashmaps?
A) Use multiple smaller hashmaps, or
B) store all objects in one giant hashmap?
(Assume that the hashing algorithm for the keys is fairly efficient, resulting in few collisions)
CLARIFICATION: Option B implies segregation by primary key -- i.e. no additional lookup is necessary to determine which actual hashmap to use. (For example, if the lookup keys are alphanumeric, Hashmap 1 stores the A's, Hashmap 2 stores B's, and so on.)
Definitely B. The advantage of hash tables is that the average number of comparisons per lookup is independent of the size.
If you split your map into N smaller hashmaps, you will have to search half of them on average for each lookup. If the smaller hashmaps have the same load factor that the larger map would have had, you will increase the total number of comparisons by a factor of approximately N/2.
And if the smaller hashmaps have a smaller load factor, you are wasting memory.
All that is assuming you distribute the keys randomly between the smaller hashmaps. If you distribute them according to some function of the key (e.g. a string prefix) then what you have created is a trie, which is efficient for some applications (e.g. auto-complete in web forms.)
Are these maps used in logically distinct places? For instance, I wouldn't have one map containing users, cached query results, loggers etc, just because you happen to know the keys won't clash. However, I equally wouldn't split up a single map into multiple maps.
Keep one hashmap for each logical mapping from key to value.
In addition #Jon's answer, there can be practical reasons why you want to maintain separate hash tables.
If you have separate tables for different mappings you can 'clear' each of the mappings independently; e.g. by calling 'clear' or getting rid of the reference to the corresponding table.
If the separate tables hold mappings to cached entries, you can use different strategies to 'age' the respective entries.
If the application is multi-threaded, using separate tables may reduce lock contention, and may (for some processor architectures) increase processor memory cache hit ratios.