Best way to store data and have ordered list at the same time - redis

I have those datas that change enough not to be in my postgres tables.
I would like to get tops out of those data.
I'm trying to figure out a way to do this considering :
Easiness of use
Performance
1. Using Hash + CRON to build ordered sets frequently
In this case, I have lot of users data stored in hash like this :
u:25463:d = { "xp":45124, "lvl": 12, "like": 15; "liked": 2 }
u:2143:d = { "xp":4523, "lvl": 10, "like": 12; "liked": 5 }
If I want to get the top 15 of the higher lvl people. I dont think I can do this with a single command. I think I'll need to SCAN the all u:x:d datas and build sorted sets out of it. Am I mistaken ?
What about performance in this case ?
2.Multiple Ordered sets
In this case, I duplicate datas.
I still have to first case but I also update datas in the differents sorted sets and I don't need to use a CRON to built them.
I feel like the best approach is the first one but what if I have 1000000 users ?
Or is there another way ?

One possibility would be to use a single sorted set + hashes.
The sorted set would just be used as a lookup, it would store the key of a user's hash as the value and their level as the score.
Any time you add a new player / update their level, you would both set the hash, and insert the item into the sorted set. You could do this in a transaction based pipeline, or a lua script to be sure they both run at the same time, keeping your data consistent.
Getting the top players would mean grabbing the top entries in the sorted set, and then using the keys from that set, to go lookup the full data on those players with the hashes.
Hope that helps.

Related

sdiff - limit the result set to X items

I want to get the diff of two sets in redis, but I don't need to return the entire array, just 10 items for example. Is there any way to limit the results?
I was thinking something like this:
SDIFF set1 set2 LIMIT 10
If not, are there any other options to achieve this in a performant way, considering that set1 can be millions of objects and set2 is much much smaller (hundreds).
More info would be helpful on what you want to achieve. Something like this might require you to duplicate your data. Though I don’t know if it’s something you want.
An option is chunking them.
Create a set with a unique generated id that can hold a max of 10 items
Create a sorted set like so…
zadd(key, timestamp, chunkid)
where your timestamp is a unix time and the chunkid is the key the connects to the set. The key can be the name of whatever you would like it to be or it could also be a uniquely generated id.
Use zrange to grab a specific one
(Repeat steps 1-3 for the second set)
Once you have your 1 result from both your sorted sets “zset”. You can now do your sdiff by using the chunkid.
Note that there is advantages and disadvantages in doing this. Like more connection consumption (if calling from a a client), and the obvious being a little more processing. Though it will help immensely if you put this in a lua script.
Hope this helps or at least gives you an idea on how to model your data. Though if this is critical data you might need to use a automated script of some sort to move your data around to meet the modeling requirement.

Best way for getting users friends top rating with Redis SORTED SET

I have SORTED SET user_id:rating for every level in the game(2000+ levels). There is 2 000 000 users in set.
I need to create 2 ratings - first - all users top 100, second - top 5 friends each player
First can be solved very easily with ZRANGE
But there is a problem with second, because in average - every user has 500 friends
There is 2 ways:
1) I can do 500 requests with ZSCORE\ZRANK and sort users on by backend (too many requests, bad performance)
2) I can create SORTED SET for each user and update it on background on every users update. (more data, more ram, more complex)
May be there are any others options I missed?
I believe your main concern here should be your data model. Does every user have a sorted set of his friends?
I would recommend something like this:
users:{id}:friends values as the ids of friends
users:scoreboard values as the users ids and score as the rating
of each
As an answer to your first concern, you can consider using pipelines, which will reduce the number of requests drastically, none the less you will still need to handle ordering the results.
The better answer for you problem would be, in case you have the two sorted sets as described earlier:
Get the intersection between the two, using the "zinterstore" command and storing the result in a sorted set created solely for this purpose. As a result, the new sorted set will contain all the user's friends ids with their rating as the score (need to be careful here since you will need to specify the score of the new sorted set, it can either be the SUM, MIN or MAX of the scores).
ref: http://redis.io/commands/zinterstore
At this point using a simple "zrevrangebyscore" and specifying a limit, will leverage the sorted result you are looking for.

What is the best way to retrieve soccer games by league names in redis?

I have a hundreds of soccer games saved in my redus database. They are saved in hashes under the key: games:soccer:data I have three z sets to clasify them into upcoming, live, and ended. All ordered by date (score). This way I can easily retrieve them depending on if will start soon, they are already happening, or they already ended. Now, i want to be able to retrieve them by league names.
I came up with two alternatives:
First alternative: save single hashes containing the game id and the league name. This way I can get all live game ids and then check each id against their respective hashes, if it matches the league name(s) i want, then i push it into an array, if not, i skip it. Finally, return the array with all game ids for the leagues i wanted.
Second alternative: create keys for each league and have live, upcoming, and ended sets for each. This way, i think, it would be faster to retrieve the game ids; however, it would be a pain to maintain each set.
If you have any other way of doing this, please let me know. I don't know if sorting would be faster and save me some memory.
I am looking for speed and low memory usage.
EDIT (following hobbs alternative):
const multi = client.multi();
const tempSet = 'users:data:14:sports:soccer:lists:temp_' + getTimestamp();
return multi
.sunionstore(
tempSet,
[
'sports:soccer:lists:leagueNames:Bundesliga',
'sports:soccer:lists:leagueNames:La Liga'
]
)
.zinterstore(
'users:data:14:sports:soccer:lists:live',
2,
'sports:lists:live',
tempSet
)
.del(tempSet)
.execAsync()
I need to set AGGREGATE MAX to my query and I have no idea how.
One way would be to use a SET containing all of the games for each league, and use ZINTERSTORE to compute the intersection between your league sets and your existing sets. You could do the ZINTERSTORE every time you query the data (it's not a horribly expensive operation unless your data is very large), or you could do it only when writing to one of the "parent" sets, or you could treat it as a sort of cache by giving it a short TTL and creating it only if it doesn't exist when you go to query it.

Redis Sorted Set ... store data in "member"?

I am learning Redis and using an existing app (e.g. converting pieces of it) for practice.
I'm really struggling to understand first IF and then (if applicable) HOW to use Redis in one particular use-case ... apologies if this is super basic, but I'm so new that I'm not even sure if I'm asking correctly :/
Scenario:
Images are received by a server and info like time_taken and resolution is saved in a database entry. Images are then associated (e.g. "belong_to") with one Event ... all very straight-forward for a RDBS.
I'd like to use a Redis to maintain a list of the 50 most-recently-uploaded image objects for each Event, to be delivered to the client when requested. I'm thinking that a Sorted Set might be appropriate, but here are my concerns:
First, I'm not sure if a Sorted Set can/should be used in this associative manner? Can it reference other objects in Redis? Or is there just a better way to do this altogether?
Secondly, I need the ability to delete elements that are greater than X minutes old. I know about the EXPIRE command for keys, but I can't use this because not all images need to expire at the same periodicity, etc.
This second part seems more like a query on a field, which makes me think that Redis cannot be used ... but then I've read that I could maybe use the Sorted Set score to store a timestamp and find "older than X" in that way.
Can someone provide come clarity on these two issues? Thank you very much!
UPDATE
Knowing that the amount of data I need to store for each image is small and will be delivered to the client's browser, can is there anything wrong with storing it in the member "field" of a sorted set?
For example Sorted Set => event:14:pictures <time_taken> "{id:3,url:/images/3.png,lat:22.8573}"
This saves the data I need and creates a rapidly-updatable list of the last X pictures for a given event with the ability to, if needed, identify pictures that are greater than X minutes old ...
First, I'm not sure if a Sorted Set can/should be used in this
associative manner? Can it reference other objects in Redis?
Why do you need to reference other objects? An event may have n image objects, each with a time_taken and image data; a sorted set is perfect for this. The image_id is the key, the score is time_taken, and the member is the image data as json/xml, whatever; you're good to go there.
Secondly, I need the ability to delete elements that are greater than
X minutes old
If you want to delete elements greater than X minutes old, use ZREMRANGEBYSCORE:
ZREMRANGEBYSCORE event:14:pictures -inf (currentTime - X minutes)
-inf is just another way of saying the oldest member without knowing the oldest members time, but for the top range you need to calculate it based on current time before using this command ( the above is just an example)

Redis Sorted Sets: How do I get the first intersecting element?

I have a number of large sorted sets (5m-25m) in Redis and I want to get the first element that appears in a combination of those sets.
e.g I have 20 sets and wanted to take set 1, 5, 7 and 12 and get only the first intersection of only those sets.
It would seem that a ZINTERSTORE followed by a "ZRANGE foo 0 0" would be doing a lot more work that I require as it would calculate all the intersections then return the first one. Is there an alternative solution that does not need to calculate all the intersections?
There is no direct, native alternative, although I'd suggest this:
Create a hash which its members are your elements. Upon each addition to one of your sorted sets, increment the relevant member (using HINCRBY). Of course, you'll make the increment only after you check that the element does not exist already in the sorted set you are attempting to add to.
That way, you can quickly know which elements appear in 4 sets.
UPDATE: Now that I rethink about it, it might be too expensive to query your hash to find items with value of 4 (O(n)). Another option would be creating another Sorted Set, which its members are your elements, and their score gets incremented (as I described before, but using ZINCRBY), and you can quickly pull all elements with score 4 (using ZRANGEBYSCORE).