Use Cases for Redis' "Score" and "Ranking" Features for Sets - redis

What are some use cases for Redis' "score" and "ranking" features for sets (outside of the typical "leaderboard" examples for games? I'm trying to figure out how to make use of these dynamic new features as I anticipate moving from using a traditional relational database to Redis as a persistent data store.

ZSETs are great for selections or ranges based on scores, but scores can be any numerical value, like a timestamp.
We store daily stock prices for all US stocks in redis. Here's an example for ebay...
ZADD key score member [score member ...]
...
ZADD stocks:ebay 1 30.39 2 32.70 3 31.25 4 31.75 5 29.12 6 29.87 7 29.93
The score values in this case would normally be long timestamps, with that aside, if we want daily prices for the last 3 days, we simply convert two dates to timestamps and pull from redis using the timestamp range 1 3...
zrangebyscore stocks:ebay 1 3
1) "30.39"
2) "32.70"
3) "31.25"
The query is very fast and works well for our needs.
Hope it helps!

zset is the only type of key who can be sorted
by example you can imagine puts all comments key id of a specific article in a zset,
users will vote up/down each comments and this will change the score value
after that when you need to draw comments you can get them ordered, better comments in first place (like here)
using ZREMRANGEBYSCORE you can imagine delete all pretty bad comments each days
but as each redis type, they still basic, give you a dedicated use case is hard there can be some :- )

Related

Can I define a custom compare function for redis zset?

Background:
I'm now using php+redis as my backend to store a rank.
And zset seems to be a good solution to handle this.
However the rank contains multiple scores, if the first score equals, I need to compare the second score to decide the ordering. There are 3 scores in total.
I thought there will be an interface that I can set a custom compare function for a specific zset so that I can do the sort job inside it but I failed to find it. Besides, I'd like the rank to be sorted when it's added. if I need to sort again everytime there is a request to get the rank, then this is wasteful I think.
expect result:
zadd myset 1000_100_3000 matchId1
zadd myset 1000_2500_250 matchId2
zadd myset 1000_2500_200 matchId3
zrange myset 0 -1 returns:
matchId2
matchId3
matchId1
something like this
Short answer: no, you can't do that
However, you can often compose multiple keys in such a way to make them sortable. In your case, an obvious candidate would be to determine the max range of the three pieces, and just compose them all into a single big number. For example, 1000_100_3000 could perhaps be the number 100001003000 (4 decimal digits per chunk), which can be trivially compared or decomposed. You might also want to think in terms of bits rather than digits, though. For example, maybe allow 20 bits per segment, and use shift/mask bit operations to compose/decompose (i.e. (1000 << 40) | (100 << 20) | (3000))

Best way for getting users friends top rating with Redis SORTED SET

I have SORTED SET user_id:rating for every level in the game(2000+ levels). There is 2 000 000 users in set.
I need to create 2 ratings - first - all users top 100, second - top 5 friends each player
First can be solved very easily with ZRANGE
But there is a problem with second, because in average - every user has 500 friends
There is 2 ways:
1) I can do 500 requests with ZSCORE\ZRANK and sort users on by backend (too many requests, bad performance)
2) I can create SORTED SET for each user and update it on background on every users update. (more data, more ram, more complex)
May be there are any others options I missed?
I believe your main concern here should be your data model. Does every user have a sorted set of his friends?
I would recommend something like this:
users:{id}:friends values as the ids of friends
users:scoreboard values as the users ids and score as the rating
of each
As an answer to your first concern, you can consider using pipelines, which will reduce the number of requests drastically, none the less you will still need to handle ordering the results.
The better answer for you problem would be, in case you have the two sorted sets as described earlier:
Get the intersection between the two, using the "zinterstore" command and storing the result in a sorted set created solely for this purpose. As a result, the new sorted set will contain all the user's friends ids with their rating as the score (need to be careful here since you will need to specify the score of the new sorted set, it can either be the SUM, MIN or MAX of the scores).
ref: http://redis.io/commands/zinterstore
At this point using a simple "zrevrangebyscore" and specifying a limit, will leverage the sorted result you are looking for.

How to intersect sorted sets with specific score?

tl:dr;
Can I do the intersection between two sorted sets, but only for the fields that have a specified score?
For example:
ZADD zset1 1 "first"
ZADD zset1 2 "second"
ZADD zset2 1 "first"
ZADD zset2 2 "second"
<command_that_I_need> <score> 2 zset1 zset2
And with score 1, would return:
1 "first"
Why do I want to do this?
Why? I'll need to explain some things first.
In my model, there are users, which can create articles.
Each article is associated to an article_id.
I want to give anyone the ability to search for article titles, and optionally specifying the user.
And I thought of this by splitting the query in words, and finding which articles contain those words.
And I'm able to do that because I store that information this way:
zadd search_words:<word> <user_id> <article_id>
Each time an article is created, its title is split in words. For each word in words, I create that sorted set.
That way, I would split each query in words, and intersect the search_words:<word> sorted sets. And I do want to filter by score because that would allow me to filter by user_id.
Note: Since the articles also have content, I'm aware that the user would want the search to look in the content, but that is out of the scope of my project, and thus I ignore that temporarily.
Note 2: Is there a better solution to this?
There is no single command to do your bidding in this case. One way to go about it would be to first do the ZINTERSTORE on all your sets and then do a ZRANGEBYSCORE on the result to filter by your user_id.
Note 2: your use of sorted sets' score is acceptable since apparently user_id is numeric. However, a simpler way would be to use regular sets. You can continue using the same mechanism for the words and articles but instead of storing the article_id<->user_id relationship in a score, use a dedicated set for each user, for example:
SADD articles:<user_id> <article_id>
Then, to do your query for articles for specific words and a user, just do SINTER[STORE] like so:
SINTER articles:<user_id> search_words:<word1> search_words:<word2>...

Redis: fan out news feeds in list or sorted set?

I'm caching fan-out news feeds with Redis in the following way:
each feed activity is a key/value, like activity:id where the value is a JSON string of the data.
each news feed is currently a list, the key is feed:user:user_id and the list contains the keys of the relevant activities.
to retrieve a news feed I use for example: 'sort feed:user:user_id by nosort get * limit 0 40'
I'm considering changing the feed to a sorted set where the score is the activity's timestamp, this way the feed is always sorted by time.
I read http://arindam.quora.com/Redis-sorted-sets-and-lists-Pertaining-to-Newsfeed which recommend using lists because of the time complexity of sorted sets, but by keep using lists I have to take care of the insert order,
inserting a past story requires to iterate through the list and finding the right index to push to. (which can cause new problems in distributed environments).
should I keep using lists or go for sorted sets?
is there a way to retrieve the news feed instantly from a sorted set, (like with the sort ... get * command for a list) or does it have to be zrange and then iterating through the results and getting each value?
Yes, sorted sets are very fast and powerful. They seem a much better match for your requirements than SORT operations. The time complexity is often misunderstood. O(log(N)) is very fast, and scales just fine. We use it for tens of millions of members in one sorted set. Retrieval and insertion is sub-millisecond.
Use ZRANGEBYSCORE key min max WITHSCORES [LIMIT offset count] to get your results.
Depending on how you store the timestamps as 'scores', ZREVRANGEBYSCORE might be better.
A small remark about the timestamps: Sorted set SCORES which don't need a decimal part should be using 15 digits or less. So the SCORE has to stay in the range -999999999999999 to 999999999999999. Note: These limits exist because Redis server actually stores the score (float) as a redis-string representation internally.
I therefore recommend this format, converted to Zulu Time: -20140313122802 for second-precision. You may add 1 digit for 100ms-precision, but no more if you want no loss in precision. It's still a float64 by the way, so loss of precision could be fine in some scenarios, but your case fits in the 'perfect precision' range, so that's what I recommend.
If your data expires within 10 years, you can also skip the three first digits (CCY of CCYY), to achieve .0001 second precision.
I suggest negative scores here, so you can use the simpler ZRANGEBYSCORE instead of the REV one. You can use -inf as the start score (minus infinity) and LIMIT 0 100 to get the top 100 results.
Two sorted set members (or 'keys' but that's ambiguous since the sorted set is also a key in itself) may share a score, that's no problem, the results within an identical score are alphabetical.
Hope this helps, TW
Edit after chat
The OP wanted to collect data (using a ZSET) from different keys (GET/SET or HGET/HSET keys). JOIN can do that for you, ZRANGEBYSCORE can't.
The preferred way of doing this, is a simple Lua script. The Lua script is executed on the server. In the example below I use EVAL for simplicity, in production you would use SCRIPT EXISTS, SCRIPT LOAD and EVALSHA. Most client libraries have some bookkeeping logic built-in, so you don't upload the script each time.
Here's an example.lua:
local r={}
local zkey=KEYS[1]
local a=redis.call('zrangebyscore', zkey, KEYS[2], KEYS[3], 'withscores', 'limit', 0, KEYS[4])
for i=1,#a,2 do
r[i]=a[i+1]
r[i+1]=redis.call('get', a[i])
end
return r
You use it like this (raw example, not coded for performance):
redis-cli -p 14322 set activity:1 act1JSON
redis-cli -p 14322 set activity:2 act2JSON
redis-cli -p 14322 zadd feed 1 activity:1
redis-cli -p 14322 zadd feed 2 activity:2
redis-cli -p 14322 eval '$(cat example.lua)' 4 feed '-inf' '+inf' 100
Result:
1) "1"
2) "act1JSON"
3) "2"
4) "act2JSON"

Creating a workable Redis store with several filters

I am working on a system to display information about real estate. It runs in angular with the data stored as a json file on the server, which is updated once a day.
I have filters on number of bedrooms, bathrooms, price and a free text field for the address. It's all very snappy, but the problem is the load time of the app. This is why I am looking at Redis. Trouble is, I just can't get my head round how to get data with several different filters running.
Let's say I have some data like this: (missing off lots of fields for simplicity)
id beds price
0 3 270000
1 2 130000
2 4 420000
etc...
I am thinking I could set up three sets, one to hold the whole dataset, one to create an index on bedrooms and another for price:
beds id
2 1
3 0
4 2
and the same for price:
price id
130000 1
270000 0
420000 2
Then I was thinking I could use SINTER to return the overlapping sets.
Let's say I looking for a house with more than 2 bedrooms that is less than 300000.
From the bedrooms set I get IDs 0,2 for beds > 2.
From the prices set I get IDs 0,1 for price < 300000
So the common id is 0, which I would then lookup in the main dataset.
It all sounds good in theory, but being a Redis newbie, I have no clue how to go about achieving it!
Any advice would be gratefully received!
You're on the right track; sets + sorted sets is the right answer.
Two sources for all of the information that you could ever want:
Chapter 7 of my book, Redis in Action - http://bitly.com/redis-in-action
My Python/Redis object mapper - https://github.com/josiahcarlson/rom (it uses ideas directly from chapter 7 of my book to implement sql-like indices)
Both of those resources use Python as the programming language, though chapter 7 has been translated into Java: https://github.com/josiahcarlson/redis-in-action/ (go to the java path to see the code).
... That said, a normal relational database (especially one with built-in Geo handling like Postgres) should handle this data with ease. Have you considered a relational database?