How to intersect sorted sets with specific score? - redis

tl:dr;
Can I do the intersection between two sorted sets, but only for the fields that have a specified score?
For example:
ZADD zset1 1 "first"
ZADD zset1 2 "second"
ZADD zset2 1 "first"
ZADD zset2 2 "second"
<command_that_I_need> <score> 2 zset1 zset2
And with score 1, would return:
1 "first"
Why do I want to do this?
Why? I'll need to explain some things first.
In my model, there are users, which can create articles.
Each article is associated to an article_id.
I want to give anyone the ability to search for article titles, and optionally specifying the user.
And I thought of this by splitting the query in words, and finding which articles contain those words.
And I'm able to do that because I store that information this way:
zadd search_words:<word> <user_id> <article_id>
Each time an article is created, its title is split in words. For each word in words, I create that sorted set.
That way, I would split each query in words, and intersect the search_words:<word> sorted sets. And I do want to filter by score because that would allow me to filter by user_id.
Note: Since the articles also have content, I'm aware that the user would want the search to look in the content, but that is out of the scope of my project, and thus I ignore that temporarily.
Note 2: Is there a better solution to this?

There is no single command to do your bidding in this case. One way to go about it would be to first do the ZINTERSTORE on all your sets and then do a ZRANGEBYSCORE on the result to filter by your user_id.
Note 2: your use of sorted sets' score is acceptable since apparently user_id is numeric. However, a simpler way would be to use regular sets. You can continue using the same mechanism for the words and articles but instead of storing the article_id<->user_id relationship in a score, use a dedicated set for each user, for example:
SADD articles:<user_id> <article_id>
Then, to do your query for articles for specific words and a user, just do SINTER[STORE] like so:
SINTER articles:<user_id> search_words:<word1> search_words:<word2>...

Related

How to get subset of SMEMBERS result? Or should I use SortedSet with the same value for both score & member?

I am new to Redis. For example, if I have the following schema:
INCR id:product
SET product:<id:product> value
SADD color:red <id:product>
(Aside: I am not sure how to express a variable in Redis. I will just use <id:product> as the primary key value. In production, I will use golang client for this job)
To query products which have red color, I can do:
SMEMBERS color:red
But the problem is I just want to display 10 of them in the first page, and then next 10 in the second page and so on. How to let Redis return only part of them by specifying offset and limit arguments?
What do redis experts normally do for this case? Return all IDs even if I just want 10 of them? Is that efficient? What if it has millions of values in the set, but I only want 10?
Edited 1
Incidentally, I use sets instead of lists and sorted sets because I will need to do SINTER jobs for other queries.
For example:
SADD type:foo <id:product>
SINTER color:red type:foo
And then I will have pagination problem again. Because I actually just want to find 10 of the intersection at a time. (eg: if the intersection returns millions of keys, but actually I just want 10 of them at a time for pagination).
Edited 2
Should I use a sorted set instead? I am not sure if this is the expert choice or not. Something like:
ZADD color:red <id:product> <id:product>
ZADD type:foo <id:product> <id:product>
ZRANGE color:red 0 9 // for the first page of red color products
ZINTERSTORE out 2 color:red type:foo AGGREGATE MIN
ZRANGE out 0 9 // for the first page of red color and type foo products
I have no ideas if the above way is suggested or not.
What will happen if multiple clients are creating the same out sorted set?
Is that meaningful to use the same value for both score and member?
Using sorted sets is the standard way to do pagination in Redis.
The documentation of ZINTERSTORE says that: "If destination already exists, it is overwritten."
Therefore, you shouldn't use "out" as the destination key name. You should instead use a unique or sufficiently random key name and then delete it when you're done.
I'm not sure what you mean by "meaningful". It's a fine choice if that's the order you want them to be in.

Best way for getting users friends top rating with Redis SORTED SET

I have SORTED SET user_id:rating for every level in the game(2000+ levels). There is 2 000 000 users in set.
I need to create 2 ratings - first - all users top 100, second - top 5 friends each player
First can be solved very easily with ZRANGE
But there is a problem with second, because in average - every user has 500 friends
There is 2 ways:
1) I can do 500 requests with ZSCORE\ZRANK and sort users on by backend (too many requests, bad performance)
2) I can create SORTED SET for each user and update it on background on every users update. (more data, more ram, more complex)
May be there are any others options I missed?
I believe your main concern here should be your data model. Does every user have a sorted set of his friends?
I would recommend something like this:
users:{id}:friends values as the ids of friends
users:scoreboard values as the users ids and score as the rating
of each
As an answer to your first concern, you can consider using pipelines, which will reduce the number of requests drastically, none the less you will still need to handle ordering the results.
The better answer for you problem would be, in case you have the two sorted sets as described earlier:
Get the intersection between the two, using the "zinterstore" command and storing the result in a sorted set created solely for this purpose. As a result, the new sorted set will contain all the user's friends ids with their rating as the score (need to be careful here since you will need to specify the score of the new sorted set, it can either be the SUM, MIN or MAX of the scores).
ref: http://redis.io/commands/zinterstore
At this point using a simple "zrevrangebyscore" and specifying a limit, will leverage the sorted result you are looking for.

Redis - Sorted set, find item by property value

In redis I store objects in a sorted set.
In my solution, it's important to be able to run a ranged query by dates, so I store the items with the score being the timestamp of each items, for example:
# Score Value
0 1443476076 {"Id":"92","Ref":"7ADT","DTime":1443476076,"ATime":1443901554,"ExTime":0,"SPName":"7ADT33CFSAU6","StPName":"7ADT33CFSAU6"}
1 1443482969 {"Id":"11","Ref":"DAJT","DTime":1443482969,"ATime":1443901326,"ExTime":0,"SPName":"DAJTJTT4T02O","StPName":"DAJTJTT4T02O"}
However, in other situations I need to find a single item in the set based on it's ID.
I know I can't just query this data structure as if it were a nosql db, but I tried using ZSCAN, which didn't work.
ZSCAN MySet 0 MATCH Id:92 count 1
It returns; "empty list or set"
Maybe I need to serialize different?
I have serialized using Json.Net.
How, if possible, can I achieve this; using dates as score and still be able to lookup an item by it's ID?
Many thanks,
Lars
Edit:
Assume it's not possible, but any thoughts or inputs are welcome:
Ref: http://openmymind.net/2011/11/8/Redis-Zero-To-Master-In-30-Minutes-Part-1/
In Redis, data can only be queried by its key. Even if we use a hash,
we can't say get me the keys wherever the field race is equal to
sayan.
Edit 2:
I tried to do:
ZSCAN MySet 0 MATCH *87*
127.0.0.1:6379> ZSCAN MySet 0 MATCH *87*
1) "192"
2) 1) "{\"Id\":\"64\",\"Ref\":\"XQH4\",\"DTime\":1443837798,\"ATime\":1444187707,\"ExTime\":0,\"SPName\":\"XQH4BPGW47FM\",\"StPName\":\"XQH4BPGW47FM\"}"
2) "1443837798"
3) "{\"Id\":\"87\",\"Ref\":\"5CY6\",\"DTime\":1443519199,\"ATime\":1444172326,\"ExTime\":0,\"SPName\":\"5CY6DHP23RXB\",\"StPName\":\"5CY6DHP23RXB\"}"
4) "1443519199"
And it finds the desired item, but it also finds another one with an occurance of 87 in the property ATime. Having more unique, longer IDs might work this way and I would have to filter the results in code to find the one with the exact value in its property.
Still open for suggestions.
I think it's very simple.
Solution 1(Inferior, not recommended)
Your way of ZSCAN MySet 0 MATCH Id:92 count 1 didn't work out because the stored string is "{\"Id\":\"92\"... not "{\"Id:92\".... The string has been changed into another format. So try to use MATCH Id\":\"64 or something like that to match the json serialized data in redis. I'm not familiar with json.net, so the actual string leaves for you to discover.
By the way, I have to ask you did ZSCAN MySet 0 MATCH Id:92 count 1 return a cursor? I suspect you used ZSCAN in a wrong way.
Solution 2(Better, strongly recommended)
ZSCAN is good when your sorted set is not large and you know how to save network roundtrip time by Redis' Lua transaction. This still make "look up by ID" operation O(n). Therefore, a better solution is to change you data model in the following way:
change sorted set
from
# Score Value
0 1443476076 {"Id":"92","Ref":"7ADT","DTime":1443476076,"ATime":1443901554,"ExTime":0,"SPName":"7ADT33CFSAU6","StPName":"7ADT33CFSAU6"}
1 1443482969 {"Id":"11","Ref":"DAJT","DTime":1443482969,"ATime":1443901326,"ExTime":0,"SPName":"DAJTJTT4T02O","StPName":"DAJTJTT4T02O"}
to
# Score Value
0 1443476076 Id:92
1 1443482969 Id:11
Move the rest detailed data in another set of hashes type keys:
# Key field-value field-value ...
0 Id:92 Ref-7ADT DTime-1443476076 ...
1 Id:11 Ref-7ADT DTime-1443476076 ...
Then, you locate by id by doing hgetall id:92. As to ranged query by date, you need do ZRANGEBYSCORE sortedset mindate maxdate then hgetall every id one by one. You'd better use lua to wrap these commands in one and it will still be super fast!
Data in NoSql database need to be organized in a redundant way like above. This may make some usual operation involve more than one commands and roundtrip, but it can be tackled by redis's lua feature. I strongly recommend the lua feature of redis, cause it wrap commands into one network roundtrip, which are all executed on the redis-server side and is atomic and super fast!
Reply if there's anything you don't know

Use Cases for Redis' "Score" and "Ranking" Features for Sets

What are some use cases for Redis' "score" and "ranking" features for sets (outside of the typical "leaderboard" examples for games? I'm trying to figure out how to make use of these dynamic new features as I anticipate moving from using a traditional relational database to Redis as a persistent data store.
ZSETs are great for selections or ranges based on scores, but scores can be any numerical value, like a timestamp.
We store daily stock prices for all US stocks in redis. Here's an example for ebay...
ZADD key score member [score member ...]
...
ZADD stocks:ebay 1 30.39 2 32.70 3 31.25 4 31.75 5 29.12 6 29.87 7 29.93
The score values in this case would normally be long timestamps, with that aside, if we want daily prices for the last 3 days, we simply convert two dates to timestamps and pull from redis using the timestamp range 1 3...
zrangebyscore stocks:ebay 1 3
1) "30.39"
2) "32.70"
3) "31.25"
The query is very fast and works well for our needs.
Hope it helps!
zset is the only type of key who can be sorted
by example you can imagine puts all comments key id of a specific article in a zset,
users will vote up/down each comments and this will change the score value
after that when you need to draw comments you can get them ordered, better comments in first place (like here)
using ZREMRANGEBYSCORE you can imagine delete all pretty bad comments each days
but as each redis type, they still basic, give you a dedicated use case is hard there can be some :- )

Redis Sorted Sets: How do I get the first intersecting element?

I have a number of large sorted sets (5m-25m) in Redis and I want to get the first element that appears in a combination of those sets.
e.g I have 20 sets and wanted to take set 1, 5, 7 and 12 and get only the first intersection of only those sets.
It would seem that a ZINTERSTORE followed by a "ZRANGE foo 0 0" would be doing a lot more work that I require as it would calculate all the intersections then return the first one. Is there an alternative solution that does not need to calculate all the intersections?
There is no direct, native alternative, although I'd suggest this:
Create a hash which its members are your elements. Upon each addition to one of your sorted sets, increment the relevant member (using HINCRBY). Of course, you'll make the increment only after you check that the element does not exist already in the sorted set you are attempting to add to.
That way, you can quickly know which elements appear in 4 sets.
UPDATE: Now that I rethink about it, it might be too expensive to query your hash to find items with value of 4 (O(n)). Another option would be creating another Sorted Set, which its members are your elements, and their score gets incremented (as I described before, but using ZINCRBY), and you can quickly pull all elements with score 4 (using ZRANGEBYSCORE).