I have a problem that is very similar to one mentioned here...
https://grokbase.com/t/gg/redis-db/123wv39cnt/filtering-zset-by-hash-field-value
But I am not able to understand the answer provided in that thread.
zadd scores 1.0 mary
zadd scores 2.0 sue
zadd scores 3.0 bob
zadd scores 4.0 bruce
zadd scores 5.0 maggie
sadd females mary sue maggie
zinterstore femscores 2 scores females
zrange femscores 0 50 withscores
How are the values 2, 3 and 6 calculated?
1) "mary"
2) "2"
3) "sue"
4) "3"
5) "maggie"
6) "6"
It is the result of the default WEIGHTS and AGGREGATE as described in ZUNIONSTORE.
Default WEIGHTS is 1, default AGGREGATE is SUM, so you are seeing the scores incremented by 1.
If you want the scores score without modification, simply set to zero the weight for females:
ZINTERSTORE femscores 2 scores females WEIGHTS 1 0
Related
df = pd.DataFrame({'Alice': [4,15,2], 'Bob': [9,3,5], 'Emma': [4,7,19]})
I can find who got the highest score in each round with
df.idxmax(1)
> 0 Bob
1 Alice
2 Emma
dtype: object
But I would like to find in which place Bob finished in each round. Output should be:
> 0
2
1
Seems like something with argsort should work, but can't quite get it.
(Here is the same question, but in SQL Server.)
You can use rank:
df.rank(axis=1, method='first', ascending=False)
NB. check the methods to find the one that better suits your need:
How to rank the group of records that have the same value (i.e. ties):
average: average rank of the group
min: lowest rank in the group
max: highest rank in the group
first: ranks assigned in order they appear in the array
dense: like ‘min’, but rank always increases by 1 between groups.
output:
Alice Bob Emma
0 2.0 1.0 3.0
1 1.0 3.0 2.0
2 3.0 2.0 1.0
NB. note that the ranks start as 1, you can add sub(1) to get a rank from 0
df.rank(axis=1, method='first', ascending=False).sub(1).convert_dtypes()
output:
Alice Bob Emma
0 1 0 2
1 0 2 1
2 2 1 0
I have 2 sorted set acting as ranking. I want to get the top 5 from the union between them. the scores are the same.
zadd rank1 1 aaa
zadd rank1 2 bbb
zadd rank1 3 ccc
zadd rank1 4 ddd
zadd rank2 1 aaa
zadd rank2 2 bbb
zadd rank2 3 ccc
zadd rank2 4 ddd
What is the best approach to do so?
ZUINON 10 rank1 rank2 AGGREGATE MAX 5
I would assume something like that, but max 5 doesn't exists.
EDIT
Just figured out that even ZUNION wouldn't help as my redis version is 6.0.5 and not 6.2.0
my sorted sets are huge - i.e million of keys in each set. plus, this union will happen a lot of times per sec (it is the top query in my site). is this the fastest approach?
Just take the top 5 from each SortedSet and choose top 5 among those 10 elements at your server(/client) process. This would be the fastest and least complex for your scenario.
You can get top N elements from one SortedSet using ZREVRANGE command. But to unify/merge 2xN elements and choose top M elements, you would also require the respective scores of those elements. ZREVRANGE command with WITHSCORES keyword returns top N elements with their scores.
ZREVRANGE rank1 0 4 WITHSCORES
ZREVRANGE rank2 0 4 WITHSCORES
It depends on what you would like to do with the scores of the same key. For the two 'aaa', do you want to add them (aggregate) and then get the top five aggregated result?
You can use ZUNIONSTORE, with the option of storing the temporary result somewhere else, and then get the top five result. (supports Redis version below 6)
To do this atomically, you'll need a lua script to combine the following two commands
ZUNIONSTORE out 2 rank1 rank2 --(the temporary result is stored in a zset called 'out')
ZREVRANGE out 0 4 ---- (ranking from highest to lowest, get top five)
When I have a sorted set with scores, I'd like to have the right rank even when multiple items have the same score.
For instance, when there are 5 items with scores: 1, 2, 2, 2, 3, I'd like to have those three central items to have the same rank (1), while the highest score gets rank 0 (with ZREVRANGE), and the lowest gets rank 4.
I see that it's possible to query the amount of keys with the same score somewhat efficiently O(log(N)), but it looks like if I want to have the scores as I want them, I'd have to use zscan, which is O(N).
Edit: add complete example based on the accepted solution
Our dataset is a sorted set with scores. For example: a has score 1, b, c and d have score 2, and e has score 3:
127.0.0.1:6379> zadd aset 1 a
(integer) 1
127.0.0.1:6379> zadd aset 2 b
(integer) 1
127.0.0.1:6379> zadd aset 2 c
(integer) 1
127.0.0.1:6379> zadd aset 2 d
(integer) 1
127.0.0.1:6379> zadd aset 3 e
(integer) 1
ZREVRANK works for those items with a unique score:
127.0.0.1:6379> zrevrank aset a
(integer) 4
127.0.0.1:6379> zrevrank aset e
(integer) 0
But it fails for those items with the same score:
127.0.0.1:6379> zrevrank aset b
(integer) 3
127.0.0.1:6379> zrevrank aset c
(integer) 2
127.0.0.1:6379> zrevrank aset d
(integer) 1
To solve that, first get the score with ZSCORE:
127.0.0.1:6379> zscore aset c
"2"
The other items have the same score, of course:
127.0.0.1:6379> zscore aset b
"2"
127.0.0.1:6379> zscore aset d
"2"
To get their rank, just use ZCOUNT with the score:
127.0.0.1:6379> zcount aset (2 +inf
(integer) 1
This also works for those items that have a unique score:
127.0.0.1:6379> zcount aset (1 +inf
(integer) 4
127.0.0.1:6379> zcount aset (3 +inf
(integer) 0
Writing this as an atomic lua script is left as an exercise for the reader.
For a given item with score x, you can determine its rank in O(log(N)) time with ZCOUNT (X +inf.
Exactly how you make use of that will depend on the details of your implementation.
ZREVRANGEBYLEX could be used in this case. The time complexity in this case would be O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements being returned. Please look at ZRANGEBYLEX for syntax.
Lex family of sorted set commands allow you to specify lexicographical ordering for keys with same values.
For example,I have a set in Redis
5 7 11 15 19 2 1
I want to find the upper bound or lower bound of 12 in Redis.
They are 15 and 11 in this example.
How can I do it efficiently.
I can use set or ordered set
Thanks!
I can think of two ways, neither of them perfect.
if they are all in a set, you can check ($r->sIsMember()) for that number 12 and then iteratively walk up and down until a match is found for each. This is not great, and I would suggest a LUA script to avoid a ton back-and-forth if you go that route.
second, put them in a sorted set as scores to a primary key of sorts. Then you will zRangeByScore() and get the zRank() of that member of the sorted set, and then zRange() to get the value before and after. I will do that here:
add them to a sorted set:
zadd mysset 5 "one"
zadd mysset 7 "two"
zadd mysset 11 "three"
zadd mysset 15 "four"
zadd mysset 19 "five"
zadd mysset 2 "six"
zadd mysset 1 "seven"
add one on the fly for reference
zadd mysset 12 "center"
now get the rank of that reference:
zRank mysset "center"
// 6 in this case
now get the range of the one above and below withscores:
zRange mysset 7 7 WITHSCORES
zRange mysset 5 5 WITHSCORES
// "four" 15
// "three" 11
good luck, have fun!
edit: oh yeh, remove the reference number:
zrem mysset "center"
Since zrange "Lexicographical order is used for elements with equal score" how do I go around this issue?
For example:
zadd s 0 1
zadd s 0 2
zadd s 0 10
zadd s 0 3
zrange s 0 4
1) 1
2) 10
3) 2
4) 3
How do I make it sort like this (while ofc honor the score):
1) 1
2) 2
3) 3
4) 10
You cannot alter the lexicographic order.
However, you could store a value whose lexicographic order matches with numerical order. For instance instead of storing:
1
2
12
15
122
321
you could store:
A1
A2
B12
B15
C122
C321
The first letter is just a code to indicate the number of digits of the numerical value (A=1, B=2, etc ...), so that numerical and lexicographic order is the same. The client application can easily add/remove this prefix at store/retrieve time.
zadd s 0 A1
zadd s 0 A2
zadd s 0 B10
zadd s 0 A3
zrange s 0 4
1) "A1"
2) "A2"
3) "A3"
4) "B10"