How to get same rank for same scores in Redis' ZRANK? - redis

If I have 5 members with scores as follows
a - 1
b - 2
c - 3
d - 3
e - 5
ZRANK of c returns 2, ZRANK of d returns 3
Is there a way to get same rank for same scores?
Example: ZRANK c = 2, d = 2, e = 3
If yes, then how to implement that in spring-data-redis?

Any real solution needs to fit the requirements, which are kind of missing in the original question. My 1st answer had assumed a small dataset, but this approach does not scale as dense ranking is done (e.g. via Lua) in O(N) at least.
So, assuming that there are a lot of users with scores, the direction that for_stack suggested is better, in which multiple data structures are combined. I believe this is the gist of his last remark.
To store users' scores you can use a Hash. While conceptually you can use a single key to store a Hash of all users scores, in practice you'd want to hash the Hash so it will scale. To keep this example simple, I'll ignore Hash scaling.
This is how you'd add (update) a user's score in Lua:
local hscores_key = KEYS[1]
local user = ARGV[1]
local increment = ARGV[2]
local new_score = redis.call('HINCRBY', hscores_key, user, increment)
Next, we want to track the current count of users per discrete score value so we keep another hash for that:
local old_score = new_score - increment
local hcounts_key = KEYS[2]
local old_count = redis.call('HINCRBY', hcounts_key, old_score, -1)
local new_count = redis.call('HINCRBY', hcounts_key, new_score, 1)
Now, the last thing we need to maintain is the per score rank, with a sorted set. Every new score is added as a member in the zset, and scores that have no more users are removed:
local zdranks_key = KEYS[3]
if new_count == 1 then
redis.call('ZADD', zdranks_key, new_score, new_score)
end
if old_count == 0 then
redis.call('ZREM', zdranks_key, old_score)
end
This 3-piece-script's complexity is O(logN) due to the use of the Sorted Set, but note that N is the number of discrete score values, not the users in the system. Getting a user's dense ranking is done via another, shorter and simpler script:
local hscores_key = KEYS[1]
local zdranks_key = KEYS[2]
local user = ARGV[1]
local score = redis.call('HGET', hscores_key, user)
return redis.call('ZRANK', zdranks_key, score)

You can achieve the goal with two Sorted Set: one for member to score mapping, and one for score to rank mapping.
Add
Add items to member to score mapping: ZADD mem_2_score 1 a 2 b 3 c 3 d 5 e
Add the scores to score to rank mapping: ZADD score_2_rank 1 1 2 2 3 3 5 5
Search
Get score first: ZSCORE mem_2_score c, this should return the score, i.e. 3.
Get the rank for the score: ZRANK score_2_rank 3, this should return the dense ranking, i.e. 2.
In order to run it atomically, wrap the Add, and Search operations into 2 Lua scripts.

Then there's this Pull Request - https://github.com/antirez/redis/pull/2011 - which is dead, but appears to make dense rankings on the fly. The original issue/feature request (https://github.com/antirez/redis/issues/943) got some interest so perhaps it is worth reviving it /cc #antirez :)

The rank is unique in a sorted set, and elements with the same score are ordered (ranked) lexically.
There is no Redis command that does this "dense ranking"
You could, however, use a Lua script that fetches a range from a sorted set, and reduces it to your requested form. This could work on small data sets, but you'd have to devise something more complex for to scale.

unsigned long zslGetRank(zskiplist *zsl, double score, sds ele) {
zskiplistNode *x;
unsigned long rank = 0;
int i;
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
while (x->level[i].forward &&
(x->level[i].forward->score < score ||
(x->level[i].forward->score == score &&
sdscmp(x->level[i].forward->ele,ele) <= 0))) {
rank += x->level[i].span;
x = x->level[i].forward;
}
/* x might be equal to zsl->header, so test if obj is non-NULL */
if (x->ele && x->score == score && sdscmp(x->ele,ele) == 0) {
return rank;
}
}
return 0;
}
https://github.com/redis/redis/blob/b375f5919ea7458ecf453cbe58f05a6085a954f0/src/t_zset.c#L475
This is the piece of code redis uses to compute the rank in sorted sets. Right now ,it just gives rank based on the position in the Skiplist (which is sorted based on scores).
What does the skiplistnode variable "span" mean in redis.h? (what is span ?)

Related

Store targets as collections that handle logic operation

I think my title is kinda unclear but I don't konw how to tell that otherwise.
My problem is:
We have users that belong to groups, there are many types of groups and any user belong to exaclty one group for each type.
Example: With group types A, B and C, containing respectively the groups (A1; A2; A3), (B1; B2) and (C1; C2; C3)
Every User must have a list of groups like [A1, B1, C1] or [A1, B2, C3] but never [A1, A2, B1] or [A1, C2]
We have messages that target to certain groups but not just a union, it can be more complex collection operations
Example: we can have message intended to [A1, B1, C3], [A1, *, *], [A1|A2, *, *] or even like ([A1, B1, C2] | [A2, B2, C1])
(* = any group of the type, | = or)
Messages are stored in a SQL DB, and users can retrieve all messages intended to their groups
How may I store messages and make my Query to reproduce this behavior ?
An option could be to encode both the user groups and the message targets in a (big) integer built on the powers of 2, and then base your query on a bitwise AND between user group code and message target code.
The idea is, group 1 is 1, group 2 is 2, group 3 is 4 and so on.
Level 1:
Assumptions:
you know in advance how many group types you have, and you have very few of them
you don't have more than 64 groups per type (assuming you work with 64-bit integers)
the message has only one target: A1|A2,B..,C... is ok, A*,B...,C... is ok, (A1,B1,C1)|(A2,B2,C2) is not.
Solution:
Encode each user group as the corresponding power of 2
Encode each message target as the sum of the allowed values: if groups 1 and 3 are allowed (A1|A3) the code will be 1+4=5, if all groups are allowed (A*) the code will be 2**64-1
you will have a User table and a Message table, and both will have one field for each group type code
The query will be WHERE (u.g1 & m.g1) * (u.g2 & m.g2) * ... * (u.gN & m.gN) <> 0
Level 2:
Assumptions:
you have some more group types, and/or you don't know in advance how many they are, or how they are composed
you don't have more than 64 groups in total (e.g. 10 for the first type, 12 for the second, ...)
the message still has only one target as above
Solution:
encode each user group and each message target as a single integer, taking care of the offset: if the first type has 10 groups they will be encoded from 1 to 1023 (2**10-1), then if the second type has 12 groups they will go from 1024 (2**10) to 4194304 (2**(10+12)-1), and so on
you will still have a User table and a Message table, and both will have one single field for the cumulative code
you will need to define a function which is able to check the user group vs the message target separately by each range; this can be difficult to do in SQL, and depends on which engine you are using
following is a Python implementation of both the encoding and the check
class IdEncoder:
def __init__(self, sizes):
self.sizes = sizes
self.grouplimits = {}
offset = 0
for i,size in enumerate(sizes):
self.grouplimits[i] = (2**offset, 2**(offset + size)-1)
offset += size
def encode(self, vals):
n = 0
for i, val in enumerate(vals):
if val == '*':
g = self.grouplimits[i][1] - self.grouplimits[i][0] + 1
else:
svals = val.split('|')
g = 0
for sval in svals:
g += 2**(int(sval)-1)
if i > 0:
g *= self.grouplimits[i][0]
print(g)
n += g
return n
def check(self, user, message):
res = False
for i,size in enumerate(self.sizes):
if user%2**size & message%2**size == 0:
break
if i < len(self.sizes)-1:
user >>= size
message >>= size
else:
res = True
return res
c = IdEncoder([10,12,10])
m3 = c.encode(['1|2','*','*'])
u1 = c.encode(['1','1','1'])
c.check(u1,m3)
True
u2=c.encode(['4','1','1'])
c.check(u2,m3)
False
Level 3:
Assumptions:
you adopt one of the above solutions, but you need multiple targets for each message
Solution:
You will need a third table, MessageTarget, containing the target code fields as above and a FK linking to the message
The query will search for all the MessageTarget rows compatible with the User group code(s) and show the related Message data
So you have 3 main tables:
Messages
Users
Groups
You then create 2 relationship tables:
Message-Group
User-Group
If you want to limit users to have access to just "their" messages then you join:
User > User-Group > Message-Group > Message

How can i add an array of values to Google ortools versus a lower and upper bound?

In the documentation and all examples I can find... in terms of nurse scheduling at least, everyone just declares shift values within the search space of {1,4} lets say for shift 1,2,3,4....
solver = pywrapcp.Solver("schedule_shifts")
num_nurses = 4
num_shifts = 4 # Nurse assigned to shift 0 means not working that day.
num_days = 7
# [START]
# Create shift variables.
shifts = {}
for j in range(num_nurses):
for i in range(num_days):
shifts[(j, i)] = solver.IntVar(0, num_shifts - 1, "shifts(%i,%i)" % (j, i))
shifts_flat = [shifts[(j, i)] for j in range(num_nurses) for i in range(num_days)]
# Create nurse variables.
nurses = {}
for j in range(num_shifts):
for i in range(num_days):
nurses[(j, i)] = solver.IntVar(0, num_nurses - 1, "shift%d day%d" % (j,i))
I want to avoid the use of range of values when I call solver.IntVar(lowerbound, upperbound, ...)
I want IntSolver([available values that you can choose], ...)
I created a matrix of all shifts as the columns flowing from the first day to last. My row indexes don't matter but in each day/shift column, I have the index values of nurses in ranked descending order of who bid the highest for that shift. I want to create then a constraint where if I choose a nurse, I choose the maximum bid that is allowed via other constraints from the column, however I don't know how to do that given the limited documentation ortools has with python IntVar.
Can you try
solver.IntVar([values...], 'name')
It should work.
See https://github.com/google/or-tools/blob/master/examples/python/einav_puzzle2.py

How do I represent this data using Redis?

I want to be able to store data such as "store x is open between 9am and 5pm on Monday but it's only open during 9am and 12pm on Saturday"
What's the best way to store this using redis?
I would later like to query it using something like this. Show me all stores that are open on Saturday at 10:30am
In Redis, like most if not all other NoSQL databases, you want to store your data in the manner that's most suitable for answering the query. There are quite a few ways you can represent this data and answer the query, choosing between them requires knowledge about the other access patterns that you need to support.
However, in the context of this specific question alone, the simplest way of doing that IMO is to use two Sorted Sets per for each day of the week. Assuming that stores are open continuously and at most once each day (i.e. no siestas), the members of these Sorted Sets should be the store ids and the scores their opening hours - the first Sort Set's scores will denote the time that the store opens whereas the second's the time it closes. For example:
ZADD monday:open 9 store:x
ZADD monday:close 17 store:x
ZADD saturday:open 9 store:x
ZADD saturday:close 12 store:x
Once you have all the Sorted Sets in place, answering the query requires two calls to ZRANGEBYSCORE and intersecting the results. The snippet below demonstrates how to do it using Lua since doing using server scripts will be more efficient than moving the entire thing to the client in most cases.
Note: an alternative approach to doing the intersect in Lua is actually storing the temporary results in Redis' Sets and calling SINTER.
-- helper function to make a "set" out of a table
local function makeset(t)
local r = {}
for _, v in ipairs(t) do r[v] = true end
return(r)
end
-- get opening and closing hours for a given day
local ot = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
local ct = redis.call('ZRANGEBYSCORE', KEYS[2], '(' .. ARGV[1], '+inf')
-- convert to sets and choose the smaller set as s1
local s1 = {}
local s2 = {}
if #ot < #ct then
s1 = makeset(ot)
s2 = makeset(ct)
else
s1 = makeset(ct)
s2 = makeset(ot)
end
-- intersect s1 and s2
local t = {}
for k in pairs(s1) do
t[k] = s2[k]
end
-- prepare a response table
local r = {}
for k in pairs(t) do
r[#r+1] = k
end
return(r)
Run this script by passing to it the two keys and the hour, like so:
redis-cli --eval storehours.lua saturday:open saturday:close , 10.5

option for lexicographical order in zrange?

When i add a score for a key using zincrby, it increases the score and puts the element in lexicographical order.
Can i get this list in the order, in which the elements are updated or added ?
e.g>
If I execute
zincrby A 100 g
zincrby A 100 a
zincrby A 100 z
and then
zrange A 0 -1
then the result is
a->g->z
where, i want the result in order the entries are made so,
g->a->z
As score is same for all, redis is placing the elements in lexicographical order. Is there any way to prevent it ?
I don't think it is possible, but if you want to keep the order of insertion with scores, you should manipulate something like this:
<score><timestamp>
instead of
<score>
You will have to define a good time record (millis should be ok). Then you can use
zincrby A 100 * (10^nbdigitsformillis)
For instance:
Score = 100 and timestamps is 1381377600 seconds
That gives: 1001381377600
You incr by 200 the score: 1001381377600 + 200 * 10 = 3001381377600
Be careful with zset as it stores scores with double values (64 bits, but only 52 available for int value) so don't store more than 15-17 digits.
If you can't do that (need for great timestamp precision, and great score precision), you will have to manage two zsets (one for actual score, one for timestamp) and managing your ranking manual with the two values.

What is the most elegant way to pick a random value from a set defined in NS_OPTION in objective-c?

I have an NS_OPTION that I'm defining as such :
typedef NS_OPTIONS(NSInteger, PermittedSize) {
SmallSize = 1 << 0,
MediumSize = 1 << 1,
LargeSize = 1 << 2
};
And later I set the values I need :
PermittedSize size = SmallSize | MediumSize;
I'm using it to randomly generate an various objects of small and medium sizes(duh) for a particular level of a game.
What is the best way to go about selecting which size of an object to generate? Meaning, I'd like to choose randomly for each object I'm generating whether it will be one of the 2 options allowed (small and medium in this case). Normally I would use an arc4random function with the range of numbers I need - but in this case, how can it be done with bits? (and then mapped back to the values of the PermittedSize type?
Use the result from arc4random to determine the amount of bit shifting you want to do. Something like this:
int bitShiftAmount = arc4random_uniform(numberOfPermittedSizes);
PermittedSize size = 1 << bitShiftAmount;
You are still working with integers. SmallSize is 1. MediumSize is 2. And LargeSize is 4.
So pick a random number from 1 to 3. 1 is small, 2 is medium, 3 is both.
Once you have a random number, assign it.
NSInteger val = arc4random_uniform(3) + 1; // give 1-3
PermittedSize size = (PermittedSize)val;