Get random item from sorted set in Redis - redis

I was needed to implement set of items with individual expiration, so I used zsetwith score of expiration timestamp.
Now I want to get random item from range of not expired items, or at least from all items in set.
How can I do it?
Can I get min and max rank of range and random rank in between of it via LUA scripting?
Redis version: 5.0.2

I solve this via following script:
-- KEYS[1] - set key
-- ARGV[1] - seed timestamp
local count = redis.call('ZCARD', KEYS[1])
if count ~= 0 then
math.randomseed(ARGV[1])
local rank = math.random(0, count - 1)
local range = redis.call('ZRANGE', KEYS[1], rank, rank)
return range[1]
else
return ''
end
And because I search among all items I do sanitization from expired items every n seconds.

Can change:
ARGV[1] -> os.time()

Related

Increment Redis counter used as a value only when the key is unique

I have to count unique entries from a stream of transactions using Redis. There will be at least 1K jobs trying to concurrently check if the transaction is unique and if it is, put the the transaction type as key and the value is an incremented counter. This counter is again shared by all threads.
If all threads do
Check if key exists. exists(transactionType)
Increment the counter. val count = incr(counter)
Set the new value. setnx(transactionType, count)
This creates two problems.
Increments the counter unnecessarily, as the count can be updated by one of the threads.
Have to perform an exists, increment and then insert. (3 operations)
Is there a better way of doing this increment and update of counter if the value does not exist.
private void checkAndIncrement(String transactionType, Jedis redisHandle) {
if(transactionType != null) {
if(redisHandle.exists(transactionType) ^ Boolean.TRUE) {
long count = redisHandle.incr("t_counter");
redisHandle.setnx(transactionType, "" + count);
}
}
}
EDIT:
Once a value is created as say T1 = 100, the transaction should also be identifiable with the number 100. I would have to store another map with counter as key and transaction type as value.
Two options:
Use a hash, HSETNX to add keys to the hash (just set the value to 1 or "" or anything), and HLEN to get the count of keys in the hash. You can always start over with HDEL. You could also use HINCRBY instead of HSETNX to additionally find out how many times each key appears.
Use a hyperloglog. Use PFADD to insert elements and PFCOUNT to retrieve the count. HyperLogLog is a probabilistic algorithm; the memory usage for a HLL doesn't go up with the number of unique items the way a hash does, but the count returned is only approximate (usually within about 1% of the true value).

Queue or other methods to handle tick data?

In our electronic trading system, we need to do calculation based on tick data from 100+ contracts.
Tick data of contracts is not received in one message. One message only include tick data for one contract. Timestamp of contracts are slightly different (sometimes big diff, but let's ignore this case).
eg: (first column is timestamp. Second is contract name)
below 2 data has 1ms diff
10:34:03.235,10002007,510050C2006A03500 ,0.0546
10:34:03.236,10001909,510050C2003A02750 ,0.3888
below 2 data has 3ms diff
10:34:03.594,10002154,510300C2003M03700 ,0.4985
10:34:03.597,10002118,510300C2001M03700 ,0.4514
Only those with price change will have data. So I can't count contract number to know if I have received all data for this tick.
But on the other hand, we don't want to wait till we receive all data for the tick, because sometimes data could be late for long time, we will want to exclude them.
Low latency is required. So I think we will define a window - say 50 ms - and start to calculate based on whatever data we received in past 50ms.
What will be the best way to handle such use case?
Originally I want to use redis stream to maintain a small queue, that whenever a contract's data is received, I will push it to redis stream. But I couldn't figure out what's the best way to pull data as soon as specific time (say 50ms) passed.
I am thinking about maybe I should use some other technicals?
Any suggestions are appreciated.
Use XRANGE myStream - + COUNT 1 to get the first entry.
Use XREVRANGE myStream + - COUNT 1 to get the last entry.
XINFO STREAM myStream also brings first and last entry, but the docs say it is O(log N).
Assuming you are using a timestamp as ID, or as a field, then you can compute the time difference.
If you are using Redis Streams auto-ID (XADD myStream * ...), the first part of the ID is the UNIX timestamp in milliseconds.
Assuming the above, you can do the check atomically with a Lua script:
EVAL "local first = redis.call('XRANGE', KEYS[1], '-', '+', 'COUNT', '1') local firstTime = {} if next(first) == nil then return redis.error_reply('Stream is empty or key doesn`t exist') end for str in string.gmatch(first[1][1], '([^-]+)') do table.insert(firstTime, tonumber(str)) end local last = redis.call('XREVRANGE', KEYS[1], '+', '-', 'COUNT', '1') local lastTime = {} for str in string.gmatch(last[1][1], '([^-]+)') do table.insert(lastTime, tonumber(str)) end local ms = lastTime[1] - firstTime[1] if ms >= tonumber(ARGV[1]) then return redis.call('XRANGE', KEYS[1], '-', '+') else return redis.error_reply('Only '..ms..' ms') end" 1 myStream 50
The arguments are numKeys(1 here) streamKey timeInMs(50 here): 1 myStream 50.
Here a friendly view of the Lua script:
local first = redis.call('XRANGE', KEYS[1], '-', '+', 'COUNT', '1')
local firstTime = {}
if next(first) == nil then
return redis.error_reply('Stream is empty or key doesn`t exist')
end
for str in string.gmatch(first[1][1], '([^-]+)') do
table.insert(firstTime, tonumber(str))
end
local last = redis.call('XREVRANGE', KEYS[1], '+', '-', 'COUNT', '1')
local lastTime = {}
for str in string.gmatch(last[1][1], '([^-]+)') do
table.insert(lastTime, tonumber(str))
end
local ms = lastTime[1] - firstTime[1]
if ms >= tonumber(ARGV[1]) then
return redis.call('XRANGE', KEYS[1], '-', '+')
else
return redis.error_reply('Only '..ms..' ms')
end
It returns:
(error) Stream is empty or key doesn`t exist
(error) Only 34 ms if we don't have the required time elapsed
The actual list of entries if the required time between first and last message has elapsed.
Make sure to check Introduction to Redis Streams to get familiar with Redis Streams, and EVAL command to learn about Lua scripts.

How to get same rank for same scores in Redis' ZRANK?

If I have 5 members with scores as follows
a - 1
b - 2
c - 3
d - 3
e - 5
ZRANK of c returns 2, ZRANK of d returns 3
Is there a way to get same rank for same scores?
Example: ZRANK c = 2, d = 2, e = 3
If yes, then how to implement that in spring-data-redis?
Any real solution needs to fit the requirements, which are kind of missing in the original question. My 1st answer had assumed a small dataset, but this approach does not scale as dense ranking is done (e.g. via Lua) in O(N) at least.
So, assuming that there are a lot of users with scores, the direction that for_stack suggested is better, in which multiple data structures are combined. I believe this is the gist of his last remark.
To store users' scores you can use a Hash. While conceptually you can use a single key to store a Hash of all users scores, in practice you'd want to hash the Hash so it will scale. To keep this example simple, I'll ignore Hash scaling.
This is how you'd add (update) a user's score in Lua:
local hscores_key = KEYS[1]
local user = ARGV[1]
local increment = ARGV[2]
local new_score = redis.call('HINCRBY', hscores_key, user, increment)
Next, we want to track the current count of users per discrete score value so we keep another hash for that:
local old_score = new_score - increment
local hcounts_key = KEYS[2]
local old_count = redis.call('HINCRBY', hcounts_key, old_score, -1)
local new_count = redis.call('HINCRBY', hcounts_key, new_score, 1)
Now, the last thing we need to maintain is the per score rank, with a sorted set. Every new score is added as a member in the zset, and scores that have no more users are removed:
local zdranks_key = KEYS[3]
if new_count == 1 then
redis.call('ZADD', zdranks_key, new_score, new_score)
end
if old_count == 0 then
redis.call('ZREM', zdranks_key, old_score)
end
This 3-piece-script's complexity is O(logN) due to the use of the Sorted Set, but note that N is the number of discrete score values, not the users in the system. Getting a user's dense ranking is done via another, shorter and simpler script:
local hscores_key = KEYS[1]
local zdranks_key = KEYS[2]
local user = ARGV[1]
local score = redis.call('HGET', hscores_key, user)
return redis.call('ZRANK', zdranks_key, score)
You can achieve the goal with two Sorted Set: one for member to score mapping, and one for score to rank mapping.
Add
Add items to member to score mapping: ZADD mem_2_score 1 a 2 b 3 c 3 d 5 e
Add the scores to score to rank mapping: ZADD score_2_rank 1 1 2 2 3 3 5 5
Search
Get score first: ZSCORE mem_2_score c, this should return the score, i.e. 3.
Get the rank for the score: ZRANK score_2_rank 3, this should return the dense ranking, i.e. 2.
In order to run it atomically, wrap the Add, and Search operations into 2 Lua scripts.
Then there's this Pull Request - https://github.com/antirez/redis/pull/2011 - which is dead, but appears to make dense rankings on the fly. The original issue/feature request (https://github.com/antirez/redis/issues/943) got some interest so perhaps it is worth reviving it /cc #antirez :)
The rank is unique in a sorted set, and elements with the same score are ordered (ranked) lexically.
There is no Redis command that does this "dense ranking"
You could, however, use a Lua script that fetches a range from a sorted set, and reduces it to your requested form. This could work on small data sets, but you'd have to devise something more complex for to scale.
unsigned long zslGetRank(zskiplist *zsl, double score, sds ele) {
zskiplistNode *x;
unsigned long rank = 0;
int i;
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
while (x->level[i].forward &&
(x->level[i].forward->score < score ||
(x->level[i].forward->score == score &&
sdscmp(x->level[i].forward->ele,ele) <= 0))) {
rank += x->level[i].span;
x = x->level[i].forward;
}
/* x might be equal to zsl->header, so test if obj is non-NULL */
if (x->ele && x->score == score && sdscmp(x->ele,ele) == 0) {
return rank;
}
}
return 0;
}
https://github.com/redis/redis/blob/b375f5919ea7458ecf453cbe58f05a6085a954f0/src/t_zset.c#L475
This is the piece of code redis uses to compute the rank in sorted sets. Right now ,it just gives rank based on the position in the Skiplist (which is sorted based on scores).
What does the skiplistnode variable "span" mean in redis.h? (what is span ?)

How do I represent this data using Redis?

I want to be able to store data such as "store x is open between 9am and 5pm on Monday but it's only open during 9am and 12pm on Saturday"
What's the best way to store this using redis?
I would later like to query it using something like this. Show me all stores that are open on Saturday at 10:30am
In Redis, like most if not all other NoSQL databases, you want to store your data in the manner that's most suitable for answering the query. There are quite a few ways you can represent this data and answer the query, choosing between them requires knowledge about the other access patterns that you need to support.
However, in the context of this specific question alone, the simplest way of doing that IMO is to use two Sorted Sets per for each day of the week. Assuming that stores are open continuously and at most once each day (i.e. no siestas), the members of these Sorted Sets should be the store ids and the scores their opening hours - the first Sort Set's scores will denote the time that the store opens whereas the second's the time it closes. For example:
ZADD monday:open 9 store:x
ZADD monday:close 17 store:x
ZADD saturday:open 9 store:x
ZADD saturday:close 12 store:x
Once you have all the Sorted Sets in place, answering the query requires two calls to ZRANGEBYSCORE and intersecting the results. The snippet below demonstrates how to do it using Lua since doing using server scripts will be more efficient than moving the entire thing to the client in most cases.
Note: an alternative approach to doing the intersect in Lua is actually storing the temporary results in Redis' Sets and calling SINTER.
-- helper function to make a "set" out of a table
local function makeset(t)
local r = {}
for _, v in ipairs(t) do r[v] = true end
return(r)
end
-- get opening and closing hours for a given day
local ot = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
local ct = redis.call('ZRANGEBYSCORE', KEYS[2], '(' .. ARGV[1], '+inf')
-- convert to sets and choose the smaller set as s1
local s1 = {}
local s2 = {}
if #ot < #ct then
s1 = makeset(ot)
s2 = makeset(ct)
else
s1 = makeset(ct)
s2 = makeset(ot)
end
-- intersect s1 and s2
local t = {}
for k in pairs(s1) do
t[k] = s2[k]
end
-- prepare a response table
local r = {}
for k in pairs(t) do
r[#r+1] = k
end
return(r)
Run this script by passing to it the two keys and the hour, like so:
redis-cli --eval storehours.lua saturday:open saturday:close , 10.5

Redis: Sum of SCORES in Sorted Set

What's the best way to get the sum of SCORES in a Redis sorted set?
The only option I think is iterating the sorted set and computing the sum client side.
Available since Redis v2.6 is the most awesome ability to execute Lua scripts on the Redis server. This renders the challenge of summing up a Sorted Set's scores to trivial:
local sum=0
local z=redis.call('ZRANGE', KEYS[1], 0, -1, 'WITHSCORES')
for i=2, #z, 2 do
sum=sum+z[i]
end
return sum
Runtime example:
~$ redis-cli zadd z 1 a 2 b 3 c 4 d 5 e
(integer) 5
~$ redis-cli eval "local sum=0 local z=redis.call('ZRANGE', KEYS[1], 0, -1, 'WITHSCORES') for i=2, #z, 2 do sum=sum+z[i] end return sum" 1 z
(integer) 15
If the sets are small, and you don't need killer performance, I would just iterate (zrange/zrangebyscore) and sum the values client side.
If, on the other hand, you are talking about many thousands - millions of items, you can always keep a reference set with running totals for each user and increment/decrement them as the gifts are sent.
So when you do your ZINCR 123:gifts 1 "3|345", you could do a seperate ZINCR command, which could be something like this:
ZINCR received-gifts 1 <user_id>
Then, to get the # of gifts received for a given user, you just need to run a ZSCORE:
ZSCORE received-gifts <user_id>
Here is a little lua script that maintains the zset score total as you go, in a counter with key postfixed with '.ss'. You can use it instead of ZADD.
local delta = 0
for i=1,#ARGV,2 do
local oldScore = redis.call('zscore', KEYS[1], ARGV[i+1])
if oldScore == false then
oldScore = 0
end
delta = delta - oldScore + ARGV[i]
end
local val = redis.call('zadd', KEYS[1], unpack(ARGV))
redis.call('INCRBY', KEYS[1]..'.ss', delta)