How to Xtrim (trim redis stream) in the safe way - redis

Redis Xtrim is to keep a number of records and remove the older ones.
suppose I am having 100 records, I want to keep the newest 10 ones. (suppose java/scala API)
redis = new Jedis(...)
redis.xlen("mystream") // returns 100
// new records arrive stream at here
redis.xtrim("mystream", 10)
however, new records arrive between the execution of xlen and xtrim. So for example now there are 105 records, it will keep 95 - 105, however I want to keep start from 90. So it also trim 90 - 95, which is bad.
Any ideas?

Option 1:
If you always want to keep only the 10 newest ones, use:
XADD mystream MAXLEN 10 * field value ...
By moving trimming to the addition, you keep it always at 10. You can forget about separate logic to trim.
In Jedis,
xadd(String key, StreamEntryID id, Map<String,String> hash, long maxLen, boolean approximateLength)
Option 2:
Optimistic, use a transaction.
WATCH mystream
XLEN mystream
MULTI
XTRIM mystream MAXLEN 10
EXEC
Here, if the stream has been modified, the transaction will fail, EXEC returns null, you then retry.
Jedis transaction example
Option 3:
Use a Lua Script to adjust your XTRIM target number. With this, you can execute the XLEN, do your logic to decide on the trimming you want, and then do XTRIM, all Redis-server-side, atomically, so no chance for your stream changing in-between.
Here some eval examples with Jedis.
Update:
A Lua Script like the following should do the trick: atomically evaluate the current number of myStreams entries and then call XTRIM accordingly.
EVAL "local streamLen = redis.call('XLEN', KEYS[1]) \n if streamLen >= tonumber(ARGV[1]) then \n return redis.call('XTRIM', KEYS[1], 'MAXLEN', streamLen - tonumber(ARGV[1])) \n else return redis.error_reply(KEYS[1]..' has less than '..ARGV[1]..' items') end" 1 myStream 90
Let's take a look at the Lua script:
local streamLen = redis.call('XLEN', KEYS[1])
if streamLen >= tonumber(ARGV[1]) then
return redis.call('XTRIM', KEYS[1], 'MAXLEN', streamLen - tonumber(ARGV[1]))
else
return redis.error_reply(KEYS[1]..' has less than '..ARGV[1]..' items')
end
I posted more detail on a separate question on How to delete or remove a given number of entries from a Redis stream?

Related

Redis: How do I calculate time differences in a sorted list time-series?

I'm trying to calculate the response times between messages stored in Redis. But I have no clue how to do this.
First I have to store the time-stream of chat_messages like this
ZADD conversation:CONVERSATION_ID:chat_messages TIMESTAMP CHAT_MESSAGE_ID
ZADD conversation:9:chat_messages 1511533205001 2583
ZADD conversation:9:chat_messages 1511533207057 732016
Afterward, I need the application to be able to calculate the time difference between timestamps using Redis because I need the extra speed of doing it without another (potentially slower) technology.
Is there a way of achieving this using plain Redis or Lua?
The timestamp you are using seems to be in milliseconds, so you only need to substract and convert to your desired units.
You can get the score using ZSCORE for each message. Or use one of the ZRANGE methods to get multiple messages at once: ZRANGEBYSCORE ... WITHSCORES.
You can use a Lua script to get the time difference:
local t1 = redis.call('ZSCORE', KEYS[1], ARGV[1])
local t2 = redis.call('ZSCORE', KEYS[1], ARGV[2])
return tonumber(t2) - tonumber(t1)
Here the full EVAL command:
EVAL "local t1 = redis.call('ZSCORE', KEYS[1], ARGV[1]) local t2 = redis.call('ZSCORE', KEYS[1], ARGV[2]) return tonumber(t2) - tonumber(t1)" 1 conversation:9:chat_messages 2583 732016

Is there any recommended value of COUNT for SCAN / HSCAN command in REDIS?

I have understood the meaning of COUNT in the case of REDIS SCAN. But, what is the ideal value for REDIS COUNT ?
The default value is 10. It means the command will bring back more or less 10 keys, could be less if the keys are sparsely populated in the hash slots, or filtered out by the MATCH pattern. It could be more if some keys are sharing a hash slot. Anyhow, the work performed is proportional to the COUNT parameter.
Redis is single-threaded. One of the reasons SCAN was introduced is to allow going through all the keys without blocking the server for a long time, by going a few steps at a time.
And that's precisely the criteria to decide what's a good number. For how long are you willing to block your Redis server by running a SCAN command. The higher the COUNT, the longer the block.
Let's use a Lua script to get a sense of the COUNT impact. Use it on your environment to get the results based on your server resources.
The Lua script:
local t0 = redis.call('TIME')
local res = redis.call('SCAN', ARGV[1], 'COUNT', ARGV[2])
local t1 = redis.call('TIME')
local micros = (t1[1]-t0[1])*1000000 + t1[2]-t0[2]
table.insert(res,'Time taken: '..micros..' microseconds')
table.insert(res,'T0: '..t0[1]..string.format('%06d', t0[2]))
table.insert(res,'T1: '..t1[1]..string.format('%06d', t1[2]))
return res
Here we use Redis TIME command. The command returns:
unix time in seconds
microseconds
A few runs in my machine, with 1 million keys:
COUNT TIME IN MICROSECONDS
10 37
100 257
1000 1685
10000 14438
Note these times don't include the time used to read from the socket and to buffer and send the response. Actual times will be larger. The time it takes once is out of Redis, including time traveling the network is not time your Redis server is blocked though.
This is how I called the Lua script and the results:
> EVAL "local t0 = redis.call('TIME') \n local res = redis.call('SCAN', ARGV[1], 'COUNT', ARGV[2]) \n local t1 = redis.call('TIME') \n local micros = (t1[1]-t0[1])*1000000 + t1[2]-t0[2] \n table.insert(res,'Time taken: '..micros..' microseconds') \n table.insert(res,'T0: '..t0[1]..string.format('%06d', t0[2])) \n table.insert(res,'T1: '..t1[1]..string.format('%06d', t1[2])) \n return res" 0 0 5
1) "851968"
2) 1) "key:560785"
2) "key:114611"
3) "key:970983"
4) "key:626494"
5) "key:23865"
3) "Time taken: 36 microseconds"
4) "T0: 1580816056349600"
5) "T1: 1580816056349636"

Queue or other methods to handle tick data?

In our electronic trading system, we need to do calculation based on tick data from 100+ contracts.
Tick data of contracts is not received in one message. One message only include tick data for one contract. Timestamp of contracts are slightly different (sometimes big diff, but let's ignore this case).
eg: (first column is timestamp. Second is contract name)
below 2 data has 1ms diff
10:34:03.235,10002007,510050C2006A03500 ,0.0546
10:34:03.236,10001909,510050C2003A02750 ,0.3888
below 2 data has 3ms diff
10:34:03.594,10002154,510300C2003M03700 ,0.4985
10:34:03.597,10002118,510300C2001M03700 ,0.4514
Only those with price change will have data. So I can't count contract number to know if I have received all data for this tick.
But on the other hand, we don't want to wait till we receive all data for the tick, because sometimes data could be late for long time, we will want to exclude them.
Low latency is required. So I think we will define a window - say 50 ms - and start to calculate based on whatever data we received in past 50ms.
What will be the best way to handle such use case?
Originally I want to use redis stream to maintain a small queue, that whenever a contract's data is received, I will push it to redis stream. But I couldn't figure out what's the best way to pull data as soon as specific time (say 50ms) passed.
I am thinking about maybe I should use some other technicals?
Any suggestions are appreciated.
Use XRANGE myStream - + COUNT 1 to get the first entry.
Use XREVRANGE myStream + - COUNT 1 to get the last entry.
XINFO STREAM myStream also brings first and last entry, but the docs say it is O(log N).
Assuming you are using a timestamp as ID, or as a field, then you can compute the time difference.
If you are using Redis Streams auto-ID (XADD myStream * ...), the first part of the ID is the UNIX timestamp in milliseconds.
Assuming the above, you can do the check atomically with a Lua script:
EVAL "local first = redis.call('XRANGE', KEYS[1], '-', '+', 'COUNT', '1') local firstTime = {} if next(first) == nil then return redis.error_reply('Stream is empty or key doesn`t exist') end for str in string.gmatch(first[1][1], '([^-]+)') do table.insert(firstTime, tonumber(str)) end local last = redis.call('XREVRANGE', KEYS[1], '+', '-', 'COUNT', '1') local lastTime = {} for str in string.gmatch(last[1][1], '([^-]+)') do table.insert(lastTime, tonumber(str)) end local ms = lastTime[1] - firstTime[1] if ms >= tonumber(ARGV[1]) then return redis.call('XRANGE', KEYS[1], '-', '+') else return redis.error_reply('Only '..ms..' ms') end" 1 myStream 50
The arguments are numKeys(1 here) streamKey timeInMs(50 here): 1 myStream 50.
Here a friendly view of the Lua script:
local first = redis.call('XRANGE', KEYS[1], '-', '+', 'COUNT', '1')
local firstTime = {}
if next(first) == nil then
return redis.error_reply('Stream is empty or key doesn`t exist')
end
for str in string.gmatch(first[1][1], '([^-]+)') do
table.insert(firstTime, tonumber(str))
end
local last = redis.call('XREVRANGE', KEYS[1], '+', '-', 'COUNT', '1')
local lastTime = {}
for str in string.gmatch(last[1][1], '([^-]+)') do
table.insert(lastTime, tonumber(str))
end
local ms = lastTime[1] - firstTime[1]
if ms >= tonumber(ARGV[1]) then
return redis.call('XRANGE', KEYS[1], '-', '+')
else
return redis.error_reply('Only '..ms..' ms')
end
It returns:
(error) Stream is empty or key doesn`t exist
(error) Only 34 ms if we don't have the required time elapsed
The actual list of entries if the required time between first and last message has elapsed.
Make sure to check Introduction to Redis Streams to get familiar with Redis Streams, and EVAL command to learn about Lua scripts.

Get sizes of all sorted sets with a given prefix

I got several sorted sets with a common prefix (itemmovements:) in Redis.
I know we can use ZCOUNT to get the number of items for a single (sorted set) key like this:
127.0.0.1:6379> zcount itemmovements:8 0 1000000000
(integer) 23
(I am able to do this, since I know the range of the item scores.)
How to run this in a loop for all keys prefixed itemmovements:?
Taking hint from How to atomically delete keys matching a pattern using Redis I tried this:
127.0.0.1:6379> EVAL "return redis.call('zcount', unpack(redis.call('keys', ARGV[1])), 0, 1000000000)" 0 itemmovements:*
(integer) 150
but as you can see it just returns a single number (which happens to be the size of itemmovements:0, the first value returned by keys).
I realized I did not understand what that lua code in EVAL was doing. The code below works fine:
eval "local a = {}; for _,k in ipairs(redis.call('keys', 'itemmovements:*')) do table.insert(a, k); table.insert(a, redis.call('zcount', k, 0, 1000000000)); end; return a" 0

Redis: Sum of SCORES in Sorted Set

What's the best way to get the sum of SCORES in a Redis sorted set?
The only option I think is iterating the sorted set and computing the sum client side.
Available since Redis v2.6 is the most awesome ability to execute Lua scripts on the Redis server. This renders the challenge of summing up a Sorted Set's scores to trivial:
local sum=0
local z=redis.call('ZRANGE', KEYS[1], 0, -1, 'WITHSCORES')
for i=2, #z, 2 do
sum=sum+z[i]
end
return sum
Runtime example:
~$ redis-cli zadd z 1 a 2 b 3 c 4 d 5 e
(integer) 5
~$ redis-cli eval "local sum=0 local z=redis.call('ZRANGE', KEYS[1], 0, -1, 'WITHSCORES') for i=2, #z, 2 do sum=sum+z[i] end return sum" 1 z
(integer) 15
If the sets are small, and you don't need killer performance, I would just iterate (zrange/zrangebyscore) and sum the values client side.
If, on the other hand, you are talking about many thousands - millions of items, you can always keep a reference set with running totals for each user and increment/decrement them as the gifts are sent.
So when you do your ZINCR 123:gifts 1 "3|345", you could do a seperate ZINCR command, which could be something like this:
ZINCR received-gifts 1 <user_id>
Then, to get the # of gifts received for a given user, you just need to run a ZSCORE:
ZSCORE received-gifts <user_id>
Here is a little lua script that maintains the zset score total as you go, in a counter with key postfixed with '.ss'. You can use it instead of ZADD.
local delta = 0
for i=1,#ARGV,2 do
local oldScore = redis.call('zscore', KEYS[1], ARGV[i+1])
if oldScore == false then
oldScore = 0
end
delta = delta - oldScore + ARGV[i]
end
local val = redis.call('zadd', KEYS[1], unpack(ARGV))
redis.call('INCRBY', KEYS[1]..'.ss', delta)