Keep item in list for certain time - redis

I am no an expert in redis at all. Today I run into one idea, but I don't know if it is possible in redis.
I want to store list of values but only for some time, for example list of ip addresses which visited page in last 5 minutes. As far as I know I can't set EXPIRE on single list/hash item, right? So I am pushing 1, 2, 3 into list/hash but after certain constant time I want each item to expire/disapear? Or maybe instead of list hash structure will be more suitable { '1': timestamp-when-disapear, ... }?
Or maybe only solution is
SET test.1.1 1
EXPIRE test.1.1 60
SET test.1.2 2
EXPIRE test.1.2 60
SET test.1.3 3
EXPIRE test.1.3 60
# to retrieve, can I pipeline KEYS output to MGET?
KEYS test.1.*

Use a sorted set instead.
log the server IP along with the timestamp in a sorted set. During retrieval make use of that timestamp to get things you need. In a scheduler periodically delete keys which goes beyond the range.
Example:
zadd test 1465371055 1.1
zadd test 1465381055 1.3
zadd test 1465391055 1.1
your sorted set will have 1.1 and 1.3, where 1.1 is with the new value 1465391055.
Now on retrieval use
zrangebyscore test min max
min -> currenttime - (5*60*1000)
max -> currenttime
you will get IP's visited in last 5 mins.
In another scheduler kind of thread you need to delete unwanted entries.
zremrangebyscore test min max
min -> currenttime - (10*60*1000) -> you can give it to any minute you want.
max -> currenttime
Also understand that if number of distinct IP's are too large then the sorted set will grow rapidly. Your scheduler thread must work properly to keep the memory in control.

Related

Implementing bursts or spikes detection in a counter using Redis

I wanna use Redis to keep track of certain numbers. Basically, they're counters. Is there a way to use Redis to sort of track the rate at which these counters increase?
For example, let's say a counter is being incremented at a rate of 10 per minute for the most of the time but suddenly it's being incremented at a rate of 40 per minute. How can I detect that?
You cannot do that directly, but you can do that with a sorted set for example, with a bit of client side, or Lua based processing.
Let's say that you use a sorted set, for each time window you increment the value:
ZINCRBY mykey timestamp 1
Then you have a simple counter per timestamp.
When you want to analyze it, you can take a range by time with ZRANGE or ZREVRANGE, getting the scores by using WITHSCORES, and do some processing on the differences for detecting anomalies. There are many ways to do it, here's a link with a few pointers: https://stats.stackexchange.com/questions/152644/what-algorithm-should-i-use-to-detect-anomalies-on-time-series

Expire geospatial items in Redis

There are proposals for sorted set item expiration in Redis (see https://groups.google.com/d/msg/redis-db/rXXMCLNkNSs/Bcbd5Ae12qQJ and https://quickleft.com/blog/how-to-create-and-expire-list-items-in-redis/), I tried the worker approach to expire geospatial indexes with ZREMRANGEBYSCORE and ZREMRANGEBYRANK commands unsuccessfully (nothing removed).
I succeded using ZREMRANGEBYLEX.
Is there a way to work with geospatial items score other than Strings?
Update:
For example, if time to live(ttl) of an item is 30sec, I add it as:
geoadd 1 -8.616021 41.154503 30
Now, suppose worker executes after 40sec, I was expecting that
zremrangebyscore 1 0 40
would do the job, but it does not,
ZREMRANGEBYLEX 1 [0 [40
does it. Why is this behavior? That means the score of a geospatial item supports only lexicographical operations?
Sorted Sets have elements (strings), and every element has a score (floating-point). Geosets use the score to encode a coordinate.
Redis doesn't expire members in a Sorted Set (or a Geoset). You have to remove them yourself if that is required.
In your case, you'll need to keep two Sorted Sets - one as your GeoSet and one for managing TTLs as scores.
For example, assuming your member is called 'foo', to add it:
ZADD ttls 30 foo
ZADD elems -8.616021 41.154503 foo
To manually expire, first find the members with a call to ZRANGEBYSCORE ttls, and then remove them from both Sets.
Tip: it is preferable to use a timestamp as score instead of seconds.

Time-barred list in redis

It's easy to do a list in redis with a fixed size, by performing ltrim at the required size. However, how can one maintain a list that contains data that is time-barred, not size-barred.
E.g. how would I maintain a list of all user_ids that logged into my website in the last 10 mins? Please provide an illustrative example of the most efficient way to accomplish this. Maybe I'm approaching this with the wrong data-type?
Just use a sorted set instead of a list. Use unix timestamps as score of items.
To add an item in the Zset:
ZADD myzet <current timestamp> item
To retrieve the items of the last ten minutes, sorted by insertion time, older first:
ZRANGEBYSCORE myzset (<current timestamp - 600 seconds> (+inf
To get the newest first replace ZREVRANGEBYSCORE with ZRANGEBYSCORE.
To remove the expired items:
ZREMRANGEBYSCORE -inf (<current timestamp - 600 seconds>

How to store every second data in redis

I am working on to find usage pattern. For every request I will take the corresponding second of the day and mark an entry. At the end I will see the usage pattern with this. What is the best structure to perform this in redis?
You can store it in three ways:
1) setbit operations storing in a single key
You can use setbit operations if the frequency is very high. That is if you mark for almost all the seconds then you have to store 86400 values inside that. But this will hardly take 0.1 Mb to store.
Even if you store only one entry at 86400th second then you have to loose that 0.1 Mb. But it always have the fixed size as 0.1 Mb. Also beware that you can get the whole thing as a string and you have to convert them as bits.
setbit date second
get date
2) sets
You can use sets if the frequency is little low. So only those seconds in which the request comes will be in your set.
Sadd date second
smembers date
3) hashes
You can use hashes if want to know the count for each second.
Hincrby date second 1
hgetall date
Also do a sample test with all these and compare the size and efficiency.
I'd use a hash to group by day.
// Set the counter to 0 if it doesn't exist.
HSETNX [DAY] [SECOND] 0
// Increment the counter by 1 each request
HINCRBY [DAY] [SECOND] 1
Then use HKEYS or HSCAN to get your results.

Is there any option to use redis.expire more elastically?

I got a quick simple question,
Assume that if server receives 10 messages from user within 10 minutes, server sends a push email.
At first I thought it very simple using redis,
incr("foo"), expire("foo",60*10)
and in Java, handle the occurrence count like below
if(jedis.get("foo")>=10){sendEmail();jedis.del("foo");}
but imagine if user send one message at first minute and send 8 messages at 10th minute.
and the key expires, and user again send 3 messages in the next minute.
redis key will be created again with value 3 which will not trigger sendEmail() even though user send 11 messages in 2 minutes actually.
we're gonna use Redis and we don't want to put receive time values to redis.
is there any solution ?
So, there's 2 ways of solving this-- one to optimize on space and the other to optimize on speed (though really the speed difference should be marginal).
Optimizing for Space:
Keep up to 9 different counters; foo1 ... foo9. Basically, we'll keep one counter for each of the possible up to 9 different messages before we email the user, and let each one expire as it hits the 10 minute mark. This will work like a circular queue. Now do this (in Python for simplicity, assuming we have a connection to Redis called r):
new_created = False
for i in xrange(1,10):
var_name = 'foo%d' % i
if not (new_created or r.exists(var_name)):
r.set(var_name, 0)
r.expire(var_name, 600)
new_created = True
if not r.exists(var_name): continue
r.incr(var_name, 1)
if r.get(var_name) >= 10:
send_email(user)
r.del(var_name)
If you go with this approach, put the above logic in a Lua script instead of the example Python, and it should be quite fast. Since you'll at most be storing 9 counters per user, it'll also be quite space efficient.
Optimizing for speed:
Keep one Redis Sortet Set per user. Every time a user sends a message, add to his sorted set with a key equal to the timestamp and an arbitrary value. Then just do a ZCOUNT(now, now - 10 minutes) and send an email if that's greater than 10. Then ZREMRANGEBYSCORE(now - 10 minutes, inf). I know you said you didn't want to keep timestamps in Redis, but IMO this is a better solution, and you're going to have to hold some variant on timestamps somewhere no matter what.
Personally I'd go with the latter approach because the space differences are probably not that big, and the code can be done quickly in pure Redis, but up to you.