Redis Snapshot-Configuration: How are multiple changes on the same key counted? - redis

in Redis you can configure the creation of snapshots, e.g. "save 60 10" would save the database after 60 seconds if at least 10 keys were changed.
If the SAME key was changed 10 times, would a snapshot be saved? Or does this refer to 10 unique/different keys that have to be changed?
Thank you!

The documented config doesn't say anything about "if at least 10 keys were changed". It says the snapshot will happen if "the given number of write operations against the DB occurred". Simple commands like SET and DEL count as one write operation. More complicated commands like HMSET and ZINTERSTORE might count as more than one write operation depending on the number of values they affect. Nothing takes into account the number of unique keys that were written to since the last snapshot.

Related

Redis: clean up members of ZSET

I'm currently studying Redis, and have the following case:
So what I have is a sorted set by google place id, for which all posts are sorted from recent to older.
The first page that is requested fetches posts < current timestamp.
When a cursor is sent to the backend, this cursor is a simple timestamp that indicates from where to fetch the next posts from the ZSET.
The query to retrieve posts by place id would be:
ZREVRANGEBYSCORE <gplaceId> <cur_timestamp> -INF WITHSCORES LIMIT <offset:timestamp as from where to fetch> <count:number of posts>
My question is what is the recommended way to clean up members of the ZSET.
Since I want to use Redis as a cache, I would prefer to limit the number of posts per place for example up until 50. When places get new posts when already 50 posts have been added to the set, I want to drop the last post from the set.
Of course I realise that I could make a manual check on every insert and perform operations as such, but I wonder if Redis has a recommended way of performing this cleanup. Alternatively I could create a scheduler for this, but I would prefer not having to do this.
Unfortunately Redis sorted set do not come with an out of the box feature for this. If sorted sets allowed a max size attribute with a configurable eviction strategy - you could have avoided some extra work.
See this related question:
How to specify Redis Sorted Set a fixed size?
In absence of such a feature, the two approaches you mentioned are right.
You can replace inserts with a transaction : insert, check size, delete if > 50
A thread that checks size of the set and trims it periodically

What Redis data type fit the most for following example

I have following scenario:
Fetch array of numbers (from REDIS) conditionally
For each number do some async stuff (fetch something from DB based on number)
For each thing in result set from DB do another async stuff
Periodically repeat 1. 2. 3. because new numbers will be constantly added to REDIS structure.Those numbers represent unix timestamp in milliseconds so out of the box those numbers will always be sorted in time of addition
Conditionally means fetch those unix timestamp from REDIS that are less or equal to current unix timestamp in milliseconds(Date.now())
Question is what REDIS data type fit the most for this use case having in mind that this code will be scaled up to N instances, so N instances will share access to single REDIS instance. To equally share the load each instance will read for example first(oldest) 5 numbers from REDIS. Numbers are unique (adding same number should fail silently) so REDIS SET seems like a good choice but reading M first elements from REDIS set seems impossible.
To prevent two different instance of the code to read same numbers REDIS read operation should be atomic, it should read the numbers and delete them. If any async operation fail on specific number (steps 2. and 3.), numbers should be added again to REDIS to be handled again. They should be re-added back to the head not to the end to be handled again as soon as possible. As far as i know SADD would push it to the tail.
SMEMBERS key would read everything, it looks like a hammer to me. I would need to include some application logic to get first five than to check what is less or equal to Date.now() and then to delete those and to wrap somehow everything in single transaction. Besides that set cardinality can be huge.
SSCAN sounds interesting but i don't have any clue how it works in "scaled" environment like described above. Besides that, per REDIS docs: The SCAN family of commands only offer limited guarantees about the returned elements since the collection that we incrementally iterate can change during the iteration process. Like described above collection will be changed frequently
A more appropriate data structure would be the Sorted Set - members have a float score that is very suitable for storing a timestamp and you can perform range searches (i.e. anything less or equal a given value).
The relevant starting points are the ZADD, ZRANGEBYSCORE and ZREMRANGEBYSCORE commands.
To ensure the atomicity when reading and removing members, you can choose between the the following options: Redis transactions, Redis Lua script and in the next version (v4) a Redis module.
Transactions
Using transactions simply means doing the following code running on your instances:
MULTI
ZRANGEBYSCORE <keyname> -inf <now-timestamp>
ZREMRANGEBYSCORE <keyname> -inf <now-timestamp>
EXEC
Where <keyname> is your key's name and <now-timestamp> is the current time.
Lua script
A Lua script can be cached and runs embedded in the server, so in some cases it is a preferable approach. It is definitely the best approach for short snippets of atomic logic if you need flow control (remember that a MULTI transaction returns the values only after execution). Such a script would look as follows:
local r = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
return r
To run this, first cache it using SCRIPT LOAD and then call it with EVALSHA like so:
EVALSHA <script-sha> 1 <key-name> <now-timestamp>
Where <script-sha> is the sha1 of the script returned by SCRIPT LOAD.
Redis modules
In the near future, once v4 is GA you'll be able to write and use modules. Once this becomes a reality, you'll be able to use this module we've made that provides the ZPOP command and could be extended to cover this use case as well.

Is there any option to use redis.expire more elastically?

I got a quick simple question,
Assume that if server receives 10 messages from user within 10 minutes, server sends a push email.
At first I thought it very simple using redis,
incr("foo"), expire("foo",60*10)
and in Java, handle the occurrence count like below
if(jedis.get("foo")>=10){sendEmail();jedis.del("foo");}
but imagine if user send one message at first minute and send 8 messages at 10th minute.
and the key expires, and user again send 3 messages in the next minute.
redis key will be created again with value 3 which will not trigger sendEmail() even though user send 11 messages in 2 minutes actually.
we're gonna use Redis and we don't want to put receive time values to redis.
is there any solution ?
So, there's 2 ways of solving this-- one to optimize on space and the other to optimize on speed (though really the speed difference should be marginal).
Optimizing for Space:
Keep up to 9 different counters; foo1 ... foo9. Basically, we'll keep one counter for each of the possible up to 9 different messages before we email the user, and let each one expire as it hits the 10 minute mark. This will work like a circular queue. Now do this (in Python for simplicity, assuming we have a connection to Redis called r):
new_created = False
for i in xrange(1,10):
var_name = 'foo%d' % i
if not (new_created or r.exists(var_name)):
r.set(var_name, 0)
r.expire(var_name, 600)
new_created = True
if not r.exists(var_name): continue
r.incr(var_name, 1)
if r.get(var_name) >= 10:
send_email(user)
r.del(var_name)
If you go with this approach, put the above logic in a Lua script instead of the example Python, and it should be quite fast. Since you'll at most be storing 9 counters per user, it'll also be quite space efficient.
Optimizing for speed:
Keep one Redis Sortet Set per user. Every time a user sends a message, add to his sorted set with a key equal to the timestamp and an arbitrary value. Then just do a ZCOUNT(now, now - 10 minutes) and send an email if that's greater than 10. Then ZREMRANGEBYSCORE(now - 10 minutes, inf). I know you said you didn't want to keep timestamps in Redis, but IMO this is a better solution, and you're going to have to hold some variant on timestamps somewhere no matter what.
Personally I'd go with the latter approach because the space differences are probably not that big, and the code can be done quickly in pure Redis, but up to you.

Redis: How to delete all keys older than 3 months

I want to flush out all keys older than 3 months. These keys were not set with an expire date.
Or if that is not possible, can I then delete maybe the oldest 1000 keys?
Using the object idletime you can delete all keys that have not been used since three months. It is not exactly what you ask. If you created a key 6 months ago, but the key is accessed everyday, then idletime is updated and this script will not delete it. I hope the script can help:
#!/usr/bin/env python
import redis
r = redis.StrictRedis(host='localhost', port=6379, db=0)
for key in r.scan_iter("*"):
idle = r.object("idletime", key)
# idle time is in seconds. This is 90days
if idle > 7776000:
r.delete(key)
Are you NOW using an expire? If so, you could loop through all keys if no TTL is set then add one.
Python example:
for key in redis.keys('*'):
if redis.ttl(key) == -1:
redis.expire(key, 60 * 60 * 24 * 7)
# This would clear them out in a week
EDIT
As #kouton pointed out use scan over keys in production, see a discussion on that at: SCAN vs KEYS performance in Redis
A bit late, but check out the OBJECT command. There you will find object idle time (with 10 second resolution). It's used for debugging purposes but still, could be a nice workaround for your need.
References: http://redis.io/commands/object
Sorry, that's not possible, as stated in the comments above. In Redis it's crucial to create your own indexes to support your access patterns.
Tip: What you should do is to create a sorted set (ZADD) with all new or modified keys, and set the score to a timestamp. This way you can with ease fetch the keys within a time period using ZRANGEBYSCORE.
If you want to expire your existing keys, get all keys (expensive) and set a TTL for each using the EXPIRE command.

Best way to delete members of one set from multiple other sets

Let say I have a community with 50 000 members.
So there are one Redis set called community_###_members with 50 000 sha1 keys of users and for each user his own set user_###_communities exists with sha1 hash of community above
In some time I decide to delete community.. What is the best algorithm to kill all members?
Thanks.
assuming you really want to actually delete members of a community that does not exist, it's something like (python code example):
memebrs = redis.smembers('community_members')
pipe = redis.pipeline()
pipe.delete('community_memebrs')
for member in members:
pipe.delete(member)
pipe.execute()
if the community is very large you might want to do that N memebrs at a time, so the server will not stall until you finish all.