Best way to delete members of one set from multiple other sets - redis

Let say I have a community with 50 000 members.
So there are one Redis set called community_###_members with 50 000 sha1 keys of users and for each user his own set user_###_communities exists with sha1 hash of community above
In some time I decide to delete community.. What is the best algorithm to kill all members?
Thanks.

assuming you really want to actually delete members of a community that does not exist, it's something like (python code example):
memebrs = redis.smembers('community_members')
pipe = redis.pipeline()
pipe.delete('community_memebrs')
for member in members:
pipe.delete(member)
pipe.execute()
if the community is very large you might want to do that N memebrs at a time, so the server will not stall until you finish all.

Related

Redis Snapshot-Configuration: How are multiple changes on the same key counted?

in Redis you can configure the creation of snapshots, e.g. "save 60 10" would save the database after 60 seconds if at least 10 keys were changed.
If the SAME key was changed 10 times, would a snapshot be saved? Or does this refer to 10 unique/different keys that have to be changed?
Thank you!
The documented config doesn't say anything about "if at least 10 keys were changed". It says the snapshot will happen if "the given number of write operations against the DB occurred". Simple commands like SET and DEL count as one write operation. More complicated commands like HMSET and ZINTERSTORE might count as more than one write operation depending on the number of values they affect. Nothing takes into account the number of unique keys that were written to since the last snapshot.

REDIS usecase using large keys with small values

I have a use-case for using redis that is a little bit different.
In my MySQL I have an entity, let's call it HumanEntity. this HumanEntity has many to many relations.
HumanEntity.Urls - Many URLs per HumanEntity.
HumanEntity.UserNames - Many UserNames per HumanEntity.
HumanEntity.Phones ...
HumanEntity.Emails ...
in a normal one hour, the application creates hundreds of these many values.
The use-case is that, the application receives an HTTP call (100 per one second) with a HumanEntity value (Url or UserName or Phone or Email).
I need to scan my MySQL (1,000,000 records) and return back the HumanEntity.Id(integer) .
Since its ok to have some latency in the data integrity I thought about REDIS.
Can I store the values as a Redis key and the and the HumanEntity.Id(integer) as the value.
My API needs to return back the HumanEntity.Id(integer).
does it make sense to have such long key and such short value? The URL, for example, maybe 1500 bytes and the value can be 1 byte.
What is the best redis method to implement that?
Thanks
If the values are not unique then you may have some problem. Phones, emails or usernames maybe unique for user but i am not sure about url or any other property stored in your database. You may overwrite the value of an identifier with another user's.
If you don't have any problem like that; you may proceed with string types, Time complexity of GET and SET is O(1) - that's the best you may get.
In some cases such as checking whether the user used any coupon, you may use long(let's say 64 chars) user id as key, and 1 as value and use EXISTS to determine it. So it's valid to use long key and short value.

Is there any option to use redis.expire more elastically?

I got a quick simple question,
Assume that if server receives 10 messages from user within 10 minutes, server sends a push email.
At first I thought it very simple using redis,
incr("foo"), expire("foo",60*10)
and in Java, handle the occurrence count like below
if(jedis.get("foo")>=10){sendEmail();jedis.del("foo");}
but imagine if user send one message at first minute and send 8 messages at 10th minute.
and the key expires, and user again send 3 messages in the next minute.
redis key will be created again with value 3 which will not trigger sendEmail() even though user send 11 messages in 2 minutes actually.
we're gonna use Redis and we don't want to put receive time values to redis.
is there any solution ?
So, there's 2 ways of solving this-- one to optimize on space and the other to optimize on speed (though really the speed difference should be marginal).
Optimizing for Space:
Keep up to 9 different counters; foo1 ... foo9. Basically, we'll keep one counter for each of the possible up to 9 different messages before we email the user, and let each one expire as it hits the 10 minute mark. This will work like a circular queue. Now do this (in Python for simplicity, assuming we have a connection to Redis called r):
new_created = False
for i in xrange(1,10):
var_name = 'foo%d' % i
if not (new_created or r.exists(var_name)):
r.set(var_name, 0)
r.expire(var_name, 600)
new_created = True
if not r.exists(var_name): continue
r.incr(var_name, 1)
if r.get(var_name) >= 10:
send_email(user)
r.del(var_name)
If you go with this approach, put the above logic in a Lua script instead of the example Python, and it should be quite fast. Since you'll at most be storing 9 counters per user, it'll also be quite space efficient.
Optimizing for speed:
Keep one Redis Sortet Set per user. Every time a user sends a message, add to his sorted set with a key equal to the timestamp and an arbitrary value. Then just do a ZCOUNT(now, now - 10 minutes) and send an email if that's greater than 10. Then ZREMRANGEBYSCORE(now - 10 minutes, inf). I know you said you didn't want to keep timestamps in Redis, but IMO this is a better solution, and you're going to have to hold some variant on timestamps somewhere no matter what.
Personally I'd go with the latter approach because the space differences are probably not that big, and the code can be done quickly in pure Redis, but up to you.

Redis Sorted Set ... store data in "member"?

I am learning Redis and using an existing app (e.g. converting pieces of it) for practice.
I'm really struggling to understand first IF and then (if applicable) HOW to use Redis in one particular use-case ... apologies if this is super basic, but I'm so new that I'm not even sure if I'm asking correctly :/
Scenario:
Images are received by a server and info like time_taken and resolution is saved in a database entry. Images are then associated (e.g. "belong_to") with one Event ... all very straight-forward for a RDBS.
I'd like to use a Redis to maintain a list of the 50 most-recently-uploaded image objects for each Event, to be delivered to the client when requested. I'm thinking that a Sorted Set might be appropriate, but here are my concerns:
First, I'm not sure if a Sorted Set can/should be used in this associative manner? Can it reference other objects in Redis? Or is there just a better way to do this altogether?
Secondly, I need the ability to delete elements that are greater than X minutes old. I know about the EXPIRE command for keys, but I can't use this because not all images need to expire at the same periodicity, etc.
This second part seems more like a query on a field, which makes me think that Redis cannot be used ... but then I've read that I could maybe use the Sorted Set score to store a timestamp and find "older than X" in that way.
Can someone provide come clarity on these two issues? Thank you very much!
UPDATE
Knowing that the amount of data I need to store for each image is small and will be delivered to the client's browser, can is there anything wrong with storing it in the member "field" of a sorted set?
For example Sorted Set => event:14:pictures <time_taken> "{id:3,url:/images/3.png,lat:22.8573}"
This saves the data I need and creates a rapidly-updatable list of the last X pictures for a given event with the ability to, if needed, identify pictures that are greater than X minutes old ...
First, I'm not sure if a Sorted Set can/should be used in this
associative manner? Can it reference other objects in Redis?
Why do you need to reference other objects? An event may have n image objects, each with a time_taken and image data; a sorted set is perfect for this. The image_id is the key, the score is time_taken, and the member is the image data as json/xml, whatever; you're good to go there.
Secondly, I need the ability to delete elements that are greater than
X minutes old
If you want to delete elements greater than X minutes old, use ZREMRANGEBYSCORE:
ZREMRANGEBYSCORE event:14:pictures -inf (currentTime - X minutes)
-inf is just another way of saying the oldest member without knowing the oldest members time, but for the top range you need to calculate it based on current time before using this command ( the above is just an example)

Redis unique increment

I am trying to implement a scoring system on redis. I have no experience with it what-so-ever.
What my app should be doing is increasing a value ONLY if the user has not already voted, so I was thinking of something like this:
INCR voteme
but only if this is has not been increased already, so wanted to do the following:
SET voteme:voterip 1
so then i would count the elements. Problem is I think this is not doable in redis, and have to think of another approach.
Any ideas?
EXTRA question:
I want to make this data persistent by writing the resulting count (e.g: 24) to the corresponding user, in mongodb. Some pseudo code would be of great help
I would not store a counter but directly a set containing all the users who have already voted.
Let's suppose a vote is organized for user 1. Each time, a user X vote for user 1, you can execute:
SADD user:1:votes X
The number of votes for user 1 can be easily retrieved:
SCARD user:1:votes
Now if you need to keep this count in sync with another store, you can execute (still supposing user X votes for user 1):
MULTI
SADD users:1:votes X
SCARD user:1:votes
EXEC
The trick is the SADD command returns the number of items effectively added to the set. If the item already exists, it returns 0. So it is quite easy to run this multi/exec block, check the result of SADD, get the cardinality of the set (number of votes), and push the cardinality to another store only if the set has been altered by the transaction.
This way, you keep the counter up-to-date in your persistent store (in real time), while filtering useless voting events.