Redis unique increment - redis

I am trying to implement a scoring system on redis. I have no experience with it what-so-ever.
What my app should be doing is increasing a value ONLY if the user has not already voted, so I was thinking of something like this:
INCR voteme
but only if this is has not been increased already, so wanted to do the following:
SET voteme:voterip 1
so then i would count the elements. Problem is I think this is not doable in redis, and have to think of another approach.
Any ideas?
EXTRA question:
I want to make this data persistent by writing the resulting count (e.g: 24) to the corresponding user, in mongodb. Some pseudo code would be of great help

I would not store a counter but directly a set containing all the users who have already voted.
Let's suppose a vote is organized for user 1. Each time, a user X vote for user 1, you can execute:
SADD user:1:votes X
The number of votes for user 1 can be easily retrieved:
SCARD user:1:votes
Now if you need to keep this count in sync with another store, you can execute (still supposing user X votes for user 1):
MULTI
SADD users:1:votes X
SCARD user:1:votes
EXEC
The trick is the SADD command returns the number of items effectively added to the set. If the item already exists, it returns 0. So it is quite easy to run this multi/exec block, check the result of SADD, get the cardinality of the set (number of votes), and push the cardinality to another store only if the set has been altered by the transaction.
This way, you keep the counter up-to-date in your persistent store (in real time), while filtering useless voting events.

Related

Redis: How do I count the elements in a stream in a certain range?

Bussiness Objective
I'm creating a dashboard that will depend on some time-series and I'll use Redis to implement it. I'm new to using Redis and I'm trying to use Redis-Streams to count the elements in a stream.
XADD conversation:9:chat_messages * id 2583 user_type Bot
XADD conversation:9:chat_messages * id 732016 user_type User
XADD conversation:9:chat_messages * id 732017 user_type Staff
XRANGE conversation:9:chat_messages - +
I'm aware that I can get the total count of the elements using the XLEN command like this:
XLEN conversation:9:chat_messages
but I want to also know the elements in a period, for example:
XLEN conversation:9:chat_messages 1579551316273 1579551321872
I know I can use LUA to count those elements but I want some REALLY fast way to achieve this and I know that using Redis markup will be the fastest way.
Is there any way to achieve this with a straight forward Redis command? Or do I have to write a Lua script to do this?
Additional information
I'm limited by AWS' ElastiCache to use the only Redis 5.0.6, I cannot install other modules such as the RedisTimeSeries module. I'd like to use that module but it's not possible at the moment.
While the Redis Stream data structure doesn't support this, you can use a Sorted Set alongside it for keeping track of message ranges.
Basically, for each message ID you get from XADD - e.g. "1579551316273-0" - you need to do a ZADD conversation:9:ids 0 1579551316273-0. Then, you can use ZLEXCOUNT to get the "length" of a range.
Sorry, there is no commands-way to achieve this.
Your best option with Redis Streams would be to use a Lua script. You will get O(N) with N being the number of elements being counted, instead of O(log N) if a command existed.
local T = redis.call('XRANGE', KEYS[1], ARGV[1], ARGV[2])
local count = 0
for _ in pairs(T) do count = count + 1 end
return count
Note the difference between O(N) and O(log(N)) is significant for a large N, but for a chat application, if tracked by conversation, this won't make that big of a difference if chats have hundreds or even thousands of entries, once you account total command time including Round Trip Time which takes most of the time. The Lua script above removes network-payload and client-processing time.
You can switch to sorted sets if you really want O(log N) and you don't need consumer groups and other stream features. See How to store in Redis sorted set with server-side timestamp as score? if you want to use Redis server timestamp atomically.
Then you can use ZCOUNT which is O(log(N)).
If you do need Stream features, then you would need to keep the sorted set as a secondary index.

Redis: clean up members of ZSET

I'm currently studying Redis, and have the following case:
So what I have is a sorted set by google place id, for which all posts are sorted from recent to older.
The first page that is requested fetches posts < current timestamp.
When a cursor is sent to the backend, this cursor is a simple timestamp that indicates from where to fetch the next posts from the ZSET.
The query to retrieve posts by place id would be:
ZREVRANGEBYSCORE <gplaceId> <cur_timestamp> -INF WITHSCORES LIMIT <offset:timestamp as from where to fetch> <count:number of posts>
My question is what is the recommended way to clean up members of the ZSET.
Since I want to use Redis as a cache, I would prefer to limit the number of posts per place for example up until 50. When places get new posts when already 50 posts have been added to the set, I want to drop the last post from the set.
Of course I realise that I could make a manual check on every insert and perform operations as such, but I wonder if Redis has a recommended way of performing this cleanup. Alternatively I could create a scheduler for this, but I would prefer not having to do this.
Unfortunately Redis sorted set do not come with an out of the box feature for this. If sorted sets allowed a max size attribute with a configurable eviction strategy - you could have avoided some extra work.
See this related question:
How to specify Redis Sorted Set a fixed size?
In absence of such a feature, the two approaches you mentioned are right.
You can replace inserts with a transaction : insert, check size, delete if > 50
A thread that checks size of the set and trims it periodically

Best way for getting users friends top rating with Redis SORTED SET

I have SORTED SET user_id:rating for every level in the game(2000+ levels). There is 2 000 000 users in set.
I need to create 2 ratings - first - all users top 100, second - top 5 friends each player
First can be solved very easily with ZRANGE
But there is a problem with second, because in average - every user has 500 friends
There is 2 ways:
1) I can do 500 requests with ZSCORE\ZRANK and sort users on by backend (too many requests, bad performance)
2) I can create SORTED SET for each user and update it on background on every users update. (more data, more ram, more complex)
May be there are any others options I missed?
I believe your main concern here should be your data model. Does every user have a sorted set of his friends?
I would recommend something like this:
users:{id}:friends values as the ids of friends
users:scoreboard values as the users ids and score as the rating
of each
As an answer to your first concern, you can consider using pipelines, which will reduce the number of requests drastically, none the less you will still need to handle ordering the results.
The better answer for you problem would be, in case you have the two sorted sets as described earlier:
Get the intersection between the two, using the "zinterstore" command and storing the result in a sorted set created solely for this purpose. As a result, the new sorted set will contain all the user's friends ids with their rating as the score (need to be careful here since you will need to specify the score of the new sorted set, it can either be the SUM, MIN or MAX of the scores).
ref: http://redis.io/commands/zinterstore
At this point using a simple "zrevrangebyscore" and specifying a limit, will leverage the sorted result you are looking for.

What Redis data type fit the most for following example

I have following scenario:
Fetch array of numbers (from REDIS) conditionally
For each number do some async stuff (fetch something from DB based on number)
For each thing in result set from DB do another async stuff
Periodically repeat 1. 2. 3. because new numbers will be constantly added to REDIS structure.Those numbers represent unix timestamp in milliseconds so out of the box those numbers will always be sorted in time of addition
Conditionally means fetch those unix timestamp from REDIS that are less or equal to current unix timestamp in milliseconds(Date.now())
Question is what REDIS data type fit the most for this use case having in mind that this code will be scaled up to N instances, so N instances will share access to single REDIS instance. To equally share the load each instance will read for example first(oldest) 5 numbers from REDIS. Numbers are unique (adding same number should fail silently) so REDIS SET seems like a good choice but reading M first elements from REDIS set seems impossible.
To prevent two different instance of the code to read same numbers REDIS read operation should be atomic, it should read the numbers and delete them. If any async operation fail on specific number (steps 2. and 3.), numbers should be added again to REDIS to be handled again. They should be re-added back to the head not to the end to be handled again as soon as possible. As far as i know SADD would push it to the tail.
SMEMBERS key would read everything, it looks like a hammer to me. I would need to include some application logic to get first five than to check what is less or equal to Date.now() and then to delete those and to wrap somehow everything in single transaction. Besides that set cardinality can be huge.
SSCAN sounds interesting but i don't have any clue how it works in "scaled" environment like described above. Besides that, per REDIS docs: The SCAN family of commands only offer limited guarantees about the returned elements since the collection that we incrementally iterate can change during the iteration process. Like described above collection will be changed frequently
A more appropriate data structure would be the Sorted Set - members have a float score that is very suitable for storing a timestamp and you can perform range searches (i.e. anything less or equal a given value).
The relevant starting points are the ZADD, ZRANGEBYSCORE and ZREMRANGEBYSCORE commands.
To ensure the atomicity when reading and removing members, you can choose between the the following options: Redis transactions, Redis Lua script and in the next version (v4) a Redis module.
Transactions
Using transactions simply means doing the following code running on your instances:
MULTI
ZRANGEBYSCORE <keyname> -inf <now-timestamp>
ZREMRANGEBYSCORE <keyname> -inf <now-timestamp>
EXEC
Where <keyname> is your key's name and <now-timestamp> is the current time.
Lua script
A Lua script can be cached and runs embedded in the server, so in some cases it is a preferable approach. It is definitely the best approach for short snippets of atomic logic if you need flow control (remember that a MULTI transaction returns the values only after execution). Such a script would look as follows:
local r = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
return r
To run this, first cache it using SCRIPT LOAD and then call it with EVALSHA like so:
EVALSHA <script-sha> 1 <key-name> <now-timestamp>
Where <script-sha> is the sha1 of the script returned by SCRIPT LOAD.
Redis modules
In the near future, once v4 is GA you'll be able to write and use modules. Once this becomes a reality, you'll be able to use this module we've made that provides the ZPOP command and could be extended to cover this use case as well.

Redis Sorted Set ... store data in "member"?

I am learning Redis and using an existing app (e.g. converting pieces of it) for practice.
I'm really struggling to understand first IF and then (if applicable) HOW to use Redis in one particular use-case ... apologies if this is super basic, but I'm so new that I'm not even sure if I'm asking correctly :/
Scenario:
Images are received by a server and info like time_taken and resolution is saved in a database entry. Images are then associated (e.g. "belong_to") with one Event ... all very straight-forward for a RDBS.
I'd like to use a Redis to maintain a list of the 50 most-recently-uploaded image objects for each Event, to be delivered to the client when requested. I'm thinking that a Sorted Set might be appropriate, but here are my concerns:
First, I'm not sure if a Sorted Set can/should be used in this associative manner? Can it reference other objects in Redis? Or is there just a better way to do this altogether?
Secondly, I need the ability to delete elements that are greater than X minutes old. I know about the EXPIRE command for keys, but I can't use this because not all images need to expire at the same periodicity, etc.
This second part seems more like a query on a field, which makes me think that Redis cannot be used ... but then I've read that I could maybe use the Sorted Set score to store a timestamp and find "older than X" in that way.
Can someone provide come clarity on these two issues? Thank you very much!
UPDATE
Knowing that the amount of data I need to store for each image is small and will be delivered to the client's browser, can is there anything wrong with storing it in the member "field" of a sorted set?
For example Sorted Set => event:14:pictures <time_taken> "{id:3,url:/images/3.png,lat:22.8573}"
This saves the data I need and creates a rapidly-updatable list of the last X pictures for a given event with the ability to, if needed, identify pictures that are greater than X minutes old ...
First, I'm not sure if a Sorted Set can/should be used in this
associative manner? Can it reference other objects in Redis?
Why do you need to reference other objects? An event may have n image objects, each with a time_taken and image data; a sorted set is perfect for this. The image_id is the key, the score is time_taken, and the member is the image data as json/xml, whatever; you're good to go there.
Secondly, I need the ability to delete elements that are greater than
X minutes old
If you want to delete elements greater than X minutes old, use ZREMRANGEBYSCORE:
ZREMRANGEBYSCORE event:14:pictures -inf (currentTime - X minutes)
-inf is just another way of saying the oldest member without knowing the oldest members time, but for the top range you need to calculate it based on current time before using this command ( the above is just an example)