How to find results being between two values of different keys with Redis? - redis

I'm creating a game matchmaking system using Redis based on MMR, which is a number that pretty much sums up the skill of a player. Therefore the system can match him/her with others who are pretty much with the same skill.
For example if a player with MMR of 1000 joins the queue, system will try to find other ppl with MMR of range 950 to 1050 to match with this player. But if after one minute it cannot find any player with given stats it will scale up the range to 900 to 1100 (a constant threshold).
What I want to do is really easy with relational database design but I can't figure out how to do it with Redis.
The queue table implementation would be like this:
+----+---------+------+-------+
| ID | USER_ID | MMR | TRIES |
+----+---------+------+-------+
| 1 | 50 | 1000 | 1 |
| 2 | 70 | 1500 | 1 |
| 3 | 350 | 1200 | 1 |
+----+---------+------+-------+
So when a new player queues up, it will check it's MMR against other players in the queue if it finds one between 5% Threshold it will match the two players if not it will add the new player to the table and wait for new players to queue up to compare or to pass 1 minute and the cronjob increment the tries and retry to match players.
The only way I can imagine is to use two separate keys for the low and high of each player in the queue like this
MatchMakingQueue:User:1:Low => 900
MatchMakingQueue:User:1:High => 1100
but the keys will be different and I can't get for example all users in between range of low of 900 to high of 1100!
I hope I've been clear enough any help would be much appreciated.

As #Guy Korland had suggested, a Sorted Set can be used to track and match players based on their MMR, and I do not agree with the OP's "won't scale" comment.
Basically, when a new player joins, the ID is added to a zset with the MMR as its score.
ZADD players:mmr 1000 id:50
The matchmaking is made for each user, e.g. id:50 with the following query:
ZREVRANGEBYSCORE players:mmrs 1050 950 LIMIT 0 2
A match is found if two IDs are returned and at least one of them is different than that of the new player. To make the match, both IDs (the new player's and the matched with one) need to be removed from the set - I'd use a Lua script to implement this piece of logic (matching and removing) for atomicity and communication reduction, but it can be done in the client as well.
There are different ways to keep track of the retries, but perhaps the simplest one is to use another Sorted Set, where the score is that metric.
The following pseudo Redis Lua code is a minimal example of the approach:
local kmmrs, kretries = KEYS[1], KEYS[2]
local id = ARGV[1]
local mmr = redis.call('ZSCORE', kmmrs, id)
local retries = redis.call('ZSCORE', kretries, id)
local min, max = mmr*(1-0.05*retries), mmr*(1+0.05*retries)
local candidates = redis.call('ZREVRANGEBYSCORE', kmmrs, max, min, 'LIMIT', 0, 2)
if #candidates < 2 then
redis.call('ZINCRBY', kretries, 1, id)
return nil
end
local reply
if candidates[1] ~= id then
reply = candidates[1]
else
reply = candidates[2]
end
redis.call('ZREM', kmmrs, id, reply)
redis.call('ZREM', kretries, id, reply)
return reply

Let me get the problem right! Your problem is that you want to find all the users in a given range of MMR value. What if You make other users say that "I'm falling in this range".
Read about Redis Pub/Sub.
When a user joins in, publish its MMR to rest of the players.
Write the code on the user side to check if his/her MMR is falling in the range.
If it is, user will publish back to a channel that it is falling in that range. Else, user will silently discard the message.
repeat these steps if you get no response back in 1 minute.
You can make one channel (let's say MatchMMR) for all users to publish MMR for match request which should be suscribed by all the users. And make user specific channel in case somebody has a MMR in the calculated range.
Form you published messages such that you can send all the information like "retry count", "match range percentage", "MMR value" etc. so that your code at user side can calculate if it is the right fit for the MMR.
Read mode about redis Pub/Sub at: https://redis.io/topics/pubsub

Related

Insert and Select Data in Redis

I need to enter the below data in Redis:
Atlanta_96_Bronze Ana_Moser Ida_Alvares Ana_Paula Hilma_Caldeira Leila_Barros Virna_Dias Marcia_Fu Ericleia_Bodziak Ana_Flavia_Sanglard Fernanda_Venturini Fofao_Helia_Souza Sandra_Suruagy
Sidney_00_Bronze Elisangela_Oliveira Erika_Coimbra Fofao_Helia_Souza Janina_Conceicao Karin_Rodrigues Katia_Lopes Kely_Fraga Leila_Barros Raquel_Silva Ricarda_Lima Virna_Dias Walewska_Oliveira
Pequim_08_Gold Marianne_Steinbrecher Fofao_Helia_Souza Paula_Pequeno Walewska_Oliveira Thaisa_Menezes Valeska_Menezes Welissa_Gonzaga Fabiana_Oliveira Fabiana_Claudino Sheilla_Castro Jaqueline_Carvalho Carolina_Albuquerque
Londres_12_Gold Fabiana_Claudino Dani_Lins Paula_Pequeno Adenizia_Silva Thaisa_Menezes Jaqueline_Carvalho Fernanda_Ferreira Tandara_Caixeta Natalia_Pereira Sheilla_Castro Fabiana_Oliveira Fernanda_Garay
And then perform the following queries:
Which players won gold and silver medals?
Which players won two gold medals?
Which players only won medal in '96?
Which players were in the 96, 00 and 08 Olympics?
Which players were only in the 12 olympics?
But I never touched Redis, I came from a relational world, I need help.
Redis doesn't really seem like the right tool for the job as you've described it, as Redis is more of a cache than a database. There's no concept of running a "query" in Redis.
If you must use Redis, I would recommend storing the data in multiple sets to facilitate getting to the answers you need.
You might make a set for each type of medal, for example (a set of gold medalists, a set of silver medalists, and a set of bronze medalists). Then you could ask for the union of the gold and silver sets (Redis's SUNION operator) to get the answer to your first question.
You might also make a set for each year, so that you could retrieve information by year (for your last three questions).
In some cases, there may be no way around doing some coding to refine the results to give you exactly the answers you need.

Trigger splunk alert when received values do not change

I receive exchange rate from an external web service and I log the response received like below (note both line contain data from a single response):
com.test.Currency#366c1a1e[Id=<Null>,Code=<Null>,Feedcode=Gbparslite,Rate=<Null>,Percentaqechangetrigger=<Null>,Bid=93.4269,Offer=93.43987,Mustinvertprice=False],
com.test.Currency#54acb93a[Id=<Null>,Code=<Null>,Feedcode=Gbphkdlite,Rate=<Null>,Percentaqechangetrigger=<Null>,Bid=10.04629,Offer=10.04763,Mustinvertprice=False],
I want to set up an alert which triggers when the last x (x=5) values received did not changed.
Assuming you're looking to alert when a particular field doesn't change after 5 events, you can try the following.
index=data | head 5 | stats dc(Bid) as dv
Then alert if dv equals 1. dc(Bid) calculates the number of unique values of Bid, in this case, over the last 5 events. If there is no difference, they will be 1. If there are multiple values, dc(Bid) will be greater than 1

Splunk: Find the difference between 2 events

I have a server with 2 APIs: /migrate/start and /migrate/end
For each request, I log the userID (field usrid="") of the user using my service to be migrated and the api called (field api="").
Users call /migrate/start, then call /migrate/end. I would like to write a slunk query to list the userIDs that are being migrated, i.e. those that called /migrated/start but have yet to call /migrate/end. How would I write that query?
Thank you
Assuming you have only 2 api calls (start/end) in the logs, you can use a stats command to do this.
| your_search
| stats values(api) as api by usrid
| where api!="/migrate/end"
This clubs all api calls done per user and removes the ones which have called /migrate/end
The general method is to get all the start and end events and match them up by user ID. Take the most recent event for each user and throw out the ones that are "migrate/end". What's left are all the in-progress migrations. Something like this:
index = foo (api="/migrate/start" OR api="/migrate/end")
| stats latest(api) by usrid
| where api="/migrate/start"

Finding value collisions from a changing collection of Redis keys

In my website, users are allowed to keep the same usernames. Moreover, at any point in time a user logs in, I temporarily save their username in a redis key with a ttl of 10 mins.
The question is: is there any way - using Redis - to find all user ids online within the last 10 mins, sharing the same username?
Currently, I'm extracting all the keys' values and finding collisions in Python - which doesn't really help since I need to do this multiple times at runtime (and there's lots of user traffic).
I hypothesize that I could have created sets with a unique username as the key, and stored all user ids in the set to give me O(1) look ups on users sharing the same usernames. But then, I'd have to sacrifice the 10 mins ttl condition (which I need for every username individually).
Btw Redis/Lua beginner here, hence the noob question (if it is).
Where there is a will, there is a way... :)
Begin by storing the logins in a Sorted Set. Assuming that user id 123 had logged in at time 456 with the username "foo", you can represent that as:
ZADD logins 456 123:foo
Note: you'll also have to remove old elements from that Sorted Set so it doesn't just grow out of control.
Next, you want to search for the users from the last 10 minutes, so you'd use ZRANGEBYSCORE for that. Instead of shipping the entire thing back to your client, use Lua to process it and check for collisions.
The following script example wraps together all of the above:
-- Keys: 1) The logins Sorted Set
-- Args: 1) The epoch value of 'now'
-- 2) The logged in user id
-- 3) The logged in user name
-- Get logins from the last 10 minutes
local l = redis.call('ZRANGEBYSCORE', KEYS[1], ARGV[1]-600, '+inf')
-- "Evict" old logins
redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', '(' .. ARGV[1]-600)
-- Store the new login
redis.call('ZADD', KEYS[1], ARGV[1], ARGV[2] .. ':' .. ARGV[3])
local c = {} -- detected name collision
for _, v in pairs(l) do
local p = v:find(':') -- no string.split in Lua
local i = v:sub(1,p-1) -- id
local n = v:sub(p+1) -- name
if n == ARGV[3] then
c[#c+1] = i
end
end
return c

Redis sorted sets and best way to store uids

I have data consisting of user_ids and tags of these user ids.
The user_ids occur multiple times and have pre-specified number of tags (500) however that might change in the feature. What must be stored is the user_id, their tags and their count.
I want later to easily find tags with top score.. etc. Every time a tag appears it is incremented
My implementation in redis is done using sorted sets
every user_id is a sorted set
key is user_id and is a hex number
works like this:
zincrby user_id:x 1 "tag0"
zincrby user_id:x 1 "tag499"
zincrby user_id:y 1 "tag3"
and so on
having in mind that I want to get tags with highest score, is there a better way?
The second issue is that right now I 'm using "keys *" to retrieve these keys for client side manipulation which I know that it's not aimed towards production systems.
Plus it would be great for memory problems to iterate through a specified number of keys (in the range of 10000). I know that keys have to be stored in memory, however they don't follow
a specific pattern to allow for partial retrieval so I can avoid "zmalloc" error (4GB 64 bit debian server).
Keys amount to range of 20 million.
Any thoughts?
My first point would be to note that 4 GB are tight to store 20M sorted sets. A quick try shows that 20M users, each of them with 20 tags would take about 8 GB on a 64 bits box (and it accounts for the sorted set ziplist memory optimizations provided with Redis 2.4 - don't even try this with earlier versions).
Sorted sets are the ideal data structure to support your use case. I would use them exactly as you described.
As you pointed out, KEYS cannot be used to iterate on keys. It is rather meant as a debug command. To support key iteration, you need to add a data structure to provide this access path. The only structures in Redis which can support iteration are the list and the sorted set (through the range methods). However, they tend to transform O(n) iteration algorithms into O(n^2) (for list), or O(nlogn) (for zset). A list is also a poor choice to store keys since it will be difficult to maintain it as keys are added/removed.
A more efficient solution is to add an index composed of regular sets. You need to use a hash function to associate a specific user to a bucket, and add the user id to the set corresponding to this bucket. If the user id are numeric values, a simple modulo function will be enough. If they are not, a simple string hashing function will do the trick.
So to support iteration on user:1000, user:2000 and user:1001, let's choose a modulo 1000 function. user:1000 and user:2000 will be put in bucket index:0 while user:1001 will be put in bucket index:1.
So on top of the zsets, we now have the following keys:
index:0 => set[ 1000, 2000 ]
index:1 => set[ 1001 ]
In the sets, the prefix of the keys is not needed, and it allows Redis to optimize the memory consumption by serializing the sets provided they are kept small enough (integer sets optimization proposed by Sripathi Krishnan).
The global iteration consists in a simple loop on the buckets from 0 to 1000 (excluded). For each bucket, the SMEMBERS command is applied to retrieve the corresponding set, and the client can then iterate on the individual items.
Here is an example in Python:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ----------------------------------------------------
import redis, random
POOL = redis.ConnectionPool(host='localhost', port=6379, db=0)
NUSERS = 10000
NTAGS = 500
NBUCKETS = 1000
# ----------------------------------------------------
# Fill redis with some random data
def fill(r):
p = r.pipeline()
# Create only 10000 users for this example
for id in range(0,NUSERS):
user = "user:%d" % id
# Add the user in the index: a simple modulo is used to hash the user id
# and put it in the correct bucket
p.sadd( "index:%d" % (id%NBUCKETS), id )
# Add random tags to the user
for x in range(0,20):
tag = "tag:%d" % (random.randint(0,NTAGS))
p.zincrby( user, tag, 1 )
# Flush the pipeline every 1000 users
if id % 1000 == 0:
p.execute()
print id
# Flush one last time
p.execute()
# ----------------------------------------------------
# Iterate on all the users and display their 5 highest ranked tags
def iterate(r):
# Iterate on the buckets of the key index
# The range depends on the function used to hash the user id
for x in range(0,NBUCKETS):
# Iterate on the users in this bucket
for id in r.smembers( "index:%d"%(x) ):
user = "user:%d" % int(id)
print user,r.zrevrangebyscore(user,"+inf","-inf", 0, 5, True )
# ----------------------------------------------------
# Main function
def main():
r = redis.Redis(connection_pool=POOL)
r.flushall()
m = r.info()["used_memory"]
fill(r)
info = r.info()
print "Keys: ",info["db0"]["keys"]
print "Memory: ",info["used_memory"]-m
iterate(r)
# ----------------------------------------------------
main()
By tweaking the constants, you can also use this program to evaluate the global memory consumption of this data structure.
IMO this strategy is simple and efficient, because it offers O(1) complexity to add/remove users, and true O(n) complexity to iterate on all items. The only downside is the key iteration order is random.