Let's say we want to design a gaming platform for chess and our userbase is ~50M. And whenever a player comes to our platform to play chess we assign a random player with almost the same rating as an opponent. Every player has 4 level rating [Easy, Medium, Hard, Expert].
My approach was to keep all user in Redis cache(assume all users/boats are live and waiting for the opponent) so we are keeping data in the below format:
"chess:easy" : [u1, u2, u3]
"chess:medium" : [u4, u5, u6]
so when a user comes I will remove a user from the cache and assign.
For example: u7 (easy) wants to play chess games then his opponent will be u1(easy).
But won't this create a problem for concurrent requests as we read then remove from Redis List that will be blocking?
Can anyone suggest a better approach with or without cache?
But won't this create a problem for concurrent requests as we read then remove from Redis List that will be blocking?
No, because Redis is single-threaded when it performs write but, assuming the 50M users are equality distributed across the chess level, you would get 4 lists, one per difficulty, with a length of 12,5 million
Manipulating the list could have a lot of complexity because using [LPOP][1] to select a user and remove it has O(N) complexity and the last user may wait a lot of time before getting an opponent.
I think you should aim to use a HASH data structure and splitting the users in different databases:
db0: Easy
db1: Medium
db2: Hard
db3: Expert
In this way, you can store the users that want to play with HSET <USERID> status "ready to play" and then benefit the RANDOMKEY command to select an opponent and delete it with HDEL <KEY RETURNED BY RANDOM> status.
Doing so, you will execute commands O(1) only, providing a fast and reliable matching system that you can optimize further with the redis pipeling feature.
NB: if you set and hash per difficulty level and adding multiple fields to the hash, you will hit the O(N) complexity due to the HDEL command!
[1]: https://redis.io/commands/lpop
Can anyone suggest a better approach with or without cache?
There are many algorithms you may use such as:
Stable marriage problem
Exact cover
but it depends on the user experience you want to implement for your clients.
You can explore the gamedev.stackexchange.com community
Related
I want to implement basket functionality and store the basket using Redis in my WebApi. In whole project I am using CQRS and MediatR in operations on database. But I don't know how it should be implemented in case of Redis.
Should I implement operations on my basket also same way like: GetBasketByIdRequest, GetBasketByIdResponse, GetBasketByIdHandler, GetBasketByIdCommand, GetBasketByIdQuery, etc.)?
Or just do it apart like IBasketRepository?
I am really curious if creating MediaTr handlers the same way they typically do for a database makes sense, or if creating some service class would be better?
The repo I'm referring to: https://github.com/TryCatchLearn/skinet7/commit/73ecdb7626a36611686fad16c2c5108afb9c7534
Thanks for a help and any advices!
You should treat Redis much like you do any database access.
Redis read/write should "just" surface as an abstract Unit of work or repository pattern.
A quick look at your code and I see your project has an IRepository. Redis should "just" be surfaced to the app as another implementation of IRepository.
The fact that it is in memory and a key value pair rather than rdbms ( or a nosql for that matter ) is irrelevant.
This is your data store and you should abstract away your read/write to any data store in a consistent manner. Mostly once you pick a database then you're quite unlikely to change that. It's a fundamental. Redis is a sort of a cache though and an exception. You might later decide to move to say mongodb as the flower empire grows. Or redis may prove expensive for low order numbers.
There is one caveat though with baskets held in redis.
Abandoned baskets are a thing. You need to track how long a basket was used. You also need to link a basket to an account so you should either use account id or add basket id to the user accounts table if you have one. ( I didn't look at your code in enough depth to check ).
You will want a batch process of some sort removes any baskets which have not been accessed for a month or so. Abandoned baskets will clog up redis eventually if you don't do this. In any case, should a user log back in December they could well be surprised to see that potential valentine present still lingering.
I'm trying to build a freelance platform with using MongoDB as main database and RedisDB for caching, but really couldn't figure out which way is the proper way of caching. Basically I'll store jwt tokens, verification codes and other stuff with expiration date. On the other hand, let's say I'll store 5 big collection as Gigs, JobOffers, Reviews, Users, Companies. I also want to use query them.
Example Use Case 1
Getting job offers only categorised as "Web Design"
Example Use Case 2
Getting job offers only posted by Company X
Option 1
for these two queries i can create two hashes
hash1
"job-offers:categoryId", jobOfferId, JobOffer
hash2
"job-offers:companyId", jobOfferId, JobOffer
Option 2
Using RedisJson and RedisSearch for querying and holding everything in JSON format
Option 3
Using redisSearch with creating multiple hashes
I couldn't figure out which approach will be best, or is there any other approach which is better than both of them.
Option 1 seems like suitable for your scenario. Binding job offers with category or company ids is the smartest solution.
You can use HGETALL to get all fields data from your hashset.
When using redis as a request caching mechanism, please remember that you have to keep redis cache updated consistently if it is generated from sql or no-sql db.
good question
as far as I can see, data of redis/part of mongo is stored on RAM, and RAM is more expensive than hard disk, if you don`t care about the price, and you can handle the situations by redis/mongo, and the data can be recovered from AOF/RDB files(or things like that), you can use whichever you want
If you do care about the price of RAM, probably just use a mysql and use
engine of InnoDB cuz it is cheap and on disk and it can recover and you know a lot of people use them(mysqls,postgres)
If I were you, I probably would choose mysql InnDB, and make the right index, it is fast enough for tables that hold millions of rows.(will get not so good if there are hundreds million rows)
I've started using redis today and I've been through the tutorial and some links at stackoverflow but I'm failing to understand how to properly use redis for what it seems to be a very simple use case.
Goal: Save several users data into redis and read all of the users at once.
I start a redis client and I start by adding the first user which has id 1:
127.0.0.1:6379> hmset user:1 name "vitor" age 35
OK
127.0.0.1:6379> hgetall user:1
1) "name"
2) "vitor"
3) "age"
4) "35"
I add a couple of more users, doing several command like this one:
127.0.0.1:6379> hmset user:2 name "nuno" age 10
I was (probably wrongly) expecting to be able to now query all my users by doing:
hgetall "user:"
or even
hgetall "user:*"
The fact that I've not seen anything like this in the tutorials, kind of tells me that I'm not using redis right for this use case.
Would you be able to tell me what should be the approach for this use case?
To understand why these kind of operations seem non-trivial in NoSQL implementations, it's good to think about why NoSQL exists (and has become very popular) at all.
When you look at an early NoSQL implementation like memcached, the first use case was very simple, but very important: a blazingly fast cache for distributed data, to cache for example web page data. Very quickly stuff like clustering and sharding was added, so not all data has to be available everywhere at once at every single node in the cluster, but can be gathered on demand.
NoSQL is very different from relational data storage. Don't overuse it. Consider relational databases as well, as they are sometimes far more suited for what you are trying to accomplish. In everything you design, ask yourself "Does this scale well?".
Okay, back to your question. It is in general bad practice to do wildcard searches. You prepare your data in a way that you can retrieve your data in a scalable way.
Redis is a very chique solution, allowing you to overcome a lot of NoSQL limitations in an elegant way.
If getting "a list of all users" isn't something you have to do very often, or doesn't need to scale well, is always "I really always want all users" because it's for a daily scan anyway, use HSCAN. SCAN operations with a proper batch size don't get in the way of other clients, you can just retrieve your records a couple of thousand at a time, and after a few calls you've got everything.
You can also store your users in a SET. There's no ordering in a set, so no pagination. It can help to keep your user names unique.
If you want to do things like "get me all users that start with the letter 'a'", I'd use a ZSET. I'd wait a week or two for ZRANGEBYLEX which is just about to be released, in the works as we speak. Or use an ORM like Josiah Carlsons's 'rom' package.
When you ask yourself "But now I have to do three calls instead of one when storing my data...?!": yup, that's how it works. If you need atomicity, use a Lua script, or MULTI+EXEC pipelining. Lua is generally easier.
You can also ask yourself if using a HSET is needed. Do you need to retrieve the individual data members? Each key or member has some overhead. On top of that, HGETALL has a Big-O specification of O(N), so it doesn't scale well. It might be better to serialize your row as a whole, using JSON or MsgPack, and store it in one HSET member, or just a simple GET/SET. Also read up on SORT.
Hope this helps, TW
If you still want to use Redis you can use something like :
SADD users "{"userId":1,"name":John, "vitor":x,"age:35}"
SADD users "{"userId":2,"name":xt, "vitor":x,"age:43}"
...
And you can retrieve the same using :
SMEMBERS users
There are many accounts, which get events (data points with timestamps) stored in realtime. I discovered that it is a good idea to store events using a sorted set. I tried to store events for multiple accounts in a one sorted set, but then didn't figure out how to filter events by account id.
Is it a good idea to create multiple sorted sets for each account (> 1000 accounts)?
Questions:
How long will you keep these events in memory ?
Your number of accounts won't grow ?
Are you sure you will have enough memory ?
... but yes, you should definitely create a sorted set for each account, that's the state of art when using Redis.
However, if it's all about real-time events (storing and retrieval) you may want to give a try to a database like InfluxDB that provides a powerful SQL-like query system. It seems a better answer to your problem.
I'm implementing a Leaderboard into my django web app and don't know the best way to do it. Currently, I'm just using SQL to order my users and, from that, make a Leaderboard, however, this creates two main problems:
Performance is shocking. I've only tried scaling it to a few hundred users but I can tell calculating ranking is slow and excessive caching is annoying since I need users to see their ranking after they are added to the Leaderboard.
It's near-impossible to tell a user what position they are without performing the whole Leaderboard calculation again.
I haven't deployed but I estimate about 5% updates to Leaderboard vs 95% reading (probably more, actually) the Leaderboard. So my latest idea is to calculate a Leaderboard again each time a user is added, with a position field I can easily sort by, and no need to re-calculate to display a user's ranking.
However, could this be a problem if multiple users are committing at the same time, will locking be enough or will rankings stuff up? Additionally, I plan to put this on a separate database solely for these leaderboards, which is the best? I hear good things about redis...
Any better ways to solve this problem? (anyone know how SO makes their leaderboards?)
I've written a number of leaderboards libraries that would help you out there. The one that would be of immediate use is python-leaderboard, which is based on the reference implementation leaderboard ruby gem. Using Redis sorted sets, your leaderboard will be ranked in real-time and there is a specific section on the leaderboard page with respect to performance metrics for inserting a large number of members in a leaderboard at once. You can expect to rank 1 million members in around 30 seconds if you're pipelining writes.
If you're worried about the data changing too often in real-time, you could operate Redis in a master-slave configuration and have the leaderboards pull data from the slave, which would only poll periodically from the master.
Hope this helps!
You will appreciate the concept of sorted sets in Redis.
Don't miss the paragraph which describes your problem :D
Make a table that stores user id and user score. Just pull the leader board using
ORDER BY user_score DESC
and join the Main table for the User name or whatever else you need.
Unless the total number of tests is a variable in your equation, the calculation from your ranking system should stay the same for each user so just update individual entries.