Node.js & Redis:
I have a LIST (users:waiting) storing a queue of users waiting to join games.
I have SORTED SET (games:waiting) of games waiting for users. This is updated by the servers every 30s with a new date. This way I can ensure if a server crashes, the game is no longer used. If the server is running and fills up, it'll remove itself from the sorted set.
Each game has a SET (game:id:users) containing the users that are in it. Each game can accept no more than 6 players.
Multiple servers are using BRPOP to pick up users from the LIST (users:waiting).
Once a server has a user id, it gets the waiting games ids, then proceeds to run SCARD on their game:id:users SET. If the result of this is less than 6, it adds them to the set.
The problem:
If multiple servers are doing this at once, we could end up with more than 6 users being added to a set at a time. For example if one server requests SCARD and immediately after another runs SADD, the number in the set will have increased but the first server won't know.
Is there anyway of preventing this?
You need transactions, which redis supports: http://redis.io/topics/transactions
in your case in particular, you want to pay attention to the watch command: http://redis.io/topics/transactions#cas
Related
Problem
we have ~50k scheduled financial reports that we periodically deliver to clients via email
reports have their own delivery frequency (date&time format - as configured by clients)
weekly
daily
hourly
weekdays only
etc.
Current architecture
we have a table called report_metadata that holds report information
report_id
report_name
report_type
report_details
next_run_time
last_run_time
etc...
every week, all 6 instances of our scheduler service poll the report_metadata database, extract metadata for all reports that are to be delivered in the following week, and puts them in a timed-queue in-memory.
Only in the master/leader instance (which is one of the 6 instances):
data in the timed-queue is popped at the appropriate time
processed
a few API calls are made to get a fully-complete and current/up-to-date report
and the report is emailed to clients
the other 5 instances do nothing - they simply exist for redundancy
Proposed architecture
Numbers:
db can handle up to 1000 concurrent connections - which is good enough
total existing report number (~50k) is unlikely to get much larger in the near/distant future
Solution:
instead of polling the report_metadata db every week and storing data in a timed-queue in-memory, all 6 instances will poll the report_metadata db every 60 seconds (with a 10 s offset for each instance)
on average the scheduler will attempt to pick up work every 10 seconds
data for any single report whose next_run_time is in the past is extracted, the table row is locked, and the report is processed/delivered to clients by that specific instance
after the report is successfully processed, table row is unlocked and the next_run_time, last_run_time, etc for the report is updated
In general, the database serves as the master, individual instances of the process can work independently and the database ensures they do not overlap.
It would help if you could let me know if the proposed architecture is:
a good/correct solution
which table columns can/should be indexed
any other considerations
I have worked on a differt kind of sceduler for a program that reported analyses on a specific moment of the month/week and what I did was combining the reports to so called business cycle based time moments. these moments are on the "start of a new week", "start of the month", "start/end of a D/W/M/Q/Y'. So I standardised the moments of sending the reports and added the id's to a table that would carry the details of the report. - now you add thinks to the cycle of you remove it when needed, you could do this by adding a tag like(EOD(end of day)/EOM (End of month) SOW (Start of week) ect, ect, ect,).
So you could index the moments of when the clients want to receive the reports and build on that track. Hope that this comment can help you with your challenge.
It seems good to simply query that metadata table by all 6 instances to check which is the next report to process as you are suggesting.
It seems odd though to have a staggered approach with a check once every 60 seconds offset by 10 seconds for your servers. You have 6 servers now but that may change. Also I don't understand the "locking" you are suggesting, why now simply set a flag on the row such as [State] = "processing", then the next scheduler knows to skip that row and move on to the next available one. Once a run is processed, you can simply update a [Date_last_processed] column, or maybe something like [last_cycle_complete] = 'YES'.
Alternatively you could have one server-process to go through the table, and for each available row, sends it off to one of the instances, in a round-robin fashion (or keep track of who is busy and who isn't).
I like this article: http://technet.microsoft.com/en-us/library/dd576261(v=sql.100).aspx because of the receive top (10000) into a table variable. Processing a table variable with 10000 messages would give me a giant boost in performance.
receive top (10000) message_type_name, message_body, conversation_handle<br>
from MySSBLabTestQueue<br>
into #receive
From reading, the receive provides messages given a single conversation_handle. I have 200+ stores all sending messages with the same message type and contract to the same server. Can I implement the server to get all the messages from these stores on a single call to receive?
Thanks
A target can consolidate multiple conversations into few conversation groups, using the MOVE CONVERSATION. The RECEIVE restricts the result set to one single conversation group so moving many individual conversation into a single group can result in bigger result sets, as you desire.
For the records, initiators can also consolidate conversations using MOVE CONVERSATION, there is nothing role specific here. But initiators can also use the RELATED_CONVERSATION_GROUP clause of BEGIN DIALOG to start the conversation directly in the desired group, achieving consolidation and thus bigger result sets w/o having to use MOVE. This is useful because you can simply reverse the roles in the app, ie. instead of stores starting the dialogs with central server, have the central server start the dialogs with each store (thus reversing the roles) and the central server can start the dialogs in as few conversation groups as it likes, even 1. This removes the need to issue MOVE CONVERSATION.
Previously for my PHP app, I used a cron job that increments the health of a user in SQL every 10 minutes and the cron job script incremented the health of all users.
For my next app, I tried using MySQL events to increment the health every minutes for each individual user and ran into some problems with them not working after awhile (MySQL events stop working after awhile)
What's the best way to do this if I were to create a new app in Ruby on Rails? I'm open to using MySQL or PostgreSQL.
This is for a game where users will fight each other and lose health.
edit: Sometimes the user will encounter another user, and I need to select that user based on their health among other things. So I need the actual health stored in the database.
Instead updating every record in the database every 10 minutes, store a last-modified timestamp in the same row as the health. Every time you read the player_health from the database, add (current_time - last_modified) / (10 min) to the value. Every time you write player_health to the database, update the last_modified.
I would create a rake task that increases all users' health by 10, and call it using the awesome whenever gem every 10 minutes.
UPDATE
However, as Dan said in his comment, it might be inefficient to do such a huge DB update every 10 seconds (especially if you have huge number of users) if you can just update every user's health when he requests that. But that's subject to how your game actually works.
The correct fix, given a health bump of 10 points every minute, is a hitpoints variable and a timestamp for the last time it was set. Then the select statement will say "hitpoints + minutes(now - timestamp) * 10". Converting that to SQL is left as an exercise for the reader.
I was told redis was born for analytic, and I came across some bitmap using cases. They are useful when counting based on yes/no(0/1), but I can't find an efficient way to count the number of user who login at least 4 times during the last 10 days. Because redis runs in memory, I tried using bit map to keep track login flag of each user, and using bitcount to filer, on my laptop, it took a minute to return the count from about 4Million users' login activity.
Is there any way to solve this problem? I guess the round trips between my node redis client and redis server may be the issue, I'll try batch command or lua script to see if it works.
I think you need to use SortedSets with user id in value, and timestamp in score.
When user logs in, score (time stamp) for this user updates to current. Than you can get ether N last logged in users (ZREVRANGE), or users, logged in between some datetime range (ZRANGEBYSCORE)
I'm making a game that uses redis to store game state. It keeps track of locations and players pretty well, but I don't have a good way to clean up inactive players.
Every time a player moves (it's a semi-slow moving game. Think 1-5 frames per second), I update a hash with the new location and remove the old location key.
What would be the best way to keep track of active players? I've thought of the following
Set some key on the user to expire. Update every heartbeat or move. Problem is the locations are stored in a hash, so if the user key expires the player will still be in the same spot.
Same, but use pub/sub to listen for the expiration and finish cleaning up (seems overly complicated)
Store heartbeats in a sorted set, have a process run every X seconds to look for old players. Update score every heartbeat.
Completely revamp the way I store locations so I can use expire.. somehow?
Any other ideas?
Perhaps use separate redis data structures (though same database) to track user activity
and user location.
For instance, track users currently online separately using redis sets:
[my code snippet is in python using the redis-python bindings, and adapted from example app in Flask (python micro-framework); example app and the framework both by Armin Ronacher.]
from redis import Redis as redis
from time import time
r1 = redis(db=1)
when the function below is called, it creates a key based on current unix time in minutes
and then adds a user to a set having that key. I would imagine you would want to
set the expiry at say 10 minutes, so at any given time, you have 10 keys live
(one per minute).
def record_online(player_id):
current_time = int(time.time())
expires = now + 600 # 10 minutes TTL
k1 = "playersOnline:{0}".format(now//60)
r1.sadd(k1, player_id)
r1.expire(k1, expires)
So to get all active users just union all of the live keys (in this example, that's 10
keys, a purely arbitrary number), like so:
def active_users(listOfKeys):
return r1.sunion(listOfKeys)
This solves your "clean-up" issues because of the TTL--the inactive users would not appear in your live keys because they constantly recycle--i.e., in active users are only keyed to old timestamps which don't persist in this example (but perhaps are written to a permanent store by redis before expiry). In any event, this clears inactive users from your active redis db.