Redis queue with claim expire - redis

I have a queue interface I want to implement in redis. The trick is that each worker can claim an item for N seconds after that it's presumed the worker has crashed and the item needs to be claimable again. It's the worker's responsibility to remove the item when finished. How would you do this in redis? I am using phpredis but that's kind of irrelevant.

To realize a simple queue in redis that can be used to resubmit crashed jobs I'd try something like this:
1 list "up_for_grabs"
1 list "being_worked_on"
auto expiring locks
a worker trying to grab a job would do something like this:
timeout = 3600
#wrap this in a transaction so our cleanup wont kill the task
#Move the job away from the queue so nobody else tries to claim it
job = RPOPLPUSH(up_for_grabs, being_worked_on)
#Set a lock and expire it, the value tells us when that job will time out. This can be arbitrary though
SETEX('lock:' + job, Time.now + timeout, timeout)
#our application logic
do_work(job)
#Remove the finished item from the queue.
LREM being_worked_on -1 job
#Delete the item's lock. If it crashes here, the expire will take care of it
DEL('lock:' + job)
And every now and then, we could just grab our list and check that all jobs that are in there actually have a lock.
If we find any jobs that DON'T have a lock, this means it expired and our worker probably crashed.
In this case we would resubmit.
This would be the pseudo code for that:
loop do
items = LRANGE(being_worked_on, 0, -1)
items.each do |job|
if !(EXISTS("lock:" + job))
puts "We found a job that didn't have a lock, resubmitting"
LREM being_worked_on -1 job
LPUSH(up_for_grabs, job)
end
end
sleep 60
end

You can set up standard synchronized locking scheme in Redis with [SETNX][1]. Basically, you use SETNX to create a lock that everyone tries to acquire. To release the lock, you can DEL it and you can also set up an EXPIRE to make the lock releasable. There are other considerations here, but nothing out of the ordinary in setting up locking and critical sections in a distributed application.

Related

Can I get the current queue in perform method of worker with sidekiq/redis?

I want to be able to delete all job in queue, but I don't know what queue is it. I'm in perform method of my worker and I need to get the "current queue", the queue where the current job is come from.
for this time I use :
require 'sidekiq/api'
queue = Sidekiq::Queue.new
queue.each do |job|
job.delete
end
because I just use "default queue", It's work.
But now I will use many queues and I can't specify only one queue for this worker because I need use a lots for a server load balancing.
So how I can get the queue where we are in perform method?
thx.
You can't by design, that's orthogonal context to the job. If your job needs to know a queue name, pass it explicitly as an argument.
This is much faster:
Sidekiq::Queue.new.clear
These docs show that you can access all running job information wwhich includes the jid (job ID) and queue name for each job
inside the perform method you have access to the jid with the jid accessor. From that you can find the current job and get the queue name
workers = Sidekiq::Workers.new
this_worker = workers.find { |_, _, work|
work['payload']['jid'] == jid
}
queue = this_worker[2]['queue']
however, the content of Sidekiq::Workers can be up to 5 seconds out of date, so you should only try this after your worker has been running at least 5 seconds, which may not be ideal

MSMQ queue with multiple processes reading

I had a MSMQ application setup where data was being pushed into one queue. Initially I only had one process reading from it and processing it. Since the volume has increased I started multiple processes to read from it which is basically a new instance of my original process. I do not see any errors but the performance has really dropped. My understanding is that each process will read from a queue and receive a new message that has not yet been processed and continue with that. Is this correct or is it possible that multiple processes could end up processing the same message?
Dim q As MessageQueue
If MessageQueue.Exists(".\private$\MsgsIQueue") Then
q = New MessageQueue(".\private$\MsgsIQueue")
Else
'GS - If there is no queue then we're done here
Console.WriteLine("Queue has not been created!")
Return
End If
While True
Dim message As Message
counter += 1
Try
If q.Transactional = True Then
Thread.Sleep(2000)
End If
q.MessageReadPropertyFilter.ArrivedTime = True
message = q.Peek(TimeSpan.FromSeconds(20.0))
message.UseJournalQueue = True
message = q.Receive(New TimeSpan(0, 0, 60))
message.Formatter = New XmlMessageFormatter
(New [String]() {"System.String"})
ProcessMessage(message)
....
Ok, are you sure that it is the queue reading that is actually causing the performance degradation? I would suspect that there is some other bottleneck in your pipeline as MSMQ is really good at handling reading from multiple processes/threads.
If I take a look at your code I would suggest the following changes:
Why sleep for 2 secs if is a tx queue? Always use tx queues and move the call to Sleep to the catch block to have a wait interval if the queue is empty.
Move the setting of the filter outside of the loop.
Remove the call to Peek as it performs nothing of value.
Use journal queue is only of use when sending messages. So remove it.
Set the formatter on the queue instead and it will be used for all reads.
You should also wrap the call to Read and ProcessMessage within a TransactionScope where you also wrap ProcessMessage in another try/catch block. This way you can commit the read if everything went Ok in ProcessMessage or otherwise choose to abort the read or move the message to a dead letter queue.

Why does celery add thousands of queues to rabbitmq that seem to persist long after the tasks completel?

I am using celery with a rabbitmq backend. It is producing thousands of queues with 0 or 1 items in them in rabbitmq like this:
$ sudo rabbitmqctl list_queues
Listing queues ...
c2e9b4beefc7468ea7c9005009a57e1d 1
1162a89dd72840b19fbe9151c63a4eaa 0
07638a97896744a190f8131c3ba063de 0
b34f8d6d7402408c92c77ff93cdd7cf8 1
f388839917ff4afa9338ef81c28aad75 0
8b898d0c7c7e4be4aa8007b38ccc00ea 1
3fb4be51aaaa4ac097af535301084b01 1
This seems to be inefficient, but further I have observed that these queues persist long after processing is finished.
I have found the task that appears to be doing this:
#celery.task(ignore_result=True)
def write_pages(page_generator):
g = group(render_page.s(page) for page in page_generator)
res = g.apply_async()
for rendered_page in res:
print rendered_page # TODO: print to file
It seems that because these tasks are being called in a group, they are being thrown into the queue but never being released. However, I am clearly consuming the results (as I can view them being printed when I iterate through res. So, I do not understand why those tasks are persisting in the queue.
Additionally, I am wondering if the large number queues that are being created is some indication that I am doing something wrong.
Thanks for any help with this!
Celery with the AMQP backend will store task tombstones (results) in an AMQP queue named with the task ID that produced the result. These queues will persist even after the results are drained.
A couple recommendations:
Apply ignore_result=True to every task you can. Don't depend on results from other tasks.
Switch to a different backend (perhaps Redis -- it's more efficient anyway): http://docs.celeryproject.org/en/latest/userguide/tasks.html
Use CELERY_TASK_RESULT_EXPIRES (or on 4.1 CELERY_RESULT_EXPIRES) to have a periodic cleanup task remove old data from rabbitmq.
http://docs.celeryproject.org/en/master/userguide/configuration.html#std:setting-result_expires

In celery, how to ensure tasks are retried when worker crashes

First of all please don't consider this question as a duplicate of this question
I have a setup an environment which uses celery and redis as broker and result_backend. My question is how can I make sure that when the celery workers crash, all the scheduled tasks are re-tried, when the celery worker is back up.
I have seen advice on using CELERY_ACKS_LATE = True , so that the broker will re-drive the tasks until it get an ACK, but in my case its not working. Whenever I schedule a task its immediately goes to the worker which persists it until the scheduled time of execution. Let me give some example:
I am scheduling a task like this: res=test_task.apply_async(countdown=600) , but immediately in celery worker logs i can see something like : Got task from broker: test_task[a137c44e-b08e-4569-8677-f84070873fc0] eta:[2013-01-...] . Now when I kill the celery worker, these scheduled tasks are lost. My settings:
BROKER_URL = "redis://localhost:6379/0"
CELERY_ALWAYS_EAGER = False
CELERY_RESULT_BACKEND = "redis://localhost:6379/0"
CELERY_ACKS_LATE = True
Apparently this is how celery behaves.
When worker is abruptly killed (but dispatching process isn't), the message will be considered as 'failed' even though you have acks_late=True
Motivation (to my understanding) is that if consumer was killed by OS due to out-of-mem, there is no point in redelivering the same task.
You may see the exact issue here: https://github.com/celery/celery/issues/1628
I actually disagree with this behaviour. IMO it would make more sense not to acknowledge.
I've had the issue, where I was using some open-source C libraries that went totaly amok and crashed my worker ungraceful without throwing an exception. For any reason whatsoever, one can simply wrap the content of a task in a child process and check its status in the parent.
n = os.fork()
if n > 0: //inside the parent process
status = os.wait() //wait until child terminates
print("Signal number that killed the child process:", status[1])
if status[1] > 0: // if the signal was something other then graceful
// here one can do whatever they want, like restart or throw an Exception.
self.retry(exc=SomeException(), countdown=2 ** self.request.retries)
else: // here comes the actual task content with its respected return
return myResult // Make sure there are not returns in child and parent at the same time.

Scalable delayed task execution with Redis

I need to design a Redis-driven scalable task scheduling system.
Requirements:
Multiple worker processes.
Many tasks, but long periods of idleness are possible.
Reasonable timing precision.
Minimal resource waste when idle.
Should use synchronous Redis API.
Should work for Redis 2.4 (i.e. no features from upcoming 2.6).
Should not use other means of RPC than Redis.
Pseudo-API: schedule_task(timestamp, task_data). Timestamp is in integer seconds.
Basic idea:
Listen for upcoming tasks on list.
Put tasks to buckets per timestamp.
Sleep until the closest timestamp.
If a new task appears with timestamp less than closest one, wake up.
Process all upcoming tasks with timestamp ≤ now, in batches (assuming
that task execution is fast).
Make sure that concurrent worker wouldn't process same tasks. At the same time, make sure that no tasks are lost if we crash while processing them.
So far I can't figure out how to fit this in Redis primitives...
Any clues?
Note that there is a similar old question: Delayed execution / scheduling with Redis? In this new question I introduce more details (most importantly, many workers). So far I was not able to figure out how to apply old answers here — thus, a new question.
Here's another solution that builds on a couple of others [1]. It uses the redis WATCH command to remove the race condition without using lua in redis 2.6.
The basic scheme is:
Use a redis zset for scheduled tasks and redis queues for ready to run tasks.
Have a dispatcher poll the zset and move tasks that are ready to run into the redis queues. You may want more than 1 dispatcher for redundancy but you probably don't need or want many.
Have as many workers as you want which do blocking pops on the redis queues.
I haven't tested it :-)
The foo job creator would do:
def schedule_task(queue, data, delay_secs):
# This calculation for run_at isn't great- it won't deal well with daylight
# savings changes, leap seconds, and other time anomalies. Improvements
# welcome :-)
run_at = time.time() + delay_secs
# If you're using redis-py's Redis class and not StrictRedis, swap run_at &
# the dict.
redis.zadd(SCHEDULED_ZSET_KEY, run_at, {'queue': queue, 'data': data})
schedule_task('foo_queue', foo_data, 60)
The dispatcher(s) would look like:
while working:
redis.watch(SCHEDULED_ZSET_KEY)
min_score = 0
max_score = time.time()
results = redis.zrangebyscore(
SCHEDULED_ZSET_KEY, min_score, max_score, start=0, num=1, withscores=False)
if results is None or len(results) == 0:
redis.unwatch()
sleep(1)
else: # len(results) == 1
redis.multi()
redis.rpush(results[0]['queue'], results[0]['data'])
redis.zrem(SCHEDULED_ZSET_KEY, results[0])
redis.exec()
The foo worker would look like:
while working:
task_data = redis.blpop('foo_queue', POP_TIMEOUT)
if task_data:
foo(task_data)
[1] This solution is based on not_a_golfer's, one at http://www.saltycrane.com/blog/2011/11/unique-python-redis-based-queue-delay/, and the redis docs for transactions.
You didn't specify the language you're using. You have at least 3 alternatives of doing this without writing a single line of code in Python at least.
Celery has an optional redis broker.
http://celeryproject.org/
resque is an extremely popular redis task queue using redis.
https://github.com/defunkt/resque
RQ is a simple and small redis based queue that aims to "take the good stuff from celery and resque" and be much simpler to work with.
http://python-rq.org/
You can at least look at their design if you can't use them.
But to answer your question - what you want can be done with redis. I've actually written more or less that in the past.
EDIT:
As for modeling what you want on redis, this is what I would do:
queuing a task with a timestamp will be done directly by the client - you put the task in a sorted set with the timestamp as the score and the task as the value (see ZADD).
A central dispatcher wakes every N seconds, checks out the first timestamps on this set, and if there are tasks ready for execution, it pushes the task to a "to be executed NOW" list. This can be done with ZREVRANGEBYSCORE on the "waiting" sorted set, getting all items with timestamp<=now, so you get all the ready items at once. pushing is done by RPUSH.
workers use BLPOP on the "to be executed NOW" list, wake when there is something to work on, and do their thing. This is safe since redis is single threaded, and no 2 workers will ever take the same task.
once finished, the workers put the result back in a response queue, which is checked by the dispatcher or another thread. You can add a "pending" bucket to avoid failures or something like that.
so the code will look something like this (this is just pseudo code):
client:
ZADD "new_tasks" <TIMESTAMP> <TASK_INFO>
dispatcher:
while working:
tasks = ZREVRANGEBYSCORE "new_tasks" <NOW> 0 #this will only take tasks with timestamp lower/equal than now
for task in tasks:
#do the delete and queue as a transaction
MULTI
RPUSH "to_be_executed" task
ZREM "new_tasks" task
EXEC
sleep(1)
I didn't add the response queue handling, but it's more or less like the worker:
worker:
while working:
task = BLPOP "to_be_executed" <TIMEOUT>
if task:
response = work_on_task(task)
RPUSH "results" response
EDit: stateless atomic dispatcher :
while working:
MULTI
ZREVRANGE "new_tasks" 0 1
ZREMRANGEBYRANK "new_tasks" 0 1
task = EXEC
#this is the only risky place - you can solve it by using Lua internall in 2.6
SADD "tmp" task
if task.timestamp <= now:
MULTI
RPUSH "to_be_executed" task
SREM "tmp" task
EXEC
else:
MULTI
ZADD "new_tasks" task.timestamp task
SREM "tmp" task
EXEC
sleep(RESOLUTION)
If you're looking for ready solution on Java. Redisson is right for you. It allows to schedule and execute tasks (with cron-expression support) in distributed way on Redisson nodes using familiar ScheduledExecutorService api and based on Redis queue.
Here is an example. First define a task using java.lang.Runnable interface. Each task can access to Redis instance via injected RedissonClient object.
public class RunnableTask implements Runnable {
#RInject
private RedissonClient redissonClient;
#Override
public void run() throws Exception {
RMap<String, Integer> map = redissonClient.getMap("myMap");
Long result = 0;
for (Integer value : map.values()) {
result += value;
}
redissonClient.getTopic("myMapTopic").publish(result);
}
}
Now it's ready to sumbit it into ScheduledExecutorService:
RScheduledExecutorService executorService = redisson.getExecutorService("myExecutor");
ScheduledFuture<?> future = executorService.schedule(new CallableTask(), 10, 20, TimeUnit.MINUTES);
future.get();
// or cancel it
future.cancel(true);
Examples with cron expressions:
executorService.schedule(new RunnableTask(), CronSchedule.of("10 0/5 * * * ?"));
executorService.schedule(new RunnableTask(), CronSchedule.dailyAtHourAndMinute(10, 5));
executorService.schedule(new RunnableTask(), CronSchedule.weeklyOnDayAndHourAndMinute(12, 4, Calendar.MONDAY, Calendar.FRIDAY));
All tasks are executed on Redisson node.
A combined approach seems plausible:
No new task timestamp may be less than current time (clamp if less). Assuming reliable NTP synch.
All tasks go to bucket-lists at keys, suffixed with task timestamp.
Additionally, all task timestamps go to a dedicated zset (key and score — timestamp itself).
New tasks are accepted from clients via separate Redis list.
Loop: Fetch oldest N expired timestamps via zrangebyscore ... limit.
BLPOP with timeout on new tasks list and lists for fetched timestamps.
If got an old task, process it. If new — add to bucket and zset.
Check if processed buckets are empty. If so — delete list and entrt from zset. Probably do not check very recently expired buckets, to safeguard against time synchronization issues. End loop.
Critique? Comments? Alternatives?
Lua
I made something similar to what's been suggested here, but optimized the sleep duration to be more precise. This solution is good if you have few inserts into the delayed task queue. Here's how I did it with a Lua script:
local laterChannel = KEYS[1]
local nowChannel = KEYS[2]
local currentTime = tonumber(KEYS[3])
local first = redis.call("zrange", laterChannel, 0, 0, "WITHSCORES")
if (#first ~= 2)
then
return "2147483647"
end
local execTime = tonumber(first[2])
local event = first[1]
if (currentTime >= execTime)
then
redis.call("zrem", laterChannel, event)
redis.call("rpush", nowChannel, event)
return "0"
else
return tostring(execTime - currentTime)
end
It uses two "channels". laterChannel is a ZSET and nowChannel is a LIST. Whenever it's time to execute a task, the event is moved from the the ZSET to the LIST. The Lua script with respond with how many MS the dispatcher should sleep until the next poll. If the ZSET is empty, sleep forever. If it's time to execute something, do not sleep(i e poll again immediately). Otherwise, sleep until it's time to execute the next task.
So what if something is added while the dispatcher is sleeping?
This solution works in conjunction with key space events. You basically need to subscribe to the key of laterChannel and whenever there is an add event, you wake up all the dispatcher so they can poll again.
Then you have another dispatcher that uses the blocking left pop on nowChannel. This means:
You can have the dispatcher across multiple instances(i e it's scaling)
The polling is atomic so you won't have any race conditions or double events
The task is executed by any of the instances that are free
There are ways to optimize this even more. For example, instead of returning "0", you fetch the next item from the zset and return the correct amount of time to sleep directly.
Expiration
If you can not use Lua scripts, you can use key space events on expired documents.
Subscribe to the channel and receive the event when Redis evicts it. Then, grab a lock. The first instance to do so will move it to a list(the "execute now" channel). Then you don't have to worry about sleeps and polling. Redis will tell you when it's time to execute something.
execute_later(timestamp, eventId, event) {
SET eventId event EXP timestamp
SET "lock:" + eventId, ""
}
subscribeToEvictions(eventId) {
var deletedCount = DEL eventId
if (deletedCount == 1) {
// move to list
}
}
This however has it own downsides. For example, if you have many nodes, all of them will receive the event and try to get the lock. But I still think it's overall less requests any anything suggested here.