Do RabbitMQ, Beanstalk or Resque support scheduling a task at a certain date? - rabbitmq

I've looked at RabbitMQ, Beanstalk and Resque, which all seem geared towards asynchronous, non-delayed tasks (i.e., run all of these as quickly as possible).
Do any of them support scheduling a task on a certain timestamp?

Beanstalk has provisions for a "delay" parameter whereby you can delay the message on a delay queue for a specific period of time.
Resque has one or more scheduling add-ons to it that will provide for scheduling tasks.
With queues, the delay is often an integer specifying the number of seconds to delay (in which case you'll need to convert to the delta you need). More robust scheduling -- as part of a task queue for example -- will often take datetime values via a client library.
Note that you can also use IronMQ push queues (with a delay like beanstalk) or IronWorker (scheduling a task instead of queuing it). (Note that I work for Iron.io.)

Deplayed_job does this:
Delayed::Job.enqueue(MailingJob.new(params[:id]), 3, 3.days.from_now)
http://railscasts.com/episodes/171-delayed-job?view=asciicast

Related

Are delayed messages in Redis reliable?

I have an architecture solution that relies on the delayed messages.
In short:
There are many clients (mostly mobile devices running android or ios) that can process a given job.
I am creating a job delegation (in RDBMS) for a given client expecting it to be picked up within a certain period of time and the "chosen" client receives a push notification that there is something for it to process. IMO the details about the algorithm of choosing single client out of many is irrelevant to the problem so skipping this part.
When the client pulls a job delegation then the status of it is changed from pending to processing.
As mentioned clients are mobile devices and are often carried by people in move and thus can, due to many reasons, be unable to pull the job delegation from the server and process it.
That's why during the creation of the job delegation, there is also a delayed message dispatched in Redis which is supposed to check in now() + 40 seconds if the job was pulled or not (so if the status is pending or not).
If the delegation hasn't been pulled by the client (status = pending) server times it out and creates a new job delegation with status = pending for a different client. As so on as so for.
It works pretty well except the fact that I've noticed the "check if should timeout" jobs do not ALWAYS run at the time I would expect them to be run. The average is 7 seconds later and the max is 29 seconds later for the analyzed sample of few thousands of jobs. Redis is used as a queue but also as a key-value cache store and in general heavily utilized by the system. May it become that much impacted by the load? I've sort of "reproduced" the issue also on my local environment with a containerized setup with much less load so I doubt it's entirely due to the Redis being busy.
The delay in execution (vs expected) is quite a problem here because it may happen that, especially in case of trying few clients from the list, the total time since creation of the job till it's successfully processed can increase a lot.
So back to the original question. Is the delayed messaging functionality in Redis reliable?
Are there any good recommended docs about it?
Are there any more reliable solutions designed to solve that issue?
Expecting that messages set to be executed in a given timestamp is executed no later than 2-3 seconds from that timestamp.

When to use - Delayed Job vs RabbitMQ

Can someone give me the clarity of the advantages of using RabbitMQ(message queue) instead of Delayed Job(background processing) ?
Basically I want to know when to use background processing and messaging queue ?
My web application has 3 components one main server which will handle all user requests and 2 app servers where all the background jobs(like es reindex, es record update, sending emails, crons) should be run.
I saw articles which say Database as a queue(delayed job) is very bad as the consumers will be polling the database for new jobs and updating the statuses of jobs which will lock the tables. Then how does rabbit MQ or other messaging queues store to avoid this problem.
There are other alternatives for delayed job like sidekiq which will run over redis instead of mysql. It is better to use sidekiq instead of rabbitmq?
And are there any advantages of using sidekiq over delayed job?
You have 2 workers and 1 web server: I guess your web app dispatches some delayed jobs to your workers. So you need a way to store the data related to those background jobs.
For that, you can use both a database (like Redis, this is what sidekick is doing) or a message queue (like RabbitMQ). A message queue is a specialized system that is very efficient for this use case (allowing a much higher throughput). A database would let you have a better introspection (as you can request the jobs table to see what your current situation is), while the queuing system would be more efficient but also is more a black box and will require new skills.
If you do not have performance issues, the simpler the better, even a simple mysql database should be enough. If you want a more powerful system or need a lot of monitoring you can also consider using a specialized hosted service such as zenaton (I'm founder) that will do all the heavy lifting for you, including scheduling or more sophisticated orchestration of your background jobs.
Both perform the same task, i.e executing jobs in the background, but go about it differently.
With delayed job one uses some sort of a database for storage, queries for the jobs thereafter then processes them. It's simple to set up but the performance and scalability aren't great.
RabbitMQ or its alternatives Redis e.t.c are harder to set up but their performance, flexibility and scalability is great, we are talking in the upwards of 5000 jobs per second besides you have tend to use less code.
Another option is to use task orchestration system like Cadence Workflow. It supports both delayed execution and queueing, but provides higher level programming model and tons of features that neither queues or delayed execution frameworks.
Cadence offers a lot of advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
Built in distributed CRON
See the presentation that goes over Cadence programming model.

What's the recommended way to queue "delayed execution" messages via ServiceStack/Redis MQ?

I would like to queue up messages to be processed, only after a given duration of time elapses (i.e., a minimum date/time for execution is met), and/or at processing time of a message, defer its execution to a later point in time (say some prerequisite checks are not met).
For example, an event happens which defines a process that needs to run no sooner than 1 hour from the time of the initial event.
Is there any built in/suggested model to orchestrate this using https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-Redis?
I would probably build this in a two step approach.
Queue the Task into your Queueing system, which will process it into a persistence store: SQL Server, MongoDB, RavenDB.
Have a service polling your "Queued" tasks for when they should be reinserted back into the Queue.
Probably the safest way, since you don't want to lose these jobs presumably.
If you use RabbitMQ instead of Redis you could use Dead Letter Queues to get the same behavior. Dead letter queues essentially are catchers for expired messages.
So you push your messages into a queue with no intention of processing them, and they have a specific expiration in minutes. When they expire they pop over into the queue that you will process out of. Pretty slick way to queue things for later.
You could always use https://github.com/dominionenterprises/mongo-queue-csharp or https://github.com/dominionenterprises/mongo-queue-php or https://github.com/gaillard/mongo-queue-java which provides delayed messages and other uncommon features.

Scheduled Post implemantation in Redis

-User can prepare Post to be published for future.
So
Post.PostState is PostState.Scheduled.
Post.PublishDate is FutureDate
When futuredate comes PostState will be PostState.Published.
How can I implement this in Redis.
Sorry for duplication: I found that Delayed execution / scheduling with Redis?
Delayed execution / scheduling with Redis?
It seems an answer will be more related with code than db, so
c# reliable delayed/scheduled execution best practice
There is no scheduling as such, but you could set the values for both keys and put an expire on the scheduled date. Always lookup both keys and prefer the first. When the schedule expires, you will get back the actual as the first (and only) result.
You could also hide all that behind a lua script.
Sorry but using REDIS key expiration for scheduling won't work.
expiration can happen before, or very far in the future (eg. depending on available memory).
I think you might want to use another tool for delayed execution depending on you development platform. (eg. polling a REDIS queue, linux cron, timers, etc.)

Simple queue with Celery and RabbitMQ

I'm trying to implement a simple queue that performs one task at a time. Offloading tasks off the main thread using Celery and setting concurrency=1 in the Celery config works fine, but I might want to use more concurrent workers for other tasks.
Is there a way to tell Celery or RabbitMQ to not use multiple concurrent workers for a task (except by forcing concurrency=1)? I can't find anything in the documentation but maybe these tools are not designed for a linear queue?
Thanks!
I think what you need is a separate queue for each type of task. Create separate workers that consume from each queue, with concurrency set to 1.