apscheduler at 90 second intervals? - apscheduler

Is it possible to set an apscheduler cron job to run at 90 second intervals? (I have 40 machines that I'd like to schedule evenly over an hour without hard coding time info into the script). I've tried various kinds of this:
job = sched.add_cron_job(_test, minute='*/1', second='30')
job = sched.add_cron_job(_test, minute='*', second='90')

Try this instead:
job = sched.add_interval_job(_test, seconds=90)

Based on your question you want to start a cron job at a particular time and run it indefinitely with 90 seconds interval. You can achieve this by combining triggers
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.combining import AndTrigger
from apscheduler.triggers.interval import IntervalTrigger
from apscheduler.triggers.cron import CronTrigger
def _test():
print("code comes here")
scheduler = BackgroundScheduler()
# Runs on 2019-12-30 at 5:30 (am) & repeats every 90 seconds interval
trigger = AndTrigger([IntervalTrigger(seconds=90),
CronTrigger(start_date='2019-12-30', hour=5, minute=30)])
scheduler.add_job(_test, trigger)
scheduler.start()

Interval code example:
sched = BlockingScheduler()
sched.add_job(ClassTest, 'interval', seconds=90)
sched.start()

Related

How to add 2 minutes delay between jobs in a queue?

I am using Hangfire in ASP.NET Core with a server that has 20 workers, which means 20 jobs can be enqueued at the same time.
What I need is to enqueue them one by one with 2 minutes delay between each one and another. Each job can take 1-45 minutes, but I don't have a problem running jobs concurrently, but I do have a problem starting 20 jobs at the same time. That's why changing the worker count to 1 is not practical for me (this will slow the process a lot).
The idea is that I just don't want 2 jobs to run at the same second since this may make some conflicts in my logic, but if the second job started 2 minutes after the first one, then I am good.
How can I achieve that?
You can use BackgroundJob.Schedule() to run your job run at a specific time:
BackgroundJob.Schedule(() => Console.WriteLine("Hello"), dateTimeToExecute);
Based on that set a date for the first job to execute, and then increase this date to 2 minutes for each new job.
Something like this:
var dateStartDate = DateTime.Now;
foreach (var j in listOfjobsToExecute)
{
BackgroundJob.Schedule(() => j.Run(), dateStartDate);
dateStartDate = dateStartDate.AddMinutes(2);
}
See more here:
https://docs.hangfire.io/en/latest/background-methods/calling-methods-with-delay.html?highlight=delay

APScheduler: Interval trigger with end_date

If I run the following:
def my_job_2(text):
for i in range(7):
print("\ni: ",text)
logging.basicConfig()
logging.getLogger('apscheduler').setLevel(logging.DEBUG)
scheduler = BackgroundScheduler()
#scheduler = BlockingScheduler()
scheduler.add_job(my_job_2,'interval',id="my_job_interval", seconds=10,
args=['Running a Job, every 10 seconds'], end_date=datetime(2021,2,22,10,15,0))
try:
scheduler.start()
except (KeyboardInterrupt, SystemExit):
print("Exception Caught")
According to the documentation, this should work. However, when I run it, the script prints:
INFO:apscheduler.scheduler:Added job "my_job_2" to job store "default"
INFO:apscheduler.scheduler:Scheduler started
DEBUG:apscheduler.scheduler:Looking for jobs to run
DEBUG:apscheduler.scheduler:No jobs; waiting until a job is added

Long running jobs redelivering after broker visibility timeout with celery and redis

I am using celery 4.3 + Redis + flower. I have a few long-running jobs with acks_late=True and task_reject_on_worker_lost=True. I am using celery grouping to group jobs run parallelly and append result and use in parent job.
In this scenario, my few jobs will run more than an hour, after every one hour the same child jobs are redelivering again to the worker.
The sample jobs as below.
#app.task(queue='q1', bind=True, acks_late=True, task_reject_on_worker_lost=True, max_retries=3)
def job_1(self):
do_something()
task_group = group(job_2.s(batch) for batch in range(0, len([1,2,3,4,5,6]), 3))
result_group = task_group.apply_async()
#app.task(queue='q1', bind=True, acks_late=True, task_reject_on_worker_lost=True, max_retries=3)
def job_2(self, batch):
do_something()
return result
The above job_2 will run more than hour and after one hour the same is redelivering again to the worker.
My celery setup and config as shown below:
c = Celery(app.import_name,
backend=app.config['CELERY_RESULT_BACKEND'],
broker=app.config['CELERY_BROKER_URL'])
config.py
CELERY_BROKER_URL = os.environ['CELERY_BROKER_URL']
CELERY_RESULT_BACKEND = os.environ['CELERY_RESULT_BACKEND']
CELERY_BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 36000}
I tried to increase visibility timeout to 10 hours after the redelivering issue like above in configuration. but it looks like not working.
Please help with this issue and let me know if any solution.

Run Job every 4 days but first run should happen now

I am trying to setup APScheduler to run every 4 days, but I need the job to start running now. I tried using interval trigger but I discovered it waits the specified period before running. Also I tried using cron the following way:
sched = BlockingScheduler()
sched.add_executor('processpool')
#sched.scheduled_job('cron', day='*/4')
def test():
print('running')
One final idea I got was using a start_date in the past:
#sched.scheduled_job('interval', seconds=10, start_date=datetime.datetime.now() - datetime.timedelta(hours=4))
but that still waits 10 seconds before running.
Try this instead:
#sched.scheduled_job('interval', days=4, next_run_time=datetime.datetime.now())
Similar to the above answer, only difference being it uses add_job method.
scheduler = BlockingScheduler()
scheduler.add_job(dump_data, trigger='interval', days=21,next_run_time=datetime.datetime.now())

Celery task schedule (Celery, Django and RabbitMQ)

I want to have a task that will execute every 5 minutes, but it will wait for last execution to finish and then start to count this 5 minutes. (This way I can also be sure that there is only one task running) The easiest way I found is to run django application manage.py shell and run this:
while True:
result = task.delay()
result.wait()
sleep(5)
but for each task that I want to execute this way I have to run it's own shell, is there an easy way to do it? May be some king custom ot django celery scheduler?
Wow it's amazing how no one understands this person's question. They are asking not about running tasks periodically, but how to ensure that Celery does not run two instances of the same task simultaneously. I don't think there's a way to do this with Celery directly, but what you can do is have one of the tasks acquire a lock right when it begins, and if it fails, to try again in a few seconds (using retry). The task would release the lock right before it returns; you can make the lock auto-expire after a few minutes if it ever crashes or times out.
For the lock you can probably just use your database or something like Redis.
You may be interested in this simpler method that requires no changes to a celery conf.
#celery.decorators.periodic_task(run_every=datetime.timedelta(minutes=5))
def my_task():
# Insert fun-stuff here
All you need is specify in celery conf witch task you want to run periodically and with which interval.
Example: Run the tasks.add task every 30 seconds
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
"runs-every-30-seconds": {
"task": "tasks.add",
"schedule": timedelta(seconds=30),
"args": (16, 16)
},
}
Remember that you have to run celery in beat mode with the -B option
manage celeryd -B
You can also use the crontab style instead of time interval, checkout this:
http://ask.github.com/celery/userguide/periodic-tasks.html
If you are using django-celery remember that you can also use tha django db as scheduler for periodic tasks, in this way you can easily add trough the django-celery admin panel new periodic tasks.
For do that you need to set the celerybeat scheduler in settings.py in this way
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
To expand on #MauroRocco's post, from http://docs.celeryproject.org/en/v2.2.4/userguide/periodic-tasks.html
Using a timedelta for the schedule means the task will be executed 30 seconds after celerybeat starts, and then every 30 seconds after the last run. A crontab like schedule also exists, see the section on Crontab schedules.
So this will indeed achieve the goal you want.
Because of celery.decorators deprecated, you can use periodic_task decorator like that:
from celery.task.base import periodic_task
from django.utils.timezone import timedelta
#periodic_task(run_every=timedelta(seconds=5))
def my_background_process():
# insert code
Add that task to a separate queue, and then use a separate worker for that queue with the concurrency option set to 1.