Do Heroku Scheduler tasks cost money? - process

I've been reading through Heroku's documentation but just found it plain confusing. I have an app up that has both a web-based front-end (with web process) and a task that's set to run every day at midnight by Heroku Scheduler (shows up on heroku ps as run.1).
So, my heroku ps looks like this:
Process State Command
------- ---------- ------------------------------------
run.1 up for 21h python webpage/listings.py
web.1 up for 8m python ./manage.py runserver 0.0.0..
What I'm trying to figure out is, is this considered two dynos? Is the run task considered a background task?
Main question: Will this cost money?

Yes, a Heroku Scheduler will accrue usage and will cost money if you go over your 750 free dyno-hours you are given per app each month. As long as you keep within that limit, you won't be charged.
Scheduler runs one-off dynos, which accrue usage just like regular dynos. They will appear with a “scheduler” dyno type in your Heroku invoice.

There is 750 of free hours.
In the billing, dynos are divided in four groups: worker (background dynos), web dynos, rake and one-off-process (when executing "heroku run", for example used by the scheduler).
More at https://devcenter.heroku.com/articles/usage-and-billing

Related

Schedule background task with Sidekiq

I have a Rails 3 app deployed heroku. I have a Sidekiq worker at app/workers/task_worker.rb:
class TaskWorker
include Sidekiq::Worker
def perform
...
end
end
How to schedule execution of TaskWorker.perform_async daily at 12:01 a.m?
You might want to have a look at sidetiq too. https://github.com/tobiassvn/sidetiq The gem supports complex timing expressions via the ice_cube gem.
I personally found comfortable to have a gem that would integrate seemlessly with sidekiq.
Something like that should work:
class TaskWorker
include Sidekiq::Worker
include Sidetiq::Schedulable
recurrence do
daily.hour_of_day(0).minute_of_hour(1)
end
def perform
# do magic
end
end
Careful though when using this gem since there are some performance related issues with some time expressions. https://github.com/tobiassvn/sidetiq/wiki/Known-Issues. The expression I gave you should circumvent this issue though.
I don't like the overhead Sidetiq adds to Sidekiq so I sought out a different solution.
Apparently Heroku has a little-known, but free scheduler addon that allows you to run rake tasks every 10 minutes, hourly or daily. This is Heroku's answer to cron jobs and it's nice that it's a free add-on. It should work for most non-critical scheduling.
Heroku states in their docs that the scheduler is a "Best Effort" service which may occasionally (but rarely) miss a scheduled event. If it is critical that this job is run, you'll probably want to use a custom clock process. Custom clock processes are more reliable but they count toward your dyno hours. (And as such, incur fees just like any other process.)
Currently it looks like clockwork is the recommended clock process on Heroku.
I'm stating the obvious, but what's wrong with having a Cron Job that invokes the Sidekiq job every night at that time?

What is the best practice for running a scheduler or delayed job on Heroku?

I've seen several solutions for doing this:
Redis / Resque
Delayed Job
Heroku Scheduler
Clockwork
Heroku scheduler won't work because it runs at random times and only once per 10 minutes at its most frequent.
Running on Cedar. Running multiple web dynos.
EDIT: Here's what I want to do:
Call an arbitrary method with params at an arbitrary point in the future. Something like Schedule.set(Notification.send_update_to_user(574), Time.now + 1.days)
I would choose Sidekiq, though there are several other options suitable for your example. Sidekiq lets you schedule jobs to run at arbitrary times in the future:
NotificationUpdateWorker.perform_at(Time.now + 1.day, 574)
The delayed extensions would let you write instead:
Notification.delay_for(1.day).send_update_to_user(574)
Try with rufus/scheduler.
require 'rubygems'
require 'rufus/scheduler'
scheduler = Rufus::Scheduler.start_new
scheduler.every '1m' do
Checkin.check_checkin()
end
After looking at different options, I chose Delayed Job, which is well documented on Heroku.
For jobs that need to run at a certain time each day or once an hour, Heroku scheduler works well, but sometimes it doesn't run.

Running Clockwork and Delayed Job on Heroku

I am migrating my existing rails app to heroku. I have memory and time intensive delayed jobs that run almost 20 hours a day and I have a clockwork to handle the time specific jobs. the clockwork jobs are not so heavy and run a very few times in one day.
Is is possible to run both the delayed job process and the clockwork process using a single heroku process using bluepill ?
I do not want to pay for one more worker just for the sake of clockwork processes.
Try using heroku scheduler, it will allow you to schedule jobs on a daily, hourly or 10 minute interval.

Rails processing background jobs in real-time

I use hirefire-gem with Delayed-Job 3 on heroku cedar-stack and it is working pretty good in terms of hiring/firing but performance of the job execution is terrible. firing up the background job and seeing the results in the UI takes about 5-8 seconds locally and about 25-30 seconds (!) on heroku.
Processing time of the jobs is about the same locally/deployed but hiring workers (scaling, up, starting, ...) seems to take a lot of time(?).
is that a common issue? is there a solution (rake tasks, etc.)?
Thanks a lot.
Best, Phil
It's down to the fact that your worker isn't running all the time but spinning up for each individual job. The lag is the code start-up time.
If you have a full time dyno the jobs should process almost instantaneously.

Gems/Services for autoscaling Heroku's dynos and workers

I want to know if there are any good solutions for autoscaling dynos AND workers on Heroku in a production environment (probably a different solution for each of those, as they are pretty unrelated). What are you/companies using, regarding this?
I found lots of options, but none of them seem really mature for a production environment.
There is Heroscale, which seem to introduce some latency as it does not run locally, and I also heard of some downtime. There are modifications of delayed_jobs, which have not been updated for a long time, and there are some issues with current bundlers. There is also some alternatives related to reque, which seem not to handle very well some HTTP exceptions, which results in app crashing, and others which seem to need an always-running worker to schedule other workers, and may also suffer from some HTTP exceptions problems.
Well. In the end. What is being used, nowadays, for autoscaling Heroku's dynos and workers on a production environment with Rails3?
Thanks in advance.
We ran into this a while ago and I spent quite a bit of time on this to my great frustration. I'll try to stick to the salient point. There are several Heroku autoscaling solution that seem decent at first glance.
The example that has already been given heroku-autoscaler is actually for autoscaling dynos and is pretty much the only solution out there that claims to do this (and it certainly doesn't do it well). Most others will only claim to autoscale workers for you. So, let's focus on that first. The autoscalers you'll look at for workers depend on what you're actually using for you background workers e.g. delayed_job, resque. Those are the most common background processing libs that people use, so the autoscalers will try to hook into one of them. You can use things like:
workless
hirefire
heroku-resque-auto-scale
etc
Some of these work on the Cedar stack some might need a bit of tweaking. The problem with all of them is that it's like trying to pull yourself out of the swamp by your own hair. Let's take hirefire as an example (it's probably the best one of the lot). It modifies delayed_job so that the workers themselves can look at the queue and spin up more workers if necessary, if there are no more jobs in the queue, the workers will all shut each other down. There are several problems:
if you want to put a job on the queue to be executed in the future as opposed to right now, you're out of luck. A worker starts up when jobs enter the queue, but since the job is to be executed in the future the worker will shut down and will not start up unless another job enters the queue (that's the only thing that prompts workers to start up)
you loose the ability to retry failed jobs, this is possible by default in delayed_job, but it takes a little while before a failed job is to be retried (and progressively longer) if it fail multiple times, but the workers will shut down during this time delay and there is nothing to prompt them to start up again (in essence this is the same issue as in the first scenario)
The thing that solves this problem is to have one worker running continuously it can therefore monitor the queue periodically and can execute jobs when necessary or even spin up more workers. But if you do that, you're not saving any money (you have a worker running continuously 24/7 and have to pay for that) and that's the whole premise behind autoscalers on heroku. In essence, if you only have occasional background processing to do, or you have background jobs that are likely to fail but succeed on retry, or you have background jobs that don't need to be executed instantly, there is no autoscaling library you can use that will work for you.
Here is one alternative. The guy who wrote Hirefire, later spun it off into a webapp (Hirefire app), the essence of which is to externally monitor your Heroku workers/dynos for you and spin up/shut down workers dynos as necessary. This was free in beta but it now costs money, less than what you'd pay to run a worker 24/7 but still not insignificant if you only need a few background jobs once in a while. Either way this is the only viable way to make sure your background job infrastructure does what you want (well that and rolling your own solution which means having a machine like an EC2 instance where you can put some scripts which will ping your heroku app and spin up/shut down workers as needed - a non-trivial amount of effort).
Now Hirefire app does offer to autoscale your dynos for you as well, it does this based on hooking in to the latency of your heroku request queue. However I found that this didn't work well, perhaps if you're close to the Amazon datacenter where your heroku app actually lives (we weren't), you might have a different experience. But, for us it unnecessarily spun up a whole bunch of dynos and would never spin them down no matter how much I tweaked the settings. You can put it down to the fact that it was a beta it may have improved since then, but that's the experience that I had.
Long story short, if you want to autoscale your workers, use Hirefire app, you'll be saving a lot less money than you thought, but it is still the cheapest option. If you want to autoscale dynos you're basically out of luck. This is just one of those limitations you live with for having the convenience of a platform like Heroku.
Heroku is offering a new add-on called AdeptScale which is now just out of Beta.
Here is the add-on page for AdeptScale
Here is the more detailed documentation for AdeptScale
Here is the form to sign up for Heroku's Beta Program
Hopefully this will be a robust solution for autoscaling Heroku Dynos, as I'm not still not happy with the current options.
Update (2/4/13): I signed up for Heroku's Beta program to try out this add-on, and its worked really well for me. Occasionally scaling up with traffic, but mostly sitting on the minimum number of dynos I've set of 2. It's greatly reduced my bill, and eliminated worry that I might be slow during peak usage times.
Update (3/6/13): Added link to Heroku's Sign up page for their beta program.
Update (4/14/13): Looks like auto-scaling is out of Beta. It's still working really well for me.
HireFire.io (The Service, not the Open Source Project) now allows you to use your New Relic metrics to auto-scale your web dynos. New Relic is a performance monitoring tool provided as an add-on through Heroku. They have a free tier and it's sufficient to use with HireFire.
You can auto-scale based on:
Response Time
This is the Response Time you find on the New Relic Dashboard. It's a combination of various factors including Request Queuing, Database Performance, App-Layer, Router, etc.
Apdex Score
This allows you to scale based on your New Relic Apdex Score, enabling you to scale based on user experience/satisfaction, which is determined by this score.
Aside of this we have become language/framework agnostic. For worker dynos all you have to do to get auto-scaling working is to setup a JSON end-point at a certain path in your app that returns a very simple JSON string containing the queue size (we provide convenient, but not required, macros for the Ruby language and some out-of-the-box support for Django apps, but like I said it works for any language/framework by manually setting up a JSON end-point - it's very easy). For web dynos, you can use the HireFire Metric Source with basically any language/framework, and the above mentioned New Relic Metric Source for languages/frameworks that are supported by New Relic (these are common languages such as Ruby, Python, Java, etc).
Disclaimer: I built HireFire.
I'm trying to find a good way to autoscale dyno's too.
https://github.com/ddollar/heroku-autoscale does this but has a disclaimer about its immaturity.
I've recently written a heroku auto scaling system called Heroku Vector:
https://github.com/wpeterson/heroku-vector
It allows you to scale multiple types of dynos based on different traffic sources. It currently supports NewRelic throughout and Sidekiq number of busy threads. As traffic goes up or down, it will scale the number of dynos up or down. It's a daemon process that can be run in its own dyno on Heroku or elsewhere.