Create many IronWorker tasks in one API call - iron.io

Right now, the iron_worker_ruby_ng gem allows one to create tasks one at a time:
iron-worker client.tasks.create('MyWorker', {:client => 'Joe'})
Some scenarios require the creation of thousands of tasks. In this instance, it would be faster and more efficient if one could create many jobs at once:
client.tasks.create('MyWorker', [{:client => 'Joe'}, {:client => 'Bob'}, ..]) # batch of 100
--
I've forked the gem and made the changes but unfortunately the service endpoint returns 400. Is there any way to do this? If not, any chance this could be a feature?
def tasks_create_bulk(code_name, payloads, options = {})
payloads_arg = payloads.map do |payload|
{:code_name => code_name, :payload => payload}.merge(options)
end
parse_response(post("projects/#{#project_id}/tasks", {:tasks => payloads_arg}))
end
Thanks,
Dimitri

Related

Rails5, how to retry perform an ActiveJob a specific number of times (max attempts)

Right now I have my Background processes running with DelayedJobs, I see very convenient the max_attempts functionality on DelayedJob, how can I replicate this in ActiveJob?
Is this dependent on the queue system? has ActiveJob any abstraction configuration for this?
The gem ActiveJob::Retry did the trick:
class MyJob < ActiveJob::Base
include ActiveJob::Retry.new(
:strategy => :variable,
:delays => [10.seconds, 1.minute, 10.minutes, 15.minutes]
)
end

How to manually run a Sidekiq job

I have an application which uses Sidekiq. The web server process will sometimes put a job on Sidekiq, but I won't necessarily have the worker running. Is there a utility which I could call from the Rails console which would pull one job off the Redis queue and run the appropriate Sidekiq worker?
Here's a way that you'll likely need to modify to get the job you want (maybe like g8M suggests above), but it should do the trick:
> job = Sidekiq::Queue.new("your_queue").first
> job.klass.constantize.new.perform(*job.args)
If you want to delete the job:
> job.delete
Tested on sidekiq 5.2.3.
I wouldn't try to hack sidekiq's API to run the jobs manually since it could leave some unwanted internal state but I believe the following code would work
# Fetch the Queue
queue = Sidekiq::Queue.new # default queue
# OR
# queue = Sidekiq::Queue.new(:my_queue_name)
# Fetch the job
# job = queue.first
# OR
job = queue.find do |job|
meta = job.args.first
# => {"job_class" => "MyJob", "job_id"=>"1afe424a-f878-44f2-af1e-e299faee7e7f", "queue_name"=>"my_queue_name", "arguments"=>["Arg1", "Arg2", ...]}
meta['job_class'] == 'MyJob' && meta['arguments'].first == 'Arg1'
end
# Removes from queue so it doesn't get processed twice
job.delete
meta = job.args.first
klass = meta['job_class'].constantize
# => MyJob
# Performs the job without using Sidekiq's API, does not count as performed job and so on.
klass.new.perform(*meta['arguments'])
# OR
# Perform the job using Sidekiq's API so it counts as performed job and so on.
# klass.new(*meta['arguments']).perform_now
Please let me know if this doesn't work or if someone knows a better way to do this.

Access the last_error in failure method of Delayed Job Rails

Iam using delayed job in a rails application. I want to notify an error to airbake whenever a delayed job fails. I checked on github and leant about the failure method.
I want to send the last_error attribute of failed delayed job to airbrake. Something like this:
class ParanoidNewsletterJob < NewsletterJob
def perform
end
def failure
Airbrake.notify(:message => self.last_error, :error_class => self.handler)
end
end
But it gives me the following runtime error:
undefined method `last_error' for #<struct ParanoidNewsletterJob>
Please help me figure out how I can notify Airbrake the last_error of a failed delayed_job.
Many Thanks!!
You should be able to pass the job to the failure method, and then extract the last_error from the job. i.e.
def failure(job)
Airbrake.notify(:message => job.last_error, :error_class => job.handler)
end
this should work fine
def failure(job)
Airbrake.notify(:message => job.error, :error_class => job.error.class, :backtrace => job.error.backtrace)
end
There are two ways you can achieve what you want:
A job specific method which only applies to the type of job you want by implementing the failure method with the job as the parameter. The job will contain error and last_error. And this is what other answers are about.
A global option where a plugin can be developed to apply it to any job type created. This is desired if all jobs need to be monitored. The plugin can be registered and perform actions around various events in the lifecycle of a job. For example, below is a plugin to update the last_error if we want to process it before storing to database
One example below:
require 'delayed_job'
class ErrorDelayedJobPlugin < Delayed::Plugin
def self.update_last_error(event, job)
begin
unless job.last_error.nil?
job.last_error = job.last_error.gsub("\u0000", '') # Replace null byte
job.last_error = job.last_error.encode('UTF-8', invalid: :replace, undef: :replace, replace: '')
end
rescue => e
end
end
callbacks do |lifecycle|
lifecycle.around(:failure) do |worker, job, *args, &block|
update_last_error(:around_failure, job)
block.call(worker, job)
end
end
end
Basically it will be called when any failure occurs for any job. For details on how this callback thing work, you can refer to A plugin to update last_error in Delayed Job.

How to process a long web request in Heroku?

https://devcenter.heroku.com/articles/request-timeout
30 Seconds and a timeout error fires according to their documentation.
I'm uploading and parsing a CSV file to save to my database. One of those files is 1.7MB in size and has 37000 rows.
This process takes a bit long to process, certainly more than 30 seconds.
What can I do in these cases? What options do I have?
require 'csv'
class DatabaseImporterController < ApplicationController
def index
end
def import
# Receive the uploaded CSV file and import to the database.
csv_file = params[:csv_file].tempfile
i = 0
CSV.foreach(csv_file) do |row|
# Structure for CSV file: Year, Make, Model, Trim
if i > 0 then
make = Make.find_or_create_by_name(row[1])
model = make.model.create(:year => row[0], :name => row[2], :trim => row[3])
end
i += 1
end
redirect_to :action => 'list'
end
def list
#models = Model.all
end
end
Instead of processing your CSV file in the controller have it push a notification to a queue with the location of the uploaded file. Then have a worker dyno handle that processing.
You'll pay a little bit more, especially if you're trying to stick with the free single dyno tier, but this is a scalable design (which is why I'd imagine there's a 30 second timeout on HTTP processing).
An alternative is to push the data directly into a table and execute a stored procedure asynchronously. This pushes the work off to Postgres to handle off the HTTP thread and may place your request under the 30 second time limit, though with larger files you may breech this cap anyway.
Before you bother restructuring your entire application you'll want to run a test to ensure that Heroku has not disabled libpq-asynch.
The big cost here in your code above is Make.find_or_create_by_name which is invoking 37,000 separate - as per your example input - SELECT and possibly an INSERT for each row in your CSV. If libpq-asynch is not an option you'll have to create a stored procedure that will perform this functionality in batches of 100 or 1000 rows at a time - that way your controller code isn't making so many round trips to the database. Postgres supports arrays in the classical ordinal index style as well as arrays of row types so this is actually much less painful than it sounds.

Is it possible to terminate an already running delayed job using Ruby Threading?

Let's say I have delayed_job running in the background. Tasks can be scheduled or run immediately(some are long tasks some are not)
If a task is too long, a user should be able to cancel it. Is it possible in delayed job? I checked the docs and can't seem to find a terminate method or something. They only provide a catch to cancel delayed job itself(thus cancelling all tasks...I need to just cancel a certain running task)
UPDATE
My boss(who's a great programmer btw) suggested to use Ruby Threading for this feature of ours. Is this possible? Like creating new threads per task and killing that thread while it's running?
something like:
t1 = Thread.new(task.run)
self.delay.t1.join (?) -- still reading on threads so correct me if im wrong
then to stop it i'll just use t1.stop (?) again don't know yet
Is this possible? Thanks!
It seems that my boss hit the spot so here's what we did(please tell us if there's some possibility this is bad practice so I can bring it up):
First, we have a Job model that has def execute! (which runs what it's supposed to do).
Next, we have delayed_job worker in the background, listening for new jobs. Now when you create a job, you can schedule it to run immediately or run every certain day (we use rufus for this one)
When a job is created, it checks if its supposed to run immediately. If it is, it adds itself to the delayed job queue. The execute function creates a Thread, so each job has its own thread.
User in the ui can see if a job is running(if there's a started_at and no finished_at). If it IS running, there's a button to cancel it. Canceling it just sets the job's canceled_at to Time.now.
While the job is running it also checks itself if it has a canceled_at or if Time.now is > finished_at. If so, kill the thread.
Voila! We've tested it for one job and it seems to work. Now the only problem is scaling...
If you see any problems with this please do so in the comments or give more suggestions if ever :) I hope this helps some one too!
Delayed::Job is an < ActiveRecord::Base model, so you can query it just like you normally would like Delayed::Job.all(:conditions => {:last_error => nil}).
Delayed::Job objects have a payload field which contain a serialized version of the method or job that you're attempting to run. This object is accessed by their '#payload_object' method, which loads the object in question.
You can combine these two capabilities to make queriable job workers, for instance, if you have a User model, and the user has a paperclip'ed :avatar, then you can make a method to delete unprocessed jobs like so:
class User < ActiveRecord::Base
has_attached_file :avatar, PaperclipOptions.new(:avatar)
before_create :'process_avatar_later'
def process_avatar_later
filename = Rails.root.join('tmp/avatars_for_processing/',self.id)
open(filename, 'w') do |file| file <<self.avatar.to_file end
Delayed::Job.enqueue(WorkAvatar.new(self.id, filename))
self.avatar = nil
end
def cancel_future_avatar_processing
WorkAvatar.future_jobs_for_user(self.id).each(&:destroy)
#ummm... tell them to reupload their avatar, I guess?
end
class WorkAvatar < Struct.new(:user_id, :path)
def user
#user ||= User.find(self.user_id)
end
def self.all_jobs
Delayed::Job.scoped(:conditions => 'payload like "%WorkAvatar%"')
end
def self.future_jobs_for_user(user_id)
all_jobs.scoped(:conditions => {:locked_at => nil}).select do |job|
job.payload_object.user_id == user_id
end
end
def perform
#user.avatar = File.open(path, 'rb')
#user.save()
end
end
end
It's possible someone has made a plugin make queryable objects like this. Perhaps searching on GitHub would be fruitful.
Note also that you'd have to work with any process monitoring tools you might have to cancel any running job worker processes that are being executed if you want to cancel a job that has locked_at and locked_by set.
You can wrap the task into a Timeout statement.
require 'timeout'
class TaskWithTimeout < Struct.new(:parameter)
def perform
Timeout.timeout(10) do
# ...
end
rescue Timeout::Error => e
# the task took longer than 10 seconds
end
end
No, there's no way to do this. If you're concerned about a runaway job you should definitely wrap it in a timeout as Simone suggests. However, it sounds like you're in search of something more but I'm unclear on your end goal.
There will never be a way for a user to have a "cancel" button since this would involve finding a method to directly communicate with the worker running process running the job. It would be possible to add a signal handler to the worker so that you could do something like kill -USR1 pid to have it abort the job it's currently working and move on. Would this accomplish you goal?