How to manually run a Sidekiq job - ruby-on-rails-3

I have an application which uses Sidekiq. The web server process will sometimes put a job on Sidekiq, but I won't necessarily have the worker running. Is there a utility which I could call from the Rails console which would pull one job off the Redis queue and run the appropriate Sidekiq worker?

Here's a way that you'll likely need to modify to get the job you want (maybe like g8M suggests above), but it should do the trick:
> job = Sidekiq::Queue.new("your_queue").first
> job.klass.constantize.new.perform(*job.args)
If you want to delete the job:
> job.delete
Tested on sidekiq 5.2.3.

I wouldn't try to hack sidekiq's API to run the jobs manually since it could leave some unwanted internal state but I believe the following code would work
# Fetch the Queue
queue = Sidekiq::Queue.new # default queue
# OR
# queue = Sidekiq::Queue.new(:my_queue_name)
# Fetch the job
# job = queue.first
# OR
job = queue.find do |job|
meta = job.args.first
# => {"job_class" => "MyJob", "job_id"=>"1afe424a-f878-44f2-af1e-e299faee7e7f", "queue_name"=>"my_queue_name", "arguments"=>["Arg1", "Arg2", ...]}
meta['job_class'] == 'MyJob' && meta['arguments'].first == 'Arg1'
end
# Removes from queue so it doesn't get processed twice
job.delete
meta = job.args.first
klass = meta['job_class'].constantize
# => MyJob
# Performs the job without using Sidekiq's API, does not count as performed job and so on.
klass.new.perform(*meta['arguments'])
# OR
# Perform the job using Sidekiq's API so it counts as performed job and so on.
# klass.new(*meta['arguments']).perform_now
Please let me know if this doesn't work or if someone knows a better way to do this.

Related

Scrapyd: How to cancel all jobs with one command?

I am running over 40 spiders which are until now scheduled via cron and issued via scrapy crawl Due to several reasons I am now switching to scrapyd, one of them is to be able to see which jobs are running in case I need to do maintenance and reboot - so I can cancel a job.
Is it possible to cancel multiple jobs at once? I noticed that multiple jobs might be running at once with many waiting in quene with status "pending". Stopping the crawl might therefore require multiple calls of the cancel.json endpoint.
How to stop (or better pause) all jobs?
The scrapyd API (as of v1.3.0) does not support pausing. It does have stopping one job per call, however, so you have to loop the jobs yourself
I took #kolas's script from this question and update it to work with python 3.
import json, os
PROJECT_NAME = "MY_PROJECT"
cd = os.system('curl http://localhost:6800/listjobs.json?project={} > kill_job.text'.format(PROJECT_NAME))
with open('kill_job.text', 'r') as f:
a = json.loads(f.readlines()[0])
pending_jobs = list(a.values())[2]
for job in pending_jobs:
job_id = job['id']
kill = 'curl http://localhost:6800/cancel.json -d project={} -d job={}'.format(PROJECT_NAME, job_id)
os.system(kill)

How to start and stop multiple weblogic managed servers at one go through WLST

I am writing a code to start , stop, undeploy and deploy my application on weblogc.
My components need to be deployed on few managed servers.
When I do new deployments manually I can start and stop the servers in parallel, by ticking multiple boxes and selecting start and stop from the dop down. See below.
but when trying from WLST, i could do that in one server at a time.
ex:
start(name='ServerX',type='Server',block='true')
start(name='ServerY',type='Server',block='true')
shutdown(name='ServerX',entityType='Server',ignoreSessions='true',timeOut=600,force='true',block='true')
shutdown(name='ServerY',entityType='Server',ignoreSessions='true',timeOut=600,force='true',block='true')
Is there a way I can start stop multiple servers in once command?
Instead of directly starting and stopping servers, you create tasks, then wait for them to complete.
e.g.
tasks = []
for server in cmo.getServerLifeCycleRuntimes():
# to shut down all servers
if (server.getName() != ‘AdminServer’ and server.getState() != ‘RUNNING’ ):
tasks.append(server.start())
#or to start them up:
#if (server.getName() != ‘AdminServer’ and server.getState() != ‘SHUTDOWN’ ):
# tasks.append(server.shutdown())
#wait for tasks to complete
while len(tasks) > 0:
for task in tasks:
if task.getStatus() != ‘TASK IN PROGRESS’ :
tasks.remove(task)
java.lang.Thread.sleep(5000)
I know this is an old post, today I was reading this book "Advanced WebLogic Server Automation" written by Martin Heinzl so in the page 282 I found this.
def startCluster(clustername):
try:
start(clustername, 'Cluster')
except Exception, e:
print 'Error while starting cluster', e
dumpStack()
I tried it and it started managed servers in parallel.
Just keep in mind the AdminServer must be started first and your script must connect to the AdminServer before trying it.
Perhaps this would not be useful for you as the servers should be in a cluster, but I wanted to share this :)

Can I get the current queue in perform method of worker with sidekiq/redis?

I want to be able to delete all job in queue, but I don't know what queue is it. I'm in perform method of my worker and I need to get the "current queue", the queue where the current job is come from.
for this time I use :
require 'sidekiq/api'
queue = Sidekiq::Queue.new
queue.each do |job|
job.delete
end
because I just use "default queue", It's work.
But now I will use many queues and I can't specify only one queue for this worker because I need use a lots for a server load balancing.
So how I can get the queue where we are in perform method?
thx.
You can't by design, that's orthogonal context to the job. If your job needs to know a queue name, pass it explicitly as an argument.
This is much faster:
Sidekiq::Queue.new.clear
These docs show that you can access all running job information wwhich includes the jid (job ID) and queue name for each job
inside the perform method you have access to the jid with the jid accessor. From that you can find the current job and get the queue name
workers = Sidekiq::Workers.new
this_worker = workers.find { |_, _, work|
work['payload']['jid'] == jid
}
queue = this_worker[2]['queue']
however, the content of Sidekiq::Workers can be up to 5 seconds out of date, so you should only try this after your worker has been running at least 5 seconds, which may not be ideal

Dynamically change the periodic interval of celery task at runtime

I have a periodic celery task running once per minute, like so:
#tasks.py
#periodic_task(run_every=(crontab(hour="*", minute="*", day_of_week="*")))
def scraping_task():
result = pollAPI()
Where the function pollAPI(), as you might have guessed from the name, polls an API. The catch is that the API has a rate limit that is undisclosed, and sometimes gives an error response, if that limit is hit. I'd like to be able to take that response, and if the limit is hit, decrease the periodic task interval dynamically (or even put the task on pause for a while). Is this possible?
I read in the docs about overwriting the is_due method of schedules, but I am lost on exactly what to do to give the behaviour I'm looking for here. Could anyone help?
You could try using celery.conf.update to update your CELERYBEAT_SCHEDULE.
You can add a model in the database that will store the information if the rate limit is reached. Before doing an API poll, you can check the information in the database. If there is no limit, then just send an API request.
The other approach is to use PeriodicTask from django-celery-beat. You can update the interval dynamically. I created an example project and wrote an article showing how to use dynamic periodic tasks in Celery and Django.
The example code that updates the task when the limit reached:
def scraping_task(special_object_id, larger_interval=1000):
try:
result = pollAPI()
except Exception as e:
# limit reached
special_object = ModelWithTask.objects.get(pk=special_object_id)
task = PeriodicTask.objects.get(pk=special_object.task.id)
new_schedule, created = IntervalSchedule.objects.get_or_create(
every=larger_inerval,
period=IntervalSchedule.SECONDS,
)
task.interval = new_schedule
task.save()
You can pass the parameters to the scraping_task when creating a PeriodicTask object. You will need to have an additional model in the database to have access to the task:
from django.db import models
from django_celery_beat.models import PeriodicTask
class ModelWithTask(models.Model):
task = models.OneToOneField(
PeriodicTask, null=True, blank=True, on_delete=models.SET_NULL
)
# create periodic task
special_object = ModelWithTask.objects.create_or_get()
schedule, created = IntervalSchedule.objects.get_or_create(
every=10,
period=IntervalSchedule.SECONDS,
)
task = PeriodicTask.objects.create(
interval=schedule,
name="Task 1",
task="scraping_task",
kwargs=json.dumps(
{
"special_obejct_id": special_object.id,
}
),
)
special_object.task = task
special_object.save()

Does Delayed::Worker.new.work_off work jobs with a future run_at time?

I have a controller action that creates 2 background jobs to be run at a future date.
I am trying to test that the background jobs get run
# spec/controllers/job_controller_spec.rb
setup
post :create, {:job => valid_attributes}
Delayed::Job.count.should == 2
Delayed::Worker.logger = Rails.logger
#Delayed::Worker.new.work_off.should == [2,0]
Delayed::Worker.new.work_off
Delayed::Job.count.should == 0 # this is where it fails
This is the error:
1) JobsController POST create with valid params queues up delayed job and fires
Failure/Error: Delayed::Job.count.should == 0
expected: 0
got: 2 (using ==)
For some reason it seems like it is not firing.
You can try to use
Delayed::Worker.new(quiet: false).work_off
to debug the result of your background jobs, this could help you to find out if the fact that they're supposed to run in the future is messing with the assert itself.
Don't forget to take off the "quiet:false" when you're done, otherwise your tests will always output the results of the background jobs.