Rails processing background jobs in real-time - ruby-on-rails-3

I use hirefire-gem with Delayed-Job 3 on heroku cedar-stack and it is working pretty good in terms of hiring/firing but performance of the job execution is terrible. firing up the background job and seeing the results in the UI takes about 5-8 seconds locally and about 25-30 seconds (!) on heroku.
Processing time of the jobs is about the same locally/deployed but hiring workers (scaling, up, starting, ...) seems to take a lot of time(?).
is that a common issue? is there a solution (rake tasks, etc.)?
Thanks a lot.
Best, Phil

It's down to the fact that your worker isn't running all the time but spinning up for each individual job. The lag is the code start-up time.
If you have a full time dyno the jobs should process almost instantaneously.

Related

Celery - automatic retrying of long running tasks running on crashed worker

I'm using Celery with Django over Redis.
Some of my tasks are quite long, taking about 1 hour. I'm aware that this is suboptimal, and preferably I should use shorter tasks, but this is what I got...
Sometimes the task/worker crash. This can happen for various unimportant reasons. Maybe this worker crashed, network problem, spot-instance when preempted, killed by OOM, or any other unexpected reason that I can't "catch" and handle.
I want to make sure the task will be tried again as fast as possible.
I can use ack_late, but the problem is that this task has a very long timeout (about 90 minutes), which means that if the task started and the worker crashed after 2 minutes, I will now wait for another 88 minutes until the task will get back to the queue and will start executing again on another worker.
I'm wondering if there exists another solution, that will see the worker "disappeared" and will put the task back in the queue?
You could give task_reject_on_worker_lost a try... It is a tricky one, but have a look...

aws ECS- Farget task submition pending time

we have an ECS fargate cluster,that we have just created, and when testing, we noticed, that the submission of a new task takes about 2-3 minutes (PENDING to RUNNING).
since we run there a new task every minute, it's not good enough for us.
is there any way to optimize the PENDING to RUNNING time?
This is largely dependent on the size of your container. For example I use go from scratch containers heavily so they are only about 15MB, and I get launch times from nothing -> running in roughly 15-20 seconds.
The biggest thing you can do right now to increase launch times is to reduce the size of your container.

How to prevent ironworker from enqueuing tasks of workers that are still running?

I have this worker whose runtime greatly varies from 10 seconds to up to an hour. I want to run this worker every five minutes. This is fine as long as the job finishes within five minutes. However, If the job takes longer Iron.io keeps enqueuing the same task over and over and a bunch of tasks of the same type accumulate while the worker is running.
Furthermore, it is crucial that the task may not run concurrently, so max concurrency for this worker is set to one.
So my question is: Is there a way to prevent Iron.io from enqueuing tasks of workers that are still running?
Answering my own question.
According to Iron.io support it is not possible to prevent IronWorker from enqueuing tasks of workers that are still running. For cases like mine it is better to have master workers that do the scheduling, i.e. creating/enqueuing tasks from script via one of the client libraries.
The best option would be to enqueue new task from the worker's code. For example, your task is running for 10 sec - 1 hour and enqueues itself at the end (last line of code). This will prevent the tasks from accumulating while the worker is running.

SQL Server Log Messages / Memory Clog

I have a dedicated server that's been running for years, with no recent code or configuration changes, but suddenly about a week ago, the MS SQL Server DB has started becoming unresponsive, and shortly thereafter, the entire site goes down due to memory issues on the server. It is sporadic, which leads me to believe it could be a malicious DDOS-like attack, but I am not sure how to confirm what's going on.
After a reboot, it can stay up for a few days, or only a few hours before I start seeing rampant occurrances of these Info messages in the Windows logs, shortly before it seizing up and failing. Research has not yielded any actionable info as of yet, please help, and thank you.
Process 52:0:2 (0xaa0) Worker 0x07E340E8 appears to be non-yielding on Scheduler 0. Thread creation time: 13053491255443. Approx Thread CPU Used: kernel 280 ms, user 35895 ms. Process Utilization 0%%. System Idle 93%%. Interval: 6505497 ms.
New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 2940 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%%. System Idle: 91%%.
Here's a blog about the issue: danieladeniji.wordpress that should help you get started.
Seems unlikely that it would be a DDOS.

SQL Server running slow due to thread count increase

On our production server, for some reason at specific amount of time, thread count goes over and over to the certain point that though CPU Utilization is normal(30-50%), but the query starting to run slow we so lot more blocking statements.
I am not sure where to look at it, basically when our site runs normally, the thread count is around 150 threads, but during the specific time in a day(during 1:30 to 2:30) it come up to 270 threads.there are no extra sql transaction goes on, everything as normal as it was before but thread count grows and sql start behaving very very slow.
After restarting the SQL service immediately thread count comes to normal, and our site function fine for another 24 hours.
We are using SQL Server 2005, it is 24 core machine.
any idea?
Blocking statements steal workers (sys.dm_os_workers) so the server will spawn more workers to handle the incoming tasks. At 24 cores you'll have some 700 max worker threads out-of-the-box by default. So seeing 270 'threads' is not an issue, is well within the normal functioning parameters. You real problem must be the blocking, and you have to investigate it accordingly: who is blocking who and why. My bet is that you have a job running between 1:30 and 2:30 that is locking large portions of the database (a delete job perhaps?) and your queries block on locked rows. You'll have to investigate, find the root cause, and act accordingly. Reboot is not a solution, nor is blaming unrelated components (thread count). Use Activity Monitor, use Who Is Active, follow the methodical approach of Waits and Queues methodology. There are plenty of ways to identify the real problem. SQL Server will never appear slow due to thread count. It simply doesn't work like that.
You can control the degree of parallalism using the MAXDOP query hint. For more details please check this article:
http://blog.sqlauthority.com/2010/03/15/sql-server-maxdop-settings-to-limit-query-to-run-on-specific-cpu/
Thanks for your valuable feedback, yes it is true it is nothing which sql is behaving weiredly, it is something our Site which is based on Ektron CMS is responsible for, one of the Functionality of Ektron CMS (which is PageBuilder), while doing operation on this piece of Content was holding of the table badly, we have around 10 million users to our site, and probably since this was blocking the tables SQL Server goes nuts and does not respond very well.
We have finally eliminated the issue.