I have a Rails 3.2.20 app which I'm introducing Resque into the environment to background job mail and sms alerts. I have everything setup in development properly and now I'm to the point of preparing to merge my branch and push to staging and then to production for testing. But I have a few questions.
1.) How much memory will I need to run Resque in production (approximately). I understand that by starting up a Resque worker it loads the full environment. I'm a bit tight on memory and don't want to have any issues. I'll be using a single Resque worker as our email/sms traffic is very light and we don't mind queues being backed up for a few seconds to a minute. I know this question is very vague, but I'd like to get a feel for the memory footprint that Resque requires.
2.) I will have Redis running but need to figure out how to start a Resque worker on deployment as well as kill the existing Resque worker. I've come up with the following which I would add as a Cap after deploy action.
task :resque_restart do
run "kill $(ps aux | grep 'resque' | awk '{print $2}')"
run "cd #{current_path}; bundle exec rake resque:work QUEUE=*"
end
I haven't actually tested this with Capistrano 2 (which I'm using), but have tested the commands manually and the first command does kill all the resque rake tasks and the second command starts up the worker with all Queues enabled.
I'm not sure if this is the best way to go or not, so I'd really like to hear some feedback on this simple Capistrano task I wrote.
3.) What is the best way to monitor my Resque rake task. So for instance if it crashes or catches a signal to terminate, how can I have it restarted so the app doesn't crash and to assure the worker rake task is always running?
Thanks in advance for any advice or guidance you can provide.
It really depends on the size of your app. From my experience, generally a single Resque worker isn't much larger than your app's footprint. However, if your Resque worker will instantiate a lot of large objects, the size of the Resque instance could grow very quickly.
Check out the capistrano-resque gem. It provides all this functionality for you, plus more. :)
There are several options for this. A lot of people have followed something similar to this post about running Resque in Production and using the God gem. Personally, I've used a process similar to what is described in this post using Monit. Monit can be a bit of a pain to set up, so I'd strongly recommend checking out the God gem.
Related
I am trying to deploy a pod to the cluster. The application I am deploying is not a web server. I have an issue with setting up the liveness and readiness probes. Usually, I would use something like /isActive and /buildInfo endpoint for that.
I've read this https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-command.
Wondering if I need to code a mechanism which will create a file and then somehow prob it from the deployment.yaml file?
Edit: this is what I used to keep the container running, not sure if that is the best way to do it?
- touch /tmp/healthy; while true; do sleep 30; done;
It does not make sense to create files in your application just for the liveness probe. On the K8s documentation this is just an example to show you how the exec command probe works.
The idea behind the liveness probe is bipartite:
Avoid traffic on your Pods, before they have been fully started.
Detect unresponsive applications due to lack of resources or deadlocks where the application main process is still running.
Given that your deployments don't seem to expect external traffic, you don't require a liveness probe for the first case. Regarding the second case, question is how your application could lock up and how you would notice externally, e.g. by monitoring a log file or similar.
Bear in mind, that K8s will still monitor whether your applications main process is running. So, restarts on application failure will still occur, if you application stops running without a liveness probe. So, if you can be fairly sure that your application is not prone to becoming unresponsive while still running, you can also do without a liveness probe.
I'm looking into an issue where we're seeing CPU usage maxed out on a RavenDB instance using NServiceBus outbox implementation.
The design currently has all outbox workers configuring the deduplicationdatacleanup settings in their start up configuration. i.e. these settings:
endpointConfiguration.SetTimeToKeepDeduplicationData(TimeSpan.FromDays(7));
endpointConfiguration.SetFrequencyToRunDeduplicationDataCleanup(TimeSpan.FromMinutes(1));
If you've got multiple workers that are processing messages for the system, should the cleanup be implemented on each of those, or should it be treated like a cronjob where you run the clean up process on just one of those workers, or a dedicated system in the environment that is not a worker, but more of a utility role?
I would imagine the latter, otherwise if you scale out workers, all of them are going to be trying to run the cleanup process every minute, or am I understanding the way this configuration executes the clean up incorrectly?
Thanks
I am running Redis/Resque locally on my development machine. During a long-running task, I would like to be able to scale down my worker count to free up some bandwidth without having to cancel the command below and restart with a lower COUNT.
rake environment resque:workers QUEUE='*' COUNT=10
Is it possible to increase/decrease workers dynamically while a queue is processing?
Take a look at the following plugins:
https://github.com/kmullin/resque-sliders - Gives you control via the web UI
https://github.com/frausto/resque-director - Allows you to define "rules" for scaling up and down.
The Rails application I'm currently working on is hosted at Amazon EC2 servers. It's using Resque for running background jobs, and there are 2 such instances (would-be production and a stage). Also I've mounted Resque monitoring web app to the /resque route (on stage only).
Here is my question:
Why there are workers from multiple hosts registered within my stage system and how can I avoid this?
Some additional details:
I see workers from apparently 3 different machines, but only 2 of them I managed to identify - the stage(obviously) and the production. The third has another address format(starts with domU) and haven't any clue what it could be.
It looks like you're sharing a single Redis server across multiple resque server environments.
The best way to do this safely is to use separate Redis servers or separate Redis databases or namespaces. The Redis-namespace gem can be used with Resque to isolate each environments Resque queues and worker data.
I can't really help you with what the unknown one is, but I had something similar happen when moving hosts and having dns names change. The only way I found to clear out the old ones was to stop all workers on the machine, fire up IRB, require 'resque' and look at Resque.workers. This will list all the workers resque knows about, which in your case will include about 20 bogus ones. You can then do:
Resque.workers.each do {|worker| worker.unregister_worker}
This should prune all the not-really-there workers and get you back to a proper display of the real workers.
I want to start redis and redis-scheduler from a rake task so I'm doing the following:
namespace :raketask do
task :start do
system("QUEUE=* rake resque:work &")
system("rake redis:start")
system("rake resque:scheduler")
end
end
The problem is the redis starts in the foreground and then this never kicks off the scheduler. If It won't start in the background (using &). Scheduler must be started AFTER redis is up and running.
similar to nirvdrum. The resque workers are going to fail/quit if redis isn't already running and accepting connections.
check out this gist for an example of how to get things started with monit (linux stuff).
Monit allows one service to be dependent on another, and makes sure they stay alive by monitoring a .pid file.
That strikes me as not a great idea. You should have your redis server started via an init script or something. But, if you really want to go this way, you probably need to modify your redis:start task to use nohup and background the process so you can disconnect from the TTY and keep the process running.