Delay restart of processes in monit - monit

Can I modify montrc so that it will not restart process immediately. The process has to be down for a full cycle before a restart is triggered. This is so I can keep my existing capistrano deploys.

you can use something like :
check process x with pidfile /var/run/x.pid
every y cycles
or
start program = "/etc/init.d/x start" with timeout 90 seconds

I do not think it is currently possible to do that if you're monitoring only the PID file. If however, you are also monitoring the service by listening in on a port, you can add a if failed port 8080 X times within Y cycles then restart clause. Monit will then curl that port every cycle, and when the count of failures reaches X across Y cycles, it will attempt to restart the service.
Keep in mind that this only affects the port monitor. If monit notices the PID file is gone, it will immediately try to restart it.

Try
check process x with pidfile /var/run/x.pid
if does not exist for 2 cycles then start
This will wait a minimum of 1 full cycle before restarting the dead process.

Related

How to sync time on host wake-up within VirtualBox?

I am running an Ubuntu 12.04-based box inside of Vagrant using VirtualBox. So far, everything is fine - except for one thing:
Let's assume that the VM is running. Then, the host goes to standby-mode. After waking it up again, the VM is still running, but its internal clock continues where it stopped when the host went down. So this basically means: Put the host to sleep for 15 minutes, wake it up again, then the VM's internal clock is 15 minutes late.
How can I fix this (setting the time manually is not an option for obvious reasons ;-))? Is there a way to run a script inside of a Vagrant VM whenever the host system changes its state?
I've read in the documentation that by default the VirtualBox Guest Additions sync the time with the host every 10 seconds. Apparently this is not happening, but I can not find any place where it is disabled. So any ideas?
PS: The Guest Additions are installed and match the version of VirtualBox being used.
The documentation lacks some details here.
What VirtualBox does every 10 seconds is just slight adjustement (something like 0.005 seconds). Only when the time difference reaches a threshold (20 minutes by default) a "real" resync is done.
You can reduce the thresold (i.e. to 10 seconds) with the following command:
VBoxManage guestproperty set <vm-name> "/VirtualBox/GuestAdd/VBoxService/--timesync-set-threshold" 10000
Summarizing answers of #zilupe and #Slobodan Kovacevic, solution is to add following to Vagrantfile:
config.vm.provider 'virtualbox' do |vb|
vb.customize [ "guestproperty", "set", :id, "/VirtualBox/GuestAdd/VBoxService/--timesync-set-threshold", 1000 ]
end
This will synchronize clocks each time when desync becomes > 1s (1000ms)
I give an other solution to sync time between guest & host without installing Virtualbox guest addition:
install ntp on your guest, and de-comment these lines in /etc/ntp.conf:
disable auth
broadcastclient
Then, restart ntp with service ntp restart
Active broadcast on your host:
For Linux users, edit your /etc/ntp.conf file and configure broadcast (you must adapt IP):
broadcast 192.168.123.255
For Windows users, activate the "Windows Time" service. You can then read this page to configure it to broadcast time
Then, restart time service on host.
For me to get timesync working I had to do this:
vboxmanage setextradata «machine-name» "VBoxInternal/Devices/VMMDev/0/Config/GetHostTimeDisabled" 0
It turns the timesync on. It was, for some reason, off.
I found a solution:
install ntpdate
add "s" permission for ntpdate, this allows non-root users to run ntpdate as root: sudo chmod u+s /usr/sbin/ntpdate
add one line in ~/.bashrc: ntpdate -u ntp.ubuntu.com
After that, each time you login to the linux system, the time will be sync once.
you can install the VirtualBox Guest Additions in the VM to sync the time automatically by VB.

how do I kill idle redis clients

I want to timeout and kill idle redis clients. Is there a setting I can set to do this? I seem to remember setting a configuration somewhere but I can't seem to find it again.
I want this to be done automatically, rather than manually calling the client kill command.
Have a look into the Redis configuration file (the one you use to launch Redis).
# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0
Just check the parameter is not commented out, and change the timeout parameter to put a non zero value in seconds. The instance should be restarted to take this parameter in account.
To change this parameter on a running Redis instance, you can use a client command:
> src/redis-cli config set timeout 10
OK
> src/redis-cli config get timeout
1) "timeout"
2) "10"

Running Redis in daemonized form and using Upstart to manage it doesn't work

I've written an Upstart script for Redis as follows:
description "Redis Server"
start on runlevel [2345]
stop on shutdown
expect daemon
exec sudo -u redis /usr/local/bin/redis-server /etc/redis/redis.conf
respawn
respawn limit 10 5
I then configure redis via it's redis.conf to:
daemonize yes
All the documentation and my own experimentation says Redis forks twice in daemonized form and "expect daemon" should work, but the Upstart script is always holding on to the PID of the former parent (PID - 1). Has anyone got this working?
The following upstart config seems to be working for me, with upstart 1.5 on ubuntu 12.04, with redis.conf daemonize set to yes:
description "redis server"
start on (local-filesystems and net-device-up IFACE=eth0)
stop on shutdown
setuid redis
setgid redis
expect fork
exec /opt/redis/redis-server /opt/redis/redis.conf
respawn
Other people have the same problem. See this gist.
When the daemonize option is activated, Redis does not check if the process is already a daemon (there is no call to getppid). It systematically forks, but only once. It is somewhat unusual, other daemonization mechanisms may require the initial check on getppid, and fork to be called twice (before and after the setsid call), but on Linux this is not strictly required.
See this faq for more information about daemonization.
Redis daemonize function is extremely simple:
void daemonize(void) {
int fd;
if (fork() != 0) exit(0); /* parent exits */
setsid(); /* create a new session */
/* Every output goes to /dev/null. If Redis is daemonized but
* the 'logfile' is set to 'stdout' in the configuration file
* it will not log at all. */
if ((fd = open("/dev/null", O_RDWR, 0)) != -1) {
dup2(fd, STDIN_FILENO);
dup2(fd, STDOUT_FILENO);
dup2(fd, STDERR_FILENO);
if (fd > STDERR_FILENO) close(fd);
}
}
Upstart documentation says:
expect daemon
Specifies that the job's main process is a daemon, and will fork twice after being run.
init(8) will follow this daemonisation, and will wait for this to occur before running
the job's post-start script or considering the job to be running.
Without this stanza init(8) is unable to supervise daemon processes and will
believe them to have stopped as soon as they daemonise on startup.
expect fork
Specifies that the job's main process will fork once after being run. init(8) will
follow this fork, and will wait for this to occur before running the job's post-start
script or considering the job to be running.
Without this stanza init(8) is unable to supervise forking processes and will believe
them to have stopped as soon as they fork on startup.
So I would either deactivate daemonization on Redis side, either try to use expect fork rather than expect daemon in upstart configuration.

Monit: delay the next monitor cycle after a service test action condition has been met

When my server gets into high load, a graceful restart of Apache seems to bring things back under control. So I set up monit, with this configuration:
set daemon 10
check system localhost
if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"
So every 10 seconds, I poll the server load, and when it gets above 5, I gracefully restart Apache. However, that temporarily raises the load, so we get into a death spiral. What I want is for it to notice after 10 seconds that the load is 5 or more, and gracefully restart Apache, then wait for 5 minutes or so before checking that particular metric again.
Is there a way to do this with monit?
It's not entirely within monit, but it's close enough
set daemon 10
check system localhost
if loadavg (1min) > 5 then unmonitor
if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"
if loadavg (1min) > 5 then exec "python /scripts/remonitor.py"
Then you have a python script, like so:
import time, os
time.sleep(5*60)
os.system("monit monitor system")
So this will:
1. unmonitor "system" when it reaches too much load, to prevent the death spiral
2. restart apache gracefully
3. start the script that will re-monitor the "system" in 5 minutes
What about
set daemon 10
set limits { programtimeout: 300 seconds }
check system localhost
if loadavg (1min) > 5 then exec "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'"
or even
set daemon 10
check system localhost
start program = "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'" with timeout 330 seconds
if loadavg (1min) > 5 then start
I.e., just add the sleep 5m shell command after the command to restart Apache and add the appropriate timeout to the monitrc.

Monit on CentOS causes httpd.pid not to be created

The solution was to replace this line:
check process apache with pidfile /var/run/httpd.pid
With this line:
check process httpd with pidfile /var/run/httpd/httpd.pid
And I also removed the 'group apache'.
Original post:
After installing Monit on CentOS, and setting an alert for the Apache (httpd) service, the service no longer creates the /var/run/httpd.pid file.
The httpd service IS running properly.
On top of it, as if that's not enough, Monit reports the status of the service as: Execution failed
Naturally, the only way to restart such a service is by killing it, since the 'restart' script doesn't see any running process.
These are the contents of the /etc/monit.d/monitrc file:
set daemon 10
set logfile syslog facility log_daemon
set mailserver localhost
set mail-format { from: me#server.com }
set alert bugs#server.com
set httpd port 2812 and
# SSL ENABLE
# PEMFILE /var/certs/monit.pem
allow user:password
check process apache with pidfile /var/run/httpd.pid
group apache
start program = "/etc/init.d/httpd start"
stop program = "/etc/init.d/httpd stop"
if cpu is greater than 180% for 1 cycles then alert
if totalmem > 1200 MB for 2 cycles then restart
if children > 250 then restart
check process sshd with pidfile /var/run/sshd.pid
start program "/etc/init.d/sshd start"
stop program "/etc/init.d/sshd stop"
if failed port 22 protocol ssh for 5 cycles then restart
if 5 restarts within 25 cycles then timeout
Output of "service httpd restart":
Stopping httpd: [FAILED]
Starting httpd: (98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
[FAILED]
Any help will be greatly appreciated.
Try to replace stop program with /usr/sbin/httpd -k stop. It work for me.
I had the same problem but /usr/sbin/httpd -k stop didn't seem to help since this still tries to look up the process id from the pid file.
I opted for stop program = "/usr/bin/killall httpd". I don't think this is very elegant (probably kills open requests) but it was the only way I could find to restart apache and have the pid file recreated by monit.
I think that monit is doing a restart as 'stop; start' and is not waiting for 'stop' to finish before starting a new process, and thus is deleting the pid file at an inappropriate time. At least, that's my conclusion after tinkering with all this.
I found a reference to someone who fixed this issue by making monit sleep after the 'stop' statement.
Personally, I found that replacing 'restart' with 'start' when the http server is down worked just fine.