Monit / restart service when failed - monit

I have a service, which is a server which listening to port: 7000.
I want to verify that the service is always up, and when it fails I want to start it again.
I wrote the next script in /etc/monit.d/myserver
check process myserver with pidfile /var/run/myserver.pid
start program = "/etc/init.d/myserver start" with timeout 5 seconds
stop program = "/etc/init.d/myserver stop" with timeout 5 seconds
if failed host 127.0.0.1 port 7000
protocol HTTP request /testcheck then restart
if 5 restarts within 5 cycles then timeout
But I notice that even when the process is running, it restart the service, and give the next information on the log:
EST Dec 18 03:05:13] error : HTTP: error receiving data -- Resource temporarily unavailable
[EST Dec 18 03:05:13] error : 'myserver ' failed protocol test [HTTP] at INET[127.0.0.1:7000] via TCP
[EST Dec 18 03:05:13] info : 'myserver ' trying to restart
[EST Dec 18 03:05:13] info : 'myserver ' stop: /etc/init.d/myserver
[EST Dec 18 03:05:14] info : 'myserver ' start: /etc/init.d/myserver
How can I check it correctly so just when the service is down, it will restart it?

I had the same problem and at the end I found out that I'm not running monit daemon properly take look at this post: Rerun a process in Monit if process stops

Related

Unable to ssh VM after hardware configuration change

I followed the recommandation to reduce the size of my VM (number of CPU from 4 to 2 and memory from 16GO to 8 Go). After updating the configuration and restarting the VM i was not able to access the VM via ssh.
The VM has an external IP.
The troublshoot diagnostic using gcloud does not show any error or issue in the log. Everything is fine regarding the firewall configuration.
I tried to create a new VM under my project (same project as the original VM). I cannot access it with ssh. If i create a new project and a new VM instance under this new project then I can ssh it. --> The problem seems to be related to the project itself.
I tried to access vie serial port and I am getting these errors:
Mar 8 20:31:11 myvm systemd[1]: Started Google OSConfig Agent.
Mar 8 20:32:11 myvm OSConfigAgent[1173]: 2022-03-08T20:32:11.5643Z OSConfigAgent Critical main.go:100: Error parsing metadata, agent cannot start: network error when requesting metadata, make sure your instance has an active network and can reach the metadata server: Get http://169.254.169.254/computeMetadata/v1/?recursive=true&alt=json&wait_for_change=true&last_etag=0&timeout_sec=60: dial tcp 169.254.169.254:80: connect: network is unreachable
Mar 8 20:32:11 myvm systemd[1]: google-osconfig-agent.service: Main process exited, code=exited, status=1/FAILURE
Mar 8 20:32:11 myvm systemd[1]: google-osconfig-agent.service: Failed with result 'exit-code'.
Mar 8 20:32:12 myvm systemd[1]: google-osconfig-agent.service: Service hold-off time over, scheduling restart.
Mar 8 20:32:12 myvm systemd[1]: google-osconfig-agent.service: Scheduled restart job, restart counter is at 4.
I am blocked... I am asking for your support. Any idea or suggestion?

Cloudstack KVM installation failed

I'm installing cloudstack on ubuntu 20.04 by following this document.
I installed qemu-kvm and cloudstack-agent successfully but I'm not able to start libvirtd.service, on seeing the status I'm getting following errors
● libvirtd.service - Virtualization daemon
Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2021-03-16 18:00:09 IST; 1min 28s ago
TriggeredBy: ● libvirtd-admin.socket
● libvirtd.socket
● libvirtd-ro.socket
Docs: man:libvirtd(8)
https://libvirt.org
Process: 232313 ExecStart=/usr/sbin/libvirtd $libvirtd_opts (code=exited, status=6)
Main PID: 232313 (code=exited, status=6)
Mar 16 18:00:09 host systemd[1]: libvirtd.service: Scheduled restart job, restart counter is at 5.
Mar 16 18:00:09 host systemd[1]: Stopped Virtualization daemon.
Mar 16 18:00:09 host systemd[1]: libvirtd.service: Start request repeated too quickly.
Mar 16 18:00:09 host systemd[1]: libvirtd.service: Failed with result 'exit-code'.
Mar 16 18:00:09 host systemd[1]: Failed to start Virtualization daemon.
on seeing the log of journalctl -xe it is showing cloudstack-usage.service: Failed with result 'exit-code'
can any one suggest what whould be the issue.
Are you trying this on a virtualised VM, or baremetal host, or on a raspberrypi? This means some other service hasn't started which libvirtd may depend on. See if you can run "systemctl daemon-reload" and try to start libvirtd manually "systemctl start libvirtd", and then try rest. The cloudstack-usage service can be started once the mysql server is running. If you've further questions I encourage you to join the CloudStack users mailing list and ask questions there - http://cloudstack.apache.org/mailing-lists.html
I got that same error message when following the official install guide when starting the mysql server. The problem was for me that [mysqld] was missing in the my.conf file before the config snippet. The documentation is misleading in that case (like the secion header is only relevant when editing that alternative mysql config file mentioned later there).

Radius server failed to start in centos 7

At beginning I successfully configured radius server with mariadb and httpd. But I changed to hostname of the server and rebooted. Now even if the mariadb and httpd is running but radiusd failed to start. Here is the answer from journalctl -xe .. Please help me.
Jan 10 12:34:08 cpe.twcny.res.rr.com systemd[1]: Unit radiusd.service entered failed state.
Jan 10 12:34:08 cpe.twcny.res.rr.com systemd[1]: radiusd.service failed.
Jan 10 12:34:08 cpe.twcny.res.rr.com polkitd[963]: Unregistered Authentication Agent for unix-process:2183:15540 (system bus name :1.43, object path /org/
Jan 10 12:40:01 cpe.twcny.res.rr.com systemd[1]: Created slice User Slice of root.

Monit: search for text at a url with https protocol

For some reason, monit configuration for monitoring the presence of text at a URL has been failing constantly in the last 48 hours. Here is the relevant config data:
if failed (url https://www.Example.com.com/where-to-buy/ and content == 'Online Retail Partners' and timeout 40 seconds)
then alert
if failed (url https://www.Example.com.com/products/high-absorption and content == 'You May Also Like' and timeout 20 seconds)
then alert
if failed (url https://www.Example.com.com/health-interests/bone-health and content == 'Refine' and timeout 20 seconds)
then alert
if failed (url https://www.Example.com.com/search?keywords=vitamin+d and content == 'Vegan D3' and timeout 20 seconds)
then alert
This all worked great for months/years.
We are getting inundated with monit alerts as follows:
Date: 21 Feb 12:11:32 -0600
Host: Example.com.
Service: httpd
Action: Alert
Description: connection succeeded to [www.Example.com]:443/health-interests/bone-health [TCP/IP TLS]
Date: 21 Feb 12:11:33 -0600
Host: Example.com
Service: httpd
Action: Alert
Description: failed protocol test [HTTP] at [www.Example.com]:443/products/high-absorption [TCP/IP TLS] -- Cannot resolve [www.Example.com]:443
Your faithful employee,
M/Monit
Date: 21 Feb 12:14:00 -0600
Host: Example.com
Service: httpd
Action: Alert
Description: connection succeeded to [www.Example.com]:443/products/high-absorption [TCP/IP TLS]
I'm not sur why we are failing the protocol tests.
Is there a different way to set port 443, https protocol while searching for text in a URL?
Cannot resolve [www.Example.com]
Monit is not able to resolve the IP of the remote service. Please investigate name resolution at the host level (DNS etc...)

Monit cannot open connection errors resulting in alerts from M/Monit that server is down

I'm using monit and M/Monit to monitor my application infrastructure. But every once in a while, M/Monit will show a "No report" error from a server and mark it down. A few seconds later, the issue clears at the next check in for the server to M/Monit.
The monit logs on some of the servers have these events in them:
Oct 14 12:19:11 ip-10-203-51-199 monit[30307]: M/Monit: cannot open a
connection to http://example.com:8080/collector -- Connection timed out
Oct 14 12:20:16 ip-10-203-51-199 monit[30307]: M/Monit: cannot open a
connection to http://example.com:8080/collector -- Connection timed out
Oct 14 12:22:21 ip-10-203-51-199 monit[30307]: M/Monit: cannot open a
connection to http://example.com:8080/collector -- Connection timed out
What config do I need to tune to increase the threshold until M/Monit considers the server actually down?
Here is the config from the server that has the most trouble:
set httpd port 2812 and
allow xxx:xxx
set mailserver xxx.xxx.xxx port xxx username "xxx" password "xxx" using tlsv1 with timeout 15 seconds
set daemon 30
with start delay 120
set logfile syslog facility log_daemon
set alert xxx
set mail-format {
subject: $EVENT $SERVICE on $HOST
from: monit#$HOST
message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
}
set mmonit http://xxx:xxx#example.com:8080/collector
There doesn't appear to be any problem with config file.
The intermittent problem you are experiencing is because monit is failing to open a socket on the port and timing out. See the source code for reference (handle_mmonit()):
http://fossies.org/linux/privat/monit-5.6.tar.gz:a/monit-5.6/src/collector.c
Search for the string "M/Monit: cannot open a connection to".
The timeout value appears to be fixed at 5 seconds in the code. But 5 seconds is ample time to open a socket connection on that port.
How often does monit post events to mmonit?
Had the same problem
[MST Apr 5 11:24:11] error : 'apache' failed protocol test [APACHESTATUS] at [phoenix.example.com]:80 [TCP/IP] -- APACHE-STATUS: error -- no scoreboard found
[MST Apr 5 11:24:16] error : Cannot create socket to [10x.xx.xx.x4]:8080 -- Connection timed out
We had another firewall on top of iptables. Opened up the 8080 in the input and the output side and it fixed it!