Can someone please tell me how to define a check_disk service with check_nrpe in icinga 2? - nrpe

I'm trying to check disk status of client ubuntu 16.04 instance using icinga2 master server. In here I tried to use nrpe plugin for check disk status. I faced trouble When I'm going to define service in service.conf file. Please, can someone tell me what the correct files that should be changed when using nrpe are. Because I'm new to Icinga and nrpe.

I was able to find the solution to my problem. I hope to put it here because It may help someone's need.
Here I carried check_load example to the explain.
First of all, you need to create .conf file (name: 192.168.30.40-host.conf)regarding the client-server that you are going to monitor using icinga2. It should be placed on /etc/icinga2/conf.d/ folder
/etc/icinga2/conf.d/192.168.30.40-host.conf
object Host "host1" {
import "generic-host"
display_name = "host1"
address = "192.168.30.40"
}
you should create a service file for your client.
/etc/icinga2/conf.d/192.168.30.40-service.conf
object Service "LOAD AVERAGE" {
import "generic-service"
host_name = "host1"
check_command = "nrpe"
vars.nrpe_command = "check_load"
}
This is an important part of the problem. You should add this line to your nrpe.cfg file in Nagios server.
/etc/nagios/nrpe.cfg file
command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 20,15,10
4.make sure to restart icinga2 and Nagios servers after making any change.

You could also use an icinga2 agent instead of nrpe. The agent will be able to receive its configuration from a master or satellite, and perform local checks on the server.

Related

How to set a specific port for single-user Jupyterhub server REST API calls?

I have setup Spark SQL on Jypterhub using Apache Toree SQL kernel. I wrote a Python function to update Spark configuration options in the kernel.json file for my team to change configuration based on their queries and cluster configuration. But I have to shutdown the running notebook and re-open or restart the kernel after running Python function. In this way, I'm forcing the Toree kernel to read the JSON file to pick up the new configuration.
I thought of implementing this shutdown and restart of kernel in a programmatic way. I got to know about the Jupyterhub REST API documentation and am able implement it by invoking related API's. But the problem is, the single user server API port is set randomly by the Spawner object of Jupyterhub and it keeps changing every time I spin up a cluster. I want this to be fixed before launching the Jupyterhub service.
Here is a solution I tried based on Jupyterhub docs:
sudo echo "c.Spawner.port = 35289
c.Spawner.ip = '127.0.0.1'" >> /etc/jupyterhub/jupyterhub_config.py
But this did not work as the port was again set by the Spawner randomly. I think there is a way to fix this. Any help on this would be greatly appreciated. Thanks

Configuring SSL channel connectivity on MQ client machine

From Linux server with MQ client installed we are trying to set up connection to secured channel. I am ETL person and our MQ admin is struggling. Anyways I will explain what I tried (which obviously hasn't worked yet ) and anyone please let me know what else needs to be done to set up the connectivity.. Thanks :)
tmp/mqmutility/keyrepmodmq> ls
AMQCLCHL.TAB key.kdb key.rdb key.sth MODE_MODELTAP_DEV_keyStLst.txt
export MQSSLKEYR=/tmp/mqmutility/keyrepmodmq/key
export MQCHLLIB=/tmp/mqmutility/keyrepmodmq
export MQCHLTAB=AMQCLCHL.TAB
/opt/mqm/samp/bin> amqsputc <queue_name> <queue_manager_name>
Sample AMQSPUT0 start
MQCONN ended with reason code 2058
Note: I can connect to the same queue manager for a non-SSL channel.
Any help will be great and other approaches you follow for SSL channel connectivity from client machine will also be helpful.
When using a Client Channel Definition Table (CCDT) file - your AMQCLCHL.TAB file, a return code of 2058 usually means that the queue manager name the application tried to use - your 'queue_manager_name' - was not found in any of the channel entries in the CCDT file.
If you're using. MQ V8 you can very easily display the entries in your CCDT file and the queue manager names they are configured for using the following command:
runmqsc -n
DISPLAY CHANNEL(*) QMNAME
If none of the channels in your file have the queue manager name you are using when running the amqsputc sample then this is the cause of your 2058 reason code.
Hopefully it will be clear when you see the entries in the file listed out which queue manager name you should be using, but if not, update your question with some more details (like the contents of said file and the queue manager details) and we can help further.
You must ensure that you have a CLNTCONN channel defined which has the queue manager name you want to use in the QMNAME field, and that you have a matching named SVRCONN channel defined on the queue manager. Since you are using SSL, you must also ensure that these two channels are using the same SSLCIPH.
Please read Creating server-connection and client-connection definitions on the server and it's child topics.

Setting Up Fabric SSH, Error:Timed Out

I'm new to Fabric, so this might have a simple answer I've missed due to bad search terminology.
I'm trying to start a new ubuntu EC2 instance in AWS, then connect to it with Fabric and have it execute several commands. However, it seems there is a problem with Fabric's SSH connection, maybe I'm defining some env variable wrong?
#task //starts new EC2 instance and sets env variables
def prep_deploy():
//code to start new EC2 instance, named buildhost
env.hosts=[buildhost.public_dns_name]
env.user = "ubuntu"
env.key_filename = ".../keypair.pem"
env.port = 22
#task
def deploy():
run("echo $HOME") //code fails here
....
I run fab prep_deploy deploy, since I read you need a new task for the new env variables to take effect.
I get
Fatal error: Timed out trying to connect to ...amazonaws.com (tried 1 time)
Underlying exception: timed out
The security groups for the instance are open to SSH: I can connect through Putty. In fact, if I empty the `env.host_string' variable at the start of deploy(), when it prompts me to manually input a host, I can write in "ubuntu#...amazonaws.com:22", with the host name exactly as seen from output at the task start, and it will connect to the instance. But I can't figure how to manipulate the environment variables so that it understands the host name.
It looks like your fabric settings are correct with the use of variables. I was able to use the code you provided to connect to my Ubuntu VM. I am wondering if you are having a connection issue due to the amazon Instance not being fully booted and ready for connections when your script runs the second task. I have run into that issue on a different VM hosts.
I added the following code to check and try the connection again. This might help you
import socket
import time
def waitforssh():
s=socket.socket()
address=env.host_string
port=22
while True:
time.sleep(5)
try:
s.connect((address,port))
return
except Exception,e:
print "failed to connec to %s:%s %(address,port)
pass
insert the function call into your deploy task
def deploy():
waitforssh()
This should test the connection. If the port does not respond, it will wait 5 seconds and try again.
That could explain why your second attempt to connect works.

RabbitMQ error on startup - erlexec: HOME must be set

For some reason I cannot start RabbitMQ anymore after it crashed.
I am getting the following error:
erlexec: HOME must be set
I've tried to export my home to /home/ubuntu but still getting the same error.
Any ideas?
I'm assuming that you are trying to start rabbitmq with something like service start rabbitmq-server. If so, the service command strips out environment variables. So you will need to either define it in your start up script or in a config file for your startup script (see https://unix.stackexchange.com/a/44378).
Additionally, I believe the rabbitmq home directory is actually /var/lib/rabbitmq/.
I have found suitable solution for myself. You can run epmd service before RabbitMQ server. This is fixing issue with HOME variable and others.
erlexec needs the environment variable HOME to be set in order to place a cookie (that contains a string). If HOME was for some reason unset in the environment that you are running rabbitmq (or rabbitmqctl) in, then you will get this error.
Try to check if HOME is defined by typing:
$ env
to get the list of defined environments. If it was not defined try to define it with
$ export HOME=/var/lib/rabbitmq
If you are using python3 and tox, note that tox by default does not pass current environment variable to the test environment. You will have to add the following to tox.ini
setdev =
HOME=/var/lib/rabbitmq
Just wanted to mention it because it gave me a headache the entire day today and I finally understood what was the issue and I though I should share this hint.

I cant login to cloned image on Rackspace

I have an instance in my account that I can login to ssh with port 27891. When I clone it ( make an image and from an image a server) I can´t access it with the password Rackspace gave me to root.
I´ve tried with port 22 and with port 27891 and neither work.
I´ll appreciate help!!
Thanks..
Cloned servers don't have the same password. You'll need to grab the admin password either from the UI or the API you're using. If you need to, you can always change the password as well:
On the other hand, I highly recommend using SSH keypairs with Rackspace boxes. You'll need to use the API (or one of the SDKs), but it makes it much easier to manage the boxes. Additionally, the authorized_keys file will be on each of the cloned images.
Example cloning code, using pyrax (the Python library for Rackspace):
# Authenticate with Rackspace -- also works against OpenStack/Keystone
pyrax.set_setting("identity_type", "rackspace")
pyrax.set_credential_file(os.path.expanduser("~/.rax_creds"))
cs = pyrax.connect_to_cloudservers(region="DFW")
# Could use the API to get image IDs
# This one is for Ubuntu 13.04 (Raring Ringtail) (PVHVM beta)
image = u'62df001e-87ee-407c-b042-6f4e13f5d7e1'
flavor = u'performance1-1'
# Create a server (solely to make an image out of it, for this example)
base_server = cs.servers.create("makeanimageoutofme",
image,
flavor,
key_name="rgbkrk")
# It takes a little while to create the server, so we poll it until it completes
base_server = pyrax.utils.wait_for_build(base_server, verbose=True)
# Create image
im = base_server.create_image("base_image")
image = cs.images.get(im)
# Wait for image creation to finish
image = pyrax.utils.wait_until(image, "status", "ACTIVE", attempts=0)
# Note that a clone can be a bigger flavor (VM size) if needed as well
# Here we use the flavor of the base server
clone_server = cs.servers.create(name="iamtheclone",
image=image.id,
flavor=base_server.flavor['id'])
clone_server = pyrax.utils.wait_for_build(clone_server, verbose=True)
Within each of the created machines, the generated credentials are in base_server.adminPass and clone_server.adminPass. To access the box, use base_server.accessIPv4 and clone_server.accessIPv4.
However, I highly recommend using SSH keypairs.