How to get detailed log/info about rabbitmq connection action? - rabbitmq

I have a python program connecting to a rabbitmq server. When this program starts, it connects well. But when rabbitmq server restarts, my program can not reconnect to it, and leaving error just "Socket closed"(produced by kombu), which is meaningless.
I want to know the detailed info about the connection failure. On the server side, there is nothing useful in the rabbitmq log file either, it just said "connection failed" with no reason given.
I tried the trace plugin(https://www.rabbitmq.com/firehose.html), and found there was no trace info published to amq.rabbitmq.trace exchange when the connection failure happended. I enabled the plugin with:
rabbitmq-plugins enable rabbitmq_tracing
systemctl restart rabbitmq-server
rabbitmqctl trace_on
and then i wrote a client to get message from amq.rabbitmq.trace exchange:
#!/bin/env python
from kombu.connection import BrokerConnection
from kombu.messaging import Exchange, Queue, Consumer, Producer
def on_message(self, body, message):
print("RECEIVED MESSAGE: %r" % (body, ))
message.ack()
def main():
conn = BrokerConnection('amqp://admin:pass#localhost:5672//')
channel = conn.channel()
queue = Queue('debug', channel=channel,durable=False)
queue.bind_to(exchange='amq.rabbitmq.trace', routing_key='publish.amq.rabbitmq.trace')
consumer = Consumer(channel, queue)
consumer.register_callback(on_message)
consumer.consume()
while True:
conn.drain_events()
if __name__ == '__main__':
main()
I also tried to get some debug log from rabbitmq server. I reconfigured rabbitmq.config according to https://www.rabbitmq.com/configure.html, and set
log_levels to
{log_levels, [{connection, info}]}
but as a result rabbitmq server failed to start. It seems like the official doc is not for me, my rabbitmq server version is 3.3.5. However
{log_levels, [connection,debug,info,error]}
or
{log_levels, [connection,debug]}
works, but with this there is no DEBUG info showing in the logs, which i don't know whether it is because the log_levels configuration is not effective or there is just no DEBUG log got printed all the time.

I know that this answer comes massively late, but for future purveyors, this worked for me:
[
{rabbit,
[
{log_levels, [{connection, debug}, {channel, debug}]}
]
}
].
Basically, you just need to wrap the parameters you want to set in whichever module/plugin they belong to.

Related

Syslog-ng logs not processing certain logs possibly due to journal cursor issue

I'm using syslog-ng 3.37.1 on a VMware Photon 3.0 virtual appliance (preconfigured VM). The appliance is configured to write logs into certain files under /var/log folder as well as to remote syslog servers (optional).
Logs from facility 'auth' and 'authpriv' are configured to write to /var/log/auth.log, as well as send it over to remote syslog server when enabled.
In addition, there are other messages as well from kernel, systemd services as well as other processes, configured to be processed via syslog-ng.
Issue is that, logs from a few facilities (such as auth, authpriv, cron etc) are not processed (received?) by syslog-ng initially. So, any SSH events, TTY login events are not logged into the file and remote. However, many other events from kernel, systemd and other processes are logged fine.
Below is the configuration for auth.log, that does not log in the first boot.
filter f_auth { facility(auth) or facility(authpriv)); };
destination authlog { file("/var/log/auth.log" perm(0600)); };
log { source(s_local); filter(f_auth); destination(authlog); };
I updated the filter as below without any success
filter f_auth {
facility(auth) or facility(authpriv) or
match('sshd' value('PROGRAM')) or match('systemd-logind' value('PROGRAM'));
};
In journal logs I can observe the relevant logs, for example, below command to view SSH logs.
journalctl -f -u sshd
Additional syslog-ng service restart or config reload during appliance startup do not fix this.
The log file /var/log/auth.log (and also cron log etc) show zero size during this time. Syslog-ng log looks fine too.
However, if I generate some auth facility event (say, SSH/TTY login) and manually restart syslog-ng, all the log entries (including old events) are immediately written into filesystem log (/var/log/auth.log) and also sent to remote syslog server.
In the syslog-ng.log I find below entry when it starts working that way.
syslog-ng[481]: [date] Failed to seek journal to the saved cursor position; cursor='', error='Invalid argument (22)'
It makes me wonder if it is due to some bad cursor position. However, I can still see other systemd and kernel logs being logged fine. So, not sure.
What could be causing such behaviour? How can I ensure that syslog-ng is able to receive and process these logs without manual intervention?
Below is more detailed configuration for reference:
#version: 3.37
#include "scl.conf"
source s_local {
system();
internal();
udp();
};
destination d_local {
file("/var/log/messages");
file("/var/log/messages-kv.log" template("$ISODATE $HOST $(format-welf --scope all-nv-pairs)\n") frac-digits(3));
};
log {
source(s_local);
# uncomment this line to open port 514 to receive messages
#source(s_network);
destination(d_local);
};
filter f_auth {
facility(auth) or facility(authpriv)); # Also tried facility (auth, authpriv)
};
destination authlog { file("/var/log/auth.log" perm(0600)); };
log { source(s_local); filter(f_auth); destination(authlog); };
destination d_kern { file("/dev/console" perm(0600)); };
filter f_kern { facility(kern); };
log { source(s_local); filter(f_kern); destination(d_kern); };
destination d_cron { file("/var/log/cron" perm(0600)); };
filter f_cron { facility(cron); };
log { source(s_local); filter(f_cron); destination(d_cron); };
destination d_syslogng { file("/var/log/syslog-ng.log" perm(0600)); };
filter f_syslogng { program(syslog-ng); };
log { source(s_local); filter(f_syslogng); destination(d_syslogng); };
# A few more of above kind of configuration follows here.
# Add configuration files that have remote destination, filter and log configuration for remote servers
#include "remote/*.conf"
As can be seen, /var/log/auth.log should hold logs from auth facility, but the log remains blank until subsequent restart of syslog-ng after a syslog config change (via API) or manual login into the system. However, triggering automated restart of syslog-ng using cron (without additional syslog config change) does not help.
Any thoughts, suggestions?
This is probably caused by your real time clock going backwards. The notification mechanism in libsystemd does not work in this case.
There's a proof-of-concept patch in this syslog-ng issue:
https://github.com/syslog-ng/syslog-ng/issues/2836
But I've increased the priority to tackle that problem and fix this, as it is causing issue more often than I anticipated.
As a workaround you should synchronize the time for your VM, preferably so that during boot it waits until a sync and then keep the time synchronized by ntp.

Error while running query on Impala with Superset

I'm trying to connect impala to superset, and when I test the connection prints: "Seems OK!", and when I try to see databases on impala with the SQL Editor in the left side it shows all databases without problems.
Preview of Databases/Tables
But when i write a query and click on "Run Query", it gives the error: "Could not start SASL: b'Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Ticket expired)'"
Error running query
I'm running superset with SSL and in production mode (with Gunicorn) and Impala with SSL in a Kerberized Hadoop Cluster, and my impala database config is:
Impala Config
And in the extras I put:
{
"metadata_params": {},
"engine_params": {
"connect_args": {
"port": 21050,
"use_ssl": "True",
"ca_cert": "path/to/my/ca_cert.pem",
"auth_mechanism": "GSSAPI"
}
},
"metadata_cache_timeout": {},
"schemas_allowed_for_csv_upload": []
}
How can I solve this error? In my superset log it only shows:
Triggering query_id: 65
INFO:superset.views.core:Triggering query_id: 65
Query 65: Running query on a Celery worker
INFO:superset.views.core:Query 65: Running query on a Celery worker
Versions: Superset 0.36.0, Impyla 0.16.2
I was able to fix this error doing this steps:
1 - Created service user for celery-worker, created a kerberos ticket for him and created a crontab to renew the ticket.
2 - Runned celery worker from this service user, instead running from root.
3 - Killed an celery-worker that was running in another machine of my cluster
4 - Restarted Impala and Superset
I think this error ocurred because in some queries instead of use the celery worker in my superset machine, it was using the celery worker that was in another machine without a valid kerberos ticket. I could fix this error because when I was reading celery-worker log , it showed that a connection with the celery worker in other machine failed in a query running.

RabbitMQ Shovel Error NOT_ALLOWED

I am trying to create a shovel from my old (RabbitMQ Server 3.6.1) broker to my new (RabbitMQ 3.6.6) broker and I keep getting this error:
=ERROR REPORT==== 25-Nov-2016::18:42:45 ===
** Generic server <0.23222.460> terminating
** Last message in was {'$gen_cast',init}
** When Server state == {state,undefined,undefined,undefined,undefined,
{<<"/myvhost">>,<<"Test">>},
dynamic,
{shovel,
{endpoint,
["amqp://user:password#localhost:5672/myvhost"],
#Fun<rabbit_shovel_parameters.4.22841904>},
{endpoint,
["amqp://user:password0#XXX.XXX.XXX.XXX:5672/myvhost"],
#Fun<rabbit_shovel_parameters.5.22841904>},
1000,no_ack,
#Fun<rabbit_shovel_parameters.6.22841904>,
#Fun<rabbit_shovel_parameters.7.22841904>,
<<"queue_name">>,
1,'queue-length'},
undefined,undefined,undefined,undefined,undefined}
** Reason for termination ==
** {{badmatch,{error,not_allowed}},
[{rabbit_shovel_worker,make_conn_and_chan,1,
[{file,"src/rabbit_shovel_worker.erl"},{line,236}]},
{rabbit_shovel_worker,handle_cast,2,
[{file,"src/rabbit_shovel_worker.erl"},{line,62}]},
{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1049}]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
The log lines above are happening on the source server and nothing is being shown on the logs in the destination server.
I have verified (using e.g. telnet) and the destination server:port is reachable and the user and password being used are correct.
I have created the shovel using destination and source queue names and not exchanges. The queue and vhost already exist in the destination server.
Does anyone have any ideas about this issue or have run into it in the past? This is the first time I am working with shovel thus I may be missing a basic step.
Thank you.

how to force close a client connection rabbitmq

I have a client server application that uses rabbitmq broker.
Client connects to rabbitmq and send messages to server. At some point if server decides that this client should not be connected to rabbitmq i want to be able to force disconnect client from rabbitmq border.
Note that in my case I don't want to send message to client to disconnect, on server side I want to just force disconnect this client from rabbitmq.
Couldn't find api to do this. Any help is appriciated.
You can use the management console plug-in in two ways:
Manually, going to the connection and "force close".
Through the HTTP API using "delete" /api/connections/name, here an python example:
import urllib2, base64
def calljsonAPI(rabbitmqhost, api):
request = urllib2.Request("http://" + rabbitmqhost + ":15672/api/" + api);
base64string = base64.encodestring('%s:%s' % ('guest', 'guest')).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string);
request.get_method = lambda: 'DELETE';
urllib2.urlopen(request);
if __name__ == '__main__':
RabbitmqHost = "localhost";
#here you should get the connection detail through the api,
calljsonAPI(RabbitmqHost, "connections/127.0.0.1%3A49258%20-%3E%20127.0.0.1%3A5672");
You can use rabbitmqctl for close/force-close connections:
rabbitmqctl close_connection <connectionpid> <explanation>
<connectionpid> is from:
rabbitmqctl list_connections
#or
rabbitmqctl list_consumers

rabbitmq and logstash configuration not working

I am trying to link logstash to read messages from a queue that will get indexed in elasticsearch. I initially had it working with a shipper sending messages to the logstash port but now even that is not working. The error when trying to run the logstash conf file:
RabbitMq connection error: . Will reconnect in 10 seconds... {:level=>error}
//not sure if the next piece is related:
WARN: org.elasticsearch.discovery.zen.ping.unicast: [Hellstrom, Damion] failed to send ping
to [[#zen_unicast_2#][inet[localhost/127.0.0.1:9301]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: []
[inet[localhost/127.0.0.1:9301]][discovery/zen/unicast] request_id [0] timed out after [3752ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
log4j, [2014-03-17T14:48:20.197] WARN: org.elasticsearch.discovery.zen.ping.
unicast: [Hellstrom, Damion] failed to send ping to
[[#zen_unicast_4#] [inet[localhost/127.0.0.1:9303]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException:
[] [inet[localhost/127.0.0.1:9303]][discovery/zen/unicast]
request_id [3]
timed out after [3752ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
log4j, [2014-03-17T14:48:20.198] WARN: org.elasticsearch.discovery.zen.ping.unicast:
[Hellstrom, Damion] failed to send ping to
[[#zen_unicast_3#] [inet[localhost/127.0.0.1:9302]]]
Please I would really appreciate help on this. I have spent all weekend trying to get it to work. Even tried Redis initially but that had its own set of errors.
Oh yes is my conf file:
input {
rabbitmq {
queue => "input.queue"
host => "192.xxx.x.xxx"
exchange => "exchange.output"
vhost => "myhost"
}
}
output {
elasticsearch {
embedded => true
index => "board-feed"
}
}
The problem is related to authentication with the RabbitMQ server. For the RabbitMQ transport, the default values for user/password are guest/guest, which by default in Rabbit will only work when connecting locally (to 127.0.0.1), whereas you are connecting to 192.xxx.x.xxx. (https://www.rabbitmq.com/access-control.html)
My guess is that when it worked before, you were running the Logstash Server on the same machine as RabbitMQ.
To fix the problem, setup an account in RabbitMQ and fill in the user/password fields of the RabbitMQ output to match.