How to understand the output of rabbitmqctl commands - rabbitmq

$rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...
privateTransactionQ 2
amq.gen-o9dl3Zj7HxS50gkTC2xbBQ 0
task_queue 0
Output of rabbitmqctl looks like this. I cant make out what each column is meant for. How can I see the meaning of each column?

There is no "easy" solution for this, but we're IT and we can build them. I'm not an expert in RabbitMQ nor in programming, but I'll give my best to give a good answer to this one, just in case someone lands in here looking for help.
Let's take the exact case of listing the queues from rabbitmqctl console. By typing "rabbitmqctl" you get the list of available commands:
Commands:
[...]
list_queues [-p <vhost>] [--online] [--offline] [--local] [<queueinfoitem> ...] [-t <timeout>]
[...]
Assuming you know what a vhost and queue are, let's say you want to list all the queues in vhost "TEST", then you would need to type:
> rabbitmqctil list_queues -p TEST
Timeout: 60.0 seconds ...
Listing queues for vhost TEST ...
test.queue 0
By default, you only get the "name" of the queue and its "current depth".
Where do you find all the parameters of the queues? Pay special attention to the word "queueinfoitem" in the help instruction you typed first. If you see the rabbitmqctl help instructions (by typing "rabbitmqctl"), at the end of the instruction you can see a list of available options for the parameter "".
Now let's see an example where you want to see a more advanced status of the queue, per say: messages ready in queues, messages in status unacknowledged, messages RAM, consumers, consumer's memory utilization, state of the queue and of course, its name.
You are right about one thing: rabbitmqctl doesn't return the result in a friendly way. By default, you get this:
rabbitmqctl list_queues -p TEST messages_ready, messages_unacknowledged, messages_ram, consumers, consumer_utilisation, state, name
Timeout: 60.0 seconds ...
Listing queues for vhost TEST ...
0 0 0 0 running test.queue
But with a bit of immagination, you can achieve this:
----------------------------------------------------------
Msg. * Msg. * Msg. ** ** Cons. ** **** Name
Rdy * Unack * RAM *** Cons. * Util. ** State ***
----------------------------------------------------------
0 0 0 0 running test.queue
It's no big deal, but it's better than the default.
I achieved that with a small python script:
import os
vhosts = os.popen("rabbitmqctl list_vhosts name").read()
logging.info(vhosts)
vhosts = vhosts.split("\n",1)[1]
vhosts = vhosts[:-1]
vhosts = vhosts.split("\n")
for vhost in vhosts:
header_a = "Msg. * Msg. * Msg. ** ** Cons. ** **** Name\n"
header_b = "Rdy * Unack * RAM *** Cons. * Util. ** State *** \n"
dash = "----------------------------------------------------------\n"
queues = os.popen("rabbitmqctl list_queues -p " + vhost + " messages_ready, messages_unacknowledged, messages_ram, consumers, consumer_utilisation, state, name").read()
queues = queues.split("\n",2)[2]
queues_list = dash + header_a + header_b + dash + queues
print(queues_list)
Of course this can be improved in so many ways and critics are always welcome, I still hope it helps someone.
Cheers.

Related

Why redis pubsub working is independent of database?

I am newbie to Redis and trying to understand concept of Redis PubSub.
Step- 1:
root#01a623a828db:/data# redis-cli -n 1
127.0.0.1:6379[1]> subscribe foo
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "foo"
3) (integer) 1
In 1st step, subscribed database 1
Step- 2:
root#01a623a828db:/data# redis-cli -n 4
127.0.0.1:6379[4]> publish foo 2
(integer) 1
In 2nd step, published message on database 4
Step- 3:
root#01a623a828db:/data# redis-cli -n 1
127.0.0.1:6379[1]> subscribe foo
Reading messages... (press Ctrl-C to quit)
..........................................
1) "message"
2) "foo"
3) "2"
In 3rd step, on database 1 got the message which was published on database 4 in 2nd Step.
I tried to find out the reason behind it but I found same answer everywhere- "Pub/Sub has no relation to the key space. It was made to not interfere with it on any level, including database numbers. Publishing on db 10, will be heard by a subscriber on db 1. If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production)- This is as per official documentation of Redis PubSub."
Ques-
Why redis pubsub working architecture is independent of database?
How to implement "If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production)"?
"Publishing on db 10, will be heard by a subscriber on db 1."- It is not inline with statement
It was made to not interfere with it on any level, including database numbers.
it's a matter of design choice really.
If you need scoping, you can always prefix the pattern. eg: pattern productupdate on test env will be watched via test:productupdate and on staging env, it will be staging:productupdate
It seems to inline well with the statement. the database number doesn't matter here.

RabbitMQ lager_error_logger_h dropped messages

Help please solve the problem.
There are:
RabbitMQ - 3.7.2
Erlang - 20.1
Connections: 527
Channels: 500
Exchanges: 49
Queues: 4437
Consumers: 131
Publish rate ~ 200/s
Ack rate ~ 200/s
Config:
disk_free_limit.absolute = 5GB
log.default.level = warning
log.file.level = warning
In the logs constantly appear such messages:
11:42:16.000 [warning] <0.32.0> lager_error_logger_h dropped 105 messages in the last second that exceeded the limit of 100 messages/sec
11:42:17.000 [warning] <0.32.0> lager_error_logger_h dropped 101 messages in the last second that exceeded the limit of 100 messages/sec
11:42:18.000 [warning] <0.32.0> lager_error_logger_h dropped 177 messages in the last second that exceeded the limit of 100 messages/sec
How to get rid of them correctly? How to remove this messages from logs?
The RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
The message means that RabbitMQ is generating a very large number of error messages and that they are being dropped to avoid filling the log rapidly. If "dropped X messages in the last second" is the only message you are seeing in the logs, you need to determine what the messages are that are being dropped to find the root of the problem. You can do this by temporarily raising that limit by running the following command:
rabbitmqctl eval '[lager:set_loghwm(H, 250) || H <- gen_event:which_handlers(lager_event)].'
You should then see a much larger number of messages that will reveal the underlying issue. To revert back to the previous setting, run this command:
rabbitmqctl eval '[lager:set_loghwm(H, 50) || H <- gen_event:which_handlers(lager_event)].'

Can I use "stopsignal=WINCH" to have supervisord gracefully stop an Apache process?

According to the Apache documentation, the WINCH signal can be used to gracefully stop Apache.
So it would seem that, in supervisord, I should be able to use stopsignal=WINCH to configure supervisord to stop Apache gracefully.
However, Google turns up 0 results for "stopsignal=WINCH". It seems odd that no-one has tried this before.
Just wanted to confirm: is stopsignal=WINCH the way to get supervisord to stop Apache gracefully?
I had the same problem running/stopping apache2 under supervisord inside a Docker container. I don't know if your problem is related to Docker or not, or how familiar you are with Docker. Just to give you some context: when calling docker stop <container-name>, Docker sends SIGTERM to the process with PID 1 running inside the container (some details on the topic), in this case supervisord. I wanted supervisord to pass the signal to all its programs to gracefully terminate them, because I realized that, if you don't gracefully terminate apache2, you might not be able to restart that because the PID file is not removed. I tried both with and without stopsignal=WINCH, and the result didn't change for me. In both cases apache2 was gently terminated (exit status was 0 and no PID file in /var/run/apache2. To stay on the safe side, I kept the stopsignal=WINCH inside the supervisord config, but as of today I was also not able to find a clear answer online, neither here nor by googling.
According to the supervisord's source code:
# all valid signal numbers
SIGNUMS = [ getattr(signal, k) for k in dir(signal) if k.startswith('SIG') ]
def signal_number(value):
try:
num = int(value)
except (ValueError, TypeError):
name = value.strip().upper()
if not name.startswith('SIG'):
name = 'SIG' + name
num = getattr(signal, name, None)
if num is None:
raise ValueError('value %r is not a valid signal name' % value)
if num not in SIGNUMS:
raise ValueError('value %r is not a valid signal number' % value)
return num
It does recognize all signals and even if your signal name doesn't start with 'SIG', it will add that automatically too.

Wrong qstat GPU resource count SGE

I have a gpu resource called gpus. When I run qstat -F gpus I get weird output of the format "qc:gpus=-1" , thus negative number of available gpus are reported. If i run qstat -g c says I have multiple GPUs available. Multiple jobs fail because of "unavailable gpus". It's like the counting of GPUs starts from 1 instead of 8 on each node, so if I used more than 1 it becomes negative. My queue is :
hostlist node-01 node-02 node-03 node-04 node-05
seq_no 0
load_thresholds NONE
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list smp mpich2
rerun FALSE
slots 1,[node-01=8],[node-02=8],[node-03=8],[node-04=8],[node-05=8]
Does anyone have any idea why this is happening?
I believe you set the "gpus" complex in the host configuration. You can see it if you do
qconf -se node-01
And you can check the definition of the "gpus" complex with
qconf -sc
For instance, my UGE has this definition for the "ngpus" complex:
#name shortcut type relop requestable consumable default urgency
ngpus gpu INT <= YES YES 0 1000
And an example node "qconf -se gpu01":
hostname gpu01.cm.cluster
...
complex_values exclusive=true,m_mem_free=65490.000000M, \
m_mem_free_n0=32722.546875M,m_mem_free_n1=32768.000000M, \
ngpus=2,slots=16,vendor=intel
You can modify the value by "qconf -me node-01". See the man page complex(5) for details.

Procmail sends an extra email

I use procmail to forward certain 'From' to a Gmail account
/home/user/.procmailrc
:0c
* !^FROM_MAILER
* ^From: .*aaa | bbb | ccc.*
! ^X-Loop: user#gmail\.com
| formail -k -X "From:" -X "Subject:" \
-I "To: user#gmail.com" \
-I "X-Loop: user#gmail.com"
:0
* ^From: .*aaa | bbb | ccc.*
$DEFAULT
This works fine but on my server inbox I also get an 'undelivered' mail
The mail system <"^X-Loop:"#my-name-server.com> (expanded from
<"^X-Loop:">): unknown user:
"^x-loop:"
How can I avoid this?
I've tried to delete these mails.
This is not the best way.
Anyway It does not work.
:0B * <"\^X-Loop:"#my-name-server.com>
/dev/null
The recipe contains multiple syntax errors, but the bounce message comes because you lack an asterisk on one of the condition lines, which makes it an action line instead.
The general syntax of a Procmail recipe is
:0flags # "prelude", with optional flags
* condition # optional, can have zero conditions
* condition # ...
action
The action can be a mailbox name, or ! followed by a destination mailbox to forward the message to, or | followed by a shell pipeline.
So your first recipe is "If not from mailer and matching From: ..., forward to ^X-Loop:.
The | formail ... line after that is then simply a syntax error and ignored, because it needs to come after a prelude line :0 and (optionally) some condition lines.
Additionally, the ^From: regex is clearly wrong. It will match From: .*aaa or bbb (with spaces on both sides, in any header, not just the From: header) or ccc.
Finally, the intent is apparently to actually forward the resulting message somewhere.
:0c
* ! ^FROM_MAILER
* ^From:(.*\<)?(aaa|bbb|ccc)
* ! ^X-Loop: user#gmail\.com
| formail -I "X-Loop: user#gmail.com" | $SENDMAIL $SENDMAILFLAGS user#gmail.com
If you simply want to forward the incoming message, the other -X and -I and certainly -k options are superfluous or wrong. If they do accomplish something which is irrelevant for this question, maybe you need to add some or all of them back (and also remember to extract with -X any new headers you add with -I, as otherwise they will be suppressed; this sucks).
Your second recipe is also superfluous, unless you have more Procmail recipes later in the file which should specifically be bypassed for these messages. (If so, you will need to fix the From: regex there as well.)