RabbitMQ list limited (full) queues - rabbitmq

is there an api/rabbitmqctl call to list full queues? That is queues that are on limit by max-length or max-lengthbytes. Sometimes certain queues reach this threshold and we would like to monitor this.
Thanks
rabbitmqctl list_queues name arguments doesn't seem to show global limits from policies

rabbitmqctl list_queues name arguments doesn't seem to show global limits from policies
No, it won't, you would have to use rabbitmqctl list_policies
You can use the HTTP API to determine if your queue is close to the max. Here is an example output for listing a queue. Note that messages and message_bytes are in the output, as well as the policy:
{
“consumer_details”: [],
“arguments”: {
“x-queue-type”: “quorum”,
“x-quorum-initial-group-size”: 1
},
“auto_delete”: false,
“consumer_capacity”: 0,
“consumer_utilisation”: 0,
“consumers”: 0,
“deliveries”: [],
“durable”: true,
“effective_policy_definition”: {
“max-length-bytes”: 10000000
},
“exclusive”: false,
“garbage_collection”: {
“fullsweep_after”: 65535,
“max_heap_size”: 0,
“min_bin_vheap_size”: 46422,
“min_heap_size”: 233,
“minor_gcs”: 11
},
“incoming”: [],
“leader”: “rabbit-1#nkarlVMD6R”,
“members”: [
“rabbit-3#nkarlVMD6R”,
“rabbit-1#nkarlVMD6R”
],
“memory”: 47929396,
“message_bytes”: 9984000,
“message_bytes_dlx”: 0,
“message_bytes_persistent”: 9984000,
“message_bytes_ram”: 0,
“message_bytes_ready”: 9984000,
“message_bytes_unacknowledged”: 0,
“messages”: 39,
This data is refreshed about every 5 seconds.
Note that you shouldn't hammer the HTTP API with frequent requests, either!
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Related

Merge two message threads into one

have two message threads, each thread consists of ten messages. I need to request to display these two chains in one.
The new thread must consist of ten different messages: five messages from one system, five messages from another (backup) system. Messages from the system use the same SrcMsgId value. Each system has a unique SrcMsgId within the same chain. The message chain from the backup system enters the splunk immediately after the messages from the main system. Messages from the standby system also have a Mainsys_srcMsgId value - this value is identical to the main system's SrcMsgId value. Tell me how can I display a chain of all ten messages? Perhaps first messages from the first system (main), then from the second (backup) with the display of the time of arrival at the server.
Specifically, we want to see all ten messages one after the other, in the order in which they arrived at the server. Five messages from the primary, for example: ("srcMsgId": "rwfsdfsfqwe121432gsgsfgd71") and five from the backup: ("srcMsgId": "rwfsdfsfqwe121432gsgsfgd72"). The problem is that messages from other systems also come to the server, all messages are mixed (chaotically), which is why we want to organize all messages from one system and its relative in the search. Messages from the backup system are associated with the main system only by this parameter: "Mainsys_srcMsgId" - using this key, we understand that messages come from the backup system (secondary to the main one).
Examples of messages from the primary and secondary system:
Main system:
{
"event": "Sourcetype test please",
"sourcetype": "testsystem-2",
"host": "some-host-123",
"fields":
{
"messageId": "ED280816-E404-444A-A2D9-FFD2D171F32",
"srcMsgId": "rwfsdfsfqwe121432gsgsfgd71",
"Mainsys_srcMsgId": "",
"baseSystemId": "abc1",
"routeInstanceId": "abc2",
"routepointID": "abc3",
"eventTime": "1985-04-12T23:20:50Z",
"messageType": "abc4",
.....................................
Message from backup system:
{
"event": "Sourcetype test please",
"sourcetype": "testsystem-2",
"host": "some-host-123",
"fields":
{
"messageId": "ED280816-E404-444A-A2D9-FFD2D171F23",
"srcMsgId": "rwfsdfsfqwe121432gsgsfgd72",
"Mainsys_srcMsgId": "rwfsdfsfqwe121432gsgsfgd71",
"baseSystemId": "abc1",
"routeInstanceId": "abc2",
"routepointID": "abc3",
"eventTime": "1985-04-12T23:20:50Z",
"messageType": "abc4",
"GISGMPRequestID": "PS000BA780816-E404-444A-A2D9-FFD2D1712345",
"GISGMPResponseID": "PS000BA780816-E404-444B-A2D9-FFD2D1712345",
"resultcode": "abc7",
"resultdesc": "abc8"
}
}
When we want to combine in a query only five messages from one chain, related: "srcMsgId".
We make the following request:
index="bl_logging" sourcetype="testsystem-2"
| транзакция maxpause=5m srcMsgId Mainsys_srcMsgId messageId
| таблица _time srcMsgId Mainsys_srcMsgId messageId продолжительность eventcount
| сортировать srcMsgId_time
| streamstats current=f window=1 значения (_time) as prevTime по теме
| eval timeDiff=_time-prevTime
| delta _time как timediff

How frequently are the Azure Storage Queue metrics updated?

I observed that it took about 6 hours from the time of setting up Diagnostics (the newer offering still in preview) for the Queue Message Count metric to move from 0 to the actual total number of messages in queue. The other capacity metrics Queue Capacity and Queue Count took about 1 hour to reflect actual values.
Can anyone shed light on how these metrics are updated? It would be good to know how to predict the accuracy of the graphs.
I am concerned because if the latency of these metrics is typically this large then an alert based on queue metrics could take too long to raise.
Update:
Platform metrics are created by Azure resources and give you visibility into their health and performance. Each type of resource creates a distinct set of metrics without any configuration required. Platform metrics are collected from Azure resources at one-minute frequency unless specified otherwise in the metric's definition.
And 'Queue Message Count' is platform metrics.
So it should update the data every 1 minute.
But it didn't. And this is not a problem that only occur on portal. Even you use rest api to get the QueueMessageCount, it still not update after 1 minute:
https://management.azure.com/subscriptions/xxx-xxx-xxx-xxx-xxx/resourceGroups/0730BowmanWindow/providers/Microsoft.Storage/storageAccounts/0730bowmanwindow/queueServices/default/providers/microsoft.insights/metrics?interval=PT1H&metricnames=QueueMessageCount&aggregation=Average&top=100&orderby=Average&api-version=2018-01-01&metricnamespace=Microsoft.Storage/storageAccounts/queueServices
{
"cost": 59,
"timespan": "2021-05-17T08:57:56Z/2021-05-17T09:57:56Z",
"interval": "PT1H",
"value": [
{
"id": "/subscriptions/xxx-xxx-xxx-xxx-xxx/resourceGroups/0730BowmanWindow/providers/Microsoft.Storage/storageAccounts/0730bowmanwindow/queueServices/default/providers/Microsoft.Insights/metrics/QueueMessageCount",
"type": "Microsoft.Insights/metrics",
"name": {
"value": "QueueMessageCount",
"localizedValue": "Queue Message Count"
},
"displayDescription": "The number of unexpired queue messages in the storage account.",
"unit": "Count",
"timeseries": [
{
"metadatavalues": [],
"data": [
{
"timeStamp": "2021-05-17T08:57:00Z",
"average": 1.0
}
]
}
],
"errorCode": "Success"
}
],
"namespace": "Microsoft.Storage/storageAccounts/queueServices",
"resourceregion": "centralus"
}
This may be an issue that needs to be reported to the azure team. It is so slow, it even loses its practicality. I think send an alert based on this is a bad thing(it’s too slow).
Maybe you can design you own logic by code to check the QueueMessageCount.
Just a sample(C#):
1, Get Queues
Then get all of the queue names.
2, Get Properties
Then get the number of the message in each queue.
3, sum the obtained numbers.
4, send custom alert.
Original Answer:
At first, after I send message to one queue in queue storage, the 'Queue Message Count' also remains stubbornly at zero on my side, but a few hours later it can get the 'Queue Message Count':
I thought it would be a bug, but it seems to work well now.

How to resolve Aerospike error, Operation not allowed at this time?

I am using Aerospike 4.8v and persisting my data on disk and I am making parallel write requests to aerospike, if i make 10 parallel requests then it works fine but when I make 100 parallel requests it gives error i.e. Operation not allowed at this time, with code 22. I think that aerospike must be capable of handling hundreds or thousands of requests in parallel but I don't know whats wrong so if anyone can guide that would be helpful.
Error log:
error : { AerospikeError: Operation not allowed at this time.
at Function.fromASError (/data/codebase/lib/node_modules/aerospike/lib/error.js:113:21)
at QueryCommand.convertError (/data/codebase/lib/node_modules/aerospike/lib/commands/command.js:91:27)
at QueryCommand.convertResponse (/data/codebase/lib/node_modules/aerospike/lib/commands/command.js:101:24)
at asCallback (/data/codebase/lib/node_modules/aerospike/lib/commands/command.js:163:24)
name: 'AerospikeError',
code: 22,
command:
QueryCommand {
client:
Client {
domain: null,
_events: {},
_eventsCount: 0,
_maxListeners: undefined,
config: [Object],
as_client: AerospikeClient {},
connected: true,
captureStackTraces: false },
args: [ 'antiSpamming', 'userTargetingMatrix', [Object], undefined ],
captureStackTraces: false,
key: undefined,
ensureConnected: true,
stream:
RecordStream {
aborted: false,
client: [Object],
_events: [Object],
_eventsCount: 3 } },
func: 'as_query_parse_records_async',
file: 'src/main/aerospike/aerospike_query.c',
line: 246,
inDoubt: false }
Warning logs:
Jan 28 06:03:25 ip-1-0-4-78 asd[32437]: Jan 28 2020 06:03:25 GMT: WARNING (scan): (scan_manager.c:103) at scan threads limit - can't start new scan
Jan 28 06:03:25 ip-1-0-4-78 asd[32437]: Jan 28 2020 06:03:25 GMT: WARNING (scan): (scan.c:676) basic scan job 5614303283813349397 failed to start (22)
You are most likely exceeding the default limit related to queries. First, see the definition for error code 22:
For scans with server 4.7 or later, no threads available
(scan-threads-limit reached).
Based on your other question, you're doing scans rather than a query with a secondary index. You'll need to increase that limit, as suggested in the 'Additional Information'. However, you have a pretty weak system in terms of CPU, so you should adjust that value and benchmark, comparing performance before and after with the same workload. In a real production system you'd have multiple nodes, probably want more than two CPU cores, and similarly tune the scan threads as needed.
This happened to me because a job was performing a backup while I was trying to scan the partitions.
Recommendation, do not run an extra job while a backup is in process.

Azure Function Apps - maintain max batch size with maxDequeueCount

I have following host file:
{
"version": "2.0",
"extensions": {
"queues": {
"maxPollingInterval": "00:00:02",
"visibilityTimeout": "00:00:30",
"batchSize": 16,
"maxDequeueCount": 3,
"newBatchThreshold": 8
}
}
}
I would expect with setup there could never be more than batchSize+newBatchThreshold number of instances running. But I realized when messages are dequed they are run instantly and not just added to the back of the queue. This means you can end up with a very high amount of instances causing a lot of 429 (to many requests). Is there anyway to configure the function app to just add the dequeded messages to the back of the queue?
It was not related to dequeueCount. The problem was because it was a consumption plan, and then you cant control the amount of instances. After chaning to a Standard plan it worked as expected.

StackExcange.Redis.RedisTimeoutException

We are experiencing timeouts in our application using Redis. Already investigated but without success. See the timeout error below:
StackExchange.Redis.RedisTimeoutException: Timeout performing GET
USERORGANIZATIONS_D96510A4-A9A2-4DAA-84A9-BB77363DD3EA, inst: 9, mgr:
ProcessReadQueue, err: never, queue: 24, qu: 0, qs: 24, qc: 0, wr: 1, wq: 1,
in: 65536, ar: 1, clientName: RD00155D008B42, serverEndpoint:
Unspecified/xxxxxxx.redis.cache.windows.net:xxxx, keyHashSlot: 9735, IOCP:
(Busy=0,Free=1000,Min=4,Max=1000), WORKER:
(Busy=27,Free=32740,Min=200,Max=32767) (Please take a look at this article
for some common client-side issues that can cause timeouts:
http://stackexchange.github.io/StackExchange.Redis/Timeouts)
If need some more information, just ask me that I'll try to provide. Thanks in advance.
The “in: 65536” value in the timeout is very high.  This value indicates how much data is sitting in the client’s socket kernel buffer.  This indicates that the data has arrived at the local machine but has not been read by the application layer yet.  This typically happens when 1) thread pool settings need to be adjusted or 2) when client CPU is running high.  Here are some articles I suggest you read:
 
Diagnosing Redis errors on the client side
Azure Redis Best Practices