RabbitMQ - Many queues or one with routing keys? - rabbitmq
i'm running a distributed application (running on multiple servers) that fetch messages from our main backend server(s) running a RabbitMQ cluster.
However, the messages are almost all the same kind, but i use one queue for each of our customers.
I notice that our load & memory usage is pretty high - Could using only one queue with routing keys as customer IDs solve the problem?
Currently, i'm using one channel per one consumer and max 20 channels per connection - so one server accessing the rabbitmq server can have multiple connections. Around 500-800 connections are not unusual.
UPDATE
Here are some metrics:
Connections: 748
Channels: 6577
Exchanges: 8
Queues: 1590
Consumers: 1098
Messages Total: 153.394
Messages unacked: 152.848
Acknowledge: 2674/s
Publish: 704/s
Deliver: 586/s
And rabbitmqctl status output:
Status of node rabbit#masternode ...
[{pid,10814},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.5.6"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.5.6"},
{webmachine,"webmachine","1.10.3-rmq3.5.6-gite9359c7"},
{mochiweb,"MochiMedia Web Server","2.7.0-rmq3.5.6-git680dba8"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.5.6"},
{rabbit,"RabbitMQ","3.5.6"},
{os_mon,"CPO CXC 138 46","2.2.14"},
{inets,"INETS CXC 138 49","5.9.7"},
{mnesia,"MNESIA CXC 138 12","4.11"},
{amqp_client,"RabbitMQ AMQP Client","3.5.6"},
{xmerl,"XML parser","1.3.5"},
{sasl,"SASL CXC 138 11","2.3.4"},
{stdlib,"ERTS CXC 138 10","1.19.4"},
{kernel,"ERTS CXC 138 10","2.16.4"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:32:32] [async-threads:64] [kernel-poll:true]\n"},
{memory,
[{total,1093604792},
{connection_readers,8069400},
{connection_writers,6168304},
{connection_channels,115667448},
{connection_other,20448952},
{queue_procs,526134000},
{queue_slave_procs,3045928},
{plugins,1638160},
{other_proc,20891248},
{mnesia,5975616},
{mgmt_db,63193376},
{msg_index,2245016},
{other_ets,3895632},
{binary,214973160},
{code,20000582},
{atom,703377},
{other_system,80554593}]},
{alarms,[]},
{listeners,[]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,54036645478},
{disk_free_limit,50000000},
{disk_free,100918980608},
{file_descriptors,
[{total_limit,49900},
{total_used,1231},
{sockets_limit,44908},
{sockets_used,243}]},
{processes,[{limit,1048576},{used,15377}]},
{run_queue,1},
{uptime,2241834}]
Publish and deliver sometimes stall (go very low)
UPDATE 2
There's nothing showing in the logs and the java driver doesn't call the callback when something is blocked.
I have several use cases, for example for batching documents that i load into our search server (solr). Many producers (>50) generate around 50.000 messages per minute and the consumer(s) use that queue without autoack.
After the messages have been successfully sent (or retried up to 5 times), the messages will be acked. Maybe this could block everything? I set it to autoack and everything runs a lot smoother now.
What my initial question was about: Every of our customers has a single queue which is currently on autoack. It can happen that one of those customers suddenly doesn't have a customer, but that's no problem at all. So, would using a single queue with routing keys improve performance?
Currently i'm just sending to an empty exchange (default exchange) without routing key (or empty routing key) directly to the customer-queue. The messages are gzipped json, so pretty small, a few kb in average.
As far as it goes, connections and channels are both lightweight objects, though there is additional process and overhead involved in creating a connection as opposed to a channel, which is simply a single-packet command. I would not expect 800 connections to even begin to stretch RabbitMQ. Memory use and disk use is primarily driven by the number of messages sitting in the queues, and to some extent by the number of queues. Without details on your application throughput, it is difficult to surmise further, but I would start by ensuring that your producers and consumers are roughly matched from a volume standpoint.
To find out more about what is using memory, invoke rabbitmqctl status (see documentation here).
Related
How to process messages in parallel from a Weblogic JMS queue?
I am new to JMS, and I am trying to understand if there is a way to consume messages in parallel from a JMS queue and process them using Spring JMS. I checked a few answers on Stack Overflow, but I am still confused. The application I am working on uses Spring Boot and Weblogic JMS as the messaging broker. It listens to a JMS queue from a single producer using the JmsListener class. In the JMS ConnectionFactory configuration of the application the following parameter has been set: DefaultJmsListenerContainerFactory.setConcurrency("6-10"); Does that mean if there are 100 messages currently in a queue then 10 messages will be consumed and processed in parallel? If so, can I increase the value to process more messages in parallel? If so, are there any limitations to it? Also, I am confused about what DefaultJmsListenerContainerFactory.setConcurrency and setConcurrentConsumers does. Currently the processing of JMS client app is very slow. So I need suggestions to implement parallel processing.
concurrentConsumers is a fixed number of consumers where as concurrency can specify a variable number which scale up/down as needed. Also see maxConcurrentConsumers. The actual behavior also depends on prefetch; if each consumer prefetches 100 messages then only one consumer might get them all. There is no limit (aside from memory/cpu constraints).
what is the difference between seneca redis pubsub transport and seneca redis queue transport?
I'm learning on how to get data from redis using seneca js but seneca provides multiple plugins to connect to redis. and available plugins are the ones mentioned in the title. which should I use just to fetch a couple of keys from redis? and what is the difference between the two?
seneca-redis-pubsub-transport and seneca-redis-queue-transport are both used for transporting messages between services using redis. seneca-redis-pubsub-transport is a broadcast transport. All subscribed services will receive all messages. seneca-redis-queue-transport on the other hand is a queue transport. Messages are sent to only one of possibly multiple subscribed services. If you only want to get/set some values that take a look at seneca-redis-store. This plugin allows you to get and set values using redis.
Redis publish-subscribe: Is Redis guaranteed to deliver the message even under massive stress?
Provided that both the client subscribed and the server publishing the message retain the connection, is Redis guaranteed to always deliver the published message to the subscribed client eventually, even under situations where the client and/or server are massively stressed? Or should I plan for the possibility that Redis might ocasionally drop messages as things get "hot"?
Redis does absolutely not provide any guaranteed delivery for the publish-and-subscribe traffic. This mechanism is only based on sockets and event loops, there is no queue involved (even in memory). If a subscriber is not listening while a publication occurs, the event will be lost for this subscriber. It is possible to implement some guaranteed delivery mechanisms on top of Redis, but not with the publish-and-subscribe API. The list data type in Redis can be used as a queue, and as the the foundation of more advanced queuing systems, but it does not provide multicast capabilities (so no publish-and-subscribe). AFAIK, there is no obvious way to easily implement publish-and-subscribe and guaranteed delivery at the same time with Redis.
Redis does not provide guaranteed delivery using its Pub/Sub mechanism. Moreover, if a subscriber is not actively listening on a channel, it will not receive messages that would have been published. I previously wrote a detailed article that describes how one can use Redis lists in combination with BLPOP to implement reliable multicast pub/sub delivery: http://blog.radiant3.ca/2013/01/03/reliable-delivery-message-queues-with-redis/ For the record, here's the high-level strategy: When each consumer starts up and gets ready to consume messages, it registers by adding itself to a Set representing all consumers registered on a queue. When a producers publishes a message on a queue, it: Saves the content of the message in a Redis key Iterates over the set of consumers registered on the queue, and pushes the message ID in a List for each of the registered consumers Each consumer continuously looks out for a new entry in its consumer-specific list and when one comes in, removes the entry (using a BLPOP operation), handles the message and moves on to the next message. I have also made a Java implementation of these principles available open-source: https://github.com/davidmarquis/redisq These principles have been used to process about 1,000 messages per second from a single Redis instance and two instances of the consumer application, each instance consuming messages with 5 threads.
Maximum message size for RabbitMQ
What is the maximum size that a message can be when publishing to a RabbitMQ queue (pub/sub model) ? I can't see any explicit limits in the docs but I assume there are some guidelines. Thanks in advance.
I was doing comparison between Amazon Queue Service and RabbitMQ or other streaming+messaging platforms like kinesis, kafka. As Amazon Queue Service only supports min 2^10 bytes(1 Kilobytes) - max 2^18 bytes (256 Kilobytes), similarly kinesis has size limits too. (Don't know why?) Anyway In theory AMQueueProtocal would handle 2^64 bytes. So, even for a huge message, RabbitMQ might work in a single broker, definitely taking minutes/hours to persist but would or might not in a cluster of brokers. If the message transfer time between nodes (60seconds?) > heartbeat time between nodes, it will cause the cluster to disconnect and the loose the message. This thread is useful -> Can RabbitMQ handle big messages? References http://grokbase.com/t/rabbitmq/rabbitmq-discuss/127wsy1h92/limiting-the-size-of-a-message http://comments.gmane.org/gmane.comp.networking.rabbitmq.general/14665 http://rabbitmq.1065348.n5.nabble.com/Max-messages-allowed-in-a-queue-in-RabbitMQ-td26063.html https://www.rabbitmq.com/heartbeats.html
Redis pubsub vs blocking operations
How should I choose between pubsub and blocking operations of redis ? Redis gives blocking operations like BLPOP which blocks the operation till an element can be popped from the list. Why should I not use this to achieve the functionality of PUBSUB. PUBSUB allows you to define channels which are a higher level construct than the basic lists. If my usecase is simple without multiple channels, can I go with the basic blocking operations.
There is an important difference between using lists with blocking operations and the pub/sub facilities. A list with blocking operations can easily be used as a queue, while pub/sub channels do not involve any queuing. The only buffers involved in pub/sub are related to communication (i.e. socket management). It means when a message is published, it will be transmitted ASAP to the subscribers, it is never kept in Redis. A consequence is if the subscribers do not listen anymore to the Redis socket, the items are lost for these subscribers. Another important difference is the pub/sub mechanism can multicast items. When the items are published they are sent to all the subscribers. On the contrary, considering multiple daemons dequeuing a list in Redis using blocking operations, pushing an item to the list will result in the item to be dequeued by one and only one daemon. Blocking lists (i.e. queues) and pub/sub channels are really complementary facilities. If you do not need to multicast items, you should rather use list with blocking operations, since they are much more reliable.