I did a base install of RabbitMQ, the latest, without any configuration changes. It is running, and I have 2 million messages sitting on a queue, but I'm only able to pull them off 20 at a time. I was expecting in the thousands. I haven't been able to find anything obvious in the configuration documentation that says explicitly configure this for a high-throughput production environment. Could someone push me in the right direction to figure out why I'm not getting any throughput? I have 1 consumer with a prefetch count of 24. The consumer does nothing with the message, just pulling it off for testing purposes.
I have a base rabbitmq.config file, which only contains this:
[
{rabbit, [{vm_memory_high_watermark, 0.7}]}
].
I have looked through the example config at https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq.config.example but nothing stands out as something that needs to be configured in order to speed up the consumers.
Related
I'm using RabbitMQ 3.8.9 with Erlang 23.1.
From the web management panel, queues page, I have enabled the "consumers count" column.
When I start n consumers, after a few seconds, as expected, I see the number in that column increase by n.
The problem is that the count does not seem to update always correctly.
For instance I know for sure that a queue (in-aws) has no consumers, because the machine hosting the consumer is down, but still I see the old number of consumers.
Am I missing something?
Thanks a lot
Background
I have a distributed system with many machines. I have two types of applications - Producer and Consumer. To be more specific - a single producer and multiple consumers. Each "consumer machine" has multiple consumers.
All the messages in the system are going to same queue. Message looks like this:
{
"Id": "Thisismyid",
"CacheId": "CacheID"
...
}
My consumers are applying a cache strategy in order to process queue messages faster. Once the message was downloaded by the consumer, it being checked if the CacheId is already cached previously. If yes - continue. If no - cache it and continue.
All the consumers on same machine are sharing the same cache repository.
The problem
This structure is "optimal" when I have 1 consumer. Since, same machine cache the items and use it multiple times.
As the number of consumers is going up - the efficiency of the cache is going down. Because its more likely
that an item will be downloaded by node that wasn't has a ready to use cache.
The Question
How to use RabbitMQ to "route" messages with same same CacheId to be processes by same consumer\machine to increase efficiency? What is the "cost" in terms of RabbitMQ resources?
You might be able to do this with a topic exchange: https://www.rabbitmq.com/tutorials/tutorial-five-dotnet.html
But this would quickly get complicated if you have many CacheId.
But I would use a centralized cache instead. Could be Redis: https://redis.io/
I'm working with a product suite which uses RabbitMQ as a back end for service bus messaging. Many of the clients use software (NeuronESB) which is supposed to automatically configure exchanges, queues and channels as needed. Somewhere in the system exchanges in Rabbit are being deleted and not re-created, resulting in unexpected issues. Because of the size of the system and closed source nature of at least one of the service bus clients, an audit of code has been unsuccessful in determining the source of the deletion of these exchanges.
I have tried using the firehose functionality of Rabbit, but that only provides the messages being sent through Rabbit, not the internal activities I need.
What methods are available for logging the creation and deletion of exchanges in RabbitMQ? Ideally I would like to know the date, time and client IP of the deleter, but even just getting the date and time would allow me to narrow my search of logs to help find the offender.
Try Events Exchange plugin that should do the trick.
If not working for some reason, the last resort I can think of:
Get a test environment with less clients/messages if you app is busy, then analyse your traffic with wireshark (it can understand amqp) to filter out requests to delete exchange.
I've been tasked with investigating why the db-*.log files are not clearing.
From what I have found through vast searching, everything points to the messages being on the queue still. I've looked at hawtio at the queues on all the configured topics and the queue size is zero.
From my understanding the Enqueue size and Dequeue size in theory should be the same, but they're not. Seems my Dequeue size is 0.
I've looked at the topics and there's no operation to purge them.
I'd like to be able to clear out all messages so that the kahadb logs will disappear.
I think you point on one weakness of the ActiveMQ itself: it cannot guarantee the consumers are really strict when consuming the messages.
We have similar problems with our ActiveMQ (5.10.7) because it seems the KahaDB make likes a "disk fragmentation" and we noticed this could be from at least two issues with consumers:
Case 1: Slow consumer
We have in our system a consumer which cannot consume many messages at once. if only one unconsumed message stays in a KahaDB page, it will keep all the whole page (with all others messages which are already consumed, and acknowledged).
For preventing the KahaDB Storage to reach 100% (which will slows the producers) we transfer the messages in another ActiveMQ instance temporary queue like this:
from("activemqPROD:queue:BIG_QUEUE_UNCONSUMED")
.to("activemqTEMP:queue:TEMP_BIG_QUEUE");
then pushing them back:
from("activemqTEMP:queue:TEMP_BIG_QUEUE")
.to("activemqPROD:queue:BIG_QUEUE_UNCONSUMED");
The alternative is to store them on file system then reload them, but you loose the JMS (and custom) headers. With the temporary queue solution you keep all headers.
Case 2: Consumer who never gives acknowledgement
Sometimes even we make the previous operation, even all unconsumed queues are empty, the storage stays higher than 0%.
By looking into the KahaDB file we can see there are still pages present even no more messages in all QUEUES.
For the TOPICS, we stopped using durable subscriptions, then the storage should also stays at 0%.
The potential cause (this is a supposition, but with a strong confidence) is that some of the consumed messages were never acknowledged properly.
The reason we think this is the cause, it is because in the logs, we can still see messages
"not removing data file: 12345 as contained ack(s) refer to referenced file: [12344, 12345]"
This can happens for example when the consumer is disconnecting abruptly (they consumed some messages but disconnect before sending the ack)
In our case the messages never expires, then this could also be a potential issue for this case. However it is not clear if setting an expiration can destroy "non-acked" messages.
Because we do not want to loose any event, there is no expiration time for these specific queues.
According to your question, it looks you are in the second case, then our solution is:
Be sure no more producer / consumer are connecting to the ActiveMQ
Be sure all queues and durable topics are empty
Delete all files in the KahaDB storage (from file system)
Restart ActiveMQ (fresh)
Unfortunately we did not find a better way to manage with these cases, if someone else have a better alternative we would be happy to know it.
This article can also give you some solution (like setting an expiry policy for the ActiveMQ.DLQ queue).
add this log config to log4j.properties. Then you can see exactly what is holding kahadb files in kahadb.log.
log4j.appender.kahadb=org.apache.log4j.RollingFileAppender
log4j.appender.kahadb.file=${activemq.base}/data/kahadb.log
log4j.appender.kahadb.maxFileSize=1024KB
log4j.appender.kahadb.maxBackupIndex=5
log4j.appender.kahadb.append=true
log4j.appender.kahadb.layout=org.apache.log4j.PatternLayout
log4j.appender.kahadb.layout.ConversionPattern=%d [%-15.15t] %-5p %-30.30c{1} - %m%n
log4j.logger.org.apache.activemq.store.kahadb.MessageDatabase=TRACE, kahadb
As alternative: once you've found out which Queue is causing the log to exist, you could map it to its own KahaDB like described here http://activemq.apache.org/kahadb.html
I have a middleware based on Apache Camel which does a transaction like this:
from("amq:job-input")
to("inOut:businessInvoker-one") // Into business processor
to("inOut:businessInvoker-two")
to("amq:job-out");
Currently it works perfectly. But I can't scale it up, let say from 100 TPS to 500 TPS. I already
Raised the concurrent consumers settings and used empty businessProcessor
Configured JAVA_XMX and PERMGEN
to speed up the transaction.
According to Active MQ web Console, there are so many messages waiting for being processed on scenario 500TPS. I guess, one of the solution is scale the ActiveMQ up. So I want to use multiple brokers in cluster.
According to http://fuse.fusesource.org/mq/docs/mq-fabric.html (Section "Topologies"), configuring ActiveMQ in clustering mode is suitable for non-persistent message. IMHO, it is true that it's not suitable, because all running brokers use the same store file. But, what about separating the store file? Now it's possible right?
Could anybody explain this? If it's not possible, what is the best way to load balance persistent message?
Thanks
You can share the load of persistent messages by creating 2 master/slave pairs. The master and slave share their state either though a database or a shared filesystem so you need to duplicate that setup.
Create 2 master slave pairs, and configure so called "network connectors" between the 2 pairs. This will double your performance without risk of loosing messages.
See http://activemq.apache.org/networks-of-brokers.html
This answer relates to an version of the question before the Camel details were added.
It is not immediately clear what exactly it is that you want to load balance and why. Messages across consumers? Producers across brokers? What sort of concern are you trying to address?
In general you should avoid using networks of brokers unless you are trying to address some sort of geographical use case, have too many connections for a signle broker to handle, or if a single broker (which could be a pair of brokers configured in HA) is not giving you the throughput that you require (in 90% of cases it will).
In a broker network, each node has its own store and passes messages around by way of a mechanism called store-and-forward. Have a read of Understanding broker networks for an explanation of how this works.
ActiveMQ already works as a kind of load balancer by distributing messages evenly in a round-robin fashion among the subscribers on a queue. So if you have 2 subscribers on a queue, and send it a stream of messages A,B,C,D; one subcriber will receive A & C, while the other receives B & D.
If you want to take this a step further and group related messages on a queue so that they are processed consistently by only one subscriber, you should consider Message Groups.
Adding consumers might help to a point (depends on the number of cores/cpus your server has). Adding threads beyond the point your "Camel server" is utilizing all available CPU for the business processing makes no sense and can be conter productive.
Adding more ActiveMQ machines is probably needed. You can use an ActiveMQ "network" to communicate between instances that has separated persistence files. It should be straight forward to add more brokers and put them into a network.
Make sure you performance test along the road to make sure what kind of load the broker can handle and what load the camel processor can handle (if at different machines).
When you do persistent messaging - you likely also want transactions. Make sure you are using them.
If all running brokers use the same store file or tx-supported database for persistence, then only the first broker to start will be active, while others are in standby mode until the first one loses its lock.
If you want to loadbalance your persistence, there were two way that we could try to do:
configure several brokers in network-bridge mode, then send messages
to any one and consumer messages from more than one of them. it can
loadbalance the brokers and loadbalance the persistences.
override the persistenceAdapter and use the database-sharding middleware
(such as tddl:https://github.com/alibaba/tb_tddl) to store the
messages by partitions.
Your first step is to increase the number of workers that are processing from ActiveMQ. The way to do this is to add the ?concurrentConsumers=10 attribute to the starting URI. The default behaviour is that only one thread consumes from that endpoint, leading to a pile up of messages in ActiveMQ. Adding more brokers won't help.
Secondly what you appear to be doing could benefit from a Staged Event-Driven Architecture (SEDA). In a SEDA, processing is broken down into a number of stages which can have different numbers of consumer on them to even out throughput. Your threads consuming from ActiveMQ only do one step of the process, hand off the Exchange to the next phase and go back to pulling messages from the input queue.
You route can therefore be rewritten as 2 smaller routes:
from("activemq:input?concurrentConsumers=10").id("FirstPhase")
.process(businessInvokerOne)
.to("seda:invokeSecondProcess");
from("seda:invokeSecondProcess?concurentConsumers=20").id("SecondPhase")
.process(businessInvokerTwo)
.to("activemq:output");
The two stages can have different numbers of concurrent consumers so that the rate of message consumption from the input queue matches the rate of output. This is useful if one of the invokers is much slower than another.
The seda: endpoint can be replaced with another intermediate activemq: endpoint if you want message persistence.
Finally to increase throughput, you can focus on making the processing itself faster, by profiling the invokers themselves and optimising that code.