Is it possible to mirror a single queue in ActiveMQ? - activemq

I'm running ActiveMQ in a production system. Some of our queues are very high volume and some are very low volume. I'm interested in mirroring one of the low volume queues so that I can build informal monitoring services around the messages being received.
Unfortunately, the only documentation I've been able to find seems to imply that Mirrored Queues are are all-or-nothing: you either create a topic for every single queue you have (and suffer the performance penalty of copying every message flowing through your system), or you can't use the feature at all.
Is there no way of enabling this helpful functionality for a single, known queue name?

You can mirror single queues like this:
<destinationInterceptors>
<virtualDestinationInterceptor>
<virtualDestinations>
<compositeQueue name="YOUR.QUEUE" forwardOnly="false">
<forwardTo>
<queue physicalName="MIRROR.OF.YOUR.QUEUE"/>
</forwardTo>
</compositeQueue>
</virtualDestinations>
</virtualDestinationInterceptor>
</destinationInterceptors>
You can find the documentation for virtual destinations here.

Despite all the time passed since this question was asked, I am going to publish this answer just in case someone reaches here looking for the same thing as I did. I hope it helps the discussion.
Unfortunately, as far as I know and have tried (for solving a particular issue similar to yours), it is not possible to enable the functionality of Mirrored Queues for just one queue in a broker configuration with multiple queues. As of today, as you said, you would have to suffer that performance penalty by copying every message flowing through that broker.
But the Mirrored Queues are a setting scoped in the broker configuration and therefore, there is an alternative to that (not saying is going to be easier or free of performance cost). You might have another broker that communicates with your main broker, in which to include only those queues that you need to mirror/monitor. Of course this would make the architecture of your system more complex since you need to communicate 2 brokers (with its subsequent latency impact in the communication between them), but at least you would not need to create a mirrored queue for all the queues in your system, and in case you had a system with thousands of queues and just wanted to mirror very few of them, this might be an alternative to consider.
The answer from Ralf I believe does not answer your question, since for forwarding to a mirrored queue, you would have to activate that functionality previously for all the queues in the broker, that is what you do not want. Due to my account reputation is under 50 I cannot comment under his answer, that's why I write it here.

Related

ActiveMQ datastore for cluster setup

We have been using ActiveMQ version 5.16.0 broker with single instances in production. Now we are planning to use cluster of AMQ brokers for HA and load distribution with consistency in message data. Currently we are using only one queue
HA can be achieved using failover but do we need to use the same datastore or it can be separated? If I use different instances for AMQ brokers then how to setup a common datastore.
Please guide me how to setup datastore for HA and load distribution
Multiple ActiveMQ servers clustered together can provide HA in a couple ways:
Scale message flow by using compute resources across multiple broker nodes
Maintain message flow during single node planned or unplanned outage of a broker node
Share data store in the event of ActiveMQ process failure.
Network of brokers solve #1 and #2. A standard 3-node cluster will give you excellent performance and ability to scale the number of producers and consumers, along with splitting the full flow across 3-nodes to provide increased capacity.
Solving for #3 is complicated-- in all messaging products. Brokers are always working to be completely empty-- so clustering the data store of a single-broker becomes an anti-pattern of sorts. Many times, relying on RAID disk with a single broker node will provide higher reliability than adding NFSv4, GFSv2, or JDBC and using shared-store.
That being said, if you must use a shared store-- follow best practices and use GFSv2 or NFSv4. JDBC is much slower and requires significant DB maintenance to keep running efficiently.
Note: [#Kevin Boone]'s note about CIFS/SMB is incorrect and CIFS/SMB should not be used. Otherwise, his responses are solid.
You can configure ActiveMQ so that instances share a message store, or so they have separate message stores. If they share a message store, then (essentially) the brokers will automatically form a master-slave configuration, such that only one broker (at a time) will accept connections from clients, and only one broker will update the store. Clients need to identify both brokers in their connection URIs, and will connect to whichever broker happens to be master.
With a shared message store like this, locks in the message store coordinate the master-slave assignment, which makes the choice of message store critical. Stores can be shared filesystems, or shared databases. Only a few shared filesystem implementations work properly -- anything based on NFS 4.x should work. CIFS/SMB stores can work, but there's so much variation between providers that it's hard to be sure. NFS v3 doesn't work, however well-implemented, because the locking semantics are inappropriate. In any case, the store needs to be robust, or replicated, or both, because the whole broker cluster depends on it. No store, no brokers.
In my experience, it's easier to get good throughput from a shared file store than a shared database although, of course, there are many factors to consider. Poor network connectivity will make it hard to get good throughput with any kind of shared store (or any kind of cluster, for that matter).
When using individual message stores, it's typical to put the brokers into some kind of mesh, with 'network connectors' to pass messages from one broker to another. Both brokers will accept connections from clients (there is no master), and the network connections will deal with the situation where messages are sent to one broker, but need to be consumed from another.
Clients' don't necessarily need to specify all brokers in their connection URIs, but generally will, in case one of the brokers is down.
A mesh is generally easier to set up, and (broadly speaking) can handle more client load, than a master-slave with shared filestore. However, (a) losing a broker amounts to losing any messages that were associated with it (until the broker can be restored) and (b) the mesh interferes with messaging patterns like message grouping and exclusive consumers.
There's really no hard-and-fast rule to determine which configuration to use. Many installers who already have some sort of shared store infrastructure (a decent relational database, or a clustered NFS, for example) will tend to want to use it. The rise in cloud deployments has had the effect that mesh operation with no shared store has become (I think) a lot more popular, because it's so symmetric.
There's more -- a lot more -- that could be said here. As a broad question, I suspect the OP is a bit out-of-scope for SO. You'll probably get more traction if you break your question up into smaller, more focused, parts.

In RabbitMQ, How to make Queues in different clusters to Be Highly Available without Clustering?

In RabbitMQ, if two clusters are hosted on geographical different locations, then we can’t use Clustering. Then how to make them highly available I.e. if one site’s whole cluster goes down then the messages should be mirrored to other site and other site should be able to cater those messages. Note : sites are connected by WAN
See I can’t lose any message on the both sites. Publishing message to the right site can be taken care of, but if the messages are in queue(work queue) or messages are being processed by consumer and suddenly if the site goes down which includes the broker and consumer, how can those messages be catered by the other site. Like in a cluster if one node dies, the other one has all the messages mirrored and knows which were acknowledged, but how to achieve this on WAN, cause clustering cross WAN is not practical.
I think the question illustrates a conceptual problem with the design. To summarize,
There are two sites, connected via WAN
One site is the primary, while one is the active standby
There is a desire for complete replication of system state (total consistency) between site A and B, to include the status of messages in the queue and messages being processed.
Essentially, you want 100% consistency, availability, and partition tolerance. Such a design is not possible according to CAP Theorem. What RabbitMQ provides is either consistency and availability, with low partition tolerance via clustering, or availability and partition tolerance via federation or shovel. RabbitMQ does not deal very well with the case of needing consistency and partition tolerance, since message brokers really handle highly transient traffic.
Instead, what is needed is to fully scope the problem to something that can be solved using the available technologies. It sounds to me like the correct approach (since it's over a WAN) is to sacrifice availability for consistency and partition tolerance, and have your application handle the failover case. You may be able to configure RabbitMQ sufficiently in this regard - see https://www.rabbitmq.com/partitions.html.

Handling RabbitMQ node failures in a cluster in order to continue publishing and consuming

I would like to create a cluster for high availability and put a load balancer front of this cluster. In our configuration, we would like to create exchanges and queues manually, so one exchanges and queues are created, no client should make a call to redeclare them. I am using direct exchange with a routing key so its possible to route the messages into different queues on different nodes. However, I have some issues with clustering and queues.
As far as I read in the RabbitMQ documentation a queue is specific to the node it was created on. Moreover, we can only one queue with the same name in a cluster which should be alive in the time of publish/consume operations. If the node dies then the queue on that node will be gone and messages may not be recovered (depends on the configuration of course). So, even if I route the same message to different queues in different nodes, still I have to figure out how to use them in order to continue consuming messages.
I wonder if it is possible to handle this failover scenario without using mirrored queues. Say I would like switch to a new node in case of a failure and continue to consume from the same queue. Because publisher is just using routing key and these messages can go into more than one queue, same situation is not possible for the consumers.
In short, what can I to cope with the failures in an environment explained in the first paragraph. Queue mirroring is the best approach with a performance penalty in the cluster or a more practical solution exists?
Data replication (mirrored queues in RabbitMQ) is a standard approach to achieve high availability. I suggest to use those. If you don't replicate your data, you will lose it.
If you are worried about performance - RabbitMQ does not scale well.
The only way I know to improve performance is just to make your nodes bigger or create second cluster. Adding nodes to cluster does not really improve things. Also if you are planning to use TLS it will decrease throughput significantly as well. If you have high throughput requirement +HA I'd consider Apache Kafka.
If your use case allows not to care about HA, then just re-declare queues/exchanges whenever your consumers/publishers connect to the broker, which is absolutely fine. When you declare queue that's already exists nothing wrong will happen, queue won't be purged etc, same with exchange.
Also, check out RabbitMQ sharding plugin, maybe that will do for your usecase.

RabbitMQ - federated queues Vs exchange federation

I have set up a rabbit cluster and I publish messages into a fanout exchange every time something changes in a database.
I have dedicated queues bound to this exchange for some of my microservices that consume these updates and I also originally set up a dedicated queue for an external client so that they can federate it with their own rabbit infrastructure and consume a copy of every message.
Now I'm wondering whether allowing exchange federation rather than creating a new dedicated queue for each new external consumer would be a better approach since more and more users will come.
What are the pros and cons?
Thanks
As long as you manage permissions properly, the final decision is up to you. You can give a try to all variants first and find what will fit your actual needs.
Having local queue may have it pros and cons: it allows end-user to survive some outage with their infrastructure or network issue at the cost of your disk/memory, however, you may limit queue length and/or size.
I'd suggest you to take a look at Shovel plugin and Dynamic shovels. With local queue it may server a good job.
Comparing to federation, shovel is much simpler, e.g. it doesn't sync content between upstream and downstream but simply moves message from one queue to another in a reliable manner. As long as you don't need what federation provides, shovel could be a good choice.
Also, you may find this q/a useful (however, it might be a bit outdated) - https://stackoverflow.com/a/19357272.

How distributed should queues be in a RabbitMQ cluster?

Assume you have a small rabbitmq system of 3 nodes that is supposed to handle 100+ decently high volume queues in the same exchange. Given that queues only exist on the node they are created on (we're not using replicated, High Availability queues), what's the best way to create the queues? Is there any benefit to having the queues distributed among the cluster nodes, or is it better to keep them all on one node and have rmq do the routing?
It depends on your application, really.
RabbitMQ is smart about sending messages, so it'll only send a message to a node in the cluster if
a queue that holds that message resides on that node or
if a consumer has connected to that node and has requested the message.
In general, you should aim to declare queues on the nodes on which both the publishers and the consumers for that queue will connect to. In other words, you should aim to connect publishers and consumers to the node that holds the queues they use. This assumes you're trying to conserve bandwidth used overall.
If you're using clustering to improve throughput (and you probably are), and you don't care about internal bandwidth used, you should aim to connect your publishers/consumers to the nodes in a balanced way and not worry about the internal routing mechanisms.
One last thing to think about is memory and disk-space. Queues store messages in main memory, and fallback to disk if that's insufficient. So, if you declare all your queues in one place, that'll result in one node that's "over-worked" and two nodes with memory to spare.
As part of a move towards redundancy and failover in an application I'm working on, I've just finished setting up a RabbitMQ cluster behind a proxy, and have all of my publishers and consumers connect via the proxy, which round robins connections to the individual nodes as they come in from the clients. Prior to upgrading RabbitMQ to 2.7.1, this seemed to pretty evenly distribute queues to the separate nodes, though this would of course depend pretty heavily on how your proxy balances the requests and when your clients try to connect (and declare a queue)...
Having said all that, I just upgraded to RabbitMQ 2.7.1, which was pretty painless, and gave us HA queues, which is a pretty big win for our apps. At any rate, if you're interested in the set up, and think it would be of benefit to your queue problem, I'd be happy to share the setup.