Can we define my architecture as an ESB? - activemq

I have read many different definitions of ESB (enterprise service bus) and it is not clear for me.
Here is my own definition: An ESB is an architecture and not a tool that allows heterogeneous applications to communicate with each other through a BUS. The particularity of an ESB is that it can have producers and consumers. For example, a producer can send a message to a topic/queue inside the bus and three consumers who are subscribers will receive the same message, so it avoids point-to-point flows.
The second particularity of the ESB is that it allows managing the security and logs in one place as everything goes inside the ESB.
I've also heard about "routes" that set rules in moving a message (with Talend ESB), but I don't really see the point (if you have any examples I'm interested). And of course, Web services can be created to expose data. These services must be scalable and resistant to "Single Point of Failure".
I created an architecture and would have liked to know if it's an ESB architecture.
(I made a mistake on my draw, it's not a Queue but a Topic!)
The steps of the process above:
Producer: it listens the changes (update, insert, ...) in different databases and as soon as there is a change, it retrieves the data and sends it to the queue.
Queue: The queue contains all the messages sent by the producer and will send them to the consumers.
Consumers: Consumers will make the data quality and insert the new data into a database.
For me, this architecture respects ESB because activeMQ acts like a bus. He acts here as mediator. What do you think ?

I think you are on the right track. However, I think there is an important distinction to make sure each message flow is using different queues. It is generally a best practice to have a queue per-message type.
The message flows can all co-exist on the same broker infrastructure, allowing you to have higher density, better utilization, and the ability to wiretap message flows in one place as needed.
In your case:
Database A -> queue://A -> Consumer A
Database B -> queue://B -> Consumer B
Database C -> queue://C -> Consumer C

Related

How to get delivery path in rabbitmq to become message property?

The undelying use case
It is typical pubsub use case: Consider we have M news sources, and there are N subscribers who subscribe to the desired news sources, and who want to get news updates. However, we want these updates to land up in mongodb - essentially maintain most recent 'k' updates (and can be indexed and searched etc.). We want to design for M to scale upto million publishers, N to scale to few millions.
Subscribers' updates are finally received and stored in more than one hosts and their native mongodbs.
Modeling in rabbitmq
Rabbitmq will be used to persist the mappings (who subscribes to which news source).
I have setup a pubsub system in this way: We create publisher exchanges (each mapping to one news source) and of type 'fanout'.
For modelling subscribers, there are two options.
In the first option, have one queue for each subscriber bound to relevant publisher exchanges. And let the client process open connections to all these subscriber queues and receive the updates (and persist them to mongodb). Note that in this option, when the client is restarted, it has to manage list of all susbcribers, and open connections to all subscriber queues it is responsible for.
In the second option, we want to be able to remove overhead of having to explicitly open on each user queue upon startup. Instead, we want to listen to only one queue - representative of all subscribers who will send updates to this client host.
For achieving this, we first create one exchange for each subscriber and let it bind to the publisher exchange(s) that it follows. We let a single queue for each client, and let the subscriber exchange bind to this queue (type=direct) if the subscriber belongs to that client.
Once the client receives the update message, it should come to know which subscriber exchange it came from. Only then we can add it to mongodb for relevant subscriber. Presumably the subscriber exchange should add this information as a new header on the message.
As per rabbitmq docs, I believe there is no way to get achieve this. (Or more specifically, to get the 'delivery path' property from the delivered message, from which we can get this information).
My questions:
Is it possible to add a new header to message as it passes through exchange?
If this is not possible, then can we achieve it through custom exchange and relevant plugin? Any plugin that I can readily use for this purpose?
I am curious as to why rabbitmq is not providing delivery path property as an optional configuration?
Is there any other way I can achieve the same? (See pubsubhubbub note below)
PubSubHubBub
The use case is very similar to what pubsubhubbub protocol provides for. And there is rabbitmq plugin too called rabbithub. However, our system will be a closed system, and I believe that the webhook approach of the protocol is going to be too much of overhead compared to listening on single queue (and from performance perspective.)
The producer (RMQ Client) of the message should add all the required headers (including the originator's identity) before producing (publishing) it on RMQ. These headers are used for routing.
If, while in transit, the message (including headers) needs to be transformed (e.g. adding new headers), it needs to be sent to the transformer (another RMQ Client). This transformer will essentially become the new publisher.
The actual consumer should receive its intended messages (for which it has subscribed to) through single queue. The routing of all its subscribed messages should be arranged on the RMQ Exchange.
Managing the last 'K' updates should neither be the responsibility of the producer nor the consumer. So, it should be done in the transformer. Producers' messages should be routed to this transformer (for storage) before further re-routing to exchange(s) from where consumers consume.

Message bus: sender must wait for acknowledgements from multiple recipients

In our application the publisher creates a message and sends it to a topic.
It then needs to wait, when all of the topic's subscribers ack the message.
It does not appear, the message bus implementations can do this automatically. So we are leaning towards making each subscriber send their own new message for the client, when they are done.
Now, the client can receive all such messages and, when it got one from each destination, do whatever clean-ups it has to do. But what if the client (sender) crashes part way through the stream of acknowledgments? To handle such a misfortune, I need to (re)implement, what the buses already implement, on the client -- save the incoming acknowledgments until I get enough of them.
I don't believe, our needs are that esoteric -- how would you handle the situation, where the sender (publisher) must wait for confirmations from multiple recipients (subscribers)? Sort of like requesting (and awaiting) Return-Receipts from each subscriber to a mailing list...
We are using RabbitMQ, if it matters. Thanks!
The functionality that you are looking for sounds like a messaging solution that can perform transactions across publishers and subscribers of a message. In The Java world, JMS specifies such transactions. One example of a JMS implementation is HornetQ.
RabbitMQ does not provide such functionality and it does for good reasons. RabbitMQ is built for being extremely robust and to perform like hell at the same time. The transactional behavior that you describe is only achievable with the cost of reasonable performance loss (especially if you want to keep outstanding robustness).
With RabbitMQ, one way to assure that a message was consumed successfully, is indeed to publish an answer message on the consumer side that is then consumed by the original publisher. This can be achieved through RabbitMQ's RPC procedure calls which might help you to get a clean solution for your problem setting.
If the (original) publisher crashes before all answers could be received, you can assume that all outstanding answers are still queued on the broker. So you would have to build your publisher in a way that it is capable to resume with processing those left messages. This might turn out to be none-trivial.
Finally, I recommend the following solution: Design your producing component in a way that you can consume the answers with one or more dedicated answer consumers that are separated from the origin publisher.
Benefits of this solution are:
the origin publisher can finish its task independent of consumer success
the origin publisher is independent of consumer availability and speed
the origin publisher implementation is far less complex
in a crash scenario, the answer consumer can resume with processing answers
Now to a more general point: One of the major benefits of messaging is the decoupling of application components by the broker. In AMQP, this is achieved with exchanges and bindings that allow you to move message distribution logic from your application to a central point of configuration.
If you add RPC-style calls to your clients, then your components are most likely closely coupled again, meaning that the publishing component fails if one of the consuming components fails / is not available / too slow. This is exactly what you will want to avoid. Otherwise, why would you have split the components then?
My recommendation is that you design your application in a way that publishers can complete their tasks independent of the success of consumers wherever possible. Back-channels should be an exceptional case and be implemented in the described not-so coupled way.

Message broker vs. MOM (Message-Oriented Middleware)

I'm a little confused as to what the difference is between a message broker e.g. RabbitMQ and Message-orientated Middleware. I can't find much info apart from what's on Wikipedia. When searching MOM I find info on AMQP which states is a protocol for MOM.. what does this mean? What is MOM then? I also have read that RabbitMQ implements the AMPQ protocol.. so why does that make a RabbitMQ a messsage broker? Are a message broker and MOM the same thing?
Hope some can unravel my confusion. thanks
An overview -
A protocol - A set of rules.
AMQP - AMQP is an open internet protocol for reliably sending and receiving messages.
MOM (message-oriented-middleware) - is an approach, an architecture for distributed system i.e. a middle layer for the whole distributed system, where there's lot of internal communication (a component is querying data, and then needs to send it to the other component, which will be doing some processing on the data) so components have to share info/data among them.
Message broker - is any system (in MOM) which handles messages (sending as well as receiving), or to be more precise which routes messages to the specific consumer/recipient. A Message Broker is typically built upon a MOM. The MOM provides the base communication among the applications, and things like message persistence and guaranteed delivery. "Message brokers are a building block of Message oriented middleware."
Rabbitmq - a message broker; a MOM implementation; an open-source implementation of AMQP; as per Wikipedia:
RabbitMQ is an open source message broker software (sometimes
called message-oriented middleware) that implements the Advanced
Message Queuing Protocol (AMQP).
As you asked:
When searching MOM I find info on AMQP which states is a protocol for MOM.. what does this mean?
MOM is about having a messaging middleware (middle layer) between (distributed) system components, and AMQP is protocol (set of rules) for reliably sending and receiving messages. So, a MOM implementation (i.e. Rabbitmq) may use AMQP.
What is MOM then?
Message-Oriented-Middleware - is an approach, an architecture for distributed system i.e. a middle layer for the whole distributed system, where there's lot of internal communication (a component is querying data, and then needs to send it to the other component, which will be doing some processing on the data) so components have to share info/data among them.
In short it's a way to design a system, for example: depending upon the overall requirements we need to develop a distributed system, with some internal communication. The biggest advantage of MOM architecture/decision is decoupling of the components i.e. if we're going to change the data query component it'll have no effect on the data processing components, as they're communicating via MOM (e.g. Rabbitmq Cluster) - the data processing component is getting the data in form messages, which then parses and processes them.
MOM at the end is just a design decision, that we use a middleware for gluing our system (distributed) components, a middleware for handling communication between them, in the form of messages (i.e. JSON). To implement a message-oriented-middleware we need more - set of specific rules i.e. how the messages will be published, consumed, how the acknowledgement will work, the lifetime of a message is until it is consumed, the persistence of a message, etc. AMQP is basically these set of rules i.e. a standard/protocol for implementing a MOM i.e. a messaging system using AMQP, means it confines itself by the stated rules. From Wikipedia:
AMQP mandates the behavior of the messaging provider and client to the
extent that implementations from different vendors are inter-operable,
in the same way as SMTP, HTTP, FTP, etc. have created inter-operable
systems.
I also have read that RabbitMQ implements the AMPQ protocol.. so why does that make a RabbitMQ a message broker?
Yes, Rabbitmq is a message broker (publisher -> exchange -> queue -> consumer). It's an open source AMQP implementation i.e. a messaging system/broker which confines to AMQP (the AMQP rules) - one can use Rabbitmq as the middleware, hence MOM.
AMQP - is just a set of rules i.e .how messages will be published, kept (in queues), consumed, delivery acknowledgement, etc.
Are a message broker and MOM the same thing?
In simple words, Yes. If we need to go with MOM design for our distributed system, we can simply use Rabbitmq (a message broker; an AMQP implementation) as the middleware.
"MOM" broadly means any technology that can deliver "messages" from one user-space application to another. A message is usually understood to be a discrete piece of information, as compared to a stream.
MOM products used to be quite large and complex: CORBA, JMS, TIBCO, WebsphereMQ, etc. and tried to do a lot more than simply deliver messages.
A broker is a particular set of routing and queuing patterns, and we usually use the term "broker" specifically in MOM (as compared to HTTP, email, XMPP, etc.) Routing means, one message goes to one peer, to one of many peers, to all of many peers, etc. Queuing means messages are held in memory or disk until they can be delivered (and in some cases, acknowledged).
AMQP used to specific those broker patters, so an application could rely on consistent behavior from any AMQP-compatible broker (thus RabbitMQ and OpenAMQ looked much the same to a client app, like two HTTP or two XMPP servers would look the same). AMQP/1.0 specifies just the connection between nodes, so you don't have guarantees of behavior. This makes AMQP/1.0 much easier for firms to implement, but doesn't deliver interoperability.
ZeroMQ is message-oriented middleware that defines, like AMQP/1.0, the connections between pieces rather than the behaviour of a central broker. However it's relatively easy to write MOM brokers using 0MQ, and we've done a few of these (like Majordomo).
Message brokers are one (quite popular) kind of MOM. Another kind of MOM would be brokerless MOM, like ZeroMQ. With broker based MOM, all messages go to one central place: broker, and get distributed from there. Broker less MOM usually allows for peer to peer messaging (but does not exclude option of central server as well) .
AMQP is broker based MOM protocol definition (at least all versions prior to 1.0, which drifts into more general MOM), and there are several different Message brokers implementing that protocol, RabbitMQ is just one of them.

How to implement single-consumer-multi-queue model for rabbitMQ

I have found this image is very similar to my bussiness model. I need to split message to some queue.
for some heavy work. I can add more worker thread for them. But for some no much heavy work. I can
let single consumer to subscribe their message. But how to do that in rabbitMQ.
Through their document. I just found that single-queue-multi-consumer model.
You can add multiple workers to a queue
There can be multiple queues bound to an exchange.
In RabbitMQ, the producer always sends the message to an exchange. So, in your case, I hope only one exchange is enough. If you want to load balance at the consumer side, you have the above said two options.
You can also read my article:
https://techietweak.wordpress.com/2015/08/14/rabbitmq-a-cloud-based-message-oriented-middleware/
RabbitMQ has a very flexible model, which enables a wide variety of routing scenarios to take place.
I need to split message to some queue. for some heavy work. I can add more worker thread for them.
Yes, this is supported via a direct exchange. Publish a message using a routing key that is the same as the name of the queue. For convenience, let's say you use the fully-qualified object name (e.g. MyApp.Objects.DataTypeOne). All you need to do is subscribe multiple consuming processes to this queue, and RabbitMQ will load-balance using a round-robin approach.
But for some no much heavy work. I can let single consumer to subscribe their message.
Yes, you can do this also. Same process as in the paragraph above. Just don't attach multiple consuming processes.
I have found this image is very similar to my business model.
The diagram isn't very useful, because it lacks information about the type of messages being published. In that sense, it is only an interconnect diagram. The interesting lines are the ones connecting the queues to the exchange, as that is what you specify within RabbitMQ via Queue Bindings. You can also bind exchanges to one another, but that's a bit further than we probably need to go.
Everything else on the diagram is fully under your control as the user of the RabbitMQ/AMQP system. You can create an arbitrary number of publishers and have an arbitrary number of consuming processes each consuming from an arbitrary number of queues. There are no hard and fast limits, though there are some practical aspects you probably will want to think about to ensure your system is maintainable.

PubSub + Reliable message delivery to unreliably present subscribers

I need to build a system that uses a Publish/Subscribe bus (e.g. Mule, ZeroMQ, RabbitMQ), but the literature all implies that subscriber applications are reliably available to receive messages from topics to which they subscribe as soon as the Pub/Sub bus is able to deliver the message.
I have a system where some of the applications will be reliably connected to the Publish/Subscribe bus, but other applications will not be active or connected to the bus all the time.
The obvious solution is to have some sort of "presence" protocol between the unreliable application and the Publish/Subscribe bus so that "present" applications get their messages delivered immediately, and "not present" applications have their messages queued up in a persistent buffer of some kind, and as soon as they complete the "presence handshake", the queued messages are delivered to the newly present application.
Are there any Publish/Subscribe buses which have this kind of feature built in, or are there any open-source add-ons which do this? Can you point me to any URLs which describe this?
You can achieve this behaviour quite easily with any AMQP-compliant broker (such as RabbitMQ).
Choose the correct exchange type for your usage model. You'll want to use a direct exchange if you're always sending to absolutely named destinations, something like chat.messages.
If you want to do pattern-based routing, you'll want to use topic exchange. Then you can route based on patterns such a chat.messages.*.
Routing is described in more detail in the RabbitMQ Tutorials.
To create the kind of persistent subscription that you mention, have each subscriber create a queue that is private to that subscriber. The queue is then bound to the relevant routing keys on your chosen exchange.
Since each subscriber has its own queue, messages will be consumed by the subscriber when active and stored when subscriber is inactive or disconnected.
You haven't mentioned your language of choice, but in Java you can accomplish this with JMS using durable subscribers. Any implementation of JMS (there are many, including the aforementioned RabbitMQ) will support this feature.