What is an MQ and why do I want to use it? - activemq

On my team at work, we use the IBM MQ technology a lot for cross-application communication. I've seen lately on Hacker News and other places about other MQ technologies like RabbitMQ. I have a basic understanding of what it is (a commonly checked area to put and get messages), but what I want to know what exactly is it good at? How will I know where I want to use it and when? Why not just stick with more rudimentary forms of interprocess messaging?

All the explanations so far are accurate and to the point - but might be missing something: one of the main benefits of message queueing: resilience.
Imagine this: you need to communicate with two or three other systems. A common approach these days will be web services which is fine if you need an answers right away.
However: web services can be down and not available - what do you do then? Putting your message into a message queue (which has a component on your machine/server, too) typically will work in this scenario - your message just doesn't get delivered and thus processed right now - but it will later on, when the other side of the service comes back online.
So in many cases, using message queues to connect disparate systems is a more reliable, more robust way of sending messages back and forth. It doesn't work well for everything (if you want to know the current stock price for MSFT, putting that request into a queue might not be the best of ideas) - but in lots of cases, like putting an order into your supplier's message queue, it works really well and can help ease some of the reliability issues with other technologies.

MQ stands for messaging queue.
It's an abstraction layer that allows multiple processes (likely on different machines) to communicate via various models (e.g., point-to-point, publish subscribe, etc.). Depending on the implementation, it can be configured for things like guaranteed reliability, error reporting, security, discovery, performance, etc.
You can do all this manually with sockets, but it's very difficult.
For example: Suppose you want to processes to communicate, but one of them can die in the middle and later get reconnected. How would you ensure that interim messages were not lost? MQ solutions can do that for you.

Message queueuing systems are supposed to give you several bonuses. Among most important ones are monitoring and transactional behavior.
Transactional design is important if you want to be immune to failures, such as power failure. Imagine that you want to notify a bank system of ATM money withdrawal, and it has to be done exactly once per request, no matter what servers failed temporarily in the middle. MQ systems would allow you to coordinate transactions across multiple database, MQ and other systems.
Needless to say, such systems are very slow compared to named pipes, TCP or other non-transactional tools. If high performance is required, you would not allow your messages to be written thru disk. Instead, it will complicate your design - to achieve exotic reliable AND fast communication, which pushes the designer into really non-trivial tricks.
MQ systems normally allow users to watch the queue contents, write plugins, clear queus, etc.

MQ simply stands for Message Queue.
You would use one when you need to reliably send a inter-process/cross-platform/cross-application message that isn't time dependent.
The Message Queue receives the message, places it in the proper queue, and waits for the application to retrieve the message when ready.

reference: web services can be down and not available - what do you do then?
As an extension to that; what if your local network and your local pc is down as well?? While you wait for the system to recover the dependent deployed systems elsewhere waiting for that data needs to see an alternative data stream.
Otherwise, that might not be good enough 'real time' response for today's and very soon in the future Internet of Things (IOT) requirements.
if you want true parallel, non volatile storage of various FIFO streams(at least at some point along the signal chain) use an FPGA and FRAM memory. FRAM runs at clock speed and FPGA devices can be reprogrammed on the fly adding and taking away however many independent parallel data streams are needed(within established constraints of course).

Related

Redirect NServiceBus message based on Endpoint availability

I'm new to NServiceBus, but currently using it with SQL Server Transport to send messages between three machines: one belongs to an endpoint called Server, and two belong to an endpoint called Agent. This is working as expected, with messages sent to the Agent endpoint distributed to one of the two machines via the default round-robin.
I now want to add a new endpoint called PriorityAgent with a different queue and two additional machines. While all endpoints use the same message type, I know where each message should be handled prior to sending it, so normally I can just choose the correct destination endpoint and the message will be processed accordingly.
However, I need to build in a special case: if all machines on the PriorityAgent endpoint are currently down, messages that ordinarily should be sent there should be sent to the Agent endpoint instead, so they can be processed without delay. On the other hand, if all machines on the Agent endpoint are currently down, any Agent messages should not be sent to PriorityAgent, they can simply wait for an Agent machine to return.
I've been researching the proper way to implement this, and haven't seen many results. I imagine this isn't an unheard-of scenario, so my assumption is that I'm searching for the wrong things or thinking about this problem in the wrong way. Still, I came up with a couple potential solutions:
Separately track heartbeats of PriorityAgent machines, and add a mutator or behavior to change the destination of outgoing PriorityAgent messages to the Agent endpoint if those heartbeats stop.
Give PriorityAgent messages a short expiration, and somehow handle the expiration to redirect messages to the Agent endpoint. I'm not sure if this is actually possible.
Is one of these solutions on the right track, or am I off-base entirely?
You have not seen many do this because it's considered an antipattern. Or rather one of two antipatterns.
1) Either you are sending a command, in which case the RECEIVER of the command defines the contract. Why are you sending a command defined by PriorityAgent to Agent? There should be no coupling there. A command belongs to ONE logical endpoint/queue.
2) Or you are publishing an event defined by whoever publishes, with both PriorityAgent and Agent as subscribers. The two subscribers should be 100% autonomous and share nothing. Checking heartbeats/sharing info between these two logical separate entities is a bad thing. Why have them separately in the first place then? If they know about each other "dirty secrets," they should be the same thing.
If your primary concern is that the PriorityAgent messages will not be handled if the machines hosting it are down, and want to use the machines hosting Agent as a backup, simply deploy PriorityAgent there as well. One machine can run more than one endpoint just fine.
That way you can leverage the additional machines, but don't have to get dirty with sending the same command to a different logical endpoint or coupling two different logical endpoints together through some back channel.
I'm Dennis van der Stelt and I work for Particular Software, makers of NServiceBus.
From what I understand, both PriorityAgent and Agent are already scaled out over multiple machines? Then they both work according to competing consumers pattern. In other words, both machines try to pick up messages from the same queue, where only one will win and starts processing the message.
You're also talking about high availability. So when PriorityAgent goes down, another machine will pick it up. That's what I don't understand. Why fail over to Agent, which seems to me to be a logically different endpoint? If it is logically different, how can it handle PriorityAgent messages? If it can handle the same message, it seems logically the same endpoint. Then why make the difference between PriorityAgent and Agent?
Besides that, SQL Server has all kinds of features (like Always-On) to make sure it does not (completely) go down. Why try to solve difficult scenarios with custom build solutions, when SQL Server can already solve this for you?
Another scenario could be that PriorityAgent should handle priority cases. Something like preferred customers, or high-value customers. That is sometimes used when (for example) a lot of orders (read: messages) come in, but we want to deal with high-value customers sooner than regular customers. But due to the amount of messages coming in, high-value customers would also end up in the back of the queue, together with regular customers. A solution could be to publish these messages and have two different endpoints (with different queues) subscribed both to this message. Both receive each unique message, but check whether it's a message they should handle. The Agent will ignore high-value customers, the PriorityAgent will ignore regular customer.
These are some of the solutions available as standard messaging patterns, or infrastructural solutions to solving your issue. Again, it's not completely clear to me what it is you're looking for. If you'd like to continue the discussion; perhaps you want to email support#particular.net and we can continue the discussion there.

Persisting Data in a Twisted App

I'm trying to understand how to persist data in a Twisted application. Let's say I've decided to write a Twisted server that:
Accepts inbound SMTP requests
Sends the message to a 3rd party system for modification
Relays the modified message to its destination
A typical Twisted tutorial would have you build this app using Deferreds and callbacks, roughly:
A Factory handles inbound requests
Each time a full email is received a call is sent to the remote message processor, returning a deferred
Add an errback that substitutes the original message if anything goes wrong in the modify call.
Add a callback to send the message on to the recipient, which again returns a deferred.
A real server would add/include additional call/errbacks to retry or notify the sender or whatnot. Again for simplicity, assume we consider this an acceptable amount of effort and just log errors.
Of course, this persists NO data in the event of a crash/restart/something else. I get that a solution involves a 3rd party persistent datastore (RabbitMQ is often mentioned) and could probably come up with a dozen random ways to achieve the outcome.
However, I imagine there are a few approaches that work best in a Twisted app. What do they look like? How do they store (and restore in the event of a crash) the in-process messages?
If you found this question, you probably already know that Twisted is event-based. It sounds simple, but the "hardest" part of the answer is to get the persistence platform generating the events we need when we need them. Naturally, you can persist the data in a DB or a message queue, but some platforms don't naturally generate events. For example:
ZeroMQ has (or at least had) no callback for new data. It's also relatively poor at persistence.
In other cases, events are easy but reliability is a problem:
pgSQL can be configured to generate events using triggers, but they're one-time things so you can't resume incomplete events
The light at the end of the tunnel seems to be something like RabbitMQ.
RabbitMQ can persist the message in a database to survive a crash
We can use acknowledgements on both legs (incoming SMTP to RabbitMQ and RabbitMQ to outgoing SMTP) to ensure the application is reliable. Importantly, RabbitMQ supports acknowledgements.
Finally, several of the RabbitMQ clients provide full asynchronous support (see for example pika, txampq, and puka)
It's enough for our purposes that the RabbitMQ client provides us an event-based interface.
At a more theoretical level, however, this need not be the case. In fact, despite the "notification" issue above, ZeroMQ has an event-based client. Even if our software is elegantly event-based, we will run into systems that aren't. In these cases, we have no choice but to fall back on polling. In principle, if not in practice, we just query the message provider for messages. When we exhaust the current queue (and immediately if there are no messages), we use a callLater command to check again in the future. It may feel anti-pattern, but it's (to the best of my knowledge anyway) the right way to handle this particular case.

Is Message Queuing the right strategy for a high-bandwidth data feed?

I have a huge network of data-collection servers which generate a large volume of real-time data.
In the past I've provided partners with the ability to get this data in near-real-time using HTTP GET's. But for many reasons I'm eager to ditch this.
So yeah... I'm eager to build out a new distribution system and I was thinking that a Message Queuing System was the way to go.
I need to be able to distribute data from my sources to a number of different partners. Some partners receive all of it, others just get a portion. And, if a partner gets disconnected, they need to be able to reconnect and not miss any data. (Although, for the sake of disk and memory I'd like their queued messages to expire after hour or so)
Lastly I need the system to be able to handle tens of thousands of enqueue's per minute.
Do you think Message Queuing is an appropriate scheme?
I was looking at using RabbitMQ. Is it difficult to maintain?
Thanks Very Much!
-Z
I cannot tell you if it is the right strategy in your specific case, but message products are indeed used in high message rate systems every day.
Much of the investment world uses various products, both commercial (Tibco) and Open source (ZeroMQ) to name just two, to handle market data from exchanges and other sources. These are likely at least as active as your data sensors are.
The publish/subscribe model, where some receivers want some messages and some receivers want all, along with late-join or other so-called guaranteed messaging are indeed standard features on most of these products.
So do go ahead and investigate products, I have not used RabbitMQ myself, so cannot comment on it specifically, however with a minimal abstraction layer, you should be able to insulate yourself from too many platform specific calls, and therefore allow you to swap message-bus implementers if the need arises. (You may even want to build such a shim as part of a proof-of-concept to test out more than one product for your specific purpose. You get experience in multiple products, flesh out the facade layer, and get up to speed on the products)
Good Luck

a completely decoupled OO system?

To make an OO system as decoupled as possible, I'm thinking of the following approach:
1) we run an RMI/directory like service where objects can register and discover each other. They talk to this service through an interface
2) we run a messaging service to which objects can publish messages, and register subscription callbacks. Again, this happens through interfaces
3) when object A wants to invoke a method on object B, it discovers the target object's unique identity through #1 above, and publishes a message on the message service for object B
4) message services invokes B's callback to give it the message
5) B processes the request and sends the response for A on message service
6) A's callback is called and it gets the response.
I feel this system is as decoupled as practically possible, but it has the following problems:
1) communication is typically asynchronous
2) hence it's non real time
3) the system as a whole is less efficient.
Are there any other practical problems where this design obviously won't be applicable ? What are your thoughts on this design in general ?
Books
Enterprise Integration Patterns
It appears he's talking about using a Message Oriented Middleware
Here are some things to consider
Security
What will prevent another rogue service from registering as a key component in your system. You will need way to validate and verify that services are who they say they are. This can be done through a PKI system. There are scenarios that you might not need to do this, if your system is hosted entirely on your intranet. IF that is the case Social Engineering and Rogue Employees will be your biggest threat.
Contract
What kind of contract will your clients have with the services? Will messages all be serialized as XML and sent as a TextMessage? If you use a pure byte message you'll have to be careful about byte order if your services are to run on multiple platforms.
Synchronization
Most developers are not able to comprehend and utilize asynchronous messages correctly. Where possible it might be in the best interest of your design to create a way to invoke "synchronous" messages. I've done this in the past by creating a sendMessageAndWait() method with a timeout and a return object. Within the method you can create a temporary topic id to receive the response, register a listener for it, then use locks to wait for a message to be returned on your temporary topic.
Unsolicited Messages
What happens if you want to allow your service(s) to send unsolicited messages to your clients? A critical event happened in Service A and it must notify your clients or possibly a Watch Dog service. Allow for your design to register for a common communication channel for services to communicate with clients without clients initiating the conversation.
Failover
What happens if a critical service processing your credit cards goes down? You'll need to implement a Failover and Watch Dog service to ensure that your key infrastructure is always up and running. You could register a list of services within your registry then your register could give out the primary service, falling back to a secondary service if your primary stops communicating. Or if your Message Oriented Middleware can handle Round Robin messaging you might be able to register all the services on the same topic. Think about creating a way to know when a service has died. Since most messages are Asynchronous it will be difficult to determine when a service has gone offline. This can be done with a Heartbeat and Watch Dog.
I've created this type of system a few times in my past for large systems that needed to communicate. If you and other developers understand the pros and cons of such a system it can be very powerful and flexible.
The biggest piece of advice I can give is to build a toolkit for your other developers so they don't have to think about how to register a service, or implement failover, or respond to messages from a client. These are the sorts of things that will kill your system and have others say it is too complicated. Making it painless for them will allow your system to work the way you need it with flexibility and decoupling while not burdening your developers with understanding enterprise design patterns.
This is not a Ivory Tower Architect/Architecture. It would be if he said, "This is how it will do done, now go do it and don't bother me about it because I know I'm right." If you really wanted to reference a Anti-Pattern it could be Kitchen Sink, maybe. Nah now that I think about it, it isn't Kitchen Sink either.
If you can find one please post it as a comment.
Anti-Patterns
Coupling is simply a balance between efficiency and re-usability. If you wish the modules of your system to be as reusable as possible then that will undoubtedly come at a cost.
Personally I think it best to define some key assumptions which may tighten coupling, but bring increased efficiency.
There are design patterns which never see the light of day just because the benefit they provide is not worth the cost in complexity.
What's the simplest thing that could possibly work? Do modularize into reasonable size routines, but avoid interfaces, services, messages and all of this unless you are going to have multiple implementations or multiple hardware resources to divide a job.
Make it simple, then refactor those parts that turned out to matter.

MSMQ, WCF, and Flaky Servers

I have two applications, let us call them A and B. Currently A uses WCF to send messages to B. A doesn't need a response and B never sends messages back to A.
Unfortunately, there is a flaky network connection between the servers A and B are running on. This results in A getting timeout errors from time to time.
I would like to use WCF+MSMQ as a buffer between the two applications. That way if B goes down temporarily, or is otherwise inaccessible, the messages are not lost.
From an architectural standpoint, how should I configure this?
I think you might have inflated your question a bit with the inclusion of the word "architectural".
If you truly need an architectural overview of this issue from that high of a level, including SLA concerns, your SL will be as good as your MSMQ deployment, so if you are concerned about SL, just look at the documentation on the internet about MSMQ and SLA.
If you are looking more for the actual implementation from a code standpoint, this article is excellent:
http://code.msdn.microsoft.com/msmqpluswcf
It goes over a lot of the things you'll need to know, including how to setup MSMQ and how to implement chunking to get around MSMQ's 4MB limit (if this is necessary... I hope it's not).
Here's a good article about creating a durable and transactional queue that will cross machines using an MSMQ cluster: http://www.devx.com/enterprise/Article/39015/1954