Persisting Data in a Twisted App - twisted

I'm trying to understand how to persist data in a Twisted application. Let's say I've decided to write a Twisted server that:
Accepts inbound SMTP requests
Sends the message to a 3rd party system for modification
Relays the modified message to its destination
A typical Twisted tutorial would have you build this app using Deferreds and callbacks, roughly:
A Factory handles inbound requests
Each time a full email is received a call is sent to the remote message processor, returning a deferred
Add an errback that substitutes the original message if anything goes wrong in the modify call.
Add a callback to send the message on to the recipient, which again returns a deferred.
A real server would add/include additional call/errbacks to retry or notify the sender or whatnot. Again for simplicity, assume we consider this an acceptable amount of effort and just log errors.
Of course, this persists NO data in the event of a crash/restart/something else. I get that a solution involves a 3rd party persistent datastore (RabbitMQ is often mentioned) and could probably come up with a dozen random ways to achieve the outcome.
However, I imagine there are a few approaches that work best in a Twisted app. What do they look like? How do they store (and restore in the event of a crash) the in-process messages?

If you found this question, you probably already know that Twisted is event-based. It sounds simple, but the "hardest" part of the answer is to get the persistence platform generating the events we need when we need them. Naturally, you can persist the data in a DB or a message queue, but some platforms don't naturally generate events. For example:
ZeroMQ has (or at least had) no callback for new data. It's also relatively poor at persistence.
In other cases, events are easy but reliability is a problem:
pgSQL can be configured to generate events using triggers, but they're one-time things so you can't resume incomplete events
The light at the end of the tunnel seems to be something like RabbitMQ.
RabbitMQ can persist the message in a database to survive a crash
We can use acknowledgements on both legs (incoming SMTP to RabbitMQ and RabbitMQ to outgoing SMTP) to ensure the application is reliable. Importantly, RabbitMQ supports acknowledgements.
Finally, several of the RabbitMQ clients provide full asynchronous support (see for example pika, txampq, and puka)
It's enough for our purposes that the RabbitMQ client provides us an event-based interface.
At a more theoretical level, however, this need not be the case. In fact, despite the "notification" issue above, ZeroMQ has an event-based client. Even if our software is elegantly event-based, we will run into systems that aren't. In these cases, we have no choice but to fall back on polling. In principle, if not in practice, we just query the message provider for messages. When we exhaust the current queue (and immediately if there are no messages), we use a callLater command to check again in the future. It may feel anti-pattern, but it's (to the best of my knowledge anyway) the right way to handle this particular case.

Related

RabbitMQ Architecture Gut Check

So I'm thinking of using RabbitMQ to send messages between all the varied apps in our organization. In the attached image is essentially the picture in my mind of how things would work.
So the message goes into the exchange, and splits out into three queues.
Payloads are always JSON text.
The consumers are long-running windows services whose only job is to sit and listen for messages destined for their particular application.When a message comes in, they look at the header to determine how this payload JSON should be interpreted, and which REST endpoint it should be sent to. e.g., "When I see a 'WORK_ORDER_COMPLETE' header I am going to parse this as a WorkOrderCompleteDto and send it as a POST to the CompletedWorkOrder WebAPI method at timelabor-api.mycompany.com. If the API returns other than 200, I reject the message and let rabbit handle it. If I get a 200 back from the API, then I ack the message to rabbit."
Then end applications are simply our internal line-of-business apps that we use for inventory, billing, etc. Those applications are then responsible for performing their respective function (decrementing inventory, creating a billing record, yadda yadda.
Does this in any way make a sensible understanding of a proper way to use Rabbit?
Conceptually, I believe you may be relying on RabbitMQ to do things that your application needs to do.
The assumption of the architecture seems to be that each message is processed by each of your consuming applications totally in a vacuum. What this means is that you don't care that a message processed successfully by Billing_App ultimately failed with Inventory_App. Maybe this is true, but in my experience, it isn't.
If the end goal is to achieve some consistent state in the overall data, you're going to need a some supervisory component orchestrating and monitoring the various operations to ensure that the state is consistent. This means, in effect, that your statement about rejecting a message back to RabbitMQ means you have a bit more thought to put into what happens when something fails.
I would focus on identifying some UML activity diagrams that describe your behavior and how it achieves the end-state, and use that as a guide to determine how the orchestration of your application needs to be designed.

RabbitMQ+MassTransit: how to cancel queued message from processing?

In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed. Otherwise two systems will become out-of-sync (we deal with some outdates external systems, and if, for example, connection is dropped we have to discard all queued operations in scope of that connection).
Take a risk and resolve problem messages manually? Compensation actions (that could be tough to support in my case)? Anything else?
There are a few ways:
You can set a time-to-live when sending a message: await endpoint.Send(myMessage, c => c.TimeToLive = TimeSpan.FromHours(1));, but this will apply to all messages that are sent (or published) like this. I would consider this, after looking at your requirements. This is technical, but it is a proper messaging pattern.
Make TTL and generation timestamp properties of your message itself and let the consumer decide if the message is still worth processing. This is more business and, probably, the most correct way.
Combine tech and business - keep the timestamp and TTL in message headers so they don't pollute your message contracts, and filter them out using a custom middleware. In this case, you need to be careful to log such drops so you won't be left wonder why messages disappear now and then.
Almost any unreliable integration can be monitored using sagas, with timeouts. For example, we use a saga to integrate with Twilio. Since we have no ability to open a webhook for them, we poll after some interval to check the message status. You can start a saga when you get a message and schedule a message to check if the processing is still waiting. As discussed in comments, you can either use the "human intervention required" way to fix the issue or let the saga decide to drop the message.
A similar way could be to use a lookup table, where you put the list of messages that aren't relevant for processing. Such a table would be similar to the list of sagas. It seems that this way would also require scheduling. Both here, and for the saga, I'd recommend using a separate receive endpoint (a queue) for the DropIt message, with only one consumer. It would prevent DropIt messages from getting stuck behind the integration messages that are waiting to be processed (and some should be already dropped)
Use RMQ management API to remove messages from the queue. This is the worst method, I won't recommend it.
From what I understand, you're building a system that sends messages to 3rd party systems. In other words, systems you don't control. It has an API but compensating actions aren't always possible, because the API doesn't provide it or because actions are performed inside the 3rd party system that can't be compensated or rolled back?
If possible try to solve this via sagas. Make sure the saga executes the different steps (the sending of messages) in the right order. So that messages that cannot be compensated are sent last. This way message that can be compensated if they fail, will be compensated by the saga. The ones that cannot be compensated should be sent last, when you're as sure as possible that they don't have to be compensated. Because that last message is the last step in synchronizing all systems.
All in all this is one of the problems with distributed systems, keeping everything in sync. Compensating actions is the way to deal with this. If compensating actions aren't possible, you're in a very difficult situation. Try to see if the business can help by becoming more flexible and accepting that you need to compensate things, where they'll tell you it's not possible.
In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed.
Can't you revert this into:
Tell the consumer that an earlier message can be processed.
This way you can easily turn this in a state machine (like a saga) that acts on two messages. If the 2nd message never arrives then you can discard the 1st after a while or do something else.
The strategy here is to halt/wait until certain that no actions need to be reverted.

Message bus: sender must wait for acknowledgements from multiple recipients

In our application the publisher creates a message and sends it to a topic.
It then needs to wait, when all of the topic's subscribers ack the message.
It does not appear, the message bus implementations can do this automatically. So we are leaning towards making each subscriber send their own new message for the client, when they are done.
Now, the client can receive all such messages and, when it got one from each destination, do whatever clean-ups it has to do. But what if the client (sender) crashes part way through the stream of acknowledgments? To handle such a misfortune, I need to (re)implement, what the buses already implement, on the client -- save the incoming acknowledgments until I get enough of them.
I don't believe, our needs are that esoteric -- how would you handle the situation, where the sender (publisher) must wait for confirmations from multiple recipients (subscribers)? Sort of like requesting (and awaiting) Return-Receipts from each subscriber to a mailing list...
We are using RabbitMQ, if it matters. Thanks!
The functionality that you are looking for sounds like a messaging solution that can perform transactions across publishers and subscribers of a message. In The Java world, JMS specifies such transactions. One example of a JMS implementation is HornetQ.
RabbitMQ does not provide such functionality and it does for good reasons. RabbitMQ is built for being extremely robust and to perform like hell at the same time. The transactional behavior that you describe is only achievable with the cost of reasonable performance loss (especially if you want to keep outstanding robustness).
With RabbitMQ, one way to assure that a message was consumed successfully, is indeed to publish an answer message on the consumer side that is then consumed by the original publisher. This can be achieved through RabbitMQ's RPC procedure calls which might help you to get a clean solution for your problem setting.
If the (original) publisher crashes before all answers could be received, you can assume that all outstanding answers are still queued on the broker. So you would have to build your publisher in a way that it is capable to resume with processing those left messages. This might turn out to be none-trivial.
Finally, I recommend the following solution: Design your producing component in a way that you can consume the answers with one or more dedicated answer consumers that are separated from the origin publisher.
Benefits of this solution are:
the origin publisher can finish its task independent of consumer success
the origin publisher is independent of consumer availability and speed
the origin publisher implementation is far less complex
in a crash scenario, the answer consumer can resume with processing answers
Now to a more general point: One of the major benefits of messaging is the decoupling of application components by the broker. In AMQP, this is achieved with exchanges and bindings that allow you to move message distribution logic from your application to a central point of configuration.
If you add RPC-style calls to your clients, then your components are most likely closely coupled again, meaning that the publishing component fails if one of the consuming components fails / is not available / too slow. This is exactly what you will want to avoid. Otherwise, why would you have split the components then?
My recommendation is that you design your application in a way that publishers can complete their tasks independent of the success of consumers wherever possible. Back-channels should be an exceptional case and be implemented in the described not-so coupled way.

a completely decoupled OO system?

To make an OO system as decoupled as possible, I'm thinking of the following approach:
1) we run an RMI/directory like service where objects can register and discover each other. They talk to this service through an interface
2) we run a messaging service to which objects can publish messages, and register subscription callbacks. Again, this happens through interfaces
3) when object A wants to invoke a method on object B, it discovers the target object's unique identity through #1 above, and publishes a message on the message service for object B
4) message services invokes B's callback to give it the message
5) B processes the request and sends the response for A on message service
6) A's callback is called and it gets the response.
I feel this system is as decoupled as practically possible, but it has the following problems:
1) communication is typically asynchronous
2) hence it's non real time
3) the system as a whole is less efficient.
Are there any other practical problems where this design obviously won't be applicable ? What are your thoughts on this design in general ?
Books
Enterprise Integration Patterns
It appears he's talking about using a Message Oriented Middleware
Here are some things to consider
Security
What will prevent another rogue service from registering as a key component in your system. You will need way to validate and verify that services are who they say they are. This can be done through a PKI system. There are scenarios that you might not need to do this, if your system is hosted entirely on your intranet. IF that is the case Social Engineering and Rogue Employees will be your biggest threat.
Contract
What kind of contract will your clients have with the services? Will messages all be serialized as XML and sent as a TextMessage? If you use a pure byte message you'll have to be careful about byte order if your services are to run on multiple platforms.
Synchronization
Most developers are not able to comprehend and utilize asynchronous messages correctly. Where possible it might be in the best interest of your design to create a way to invoke "synchronous" messages. I've done this in the past by creating a sendMessageAndWait() method with a timeout and a return object. Within the method you can create a temporary topic id to receive the response, register a listener for it, then use locks to wait for a message to be returned on your temporary topic.
Unsolicited Messages
What happens if you want to allow your service(s) to send unsolicited messages to your clients? A critical event happened in Service A and it must notify your clients or possibly a Watch Dog service. Allow for your design to register for a common communication channel for services to communicate with clients without clients initiating the conversation.
Failover
What happens if a critical service processing your credit cards goes down? You'll need to implement a Failover and Watch Dog service to ensure that your key infrastructure is always up and running. You could register a list of services within your registry then your register could give out the primary service, falling back to a secondary service if your primary stops communicating. Or if your Message Oriented Middleware can handle Round Robin messaging you might be able to register all the services on the same topic. Think about creating a way to know when a service has died. Since most messages are Asynchronous it will be difficult to determine when a service has gone offline. This can be done with a Heartbeat and Watch Dog.
I've created this type of system a few times in my past for large systems that needed to communicate. If you and other developers understand the pros and cons of such a system it can be very powerful and flexible.
The biggest piece of advice I can give is to build a toolkit for your other developers so they don't have to think about how to register a service, or implement failover, or respond to messages from a client. These are the sorts of things that will kill your system and have others say it is too complicated. Making it painless for them will allow your system to work the way you need it with flexibility and decoupling while not burdening your developers with understanding enterprise design patterns.
This is not a Ivory Tower Architect/Architecture. It would be if he said, "This is how it will do done, now go do it and don't bother me about it because I know I'm right." If you really wanted to reference a Anti-Pattern it could be Kitchen Sink, maybe. Nah now that I think about it, it isn't Kitchen Sink either.
If you can find one please post it as a comment.
Anti-Patterns
Coupling is simply a balance between efficiency and re-usability. If you wish the modules of your system to be as reusable as possible then that will undoubtedly come at a cost.
Personally I think it best to define some key assumptions which may tighten coupling, but bring increased efficiency.
There are design patterns which never see the light of day just because the benefit they provide is not worth the cost in complexity.
What's the simplest thing that could possibly work? Do modularize into reasonable size routines, but avoid interfaces, services, messages and all of this unless you are going to have multiple implementations or multiple hardware resources to divide a job.
Make it simple, then refactor those parts that turned out to matter.

What is an MQ and why do I want to use it?

On my team at work, we use the IBM MQ technology a lot for cross-application communication. I've seen lately on Hacker News and other places about other MQ technologies like RabbitMQ. I have a basic understanding of what it is (a commonly checked area to put and get messages), but what I want to know what exactly is it good at? How will I know where I want to use it and when? Why not just stick with more rudimentary forms of interprocess messaging?
All the explanations so far are accurate and to the point - but might be missing something: one of the main benefits of message queueing: resilience.
Imagine this: you need to communicate with two or three other systems. A common approach these days will be web services which is fine if you need an answers right away.
However: web services can be down and not available - what do you do then? Putting your message into a message queue (which has a component on your machine/server, too) typically will work in this scenario - your message just doesn't get delivered and thus processed right now - but it will later on, when the other side of the service comes back online.
So in many cases, using message queues to connect disparate systems is a more reliable, more robust way of sending messages back and forth. It doesn't work well for everything (if you want to know the current stock price for MSFT, putting that request into a queue might not be the best of ideas) - but in lots of cases, like putting an order into your supplier's message queue, it works really well and can help ease some of the reliability issues with other technologies.
MQ stands for messaging queue.
It's an abstraction layer that allows multiple processes (likely on different machines) to communicate via various models (e.g., point-to-point, publish subscribe, etc.). Depending on the implementation, it can be configured for things like guaranteed reliability, error reporting, security, discovery, performance, etc.
You can do all this manually with sockets, but it's very difficult.
For example: Suppose you want to processes to communicate, but one of them can die in the middle and later get reconnected. How would you ensure that interim messages were not lost? MQ solutions can do that for you.
Message queueuing systems are supposed to give you several bonuses. Among most important ones are monitoring and transactional behavior.
Transactional design is important if you want to be immune to failures, such as power failure. Imagine that you want to notify a bank system of ATM money withdrawal, and it has to be done exactly once per request, no matter what servers failed temporarily in the middle. MQ systems would allow you to coordinate transactions across multiple database, MQ and other systems.
Needless to say, such systems are very slow compared to named pipes, TCP or other non-transactional tools. If high performance is required, you would not allow your messages to be written thru disk. Instead, it will complicate your design - to achieve exotic reliable AND fast communication, which pushes the designer into really non-trivial tricks.
MQ systems normally allow users to watch the queue contents, write plugins, clear queus, etc.
MQ simply stands for Message Queue.
You would use one when you need to reliably send a inter-process/cross-platform/cross-application message that isn't time dependent.
The Message Queue receives the message, places it in the proper queue, and waits for the application to retrieve the message when ready.
reference: web services can be down and not available - what do you do then?
As an extension to that; what if your local network and your local pc is down as well?? While you wait for the system to recover the dependent deployed systems elsewhere waiting for that data needs to see an alternative data stream.
Otherwise, that might not be good enough 'real time' response for today's and very soon in the future Internet of Things (IOT) requirements.
if you want true parallel, non volatile storage of various FIFO streams(at least at some point along the signal chain) use an FPGA and FRAM memory. FRAM runs at clock speed and FPGA devices can be reprogrammed on the fly adding and taking away however many independent parallel data streams are needed(within established constraints of course).