I've got a JMS messaging system implemented with two queues. One is used as a standard queue second is an error queue.
This system was implemented to handle database concurrency in my application. Basically, there are users and users have assets. One user can interact with another user and as a result of this interaction their assets can change. One user can interact with single user at once, so they cannot start another interaction before the first one finishes. However, one user can be in interaction with other users multiple times [as long as they started the interaction].
What I did was: crated an "interaction registry" in redis, where I store the ID of users who begin an interaction. During interaction I gather all changes that should be made to the second user's assets, and after interaction is finished I send those changes to the queue [user who has started the interaction is saved within the original transaction]. After the interaction is finished I clear the ID from registry in redis.
Listener of my queue will receive a message with information about changes to the user that need to be done. Listener will get all objects which require a change from the database and update it. Listener will check before each update if there is an interaction started by the user being updated. If there is - listener will rollback the transaction and put the message back on the queue. However, if there's something else wrong, message will be put on to the error queue and will be retried several times before it is logged and marked as failed. Phew.
Now I'm at the point where I need to create a proper integration test, so that I make sure no future changes will screw this up.
Positive testing is easy, unfortunately I have to test scenarios, where during updates there's an OptimisticLockFailureException, my own UserInteractingException & some other exceptions [catch (Exception e) that is].
I can simulate my UserInteractingException by creating a payload with hundreds of objects to be updated by the listener and changing one of it in the test. Same thing with OptimisticLockFailureException. But I have no idea how to simulate something else [I can't even think of what could it be].
Also, this testing scenario based on a fluke [well, chance that presented scenario will not trigger an error is very low] is not something I like. I would like to have something more concrete.
Is there any other, good, way to test this scenarios?
Thanks
I did as I described in the original question and it seems to work fine.
Any delays I can test with camel.
Related
I am currently developing a Microservice that is interacting with other microservices.
The problem now is that those interactions are really time-consuming. I already implemented concurrent calls via Uni and uses caching where useful. Now I still have some calls that still need some seconds in order to respond and now I thought of another thing, which I could do, in order to improve the performance:
Is it possible to send a response before the sucessfull persistence of data? I send requests to the other microservices where they have to persist the results of my methods. Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull?
With that, the front-end could already begin working even though my API is not 100% finished.
I saw that there is a possible status-code 207 but it's rather used with streams where someone wants to split large files. Is there another possibility? Thanks in advance.
"Is it possible to send a response before the sucessfull persistence of data? Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull? With that, the front-end could already begin working even though my API is not 100% finished."
You can and should, but it is a philosophy change in your API and possibly you have to consider some edge cases and techniques to deal with them.
In case of a long running API call, you can issue an "ack" response, a traditional 200 one, only the answer would just mean the operation is asynchronous and will complete in the future, something like { id:49584958, apicall:"create", status:"queued", result:true }
Then you can
poll your API with the returned ID to see if the operation that is still ongoing, has succeeded or failed.
have a SSE channel (realtime server side events) where your server can issue status messages as pending operations finish
maybe using persistent connections and keepalives, or flushing the response in the middle, you can achieve what you point out, ie. like a segmented response. I am not familiar with that approach as I normally go for the suggesions above.
But in any case, edge cases apply exactly the same: For example, what happens if then through your API a user issues calls dependent on the success of an ongoing or not even started previous command? like for example, get information about something still being persisted?
You will have to deal with these situations with mechanisms like:
Reject related operations until pending call is resolved "server side": Api could return ie. a BUSY error informing that operations are still ongoing when you want to, for example, delete something that still is being created.
Queue all operations so the server executes all them sequentially.
Allow some simulatenous operations if you find they will not collide (ie. create 2 unrelated items)
I have a requirement to design a notification system for multi-user(~1000 users) application, here are the high level requirements.
System event gets triggered on specific operations.
On event trigger, individual notification for all(or sometimes only for relevant) users gets generated and stored in database.
While user logs in, all unread notifications for him will be pulled and displayed in ui.
While user reads the notification, we capture the read status.
A scheduler in background evicts all the stale notifications.
This seems like a very typical use case and straight forward to implement with the database.
But my doubt is, is there any way we can replace the Database with the Queue based messaging system? The reason I think this way is because, the use case I have seems like asynchronous in nature(like events, notifications and timely eviction of messages).
While I replace the Database with Queues, the first 2 points from above fits well, but on later part I have some doubts -
In General, are queues flexible to store and query notifications based on user ids ?
Consider this scenario - Notifications gets generated and stored in the queue, and the user is not logged in, what is the best way to handle consumer messages.
a. Should the consumer constantly listen for the messages ?, If so should the messages be stored in application memory(does not seems to be good option) ?
b. Or the consumers should be created for each users dynamically on user login? Is this a regular pattern ?
Any other recommended ways ?
Thanks
Your use-case is suited to a database, not a message queue. While conceptually similar to the use case, a message queue is intended for extremely short-duration storage (i.e. to buffer data moving between running processes). Since you have no control over when users log in, these notifications will potentially be stored for minutes, hours, maybe even weeks. You need a persistent storage mechanism.
In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed. Otherwise two systems will become out-of-sync (we deal with some outdates external systems, and if, for example, connection is dropped we have to discard all queued operations in scope of that connection).
Take a risk and resolve problem messages manually? Compensation actions (that could be tough to support in my case)? Anything else?
There are a few ways:
You can set a time-to-live when sending a message: await endpoint.Send(myMessage, c => c.TimeToLive = TimeSpan.FromHours(1));, but this will apply to all messages that are sent (or published) like this. I would consider this, after looking at your requirements. This is technical, but it is a proper messaging pattern.
Make TTL and generation timestamp properties of your message itself and let the consumer decide if the message is still worth processing. This is more business and, probably, the most correct way.
Combine tech and business - keep the timestamp and TTL in message headers so they don't pollute your message contracts, and filter them out using a custom middleware. In this case, you need to be careful to log such drops so you won't be left wonder why messages disappear now and then.
Almost any unreliable integration can be monitored using sagas, with timeouts. For example, we use a saga to integrate with Twilio. Since we have no ability to open a webhook for them, we poll after some interval to check the message status. You can start a saga when you get a message and schedule a message to check if the processing is still waiting. As discussed in comments, you can either use the "human intervention required" way to fix the issue or let the saga decide to drop the message.
A similar way could be to use a lookup table, where you put the list of messages that aren't relevant for processing. Such a table would be similar to the list of sagas. It seems that this way would also require scheduling. Both here, and for the saga, I'd recommend using a separate receive endpoint (a queue) for the DropIt message, with only one consumer. It would prevent DropIt messages from getting stuck behind the integration messages that are waiting to be processed (and some should be already dropped)
Use RMQ management API to remove messages from the queue. This is the worst method, I won't recommend it.
From what I understand, you're building a system that sends messages to 3rd party systems. In other words, systems you don't control. It has an API but compensating actions aren't always possible, because the API doesn't provide it or because actions are performed inside the 3rd party system that can't be compensated or rolled back?
If possible try to solve this via sagas. Make sure the saga executes the different steps (the sending of messages) in the right order. So that messages that cannot be compensated are sent last. This way message that can be compensated if they fail, will be compensated by the saga. The ones that cannot be compensated should be sent last, when you're as sure as possible that they don't have to be compensated. Because that last message is the last step in synchronizing all systems.
All in all this is one of the problems with distributed systems, keeping everything in sync. Compensating actions is the way to deal with this. If compensating actions aren't possible, you're in a very difficult situation. Try to see if the business can help by becoming more flexible and accepting that you need to compensate things, where they'll tell you it's not possible.
In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed.
Can't you revert this into:
Tell the consumer that an earlier message can be processed.
This way you can easily turn this in a state machine (like a saga) that acts on two messages. If the 2nd message never arrives then you can discard the 1st after a while or do something else.
The strategy here is to halt/wait until certain that no actions need to be reverted.
I'm struggling to understand how to implement Eventual Consistency with the exposed example of BacklogItems and Tasks from Vaughn Vernon. The statement I've understood so far is (considering the case where he splits BacklogItem and Task into separate aggregate roots):
A BacklogItem can contain one or more tasks. When all remaining hours from a the tasks of a BacklogItem are 0, the status of the BacklogItem should change to "DONE"
I'm aware about the rule that says that you should not update two aggregate roots in the same transaction, and that you should accomplish that with eventual consistency.
Once a Domain Service updates the amount of hours of a Task, a TaskRemainingHoursUpdated event should be published to a DomainEventPublisher which lives in the same thread as the executing code. And here it is where I'm at a loss with the following questions:
I suppose that there should be a subscriber (also living in the same thread I guess) that should react to TaskRemainingHoursUpdated events. At which point in your Desktop/Web application you perform this subscription to the Bus? At the very initialization of your app? In the application code? Is there any reasoning to place domain subscriptors in a specific place?
Should that subscriptor (in the same thread) call a BacklogItem repository and perform the update? (But that would be a violation of the rule of not updating two aggregates in the same transaction since this would happen synchronously, right?).
If you want to achieve eventual consistency to fulfil the previously mentioned rule, do I really need a Message Broker like RabbitMQ even though both BacklogItem and Task live inside the same Bounded Context?
If I use this message broker, should I have a background thread or something that just consumes events from a RabbitMQ queue and then dispatches the event to update the product?
I'd appreciate if someone can shed some clear light over this since it is quite complex to picture in its completeness.
So to start with, you need to recognize that, if the BacklogItem is the authority for whether or not it is "Done", then it needs to have all of the information to compute that for itself.
So somewhere within the BacklogItem is data that is tracking which Tasks it knows about, and the known state of those tasks. In other words, the BacklogItem has a stale copy of information about the task.
That's the "eventually consistent" bit; we're trying to arrange the system so that the cached copy of the data in the BacklogItem boundary includes the new changes to the task state.
That in turn means we need to send a command to the BacklogItem advising it of the changes to the task.
From the point of view of the backlog item, we don't really care where the command comes from. We could, for example, make it a manual process "After you complete the task, click this button here to inform the backlog item".
But for the sanity of our users, we're more likely to arrange an event handler to be running: when you see the output from the task, forward it to the corresponding backlog item.
At which point in your Desktop/Web application you perform this subscription to the Bus? At the very initialization of your app?
That seems pretty reasonable.
Should that subscriptor (in the same thread) call a BacklogItem repository and perform the update? (But that would be a violation of the rule of not updating two aggregates in the same transaction since this would happen synchronously, right?).
Same thread and same transaction are not necessarily coincident. It can all be coordinated in the same thread; but it probably makes more sense to let the consequences happen in the background. At their core, events and commands are just messages - write the message, put it into an inbox, and let the next thread worry about processing.
If you want to achieve eventual consistency to fulfil the previously mentioned rule, do I really need a Message Broker like RabbitMQ even though both BacklogItem and Task live inside the same Bounded Context?
No; the mechanics of the plumbing matter not at all.
I have a WF4 Service with a flowchart as the root activity. It contains multiple correlated receive activites and decision branching to step through an approval process. The receive activities work perfectly until I try and use one as the trigger for a pick branch.
I am running tracking so can see that the receive is opened and in the persistance I can see the associated bookmark. When I send a client message with the receive type it does not trigger. I have a delay pick branch that fires OK but then the subsequent receive also does not work.
I have checked these receive activities individually and they work OK when not used as the pick trigger. I have tried the pick within a Sequence and a While but no difference.
I cannot see any difference between my implementation and may examples on the web. Am I missing something extra required when the receive is encapsulated by a pick branch?
There is nothing special about a PickBranch trigger that would cause a receive to behave differently so I suspect it is something with the Receive itself. What kind of errors are you seeing at the client application?