Practical examples of how correlation id is used in messaging?

Practical examples of how correlation id is used in messaging? - rabbitmq

Can anyone give me examples of how in production a correlation id can be used?
I have read it is used in request/response type messages but I don't understand where I would use it?
One example (which maybe wrong) I can think off is in a publish subscribe scenario where I could have 5 subscribers and if I get 5 replies with the same correlation id then I could say all my subscribers have received it. Not sure if this would the be correct usage of it.
Or if I send a simple message, the I can use the correlation to guarantee that the client received it.
Any other examples?

A web application that is providing HTTP API for outsiders for performing a processing task and you want to give the results for the caller as a response to the HTTP request they made.
A request comes in, message describing the task is pushed to queue by the frontend server. After that the frontend server blocks to wait for response message with the same correlation id. A pool of worker machines are listening on queue and one of them picks up the task, performs it and returns the result as message. Once a message with right correlation id comes in, frontend server continues to return the response to the caller.

In the context of CQRS and EventSourcing a command message correlation id will most likely get stored togehter with the corresponding events from the domain. This information can later be used to form an audit trail.

Streaming engines like Apache Flink use correlation ids, much like you said, to guarantee exactness of processing.

Related

Can RabbitMQ (or similar message queuing system) be used to single thread requests per user?

The issue is we have some modern web applications that are integrated with a legacy system that was never designed to support multiple concurrent requests from a single user. Basically there are certain types of requests that the legacy system can only handle one-at-a-time from a single user. It can handle multiple concurrent requests coming from different users, but for technical reasons cannot handle multiple from a single user. In these situations, the user's first request will complete successfully, but any subsequent requests from that same user that come in while the first request is still executing will fail.
Because our apps are ajax enabled, multi-tab/multi-browser friendly, and just the fact that there are multiple apps - there are certain scenarios where a user could wind up having more than one of these types of requests being sent to the legacy system at the same time.
I'm trying to determine if something like RabbitMQ could be positioned in front of the legacy system and leveraged to single-thread requests per user/IP. The thinking being that the web apps would send all requests to MQ, and they'd stack into per-user queues and pass on to the legacy system one at a time.
I don't know if there would be concerns about the potential number of queues this could create - we have a user-base of approx 4,000.
And I know we could somewhat address this in the web apps individually, but since there are multiple apps it'd be duplicating logic across them, and you'd still have the potential for two different apps to fire off concurrent requests.
Any feedback would be appreciated. Thanks-

I'm not sure a unique queue per user will work as you would need to have a backend worker process listening for messages on that queue that would need to be dynamically created.
Below is one option but it does have a performance bottleneck potential as a single backend process would be handling all requests sequentially. You could use multiple worker processes but you wouldn't know if one had completed before the other causing a race condition if your app requires a specific sequence of actions.
You could simply put all transactions (from all users) into a single queue and have a backend process pull off of that queue and service the request. If there needs to be a response back to the user once the request was serviced, then the worker process could respond back to a separate queue with a correlationID that could be used to send the response date back to the correct user.
I've done this before with ExpressJS apps where the following flow would happen:
The user/process/ajax makes a request
Express takes the payload from the request object and sends it to a RabbitMQ queue with a unique correlationId (e.g. UUID).
Express then takes the response object and stores it in a responseStore object with the key being the correlationId
Meanwhile, a backend worker process pulls the item from the queue, does some work and then sends a message to a different response queue with the same correlationId
The ExpressJS application has a connection to the response queue and when it receives a message, it takes the correlationId from the response and looks for a response object stored with same correlationId in the responseStore. If it finds it, it takes the payload from the message and does something like response.send(payload) or response.json(payload)
To do this, you should also have a mechanism that stores the creation time of the response object in the responseStore along with the response object. Then have a separate process that will check the responseStore and clean up old response objects after a certain timeout in case there are issues with the backend process completing.
Look here for more info on RPC with RabbitMQ:
https://www.rabbitmq.com/tutorials/tutorial-six-javascript.html
Hope this helps.

Camel route "to" specific websocket endpoint

I have some camel routes with mina sockets and jetty websockets. I am able to broadcast a message to all the clients connected to the websocket but how do i send a message to a specific endpoint. How do i maintain a list of all connected clients with a client id as reference so i can route to a specific client. Is that possible? Will i be able to mention a dynamic client in the to URI?
Or maybe i am thinking about this wrong and i need to create topics on active mq and have the clients subscribe to it. That would mean that i create a topic for every websocket client? and route the message to the right topic.
Am i atleast on the right track here, any examples you can point out? Google was not helpful.

The approach you take depends on how sensitive the client information is. The downside of a single topic with selectors is that anyone can subscribe to the topic without a selector and see all the information for everyone - not usually something that you want to do.
A better scheme is to use a message distribution mechanism (set of Camel routes) that act as an intermediary between the websocket clients and the system producing the messages. This mechanism is responsible for distributing messages from a single destination to client-specitic destinations. I have worked on a couple of banking web front-ends that used a similar scheme.
In order for this to work you first generate for each user a distinct token/UUID; this is presented to the user when the session is established (usually through some sort of profile query/message).
It's essential that the UUID can be worked out as a hash of the clientId rather than being stored in a DB, as it will be used all the time and you want to make sure this is worked out quickly.
The user then uses that information to connect to specific topics that use that UUID as a suffix. For example two users subscribing to an orderConfirmation topic would each subscribe to their own version of that topic:
clientA -> orderConfirmation.71jqsd87162iuhw78162wd7168
clientB -> orderConfirmation.76232hdwe7r23j92irjh291e0d
To keep track of "presence", your clients would need to periodically send a heartbeat message containing their clientId to a well-known topic that your distribution mechanism listens on. Clients should not be able to subscribe to this topic for reads (see ActiveMQ Security). The message distribution mechanism needs to keep in memory a data structure that contains the clientId and the time a heartbeat was last seen.
When a message is received by the distribution mechanism, it checks whether the clientID for which it received the message has a "live/present" session, determines the UUID for the client, and broadcasts the message on the appropriate topic.
Over time this will create a large number of topics on your broker that you don't want hanging around when the user has gone away. You can configure ActiveMQ to delete these if they have been inactive for some time.

You definitely do not want to create separate endpoint for each client.
Topic and a subscription with selector is an elegant way to resolve it.
I would say the best one.
You need single topic, which every client would subscribe to with the selector looking like where clientId in ('${myClientId}', 'EVERYONE'). Now when you want to publish a message to specific client, you set a property clientId to the id of this client. If you want to broadcast, you set it to 'EVERYONE'
I hope I understand the problem right...

NServiceBus message types and thought process

In our scenario I'm thinking of using the pub sub technique. However I don't know which is the better option.
1 ########
A web service of ours will publish a message that something has happened when it is called externally, ExternalPersonCreatedMessage!
This message will contain a field that represents the destinations to process the message into (multiple allowed).
Various subscribers will subscribe. These subscribers will filter the message to see if any action is required by checking the destination field.
2 ########
A web service of ours will parse the incoming call and publish specific types of messages depending on the destinations supplied in the field. i.e. many Destination[n]PersonCreatedMessage messages would be created.
Subscribers will subscribe to only the specific message they care for. i.e. not having to filter any messages
QUESTIONS
Which of the above is the better option and why? And how do I stop myself from making RequestMessages. From what I've read/seen I should be trying to structure this in a way of PersonCreated, PersonDeleted i.e. SOMETHING HAS HAPPENED and NOT in the REQUEST SOMETHING TO HAPPEN form such as CreatePerson or DeletePerson
Are my thoughts correct? I've been looking for guidance on how to structure messages and making sure I don't go down a wrong path but have found no guidance out there on do's and dont's. Can any one help and guide? I want to try and get this correct from the off :)

Based on the integration scenario in the referenced article, it appears to me that you may need a Saga to complete the workflow of accept message -> operate on message -> send confirmation. In the case that the confirmation is sent immediately after the operation, you could use NSBs message handler pipeline feature which allows you to chain handlers in a specified sequence such as...
First<FilterHandler>.Then<DoWorkHandler>().AndThen<SendConfirmationHandler>();
In terms of the content filtering, you can do this although you incur some transport overhead, meaning the queue will have to accept the message and the process will always call the first handler on every message(you can short-circuit the above pipeline at any point). It may be the case that what you really want is a Distributor/Worker setup where all Workers are the same and you can handle some load.
If you truly have different endpoints with completely different logic, then I would have the Publisher process(only accepts and Publishes message) do the work of translating the inbound message to something else a Subscriber can then be interested in. If then you find that a given Published message only ever has 1 Subscriber, then you don't need to Publish at all, you need to just Bus.Send() to the correct endpoint.

The way NServiceBus handles pub-sub is more like your option two.
A publisher service has an input queue and a subscription store.
A subscriber service has an input queue
The subscriber, on start-up will send a subscription message to the input queue of the publisher
The subscription message contains the type of message subscriber is interested in and the subscribers queue address
The publisher records the subscription in the subscription store.
The publisher receives a message.
The publisher evaluates the message type against the list of subscriptions
For each match found the publisher sends the message to the queue address.
In my opinion, you should stop thinking about destinations. Messages are messages. They should not have any inherent destination information in them. The subscription mechanism defines the addressing/routing requirements for the solution.

What is the best way to route NServiceBus messages to specific clients?

Let's say I have a ClientRequestMessage message that contains a request for a specific Client. A web application will generate these requests and they need to be sent to the correct Client for handling. I can think of a few options for this.
I could have a single queue that all messages go to and specific client handlers check a property (like ClientId) to decide whether they care about it. This feels wrong on many levels to me though.
I could publish a message to all of the clients and they could decide whether or not they care about it during handling. This seems like too much traffic and wastes each client's time handling messages they shouldn't care about in the first place though.
I could have client specific queues that these messages get routed too. This one feels the best to me, but I am unsure of how to do it. I'd like to keep it simple and avoid client specific message types, but I am not sure how to tell NServiceBus "for client A send it to client A's queue and for client B send it to client B's queue".
So my question is, what is the best (most efficient? easiest to manage?) way to set this up? I am pretty sure I need to use the distributor, but not positive so thought I would ask.
BONUS QUESTION:
Let's say each client has multiple handlers. How can I make sure only one of them handles a given message? Would I need a distributor per client?

If what you really want is the solution that allows you to have just a single message where you can place a specific filter on the message based on clientId and only route the message to the client when it relates to them then I would use PServiceBus(pservicebus.codeplex.com). It will make it easier for you specific a set of subscriptions for each of your client where their messages are all filtered by clientId into a specific queue or what transport you have available. The below example shows filtering a ChatTopic by the UserName Property and the subscriber only receives the message at the specified transport when the message been published UserName property is not TJ. You are also allowed to use complex filter where you do thing such as GreaterThan("MyComplexProperty.Blah.ID", 5)
Subscriber.New("MyUserName").Durable(false)
.SubscribeTo(Topic.Select<ChatTopic>().NotEqual("UserName", "TJ"))
.AddTransport("Tcp",
Transport.New<TcpTransport>(
transport => {
transport.Format = TransportFormat.Json;
transport.IPAddress = "127.0.0.1";
transport.Port = port;
}), "ChatTopic")
.Save();

You can tell NSB where to put messages by using the MessageEndpointMappings configuration section. You can map a specific message type or a whole assembly to a queue. If you don't want to create specific message types and map them, then I would recommend the publish approach. The overhead of removing a message from the queue is pretty minimal.
If your "client" has many instances of NSB to pick up messages then you will need to use a Distributor. Check out the distributed Pub/Sub documentation.

REST, WCF and Queues

I created a RESTful service using WCF which calculates some value and then returns a response to the client.
I am expecting a lot of traffic so I am not sure whether I need to manually implement queues or it is not neccessary in order to process all client requests.
Actually I am receiving measurements from clients which have to be stored to the database - each client sends a measurement every 200 ms so if there are a multiple clients there could be a lot of requests.
And the other operation performed on received data. For example a client could send an instruction "give me the average of the last 200 measurements" so it could take some time to calculate this value and in the meantime the same request could come from another client.
I would be very thankful if anyone could give any advice on how to create a reliable service using WCF.
Thanks!

You could use the MsmqBinding and utilize the method implemented by eedsi9n. However, from what I'm gathering from this post is that you're looking for something along the lines of a pub/sub type of architecture.
This can be implemented with the WSDualHttpBinding which allows subscribers to subscribe to events. The publisher will then notify the user when the action is completed.
Therefore you could have Msmq running behind the scenes. The client subscribes to the certain events, then perhaps it publishes a message that needs to be processed. THe client sits there and does work (because its all async) and when the publisher is done working on th message it can publish an event (The event your client subscribed to) letting you know that its done. That way you don't have to implement a polling strategy.
There are pre-canned solutions for this as well. Such as NService Bus, Mass Transit, and Rhino Bus.

If you are using Web Service, Transmission Control Protocol (TCP/IP) will act as the queue to a certain degree.
TCP provides reliable, ordered
delivery of a stream of bytes from one
program on one computer to another
program on another computer.
This guarantees that if client sends packet A, B, then C, the server will received it in that order: A, B, then C. If you must reply back to the client in the same order as request, then you might need a queue.
By default maximum ASP.NET worker thread is set to 12 threads per CPU core. So on a dual core machine, you can run 24 connections at a time. Depending on how long the calculation takes and what you mean by "a lot of traffic" you could try different strategies.
The simplest one is to use serviceTimeouts and serviceThrottling and only handle what you can handle, and reject the ones you can't.
If that's not an option, increase hardware. That's the second option.
Finally you could make the service completely asynchronous. Implement two methods
string PostCalc(...) and double GetCalc(string id). PostCalc accepts the parameters, stuff them into a queue (or a database) and returns a GUID immediately (I like using string instead of Guid). The client can use the returned GUID as a claim ticket and call GetCalc(string id) every few seconds, if the calculation has not finished yet, you can return 404 for REST. Calculation must now be done by a separate process that monitors the queue.
The third option is the most complicated, but the outcome is similar to that of the first option of putting cap on incoming request.

It will depend on what you mean by "calculates some value" and "a lot of traffic". You could do some load testing and see how the #requests/second evolves with the traffic.

There's nothing WCF specific here if you are RESTful
the GET for an Average would give a URI where the answer would wait once the server finish calculating (if it is indeed a long operation)
Regarding getting the measurements - you didn't specify the freshness needed (i.e. when you get a request for an average - how fresh do you need the results to be) Also you did not specify the relative frequency of queries vs. new measurements
In any event you can (and IMHO should) use the queue (assuming measuring your performance proves it) behind the endpoint. If you change the WCF binding you might still be RESTful but will not benefit from the standard based approach of REST over HTTP

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas