In our scenario I'm thinking of using the pub sub technique. However I don't know which is the better option.
1 ########
A web service of ours will publish a message that something has happened when it is called externally, ExternalPersonCreatedMessage!
This message will contain a field that represents the destinations to process the message into (multiple allowed).
Various subscribers will subscribe. These subscribers will filter the message to see if any action is required by checking the destination field.
2 ########
A web service of ours will parse the incoming call and publish specific types of messages depending on the destinations supplied in the field. i.e. many Destination[n]PersonCreatedMessage messages would be created.
Subscribers will subscribe to only the specific message they care for. i.e. not having to filter any messages
QUESTIONS
Which of the above is the better option and why? And how do I stop myself from making RequestMessages. From what I've read/seen I should be trying to structure this in a way of PersonCreated, PersonDeleted i.e. SOMETHING HAS HAPPENED and NOT in the REQUEST SOMETHING TO HAPPEN form such as CreatePerson or DeletePerson
Are my thoughts correct? I've been looking for guidance on how to structure messages and making sure I don't go down a wrong path but have found no guidance out there on do's and dont's. Can any one help and guide? I want to try and get this correct from the off :)
Based on the integration scenario in the referenced article, it appears to me that you may need a Saga to complete the workflow of accept message -> operate on message -> send confirmation. In the case that the confirmation is sent immediately after the operation, you could use NSBs message handler pipeline feature which allows you to chain handlers in a specified sequence such as...
First<FilterHandler>.Then<DoWorkHandler>().AndThen<SendConfirmationHandler>();
In terms of the content filtering, you can do this although you incur some transport overhead, meaning the queue will have to accept the message and the process will always call the first handler on every message(you can short-circuit the above pipeline at any point). It may be the case that what you really want is a Distributor/Worker setup where all Workers are the same and you can handle some load.
If you truly have different endpoints with completely different logic, then I would have the Publisher process(only accepts and Publishes message) do the work of translating the inbound message to something else a Subscriber can then be interested in. If then you find that a given Published message only ever has 1 Subscriber, then you don't need to Publish at all, you need to just Bus.Send() to the correct endpoint.
The way NServiceBus handles pub-sub is more like your option two.
A publisher service has an input queue and a subscription store.
A subscriber service has an input queue
The subscriber, on start-up will send a subscription message to the input queue of the publisher
The subscription message contains the type of message subscriber is interested in and the subscribers queue address
The publisher records the subscription in the subscription store.
The publisher receives a message.
The publisher evaluates the message type against the list of subscriptions
For each match found the publisher sends the message to the queue address.
In my opinion, you should stop thinking about destinations. Messages are messages. They should not have any inherent destination information in them. The subscription mechanism defines the addressing/routing requirements for the solution.
Related
As I have been able to verify, in MassTransit with Azure Service Bus, each type of object consumed by a "Consumer" generates a Topic for that type regardless of whether it is only consumed in a specific "receive endpoint" (queue). When sending a message of this type with the "Send()" method, the message is sent directly to the "receive endpoint" (queue) without going through the topic. If this same message is published with the "Publish()" method, it is published in the Topic, and is forwarded to the receive endpoint (queue) from the corresponding subscriber.
My application uses a CQRS pattern where the messages are divided into commands and events. Commands use the send-receive pattern and are therefore always dispatched in MassTransit with the "Send()" method. The events, however, are based on the publish-subscribe pattern, and therefore are always dispatched in MassTransit with the "Publish()" method. As a result, a large number of topics are created on the bus that are never used (one for each type of command), since the messages belonging to these topics are sent directly to the receiver's queue.
For all these reasons, the question I ask is whether it is possible to configure MassTransit so that it does not automatically create the topics of some types of messages consumed because they will only be sent using the "Send()" method? Does this make sense in MassTransit or is it not possible/recommended?
Thank you!
Regards
Edited 16/04/2021
After doing some testing, I edit this topic to clarify that the intention is to configure MassTransit so that it does not automatically create the topics of some types of messages consumed, all of them received on the same receive endpoint. That is, the intention is to configure (dynamically if possible, through the type of object) which types of messages consumed create a topic and which do not in the same receive endpoint. Let's imagine that we have a receive endpoint (a queue) associated with a service, and this service is capable of consuming both commands and events, since the commands are only dispatched through Send(), it is not necessary to create the topic for them, however the events that are dispatched via Publish(), they need their topic (and their subscribers) to exist in order to deliver the message and be consumed.
Thanks in advance
Yes, for a receive endpoint hosting a consumer that will only receive Sent messages, you can specify ConfigureConsumeTopology = false for that receive endpoint. You can do that via a ConsumerDefinition, or when configuring the receive endpoint directly.
UPDATE
It is also possible to disable topology configuration per message type using an attribute on the message contract:
[ConfigureConsumeTopology(false)]
public interface SomeCommand
{
}
This will prevent the topic/exchange from being created and bound to the receive endpoint.
While I can understand the desire to be "pure to the CQRS mantra" and only Send commands, I'd suggest you read this answer and take it into consideration before overburdening your developers with knowing every single endpoint in the system by name...
Requirement
A system undergoes some state change, and multiple other parts of the system has to know this(lets call them observers) so that they can perform some actions based on the current state, the actions of the observers are important, if some of the observers are not online(not listening currently due to some trouble, but will be back soon), the message should not be discarded till all the observers gets the message.
Trying to accomplish this with pub/sub model, here are my findings, (please correct if this understanding is wrong) -
The publisher creates an event on specific topic, and multiple subscribers can consume the same message. This model either provides no delivery guarantee(in redis), or delivery is guaranteed once(with messaging queues), ie. when one of the consumer acknowledges a message, the message is discarded(rabbitmq).
Example
A new Person Profile entity gets created in DB
Now,
A background verification service has to know this to trigger the verification process.
Subscriptions service has to know this to add default subscriptions to the user.
Now both the tasks are important, unrelated and can run in parallel.
Now In Queue model, if subscription service is down for some reason, a BG verification process acknowledges the message, the message will be removed from the queue, or if it is fire and forget like most of pub/sub, the delivery is anyhow not guaranteed for both the services.
One more point is both the tasks are unrelated and need not be triggered one after other.
In short, my need is to make sure all the consumers gets the same message and they should be able to acknowledge them individually, the message should be evicted only after all the consumers acknowledged it either of the above approaches doesn't do this.
Anything I am missing here ? How should I approach this problem ?
This scenario is explicitly supported by RabbitMQ's model, which separates "exchanges" from "queues":
A publisher always sends a message to an "exchange", which is just a stateless routing address; it doesn't need to know what queue(s) the message should end up in
A consumer always reads messages from a "queue", which contains its own copy of messages, regardless of where they originated
Multiple consumers can subscribe to the same queue, and each message will be delivered to exactly one consumer
Crucially, an exchange can route the same message to multiple queues, and each will receive a copy of the message
The key thing to understand here is that while we talk about consumers "subscribing" to a queue, the "subscription" part of a "pub-sub" setup is actually the routing from the exchange to the queue.
So a RabbitMQ pub-sub system might look like this:
A new Person Profile entity gets created in DB
This event is published as a message to an "events" topic exchange with a routing key of "entity.profile.created"
The exchange routes copies of the message to multiple queues:
A "verification_service" queue has been bound to this exchange to receive a copy of all messages matching "entity.profile.#"
A "subscription_setup_service" queue has been bound to this exchange to receive a copy of all messages matching "entity.profile.created"
The consuming scripts don't know anything about this routing, they just know that messages will appear in the queue for events that are relevant to them:
The verification service picks up the copy of the message on the "verification_service" queue, processes, and acknowledges it
The subscription setup service picks up the copy of the message on the "subscription_setup_service" queue, processes, and acknowledges it
If there are multiple consuming scripts looking at the same queue, they'll share the messages on that queue between them, but still completely independent of any other queue.
Here's a screenshot from this interactive visualisation tool that shows this scenario:
As you mentioned it is not something that you can control with Redis Pub/Sub data structure.
But you can do it easily with Redis Streams.
Streams will allow you to post messages using the XADD command and then control which consumers are dealing with the message and acknowledge that message has been processed.
You can look at these sample application that provides (in Java) example about:
posting and consuming messages
create multiple consumer groups
manage exceptions
Links:
Getting Started with Redis Streams and Java
Redis Streams in Action ( Project that shows how to use ADD/ACK/PENDING/CLAIM and build an error proof streaming application with Redis Streams and SpringData )
I am trying to build a system where I need to select next available and suitable consumer to send a message from a queue (or may be any other solution not using the queue)
Requirements
We have multiple publishers/clients who would send objects (images) to process on one side and multiple Analysts who would process them, once processed the publisher should get the corresponding response.
The publishers do not care which Analyst is going to process the data.
Users have a web app where they can map each client/publisher to one or more or all agents, say for instance if Publisher P1 is mapped to Agents A & B, all objects coming from P1 can be processed by Agent A or Agent B. Note: an object can only be processed by one agent only.
Depending on the mapping I should have a middleware which consumes the messages from all publishers and distributes to the agents
Solution 1
My initial thoughts were to have a queue where all publishers post their messages. Another queue where Agents publish message saying they are waiting to process an object.
A middleware picks the message, gets the possible list of agents it can send the message to (from cached database) and go through the agents queue to find the next suitable and available agent and publish the message to that agent.
The issue with this solution is if I have agents queue like a,b,c,d and the message I receive can only be processed by agent b I will be rejecting agents d & c and they would end up at the tail of the queue and I have around 180 agents so they might never be picked or if the next message can only be processed by agent d (for example) we have to reject all the agents to get there
Solution 2
First bit from publishers to middleware is still the same
Have a scaled fast nosql database where agents add a record to notify there availability. Basically a key value pair
The middleware gets config from cache and gets the next available + suitable agent from the nosql database sends message to the agent's queue (through direct exchange) and updates the nosql to set isavailable false ad gets the next message.
Issue with this solution is the db and middleware can become a bottleneck, also if I scale the middleware I will end up in database concurrency issues for example f I have two copies of middleware running and each recieves a message which can be proceesed by Agents A & B and both agents are available.
The two middleware copies would query the db and might get A as availble and end up sneding both messages to A while B is still waiting for a message to process.
I will have around 100 publishers and 180 agents to start with.
Any ideas how to improve these solutions or any other feasible solution would be highly appreciated?
Depending on this I also need to figure out how the Agent would send response back to the publisher.
Thank you
I'll answer this from the perspective the perspective of my open-source service bus: Shuttle.Esb
Typically one would ignore any content-based routing and simply have a distributor pattern. All message go to the primary endpoint and it will distribute the messages. However, if you decide to stick to these logical groupings you could have primary endpoints for each logical grouping (per agent group). You would still have the primary endpoint but instead of having worker endpoints mapped to agents you would have agent groupings map to the logical primary endpoint with workers backing that.
Then in the primary endpoint you would, based on your content (being the agent identifier), forward the message to the relevant logical primary endpoint. All the while you keep track of the original sender. In the worker you would then send a message back to the queue of the original sender.
I'm sure you could do pretty much the same using any service bus.
I see several requirements in here, that can be boiled down to a few things, I think:
publisher does not care which agent processes the image
publisher needs to know when the image processing is done
agent can only process 1 image at a time
agent can only process certain images
are these assumptions correct? did I miss anything important?
if not, then your solution is pretty much built into RabbitMQ with routing and queues. there should be no need to build custom middle-tier service to manage this.
With RabbitMQ, you can have a consumer set to only process 1 message at a time. The consumer sets it's "prefetch" limit to 1, and retrieves a message from the queue with "no ack" set to false - meaning, it must acknowledge the message when it is done processing it.
To consume only messages that a particular agent can handle, use RabbitMQ's routing capabilities with multiple queues. The queues would be created based on the type of image or some other criteria by which the consumers can select images.
For example, if there are two types of images: TypeA and TypeB, you would have 2 queues - one for TypeA and one for TypeB.
Then, if Agent1 can only handle TypeA images, it would only consume from the TypeA queue. If Agent2 can handle both types of images, it would consume from both queues.
To put the right images in the right queue, the publisher would need to use the right routing key. If you know if the image type (or whatever the selection criteria is), you would change the routing key on the publisher side to match that selection criteria. The routing in RabbitMQ would be set up to move messages for TypeA into the TypeA queue, etc.
The last part is getting a response on when the image is done processing. That can be accomplished through RabbitMQ's "reply to" field and related code. The gist of it is that the publisher has it's own exclusive queue. When it publishes a message, it includes the name of it's exclusive queue in the "reply to" header of the message. When the agent finishes processing the image, it sends a status update message back through the queue found in the "reply to" header. That status update message tells the producer the status of the request.
From a RabbitMQ perspective, these pieces can be put together using the examples and documentation found here:
http://www.rabbitmq.com/getstarted.html
Look at these specifically:
Work Queues: http://www.rabbitmq.com/tutorials/tutorial-two-python.html
Topics: http://www.rabbitmq.com/tutorials/tutorial-five-python.html
RPC (aka Request/Response): http://www.rabbitmq.com/tutorials/tutorial-six-python.html
You'll find examples in many languages, in these docs.
I also cover most of these scenarios (and others) in my RabbitMQ Patterns eBook
Since the total number of senders and receivers are only hundreds, how about to create one queue for each of your senders. Based on your sender receiver mapping, receivers subscribes to the sender queues (update the subscribing on mapping changes). You could configure your receiver to only receive the next message from all the queues it subscribes (in a random way) when it finishes processing one message.
I have a clients that uses API. The API sends messeges to rabbitmq. Rabbitmq to workers.
I ought to reply to clients if somethings went wrong - message wasn't routed to a certain queue and wasn't obtained for performing at this time ( full confirmation )
A task who is started after 5-10 seconds does not make sense.
Appropriately, I must use mandatory and immediate flags.
I can't increase counts of workers, I can't run workers on another servers. It's a demand.
So, as I could find the immediate flag hadn't been supporting since rabbitmq v.3.0x
The developers of rabbitmq suggests to use TTL=0 for a queue instead but then I will not be able to check status of message.
Whether any opportunity to change that behavior? Please, share your experience how you solved problems like this.
Thank you.
I'm not sure, but after reading your original question in Russian, it might be that using both publisher and consumer confirms may be what you want. See last three paragraphs in this answer.
As you want to get message result for published message from your worker, it looks like RPC pattern is what you want. See RabbitMQ RPC tuttorial. Pick a programming language section there you most comfortable with, overall concept is the same. You may also find Direct reply-to useful.
It's not the same as immediate flag functionality, but in case all your publishers operate with immediate scenario, it might be that AMQP protocol is not the best choice for such kind of task. Immediate mean "deliver this message right now or burn in hell" and it might be a situation when you publish more than you can process. In such cases RPC + response timeout may be a good choice on application side (e.g. socket timeout). But it doesn't work well for non-idempotent RPC calls while message still be processed, so you may want to use per-queue or per-message TTL (or set queue length limit). In case message will be dead-lettered, you may get it there (in case you need that for some reason).
TL;DR
As to "something" can go wrong, it can go so on different levels which we for simplicity define as:
before RabbitMQ, like sending application failure and network problems;
inside RabbitMQ, say, missed destination queue, message timeout, queue length limit, some hard and unexpected internal error;
after RabbitMQ, in most cases - messages processing application error or some third-party services like data persistence or caching layer outage.
Some errors like network outage or hardware error are a bit epic and are not a subject of this q/a.
Typical scenario for guaranteed message delivery is to use publisher confirms or transactions (which are slower). After you got a confirm it mean that RabbitMQ got your message and if it has route - placed in a queue. If not it is dropped OR if mandatory flag set returned with basic.return method.
For consumers it's similar - after basic.consumer/basic.get, client ack'ed message it considered received and removed from queue.
So when you use confirms on both ends, you are protected from message loss (we'll not run into a situation that there might be some bug in RabbitMQ itself).
Bogdan, thank you for your reply.
Seems, I expressed my thought enough clearly.
Scheme may looks like this. Each component of system must do what it must do :)
The an idea is make every component more simple.
How to task is performed.
Clients goes to HTTP-API with requests and must obtain a respones like this:
Positive - it have put to queue
Negative - response with error and a reason
When I was talking about confirmation I meant that I must to know that a message is delivered ( there are no free workers - rabbitmq can remove a message ), a client must be notified.
A sent message couldn't be delivered to certain queue, a client must be notified.
How to a message is handled.
Messages is sent for performing.
Status of perfoming is written into HeartBeat
Status.
Clients obtain status from HeartBeat by itself and then decide that
it's have to do.
I'm not sure, that RPC may be useful for us i.e. RPC means that clients must to wait response from server. Tasks may works a long time. Excess bound between clients and servers, additional logic on client-side.
Limited size of queue maybe not useful too.
Possible situation when a size of queue maybe greater than counts of workers. ( problem in configuration or defined settings ).
Then an idea with 5-10 seconds doesn't make sense.
TTL doesn't usefull because of:
Setting the TTL to 0 causes messages to be expired upon reaching a
queue unless they can be delivered to a consumer immediately. Thus
this provides an alternative to basic.publish's immediate flag, which
the RabbitMQ server does not support. Unlike that flag, no
basic.returns are issued, and if a dead letter exchange is set then
messages will be dead-lettered.
direct reply-to :
The RPC server will then see a reply-to property with a generated
name. It should publish to the default exchange ("") with the routing
key set to this value (i.e. just as if it were sending to a reply
queue as usual). The message will then be sent straight to the client
consumer.
Then I will not be able to route messages.
So, I'm sorry. I may flounder in terms i.e. I'm new in AMQP and rabbitmq.
Let's say I have a ClientRequestMessage message that contains a request for a specific Client. A web application will generate these requests and they need to be sent to the correct Client for handling. I can think of a few options for this.
I could have a single queue that all messages go to and specific client handlers check a property (like ClientId) to decide whether they care about it. This feels wrong on many levels to me though.
I could publish a message to all of the clients and they could decide whether or not they care about it during handling. This seems like too much traffic and wastes each client's time handling messages they shouldn't care about in the first place though.
I could have client specific queues that these messages get routed too. This one feels the best to me, but I am unsure of how to do it. I'd like to keep it simple and avoid client specific message types, but I am not sure how to tell NServiceBus "for client A send it to client A's queue and for client B send it to client B's queue".
So my question is, what is the best (most efficient? easiest to manage?) way to set this up? I am pretty sure I need to use the distributor, but not positive so thought I would ask.
BONUS QUESTION:
Let's say each client has multiple handlers. How can I make sure only one of them handles a given message? Would I need a distributor per client?
If what you really want is the solution that allows you to have just a single message where you can place a specific filter on the message based on clientId and only route the message to the client when it relates to them then I would use PServiceBus(pservicebus.codeplex.com). It will make it easier for you specific a set of subscriptions for each of your client where their messages are all filtered by clientId into a specific queue or what transport you have available. The below example shows filtering a ChatTopic by the UserName Property and the subscriber only receives the message at the specified transport when the message been published UserName property is not TJ. You are also allowed to use complex filter where you do thing such as GreaterThan("MyComplexProperty.Blah.ID", 5)
Subscriber.New("MyUserName").Durable(false)
.SubscribeTo(Topic.Select<ChatTopic>().NotEqual("UserName", "TJ"))
.AddTransport("Tcp",
Transport.New<TcpTransport>(
transport => {
transport.Format = TransportFormat.Json;
transport.IPAddress = "127.0.0.1";
transport.Port = port;
}), "ChatTopic")
.Save();
You can tell NSB where to put messages by using the MessageEndpointMappings configuration section. You can map a specific message type or a whole assembly to a queue. If you don't want to create specific message types and map them, then I would recommend the publish approach. The overhead of removing a message from the queue is pretty minimal.
If your "client" has many instances of NSB to pick up messages then you will need to use a Distributor. Check out the distributed Pub/Sub documentation.