I have a scenario where I need to implement a server farm (say 5 servers) which will run 4 instances of a calculation engine:
Server 1: E1 E2 E3 E4
Server 2: E1 E2 E3 E4
Server 3: E1 E2 E3 E4
Server 4: E1 E2 E3 E4
Server 5: E1 E2 E3 E4
I'd like to utilize a message queue solution whereby each engine is listening on the same queue (eg WORK.QUEUE) for incoming work. In the initial state if work is added then I'd like it to go to Server-1/E1. Then, if more work arrives whilst that instance is busy I'd like it to go to Server-2/E1 and so on. I only want work to go to an E2 instance if all the E1 instances are available.
This sounds to me like a form of round-robin load balancing, but I suspect that this isn't the correct terminology in the message queuing space.
Is this architecture possible using either MSMQ or MQ Series, or does it require some sort of load balancer running on each server to farm out work on a server level?
Round-robin load balancing of messages is certainly a term in the messaging space. In the IBM WebSphere MQ (aka MQSeries) case, it means that each new message will go to a new instance of the queue, i.e
Message 1 -> Server-1/E1
Message 2 -> Server-2/E1
Message 3 -> Server-3/E1
Message 4 -> Server-4/E1
i.e. it's not based on the busy-ness of the consumer on each server.
MSMQ just delivers messages. It doesn't care about workload. The scenario you have described is of multiple readers accessing a single common remote queue. Any load balancing would need to be performed by your own code. One approach may be to have multiple queues and a simple service that moves a message from WORK.QUEUE to WORK.QUEUE2 if it isn't processed in a set time. All the E1 instances would read messages from WORK.QUEUE, E2 instances from WORK.QUEUE2, E3 from WORK.QUEUE3, and so on. If all the instances at one level were busy, a message would be pushed/cascaded down the queues until it was processed.Similar to how WCF and MSMQ sub-queues can work.
Related
I'm using RabbitMQ and the amiquip Rust crate to build out several services that will be processing some data in multiple steps. Roughly, this might look like:
Service A ingests data from external source, publishes its results to Topic A
Service B subscribes to Topic A, does some processing, publishes results to Topic B
Service C subscribes to Topic B, does some processing, publishes results to Topic C
Each step along the way, the data are further refined. I will need to be able to shut down different services for maintenance without missing messages that they're reading (eg, Service B may be taken down briefly, but the messages published by Service A to Topic A must remain in the queue until Service B comes back online). I am okay with setting some TTL/expiration (not sure what the right terminology is for AMQP); for example, if Service B doesn't come back online after 5 minutes, it's okay if messages published to the topic are lost).
Additionally, there may be another service that should also be able to subscribe to a topic without interfering with another service reading it. For example, Service C2 gets a copy of all messages in Topic B and does something with them; every message read by Service C2 is also read by Service C (no stepping on each other's feet).
I don't know the right terminology used here, so I'm at a bit of a loss for what I should be looking for. Is this possible with AMQP & RabbitMQ?
I have around 300 different consumers / 300 message types / 300 queues with the most wildest functionality behind it.
From the extreme side:
Is the best choice to make 1 windows service (easier to deploy) with 300 consumers listening.
Or 300 windows services (easier to split between devs) each independent 1 consumer but impossible to maintain by support
?
update: from 1 to 300 queues
RabbitMQ can support hundreds of queues simultaneously, and each queue should be responsible for one specific type of message e.g. a response status or an online order information or a stack trace information for further processing by some other unit of work, these three are not same and if you are keeping them all in one then please segregate them into different queues.
If you will keep all the data in one queue it will also effect your application performance as each queue works in a sequential order and since you have 300 consumers that wait for 300 types of messages, almost all of them could be in waiting state and it is also a reason for complex decision making algorithm, if you are using one to figure out the correct consumer.
What could also go wrong with a single queue is that it is now a bottleneck which could obstruct the functioning of the whole application, if that queue fails, because every consumer listens to it. By having different queues the rest of the system can still process if one particular queue faces an issue.
Instead of going for 1 consumer per service you can check if there's anything common and if the services can take up more consumers than one after increasing the number of queues from 1 to many.
Microservice architecture and sharing common application data.
Scenario being:
There are today 17 microservices for some online social media service and 9 of them need to know who is connected to who in order for their function to work. To prevent each service constantly asking the "authentication" or "connections" microserice for the list, all services register to recieve a copy of the connections per user and store in a cache.
A proposal for the mechanism to deliver the data, or instruction to fetch data could be rabbitmq.
However, each microservice is a cluster of docker containers orchestrated by k8s for scalability.
Each container registers to listen to a collection of exchanges they are interested in... so for the "news feed" service that could be say 5 connections...
Below is an illustration of the proposed setup:
T1 - user A accept a friend request
T2 - The connections service (MS1) makes the connection in its primary database
T3 - MS1 published to a rabbitmq exchange the said event
T4 - rabbitmq exchange emits to all Q's (ie all other microservices registered)
T5 - All the nodes within the MS2 cluster pickup the event and act... their action (in this case) will be to update the cache of the friend connections.
T6 - user A requests the data for their newsfeed, MS2 now queries its database with the use of its local cache
This is all good:
The connection service didn't know or care who got the data, only that it should emit to 1 exchange via the single rabbitmq entry point
The developer of MS2 only needed to know about the location of the rabbitmq instance
The developer of all the other services the same.. they handle the data in their own brilliant way.
The 1 exception is.. there were 3 instances of MS2 so that would be 3 database writes.. if the system scales to 10 that would be 10 db writes etc etc.
Question
How is this problem bypassed... how to ensure only 1 of the MS2 instances will act?
Should the newsfeed microservice be delivered with its own internal q system to manage the data from the exchange? Is it possible to route all the messages via the load balancer so only 1 instance of MS2 gets a message? I don't want to start managing lots of lots of queues by hand as this will be a pain and defeat the simplicity of the exchange design.
So, all instances if M2 will share a queue and work using the competing consumer pattern, every messages is consumed once and if all instance of M2 goes down the queue grows until they come back up again.
M2, M3 and M4 will each create ONE queue for what M1 publishes.
Let's name them them
M2_from_M1, M3_from_M1 and M4_from_M1.
They will all also create a binding against the exchange M1 uses and on the routing key for this message.
Now, instances of M2 will all consume from M2_from_M1, instances of M3 will all consume from M3_from_M1 and so on.
If all instances of any of these are down it's queue will start to fill up but that is fine since it will be consumed later.
Regarding the overall architecture. Try first with actually making the call between M2 and M1, access time between pods is probably very fast and you could probably cache both in M1 and in M2 for a while. Worst outcome is that you see news from people you no longer follow, or that you don't get news from new contacts.
I want to create a consumer that process messages from multiple variable number of sources, that are connected or disconnected dynamically.
What I need is that each consumer prioritize first N messages of each source. Then to run multiple consumers to improve the speed.
I have been reading docs for Work queues, Routing and Topics, and a lot of other docs without identifying how to implement this. Also I made some tests without luck.
Can someone point me how to do it or where to read about it?
--EDIT--
QueueA-----A3--A2--A1-┐
QueueB-----B3--B2--B1-┼------ Consumer
QueueC-----C3--C2--C1-┘
The desired effect is that each consumer gets first messages of each queue. For example: A1, B1, C1, A2, B2, C2, A3, B3, C3, and so on. If a new queue is created (QueueD), the consumer would start receiving messages from it in the same fashion.
Thanks in advance
What I need is that each consumer prioritize first N messages of each source. Then to run multiple consumers to improve the speed.
All message queues that I know of only provide ordering guarantees within the queue itself (Kafka provides ordering guarantee not at queue level but within the partitions within queues). However, here you are asking to serialize multiple queues. Which will not be possible in a distributed system context.
Why? because if you have more than one consumers to these queues, messages will be delivered to each connected consumers of a queue in a round robin fashion.
Assuming a prefetch_count=1 and with two connected consumers, say first set of messages delivered as follows:
A1, B1 & C1 delivered to consumer 1 (X)
A2, B2 & C2 delivered to consumer 2 (Y)
Now, in a distributed system, everything is async, and things could go wrong. For example:
If X acks A1, A3 will be delivered to X. But if Y acks A2 before X, A3 will be delivered to Y.
Who acks first is not within your control in a distributed system. Consider following scenarios:
X might had to wait for I/O or CPU bound task, while Y might got lucky that it doesn't had to wait. Then Y will advance through the messages in queue.
Or Y got killed (a partition) or n/w got slow, then X will continue consuming the queue.
I'll strongly advice you to re-think your requirements, and consider your expected guarantees in an async context (you wouldn't be considering a MoM otherwise, would you?).
PS: it is possible to implement what you are asking for with some consumer side logic (with a penalty on performance/throughput).
A single consumer has to connect to all queues
wait for messages from every queue before Ack'ing the messages.
Once a message from every queue is received, group them as a single message and publish to another queue (P).
Now many consumers could be subscribed to P to process the ordered group of messages.
I do not advise it, but hey, it is your system, who is going to stop you ;)
Let me explain what I am trying to achieve here:
Create 5 copies of same service each listening to a queue specific to them. The message they listen to would be same
SVC1 listening to Q1
SVC2 listening to Q2
SVC3 listening to Q3
SVC4 listening to Q4
SVC5 listening to Q5
Say they all listen to a message called TestMessage.
Do a Round Robin load balancing between these 5 services and drop the message to the applicable queue based on the output of my round robin logic
My question how do I configure to drop TestMessage to one queue at a point of time.
Thanks in advance
Please take a look at the built-in Distributor. This will perform the load balancing for you and take care of all the work distribution.
The Distributor is what you need to use. It has its own input queue and a queue for available workers. Everytime a worker is available, it will place a message in the distributors input queue and the distributor will send it the next message in its input queue.
Below is a sample application by Mikael Koskinen that demonstrates how to use this:
http://mikaelkoskinen.net/nservicebus-distributor-sample-application/