I need to know the actual difference between them. I just learned of these techs at the high level.
No. AWS SWF is a workflow orchestration engine which has internal queueing support to deliver activity tasks. It is focused on coordinating execution of those tasks. SQS is a pure queue without any other additional features.
My understanding is that RubbitMQ is more like SQS, just not as fault tolerant and scalable and Celery is just a Python client side library to consume from it.
AWS SWF provides its own client side libraries to consume from its internal queues (called task lists)
Related
I have a producer of tasks and multiple workers to consume those tasks. Many places recommend rabbitmq and/or celery. However python has a builtin multiprocessing queue that can be shared on an ip/port using a manager/proxy. What would be the advantages of using something like rabbitmq instead?
RabbitMq is an enterprise level tool, typically deployed separately on out-of-process servers / VMs / Containers, and plays in the enterprise service bus space.
Rabbit has reliable messaging as an objective - e.g. messages are persisted, and nodes in the cluster can be restarted without losing messages.
Supports a large range of messaging topologies, such as Point-Point, Fan out, and Topic subscriptions
Can be scaled for volume by adding multiple nodes to a cluster
Allows for conditional routing of messages to queues using routing keys or header filters
Agnostic of client technology, i.e. Clients can be on any platform which support the AMQP protocol
Has an out of the box administration, monitoring and diagnostics UI
Has a wide range of extensions and tools, such as shovels allowing messages to be replicated across multiple RabbitMQ clusters.
I'm no Python expert, but from what I understand of the multiprocessing package, it serves as an manager for distributing work between worker processes and threads, so IMO would be regarded as a more local system concern, as opposed to 'enterprise' level.
e.g. you would need to handle persistence, i.e. so messages are not lost during a crash / restart, and would likely need to built your own administration and monitoring tools.
We have our Active-MQ on-premise and I'm exploring options in AWS for managed messaging service. All of our applications use ActiveMQ's Virtual Topic feature. It appears that in AWS, Fanout can be achieved using SNS->SQS. But unfortunately, SNS support SQS standard Queues and FIFO queues are not supported yet. What is the best way to achieve Fanout cases when message ordering is also important?
We could use Kinesis and AWS ActiveMQ as well. But with Kinesis I couldn't imagine how the VirtualTopic feature can be achieved in Kinesis. How shards are work in multiple Topics.
So, what is the best way to achieve ActiveMQ Virtual Topic functionality in the AWS world using SQS-SNS?
Well, now you can use SNS FIFO topics with SQS FIFO queues.
https://aws.amazon.com/about-aws/whats-new/2020/10/amazon-sns-introduces-fifo-topics-with-strict-ordering-and-deduplication-of-messages/
you can use 'Amazon MQ' since it is mentioned in FAQ that 'Amazon MQ is suitable for enterprise IT pros, developers, and architects who are managing a message broker themselves–whether on-premises or in the cloud–and want to move to a fully managed cloud service without rewriting the messaging code in their applications.'
Ref: https://aws.amazon.com/amazon-mq/faqs/
Is not Apache Kafka another implementation of JMS?
I am using JMS+AMQ in my application, and migrating to Apache Kafka. Do I have to change all JMS codes?
No, Kafka is different from JMS systems such as ActiveMQ.
see ActiveMQ vs Apollo vs Kafka
Kafka has less features than ActiveMQ, as the stress has been put on performances. So before migrating, check that the features you use in AMQ are in Kafka.
However, there is an open suggestion for a bridge between JMS and Kafka, to allow exactly what you need. Maybe the provided links can help you
https://issues.apache.org/jira/browse/KAFKA-1995
Actually, the two are not the same. And with a little more time seeing the two co-exist - and listening to problems and happy points from those deploying each in the field - there is a little more to say about each one.
Firstly, JMS supports both point-to-point messaging (where messages are sent to single consumers; the consumers themselves maintain their message queues) and the publish-and-subscribe (pub/sub) model (where messages are written to a single topic, and consumers, independently, decide which messages to consume).
In a point-to-point messaging architecture, message producers and consumers know each other, where as in a pub/sub model they do not. Apache Kafka focuses on a pub/sub model, maintaining a separate log/topic from which consumers read from offsets. Kafka is also built for the cloud, with high-throughput a core consideration.
Many in our community and at meetups throw their hands up in frustration at MOMs (message-oriented middlewares) like JMS and switch to Kafka, for, what boils down to one reason: scalability. They argue that Kafka is better suited for scale than other MOMs because Kafka maintains a partitioned topic log. In so doing, Kafka can split up message flow to groups of consumers by partition and batch transmit the messages.
This concept also allows Kafka to have more granular control over ACLs (access control) to Kafka Consumers, although there are some issues there, which Apache Pulsar is addressing.
Finally, on Kafka, since the client/consumer decides which messages to consume (by offset in the topic), this removes some of the producer-side complexity of routing rules built into MOMs like JMS.
There's more differences than that, but this is a distillation of some of the ones that keep coming up! Hope this helps.
No, Kafka uses its own non-standard protocol and clients.
However, there's a 3rd-party JMS Client for Kafka from Confluent.
NServiceBus Distributor/Worker pattern makes perfect sense for MSMQ due to the hard requirement of local input queues.
But this is not the case with RabbitMQ, I am trying to understand how and when the NServiceBus distributor is relevant with RabbitMQ. With RabbitMQ multiple workers can read from the same remote queue.
The actual scenario is similar to using an AWS auto-scaling group to scale out workers pointing to a high available RabbitMQ cluster. Now avoiding distributor altogether makes the setup much simpler to build, test and provision.
Thoughts?
As RabbitMQ transport falls into the broker style bus, so, in your use case, it would make more sense not to use the distributor.
The same goes for all broker-style transports, where you can use a competing consumer pattern to scale out.
NServiceBus is an excellent system and does wonders in most message queuing system where you don't have an integrated distributor (which you do with exchanges in RabbitMQ). We use NServiceBus here at our company.
Azure Queues and MSMQ are perfect examples of such queuing technologies.
NServiceBus handles the distribution internally and therefore reproduces this capability for you.
However... If you are blessed with the possibility of imposing what queuing technology you can use, then I would highly encourage you to look into RabbitMQ and a product (Open Source) called MassTransit
http://masstransit-project.com/
MassTransit can in turn function in the two modes and will either delegate or simulate the distribution for you - however I nonetheless have a soft spot for NServiceBus as do our senior devs here.
Per this page...
http://docs.particular.net/nservicebus/load-balancing-with-the-distributor
Using the distributor is only useful when using MSMQ - if you aren't using MSMQ then there is no point. RabbitMQ and other transport will allow access to the same queue from multiple consumers, while MSMQ will not. The distributor in a nutshell will take messages from the main queue and distribute them across multiple worker queues as they report that they are done with whatever they are working on.
I am wondering if there is a way to setup NServiceBus so that the machine actually getting the message from a publisher does not have the InputQueue on it. Also, I would like to publish to a general queue (though this can be accomplished with a web service.)
I am thinking I may use this to allow client machines to post and receive events. But the client machines are fairly locked down. If I need to have queues created on them I can, but it would be easier to have the queues uniquely named and in a more central location.
I am new to NServiceBus and pub/sub in general. So if I am off base on what I want please say so.
This sounds like the perfect candidate for an alternate queuing infrastructure beyond MSMQ--such as Azure Queues or Amazon SQS. With those types of queues you have no infrastructure to install on the client machines and everything is much more centralized.
Before you go down that road though, you'll want to get the basics of publish/subscribe under you. Pub/sub using MSMQ and NServiceBus has a decent learning curve to it and if you aren't familiar with how things work at that level then moving to cloud queues may be even more tricky.