Cloud hub workers are NOT clustered , however we get Message loss protection and workload distribution across mule instances using Persistent queues. Also we can use default persistent object store (_defaultUserObjectStore ) for distributed caching ( with tweak). Correct me if I am wrong here.
With above features present , What is that we are missing in CloudHub as compared to On -premise clusters ? ( Is it Concurrency / one-time message delivery issue preventions ?)
First of all why did Mulesoft not enable clustering feature on Cloud hub ?
I would say that with the above features present you do not miss out anything. Also keep in mind that even in the On Prem HA Cluster the shared queues and states (object stores) are by default keept in shared memory and there is no persistens if the complete cluster goes down. To get the persistence you need to do tweaks also for a on prem cluster. As such for true message reliability I would suggest you look at a external message broker or service such as Anypoint MQ.
As for why Mulesoft did no enable clustering I can not answer since I'm not a Mulesoft employee. However best practices in integrations and API design is to keep the application stateless. When this is followed and you use a external message broker, such as Anypoint MQ, to implement the reliable messaging pattern the need for the Mule runtime HA cluster capabilities are small.
Related
This is regarding the use case where we are trying to use the Redis in PCF (Pivotal Cloud Foundry). In our use case, we will refresh the Redis cache daily once or twice with the required data and then API will query Redis and then provide the response.
One thing of particular concern for us is that we want API queries to happen from Redis only that means Redis to be available at all times. But whenever we are refreshing the Redis DB, Redis would not be able to serve the APIs since it is refreshing the keys. To avoid that we wanted to setup a Redis in cluster mode or master-slave mode so if one instance is being written another can be read from.
How can we setup Redis cluster or master-slave mode in PCF and then fulfil our requirement?
Please provide any other suggestions as well that you may have.
At the time I write this, the Redis for Pivotal Platform product does not support clustering. See Availability, in the docs here -> https://docs.pivotal.io/redis/2-3/erc.html#offerings.
All Redis for Pivotal Platform services are single VMs without clustering capabilities. This means that planned maintenance jobs (e.g., upgrades) can result in 2–10 minutes of downtime, depending on the nature of the upgrade. Unplanned downtime (e.g., VM failure) also affects the Redis service.
Redis for Pivotal Platform has been used successfully in enterprise-ready apps that can tolerate downtime. Pre-existing data is not lost during downtime with the default persistence configuration. Successful apps include those where the downtime is passively handled or where the app handles failover logic.
If you require clustered Redis, you'd need to look at a different offering. Redis Labs has some offerings that integrate with PCF, you could use a Cloud Provider's Redis offering, or you could host your own.
If the solution you use isn't integrated into PCF, you can create a user-provided service with cf cups and provide the Redis credentials to your application that way. It will function just like a Redis service instance created through the marketplace.
I have a producer of tasks and multiple workers to consume those tasks. Many places recommend rabbitmq and/or celery. However python has a builtin multiprocessing queue that can be shared on an ip/port using a manager/proxy. What would be the advantages of using something like rabbitmq instead?
RabbitMq is an enterprise level tool, typically deployed separately on out-of-process servers / VMs / Containers, and plays in the enterprise service bus space.
Rabbit has reliable messaging as an objective - e.g. messages are persisted, and nodes in the cluster can be restarted without losing messages.
Supports a large range of messaging topologies, such as Point-Point, Fan out, and Topic subscriptions
Can be scaled for volume by adding multiple nodes to a cluster
Allows for conditional routing of messages to queues using routing keys or header filters
Agnostic of client technology, i.e. Clients can be on any platform which support the AMQP protocol
Has an out of the box administration, monitoring and diagnostics UI
Has a wide range of extensions and tools, such as shovels allowing messages to be replicated across multiple RabbitMQ clusters.
I'm no Python expert, but from what I understand of the multiprocessing package, it serves as an manager for distributing work between worker processes and threads, so IMO would be regarded as a more local system concern, as opposed to 'enterprise' level.
e.g. you would need to handle persistence, i.e. so messages are not lost during a crash / restart, and would likely need to built your own administration and monitoring tools.
Is not Apache Kafka another implementation of JMS?
I am using JMS+AMQ in my application, and migrating to Apache Kafka. Do I have to change all JMS codes?
No, Kafka is different from JMS systems such as ActiveMQ.
see ActiveMQ vs Apollo vs Kafka
Kafka has less features than ActiveMQ, as the stress has been put on performances. So before migrating, check that the features you use in AMQ are in Kafka.
However, there is an open suggestion for a bridge between JMS and Kafka, to allow exactly what you need. Maybe the provided links can help you
https://issues.apache.org/jira/browse/KAFKA-1995
Actually, the two are not the same. And with a little more time seeing the two co-exist - and listening to problems and happy points from those deploying each in the field - there is a little more to say about each one.
Firstly, JMS supports both point-to-point messaging (where messages are sent to single consumers; the consumers themselves maintain their message queues) and the publish-and-subscribe (pub/sub) model (where messages are written to a single topic, and consumers, independently, decide which messages to consume).
In a point-to-point messaging architecture, message producers and consumers know each other, where as in a pub/sub model they do not. Apache Kafka focuses on a pub/sub model, maintaining a separate log/topic from which consumers read from offsets. Kafka is also built for the cloud, with high-throughput a core consideration.
Many in our community and at meetups throw their hands up in frustration at MOMs (message-oriented middlewares) like JMS and switch to Kafka, for, what boils down to one reason: scalability. They argue that Kafka is better suited for scale than other MOMs because Kafka maintains a partitioned topic log. In so doing, Kafka can split up message flow to groups of consumers by partition and batch transmit the messages.
This concept also allows Kafka to have more granular control over ACLs (access control) to Kafka Consumers, although there are some issues there, which Apache Pulsar is addressing.
Finally, on Kafka, since the client/consumer decides which messages to consume (by offset in the topic), this removes some of the producer-side complexity of routing rules built into MOMs like JMS.
There's more differences than that, but this is a distillation of some of the ones that keep coming up! Hope this helps.
No, Kafka uses its own non-standard protocol and clients.
However, there's a 3rd-party JMS Client for Kafka from Confluent.
We're using spring cloud config server. Spring config clients get updates using spring control bus (RabbitMQ).
Looks like every config client instance creates a queue connected to the 'spring.cloud.bus' exchange.
Any scalability limits on how many app instances can connect to a 'spring.cloud.bus' exchange ?
I suppose RabbitMQ could be scaled to handle this.
Looking for any guidelines on this.
Many thanx,
The spring cloud config server can have multiple instances since it is stateless. That coupled with a RabbitMQ cluster should scale to a very large number of instances.
A viable solution would be spring cloud config behind a load balancer with a RabbitMQ cluster.
NServiceBus Distributor/Worker pattern makes perfect sense for MSMQ due to the hard requirement of local input queues.
But this is not the case with RabbitMQ, I am trying to understand how and when the NServiceBus distributor is relevant with RabbitMQ. With RabbitMQ multiple workers can read from the same remote queue.
The actual scenario is similar to using an AWS auto-scaling group to scale out workers pointing to a high available RabbitMQ cluster. Now avoiding distributor altogether makes the setup much simpler to build, test and provision.
Thoughts?
As RabbitMQ transport falls into the broker style bus, so, in your use case, it would make more sense not to use the distributor.
The same goes for all broker-style transports, where you can use a competing consumer pattern to scale out.
NServiceBus is an excellent system and does wonders in most message queuing system where you don't have an integrated distributor (which you do with exchanges in RabbitMQ). We use NServiceBus here at our company.
Azure Queues and MSMQ are perfect examples of such queuing technologies.
NServiceBus handles the distribution internally and therefore reproduces this capability for you.
However... If you are blessed with the possibility of imposing what queuing technology you can use, then I would highly encourage you to look into RabbitMQ and a product (Open Source) called MassTransit
http://masstransit-project.com/
MassTransit can in turn function in the two modes and will either delegate or simulate the distribution for you - however I nonetheless have a soft spot for NServiceBus as do our senior devs here.
Per this page...
http://docs.particular.net/nservicebus/load-balancing-with-the-distributor
Using the distributor is only useful when using MSMQ - if you aren't using MSMQ then there is no point. RabbitMQ and other transport will allow access to the same queue from multiple consumers, while MSMQ will not. The distributor in a nutshell will take messages from the main queue and distribute them across multiple worker queues as they report that they are done with whatever they are working on.