hazelcast WAN replication vs Solace - replication

We are evaluating IMDG technologies, Apache Geode vs Hazelcast, any real differences?
hazelcast has WAN replication. Also hazelcast can be used with Solace.
whats the difference?

Hazelcast supports different implementations for WAN replication through its WanPublisher and WanConsumer interfaces.
By default Hazelcast uses WanBatchReplication implementation which creates TCP connections to target members and send the WAN events in batches. This implementation also has in-memory (replicated) queues to send events asynchronously and to deal with differences of WAN link throughput.
You can replace this implementation with SolaceWanPublisher which publishes WAN events to Solace queues and doesn't use the direct TCP link approach used by WanBatchReplication.
Replication of WAN events are provided by Solace and target members consumes events from that queues.
So source and target members are unaware of each others network topology.
You may want to check Hazelcast Documentation or this white paper for details.

Related

why use rabbitmq or similar versus python builtin multiprocessing queue?

I have a producer of tasks and multiple workers to consume those tasks. Many places recommend rabbitmq and/or celery. However python has a builtin multiprocessing queue that can be shared on an ip/port using a manager/proxy. What would be the advantages of using something like rabbitmq instead?
RabbitMq is an enterprise level tool, typically deployed separately on out-of-process servers / VMs / Containers, and plays in the enterprise service bus space.
Rabbit has reliable messaging as an objective - e.g. messages are persisted, and nodes in the cluster can be restarted without losing messages.
Supports a large range of messaging topologies, such as Point-Point, Fan out, and Topic subscriptions
Can be scaled for volume by adding multiple nodes to a cluster
Allows for conditional routing of messages to queues using routing keys or header filters
Agnostic of client technology, i.e. Clients can be on any platform which support the AMQP protocol
Has an out of the box administration, monitoring and diagnostics UI
Has a wide range of extensions and tools, such as shovels allowing messages to be replicated across multiple RabbitMQ clusters.
I'm no Python expert, but from what I understand of the multiprocessing package, it serves as an manager for distributing work between worker processes and threads, so IMO would be regarded as a more local system concern, as opposed to 'enterprise' level.
e.g. you would need to handle persistence, i.e. so messages are not lost during a crash / restart, and would likely need to built your own administration and monitoring tools.

Failover with Spring AMQP and RabbitMQ HA

There are multiple articles suggesting that load-balancer should be used in front of RabbitMQ cluster.
However, there are also multiple references that Spring AMQP is using some
failover implementation like connection reset when broker comes back to life.
I have several questions regarding this topic (given that those articles are more or less old and it's 2018 today)
When using Spring AMQP, is it load-balancing for still required?
If load-balancing is still suggested, how would I solve affinity of primary queue to its node? There would be much inter-connect between cluster nodes, because round-robin load-balancer would have 1-(1/n) success rate of hitting correct cluster node
Does Spring AMQP support some kind of topology awareness, which would allow it to consume from correct node?
There were some articles suggesting that clients should publish/consume to nodes respecting locality of queues. Does this still apply? How does this all fits together given load-balancing, Spring AMQP failover and CachingConnectionFactory?
Can anybody please provide answers to those topics and also provide relevant references, which would provide additional information for verification?
Thanks a lot
For each of your bullets:
a load balancer makes little sense with default configuration of Spring AMQP since it opens a single, long-lived, connection that is shared across all consumers. In, 2.0, you can configure the RabbitTemplate to use a separate connections; this is because it is a recommended configuration to use a different connection for publishers/consumers; this will be default in 2.1.
It might make sense to use a load balancer if you configure the connection factory to cache connections (instead of just channels) since, then, each component gets its own connection.
See next bullet.
See Queue Affinity and the LocalizedQueueConnectionFactory. It uses the management plugin to determine which node currently hosts the queue and connects to that. It will not work with a load balancer since it needs to connect to the actual node.
It is my understanding from several discussions that queue affinity is only needed in the most extreme environments and that, in most environments, the difference is immeasurable. However, environments/networks differ so much, YMMV so you may want to test. My general rule of thumb is to avoid premature optimization since the added complexity of the configuration may simply not be worth the benefit (and you may not have a problem in the first place).

Is Apache Kafka another API for JMS?

Is not Apache Kafka another implementation of JMS?
I am using JMS+AMQ in my application, and migrating to Apache Kafka. Do I have to change all JMS codes?
No, Kafka is different from JMS systems such as ActiveMQ.
see ActiveMQ vs Apollo vs Kafka
Kafka has less features than ActiveMQ, as the stress has been put on performances. So before migrating, check that the features you use in AMQ are in Kafka.
However, there is an open suggestion for a bridge between JMS and Kafka, to allow exactly what you need. Maybe the provided links can help you
https://issues.apache.org/jira/browse/KAFKA-1995
Actually, the two are not the same. And with a little more time seeing the two co-exist - and listening to problems and happy points from those deploying each in the field - there is a little more to say about each one.
Firstly, JMS supports both point-to-point messaging (where messages are sent to single consumers; the consumers themselves maintain their message queues) and the publish-and-subscribe (pub/sub) model (where messages are written to a single topic, and consumers, independently, decide which messages to consume).
In a point-to-point messaging architecture, message producers and consumers know each other, where as in a pub/sub model they do not. Apache Kafka focuses on a pub/sub model, maintaining a separate log/topic from which consumers read from offsets. Kafka is also built for the cloud, with high-throughput a core consideration.
Many in our community and at meetups throw their hands up in frustration at MOMs (message-oriented middlewares) like JMS and switch to Kafka, for, what boils down to one reason: scalability. They argue that Kafka is better suited for scale than other MOMs because Kafka maintains a partitioned topic log. In so doing, Kafka can split up message flow to groups of consumers by partition and batch transmit the messages.
This concept also allows Kafka to have more granular control over ACLs (access control) to Kafka Consumers, although there are some issues there, which Apache Pulsar is addressing.
Finally, on Kafka, since the client/consumer decides which messages to consume (by offset in the topic), this removes some of the producer-side complexity of routing rules built into MOMs like JMS.
There's more differences than that, but this is a distillation of some of the ones that keep coming up! Hope this helps.
No, Kafka uses its own non-standard protocol and clients.
However, there's a 3rd-party JMS Client for Kafka from Confluent.

rabbitMQ federation VS ActiveMQ Master/Slave

I am trying to set up cluster of brokers, which should have same feature like rabbitMQ cluster, but over WAN (my machines are in different locations), so rabbitMQ cluster does not work.
I am looking to alternatives, rabbitMQ federation is just backup the messages in the downstream, can not make sure they have exactly the same messages available at any time (downstream still keeps the old messages already consumed in the upstream)
how about ActiveMQ Master/Slave, I have found :
http://activemq.apache.org/how-do-distributed-queues-work.html
"queues and topics are all replicated between each broker in the cluster (so often to a master and maybe a single slave). So each broker in the cluster has exactly the same messages available at any time so if a master fails, clients failover to a slave and you don't loose a message."
My concern is that if it can automatically update to make sure Master/Slave always have the same messages, which means the consumed messages in Master will also disappear in Slaves.
Thanks :)
ActiveMQ has various clustering features.
First there is High Availability - "Master/Slave". The idea is that several physical servers act as a single logical ActiveMQ broker. If one goes down, another takes it place without losing data. You can do that by sharing the message store (shared file system or shared JDBC), or you could setup a replicated cluster, which replicates read/writes to the master down to all slaves (you need three+ servers). ActiveMQ is using LevelDB and Apache Zookeeper to achieve this.
The other format of cluster available in ActiveMQ is to be able to distribute load and separate security over several logical brokers. Brokers are then connected in a network of brokers. Messages are by default passed around to the broker with available consumers for that message. However, there is a rich toolbox of features in ActiveMQ to tweak a network of brokers to do things as always send a copy of a message to specific broker etc. It takes some messing with the more advanced features though (static network connectors and queue mirroring, maybe more).
Maybe there is a better way to solve your requirements, which is not really specified in the question?

Are Activemq, Redis and Apache camel a right combination?

Are Activemq, Redis and Apache camel a right combination?
Am planning for a high performant enterprise level integration solution accross multiple applications
My objective is to make the solution
a. independent of the consumers performance
b. able to trouble shoot in case of any issue
c. highly available with failover support
d. Hanlde 10k msgs per second
Here I'm planning to have
a. network of activemq brokers running in all app servers and storing the consumed messages in redis data store
b. from redis data store, application can retrieve the messages through camel end points
(camel end point is chosen to process the messages before reaching the app).
Also can ActiveMQ be removed with only Redis + Apache camel, as I see from the discussions forms that Redis does most of the ActiveMQ stuff
Could any one advise on this technology stack.
ActiveMQ and Camel works great together and scales very well - should be no problem to handle the load given proper hardware.
Are you thinking about something like this?
Message producer App -> ActiveMQ -> Camel -> Redis
Message Consumer App <- Camel [some endpoint] <- Redis
Puting ActiveMQ in between is usually a very good way to achieve HA, load balancing and making the solution elastic. Depending on your specific setup with machines etc. ActiveMQ can help in many ways to solve HA issues.
Removing ActiveMQ can a good option if your apps use some other protocol than JMS/ActiveMQ messaging, i.e. HTTP, raw tcp or similar. Can you elaborate on how the apps will communicate with Camel? ActiveMQ, by default, supports transactions, guaranteed delivery and you can live with a limited number of threads on the server, even for your heavy traffic. For other protocols, this might be a bit trickier to achieve. Without a HA layer (cluster) in ActiveMQ you need to setup Redis to handle HA in all aspects, which might be just as easy, but Redis is a bit memory hungry, so be aware of that.