MQTT backend scaling - load-balancing

I am currently developing a typical IoT service. At the moment multiple devices connect to one MQTT broker (mosquitto) and my java backend also connects to the broker (Paho).
The problem i see is the following:
When i am going to have multiple instances of my java backend every backend will receive and process every message received. That`s a big issue. I just want to deliver a message to only one java backend. Anybody an idea how to deal with this problem?
Btw: Java backends will be added or removed depending on the load.

There are a couple of options
Place a queuing system between your application and the MQTT broker, possibly something like Apache Kafka
HiveMQ and IBM MessageSight brokers support (different implementations) of something called shared subscriptions. This allows messages to be shared out between more than one client. Shared subscriptions is likely to be formally added to the MQTTv5 spec which should mean that it will be added to more broker and have a standard implementation.

Related

Load Balancing with multi broker ActiveMQ artemis instance

I need your help to suggest me how best I can achieve load balancing using the below diagram. here I am trying to create 2 machines with Master and expecting that the consumer/publisher application will use one common URL( a load-balanced one), where I should not expose the individual VM machine info and port ID. just that load balancer should take care of routing..
this is typically what we do with help of F5 load balancer or HTTP load balancer ..just wondering can be achieved over ActiveMQ and its advisable..?
on other side, I also tried configuring this way on weblogic to consume data from ActiveMQ queue
failover://(tcp://localhost:61616,tcp://localhost:61617)?randomize=true but this does not help.. or WebLogic is not understanding this format.
Messaging connections are stateful. They are not stateless like HTTP connections, and therefore cannot be load-balanced in the same way as HTTP connections. It may be possible to configure an F5 to deal with stateful messaging connections, but I can't say for sure. I'm not an expert on F5.
Both the ActiveMQ Artemis broker itself as well as the JMS client shipped with the broker have load-balancing functionality built in. There's too much to cover here so I recommend you review the clustering documentation for the relevant details.
You might also try using the broker balancer feature. It's currently experimental, but it should be ready to use in the 2.21.0 release coming in the March/April time-frame. It can act like an F5 for your messaging connections, but it can do some more intelligent things like always sending certain clients to the same node which can facilitate certain use-cases which are not possible in a traditional cluster.
The URL failover://(tcp://localhost:61616,tcp://localhost:61617)?randomize=true which are you using is for the OpenWire JMS client shipped with ActiveMQ 5.x. If you're using the core JMS client shipped with ActiveMQ Artemis then you should be using a URL like this instead:
(tcp://localhost:61616,tcp://localhost:61617)?ha=true

Load balancing WebSocket with Redis and RabbitMQ

Consider a small chat server. In this server, the actual processing of messages is done by nodes of a service called "chat". Communications of this service along with a "user" service are then aggregated via a "gateway" service in front that is the only service that actually communicates with the users and is in charge of passing requests received to other services via the RabbitMQ channel they share.
In a system designed like this, each user is connected to one of the instances of the "gateway" service and when sending and receiving messages indirectly communicates with the private "chat" or "user" services behind. To load balance this, we have an Nginx reverse-proxy on the edge that tries to distribute requests to different "gateway" instances. But since WebSocket connection is real-time, "chat" instances should also be able to send messages to the right instance of the "gateway" in charge of that specific user for user-specific messages and to all "gateway" instances for site-wide messages. This is a problem since with RabbitMQ I don't believe we can target a specific subscriber and even if we could, we don't know to which instance that specific user is connected right now.
Therefore, since we are using Socket.io for WebSocket connection, I am thinking of adding a new Redis node to the stack to allow this communication between different instances of the "gateway" service. This is directly supported by Socket.io and works alright and removes all sorts of limitations imposed by the RabbitMQ, however, we are still using RabbitMQ to route a message from a "chat" instance to a "gateway" instance that then will propagate through the Redis service and when the right "gateway" instance having access to the user is found, delivered to them.
This adds unnecessary lag to user-specific outbound messages. So here I am asking if anyone has a better idea of how this problem should be approached and how to decrease this lag.
Personally, I have this idea of adding Socket.io to "chat" services (with no client access) and use its backend to send the message directly to the Redis store so that the instance of the "gateway" connected to it can route it directly to the user, going over the whole RabbitMQ thing for this type of messages.
It might be important to mention that none of these services are here just to do this specific thing, RabbitMQ is heavily used for communication between different services acting as the message broker and the "gateway" service works with multiple other services for data aggregation, authentication and data validation and transformation. The above example was a simplified version of the problem at hand with the minimum number of moving parts that I could easily describe here.
Edit: To send messages directly to socket.io redis store, the following library can be used apparently not to load the whole socket.io library:
https://github.com/socketio/socket.io-redis-emitter

Client queue persistence

Amqp brokers have persistence settings that allow guaranteed delivery - but that only works if the message actually reaches the broker. If there is a network failure and a subsequent client crash/reboot messages could be lost. Is there some way in rabbitmq or activemq or some other messaging framework for the client (producer) to persist messages to disk so that in the event the client crashes or is rebooted any unsent messages will not be lost?
I have seen people run a broker locally in order to get around this issue. That seems like an unnecessary amount of work, especially if you don't have much control over the deployment of your client.
In reality you've answered your own question pretty well. Many people looking for client side persistence turn to embedded brokers because it's actually a very good solution. Having a local broker that can store and forward gives you a lot more flexibility than just an built in persistence layer in each client, all local clients can share one broker instance which can allow you to move storage as needed in cases where you find that your stored local messages are building up due to unforeseen remote downtime.
There are of course some client implementations that do offer storage but finding one depends on your chosen broker / protocol and of course your willingness to shell out the money to buy support or licensing if that client happens to not be from say an open source implementation. The MQTT Paho client does I think have a local storage option as do some others.

Does Apache Apollo have failover support?

I'm looking to use a message queue system for an ongoing project, which now is relying on a custom (and brittle) message subsystem to interconnect multiple applications. Both the pub/sub and queue patterns are heavily used in my system.
Apache Apollo is one of the message queue systems I'm taking into account, but I don't find information about how can I handle (for instance) an Apollo server failure.
Is there a way to provide failover support in Apollo?
No, as of now this has not been resolved. Apollo is a very good broker, indeed, but lacks some production critical features like fail over. Apollo was an attempt to make a core for the next generation of ActiveMQ. However, the development is no loger active.
Have you considered other brokers like Apache Artemis? It's basically a new attempt to remake ActiveMQ with code from HornetQ, ActiveMQ and Apollo. Development is very active at the moment and there is support for fail over etc.

What solution should I use for this webapp with websockets. ActiveMQ?

I'm currently in the middle of developing a webapplication which needs a websocket connection to receive notifications of events from the server.
The clients are separated in groups and all the clients in a group must receive the same event notifications.
I thought that ActiveMQ could probably support this model, using different queues for each group of clients. It would also be relatively easy to push events to ActiveMQ using stomp, and then use stomp-over-websockets for the clients.
The problem I see is that messages should not be consumed by only one client, but distributed to all the clients connected to the queue.
Also the queue should not be stored. If a client is not connected when the event is generated, then it will never receive it.
I don't know ActiveMQ that much, so I'm not sure if this is possible or if there is another easy solution that could be used instead of writing my own message server.
Thanks
ActiveMQ 5.4.1 supports WebSockets natively (just like Stomp, JMS, etc.).
There is the concept of queues (you mentioned these), but also of topics.
In a queue, a single message will be received by exactly one consumer, in a topic
it goes to all the subscribers. See: http://activemq.apache.org/how-does-a-queue-compare-to-a-topic.html
There are some Stomp-WebSocket JS libraries floating around. Kaazing has a bundle that includes ActiveMQ and supports JMS API/Stomp protocol over WebSockets with support for older browsers, different client technologies, and Cross-Site security.
Look at Pusher, otherwise you'll need something that supports topic based pub/sub. You could look at Redis or RabbitMQ