Data broadcasting between instances of distributed server - udp

I'm trying to get some feedback on the recommendations for a service 'roster' in my specific application. I have a server app that maintains persistant socket connections with clients. I want to further develop the server to support distributed instances. Server "A" would need to be able to broadcast data to the other online server instances. Same goes for all other active instances.
Options I am trying to research:
Redis / Zookeeper / Doozer - Each server instance would register itself to the configuration server, and all connected servers would receive configuration updates as it changes. What then?
Maintain end-to-end connections with each server instance and iterate over the list with each outgoing data?
Some custom UDP multicast, but I would need to roll my own added reliability on top of it.
Custom message broker - A service that runs and maintains a registry as each server connects and informs it. Maintains a connection with each server to accept data and re-broadcast it to the other servers.
Some reliable UDP multicast transport where each server instance just broadcasts directly and no roster is maintained.
Here are my concerns:
I would love to avoid relying on external apps, like zookeeper or doozer but I would use them obviously if its the best solution
With a custom message broker, I wouldnt want it to become a bottleneck is throughput. Which would mean I might have to also be able to run multiple message brokers and use a load balancer when scaling?
multicast doesnt require any external processes if I manage to roll my own, but otherwise I would need to maybe use ZMQ, which again puts me in the situation of depends.
I realize that I am also talking about message delivery, but it goes hand in hand with the solution I go with.
By the way, my server is written in Go. Any ideas on a best recommended way to maintain scalability?
* EDIT of goal *
What I am really asking is what is the best way to implement broadcasting data between instances of a distributed server given the following:
Each server instance maintains persistent TCP socket connections with its remote clients and passes messages between them.
Messages need to be able to be broadcasted to the other running instances so they can be delivered to relavant client connections.
Low latency is important because the messaging can be high speed.
Sequence and reliability is important.
* Updated Question Summary *
If you have multiple servers / multiple end points that need to pub/sub between each other, what is a recommended mode of communication between them? One or more message brokers to re-pub messages to a roster of the discovered servers? Reliable multicast directly from each server?
How do you connect multiple end points in a distributed system while keeping latency low, speed high, and delivery reliable?

Assuming all of your client-facing endpoints are on the same LAN (which they can be for the first reasonable step in scaling), reliable UDP multicast would allow you to send published messages directly from the publishing endpoint to any of the endpoints who have clients subscribed to the channel. This also satisfies the low-latency requirement much better than proxying data through a persistent storage layer.
Multicast groups
A central database (say, Redis) could track a map of multicast groups (IP:PORT) <--> channels.
When an endpoint receives a new client with a new channel to subscribe, it can ask the database for the channel's multicast address and join the multicast group.
Reliable UDP multicast
When an endpoint receives a published message for a channel, it sends the message to that channel's multicast socket.
Message packets will contain ordered identifiers per server per multicast group. If an endpoint receives a message without receiving the previous message from a server, it will send a "not acknowledged" message for any messages it missed back to the publishing server.
The publishing server tracks a list of recent messages, and resends NAK'd messages.
To handle the edge case of a server sending only one message and having it fail to reach a server, server can send a packet count to the multicast group over the lifetime of their NAK queue: "I've sent 24 messages", giving other servers a chance to NAK previous messages.
You might want to just implement PGM.
Persistent storage
If you do end up storing data long-term, storage services can join the multicast groups just like endpoints... but store the messages in a database instead of sending them to clients.

Related

Load balancing WebSocket with Redis and RabbitMQ

Consider a small chat server. In this server, the actual processing of messages is done by nodes of a service called "chat". Communications of this service along with a "user" service are then aggregated via a "gateway" service in front that is the only service that actually communicates with the users and is in charge of passing requests received to other services via the RabbitMQ channel they share.
In a system designed like this, each user is connected to one of the instances of the "gateway" service and when sending and receiving messages indirectly communicates with the private "chat" or "user" services behind. To load balance this, we have an Nginx reverse-proxy on the edge that tries to distribute requests to different "gateway" instances. But since WebSocket connection is real-time, "chat" instances should also be able to send messages to the right instance of the "gateway" in charge of that specific user for user-specific messages and to all "gateway" instances for site-wide messages. This is a problem since with RabbitMQ I don't believe we can target a specific subscriber and even if we could, we don't know to which instance that specific user is connected right now.
Therefore, since we are using Socket.io for WebSocket connection, I am thinking of adding a new Redis node to the stack to allow this communication between different instances of the "gateway" service. This is directly supported by Socket.io and works alright and removes all sorts of limitations imposed by the RabbitMQ, however, we are still using RabbitMQ to route a message from a "chat" instance to a "gateway" instance that then will propagate through the Redis service and when the right "gateway" instance having access to the user is found, delivered to them.
This adds unnecessary lag to user-specific outbound messages. So here I am asking if anyone has a better idea of how this problem should be approached and how to decrease this lag.
Personally, I have this idea of adding Socket.io to "chat" services (with no client access) and use its backend to send the message directly to the Redis store so that the instance of the "gateway" connected to it can route it directly to the user, going over the whole RabbitMQ thing for this type of messages.
It might be important to mention that none of these services are here just to do this specific thing, RabbitMQ is heavily used for communication between different services acting as the message broker and the "gateway" service works with multiple other services for data aggregation, authentication and data validation and transformation. The above example was a simplified version of the problem at hand with the minimum number of moving parts that I could easily describe here.
Edit: To send messages directly to socket.io redis store, the following library can be used apparently not to load the whole socket.io library:
https://github.com/socketio/socket.io-redis-emitter

Client queue persistence

Amqp brokers have persistence settings that allow guaranteed delivery - but that only works if the message actually reaches the broker. If there is a network failure and a subsequent client crash/reboot messages could be lost. Is there some way in rabbitmq or activemq or some other messaging framework for the client (producer) to persist messages to disk so that in the event the client crashes or is rebooted any unsent messages will not be lost?
I have seen people run a broker locally in order to get around this issue. That seems like an unnecessary amount of work, especially if you don't have much control over the deployment of your client.
In reality you've answered your own question pretty well. Many people looking for client side persistence turn to embedded brokers because it's actually a very good solution. Having a local broker that can store and forward gives you a lot more flexibility than just an built in persistence layer in each client, all local clients can share one broker instance which can allow you to move storage as needed in cases where you find that your stored local messages are building up due to unforeseen remote downtime.
There are of course some client implementations that do offer storage but finding one depends on your chosen broker / protocol and of course your willingness to shell out the money to buy support or licensing if that client happens to not be from say an open source implementation. The MQTT Paho client does I think have a local storage option as do some others.

MassTransmit - Distributed Messaging Model - Reliable/Durable - NServiceBus too expensive

I would like to use MassTransmit similar to NServiceBus, every publisher and subscriber has a local queue. However I want to use RabbitMQ.
So do all my desktop clients have to have RabbitMQ installed, I think so, then should I just connect the 50 desktop clients and 2 servers into a cluster?
I know the two servers must be in the same cluster. However 50 client nodes, seems a bi tmuch to put in one cluster.....Or should I shovel them or Federate them to the server cluster exchange?
The desktop machine send messages like: LockOrder, UnLock Order.
The Servers are dealing with backend hl7 messages.
Any help and advice here is much appreciated, this is all on windows machines.
Basically I am leaving NServiceBus behind, as it is now too expensive, they aiming it at large corporations with big budgets, hence Masstransmit.
However I want reliable/durable messaging, hence local queues on ALL publishers and ALL subscribers.
The desktops also use CQS to update their views.
should I just connect the 50 desktop clients and 2 servers into a cluster?
Yes, you have to connected your clients to the cluster.
However 50 client nodes, seems a bi tmuch to put in one cluster.
No, (or it depends how big are your servers) 50 clients is a small number
Or should I shovel them or Federate them to the server cluster exchange?
The desktop machine send messages like: LockOrder, UnLock Order.
I think it's better the cluster, because federation and shovel are asynchronous, it means that your LockOrder could be not replicated in time.
However I want reliable/durable messaging, hence local queues on ALL publishers and ALL subscribers
Withe RMQ you can create a persistent queue and messages, and it is not necessary if the client(s) is connected. It will get the messages when it will connect to the broker.
I hope it helps.
I have a FOSS ESB rpoject called Shuttle, if you would like to give it a spin: https://github.com/Shuttle/shuttle-esb
I haven't used NServiceBus for a while and actually started Shuttle when it went commercial. The implementation is somewhat different from NServiceBus. I don't know MassTransit at all, though. Currently process managers (sagas) have to be hand-rolled in Shuttle whereas MassTransit and NServiceBus have this incorporated. If I do get around to adding sagas I'll be adding them as a Module that can be plugged into the receiving pipeline. This way one could have various implementations and choose the flavour you like :)
Back to your issue. Shuttle has the concept of an optional outbox for queuing technologies like RabbitMQ. Shuttle does have a RabbitMQ implementation. I believe the outbox works somewhat like 'shovel' does. So the outbox would be local and sending messages would first go to the outbox. It would periodically try to send messages on to the recipients and, after a configurable number of attempts, send the message to an error queue. It can then be returned to the outbox for further attempts, or even moved directly to the recipient queue once it is up.
Documentation here: http://shuttle.github.io/shuttle-esb/

Is it possible to thread pool IMAP connections?

From what I understand IMAP requires a connection per each user. I'm writing an IMAP client (currently just gmail) that supports many (100s, 1000s maybe 10000s+) users at a time. Obviously cutting down the number of open connections would be great. I'm wondering if it's possible to use thread pooling on my side to connect to gmail via IMAP or if that simply isn't supported by the IMAP protocol.
IMAP typically uses SSL over TCP/IP. And a TCP/IP connection will need to be maintained per IMAP client connection, meaning that there will be many simultaneous open connections.
These multiple simultaneous connections can easily be maintained in a non-threaded (single thread) implementation without affecting the state of the TCP connections. You'll have to have some sort of a flow concept per IMAP TCP/IP connection, and store all of the flows in a container (a c++ STL map for instance) using the TCP/IP five-tuple (or socketFd) as a key. For each data packet received, lookup the flow and handle the packet accordingly. There is nothing about this approach that will affect the TCP nor IMAP connections.
Considering that this will work in a single-thread environment, adding a thread pool will only increase the throughput of the application, since you can handle data packets for several flows simultaneously (assuming its a multi-core CPU) You will just need to make sure that 2 threads dont handle data packets for the same flow at the same time, which could cause the packets to be handled out of order. An approach could be to have a group of flows per thread, maybe using IP pools or something similar.

Advice on disconnected messages with WCF through firewalls

All,
I'm looking for advice over the following scenario:
I have a component running in one part of the corporate network that sends messages to an application logic component for processing. These components might reside on the same server, different servers in the same network (LAN ot WAN) or live outside in the cloud. The application server should be scalable and resilient.
The messages are related in that the sequence they arrive is important. They are time-stamped with the client timestamp.
My thinking is that I'll get the clients to use WCF basicHttpBinding (some are based on .NET CF which only has basic) to send messages to the Application Server (this is because we can guarantee port 80/443 will be open for outgoing connections). Server accepts these, and writes these into a queue. This queue can be scaled out if needed over multiple machines.
I'm hesitant to use MSMQ for the queue though as to properly scale out we are going to have to install seperate private queues on each application server and round-robin monitor the queues. I'm concerned though that we could lose a message on a server that's gone down until the server is restored, and we could end up processing a later message from a different server and disrupt the sequence.
What I'd prefer is a central queue (e.g. a database table) that all application servers monitor.
With this in mind, what I'd like to do is to create a custom WCF binding, similar to netMsmqBinding, but that uses the DB table instead but I'm confused as to whether I can simply create a custom transport or a I need a full binding, and whether the binding will allow the client to send over HTTP. I've looked around the internet but I'm a little confused as to where to start.
I could not bother with the custom WCF binding but it seems a good way to introduce scalability if I do need to seperate the servers.
Any suggestions please would be helpful, including alternatives.
Many thanks
I would start with MSMQ because it is exactly for this purpouse. Use single transactional queue on clustered machine and let application servers to take messages for processing from this queue. Each message processing has to be part of distributed transaction (MSDTC).
This scenario will ensure:
clustered queue host will ensure that if one cluster node fails the other will still be able to handle requests
sending each message as recoverable - it means that message will be persisted on hard drive (not only in memory) so in critical failure of the whole cluster you will still have all messages.
transactional queue will ensure that all message transport operations will be atomic - moving message from outgoing queue to destination queue will be processed as transaction. It means that original message from outgoing queue will be kept in queue until ack from destination queue arrives. Transactional processing can ensure in order delivery.
Distributed transaction will allow application servers consuming messages in transaction. Message will not be deleted from queue until application server commits transaction or transaction time outs.
MSMQ is also available on .NET CF so you can send messages directly to queue without intermediate non-reliable web service layer.
It should be possible to configure MSMQ over HTTP (but I have never used it so I'm not sure how it cooperates with previous mentioned features).
Your proposed solution will be pretty hard. You will end up in building BizTalk's MessageBox. But if you really want to do it, check Omar's post about building database queue table.