CometD Failover Ability - VM Switch During Restart - virtual-machine

I have a chat implementation working with CometD.
On front end I have a Client that has a clientId=123 and is talking to VirtualMachine-1
The longpolling connection between the VirtualMachine-1 and the Client is done through the clientId. When the connection is established during the handshake, VirtualMachine-1 registers the 123 clientId as it's own and accepts its data.
For some reason, if VM-1 is restarted or FAILS. The longpolling connection between Client and VM-1 is disconnected (since the VirtualMachine-1 is dead, the heartbeats would fail, thus it would become disconnected).
In which case, CometD loadBalancer will re-route the Client communication to a new VirtualMachine-2. However, since VirtualMachine-2 has different clientId it is not able to understand the "123" coming from the Client.
My question is - what is the cometD behavior in this case? How does it re-route the traffic from VM-1 to a new VM-2 to successfully go through handshaking process?

When a CometD client is redirected to the second server by the load balancer, the second server does not know about this client.
The client will send a /meta/connect message with clientId=123, and the second server will reply with a 402::unknown_session and advice: {reconnect: "handshake"}.
When receiving the advice to re-handshake, the client will send a /meta/handshake message and will get a new clientId=456 from the second server.
Upon handshake, a well written CometD application will subscribe (even for dynamic subscriptions) to all needed channels, and eventually be restored to function as before, almost transparently.
Messages published to the client during the switch from one server to the other are completely lost: CometD does not implement any persistent feature.
However, persisting messages until the client acknowledged them is possible: CometD offers a number of listeners that are invoked by the CometD implementation, and through these listeners an application can persist messages (or other information) into their own choice of persistent (and possibly distributed) store: Redis, RDBMS, etc.
CometD handles reconnection transparently for you - it just takes a few messages between client and the new server.
You also want to read about CometD's in-memory clustering features.

Related

Load balancing WebSocket with Redis and RabbitMQ

Consider a small chat server. In this server, the actual processing of messages is done by nodes of a service called "chat". Communications of this service along with a "user" service are then aggregated via a "gateway" service in front that is the only service that actually communicates with the users and is in charge of passing requests received to other services via the RabbitMQ channel they share.
In a system designed like this, each user is connected to one of the instances of the "gateway" service and when sending and receiving messages indirectly communicates with the private "chat" or "user" services behind. To load balance this, we have an Nginx reverse-proxy on the edge that tries to distribute requests to different "gateway" instances. But since WebSocket connection is real-time, "chat" instances should also be able to send messages to the right instance of the "gateway" in charge of that specific user for user-specific messages and to all "gateway" instances for site-wide messages. This is a problem since with RabbitMQ I don't believe we can target a specific subscriber and even if we could, we don't know to which instance that specific user is connected right now.
Therefore, since we are using Socket.io for WebSocket connection, I am thinking of adding a new Redis node to the stack to allow this communication between different instances of the "gateway" service. This is directly supported by Socket.io and works alright and removes all sorts of limitations imposed by the RabbitMQ, however, we are still using RabbitMQ to route a message from a "chat" instance to a "gateway" instance that then will propagate through the Redis service and when the right "gateway" instance having access to the user is found, delivered to them.
This adds unnecessary lag to user-specific outbound messages. So here I am asking if anyone has a better idea of how this problem should be approached and how to decrease this lag.
Personally, I have this idea of adding Socket.io to "chat" services (with no client access) and use its backend to send the message directly to the Redis store so that the instance of the "gateway" connected to it can route it directly to the user, going over the whole RabbitMQ thing for this type of messages.
It might be important to mention that none of these services are here just to do this specific thing, RabbitMQ is heavily used for communication between different services acting as the message broker and the "gateway" service works with multiple other services for data aggregation, authentication and data validation and transformation. The above example was a simplified version of the problem at hand with the minimum number of moving parts that I could easily describe here.
Edit: To send messages directly to socket.io redis store, the following library can be used apparently not to load the whole socket.io library:
https://github.com/socketio/socket.io-redis-emitter

How SSL server handle multiple connections?

On the law lvl of SSL protocols there are 4 types of messages:
Handshake Protocol
ChangeCipherSpec Protocol
Alert Protocol
Application Data Protocol
After the handshaking is completed and the symmetric private key been exchanged, the client will send Application Data messages to the server.
How ever same server can handle multiple clients, and each of those clients got it's own symmetric key.
Does the server keeps the connection open with all of the clients? If not how does the server know what symmetric key to use for an incoming connection? Does Application Data Protocol provide some sort of session id that the server can use to map to the right key?
Does the server keeps the connection open with all of the clients?
It can, depending on how the server is implemented.
how does the server know what symmetric key to use for an incoming connection?
Imagine you have a multiplayer game. Your server code typically looks something like this:
sock = socket.listen(port)
clients = []
if sock.new_client_waiting:
new_client = sock.accept()
clients.append(new_client)
for client in clients:
if client.new_data_waiting:
data = client.read()
# handle incoming actions...
So if there are two clients, the server will just know about both and have a socket object for both. Therein lies your answer: the OS (its TCP stack) handles the concept of connections and by providing you a socket, you can read/write from and to that socket, and you know from which clients it originates (to some degree of certainty anyway).
Many servers work differently, e.g. web server code is much more like this:
sock = socket.listen(port)
while True: # infinitely loop...
client = sock.accept() # This call will block until a client is available
spawn_new_http_handler(client)
Whenever a new person connects, a worker thread will pick it up and manage things from there. But still, it will have its socket to read from and write to.
Does Application Data Protocol provide some sort of session id that the server can use to map to the right key?
I do not know these specs by heart, but I am pretty sure the answer is no. Session resumption is done earlier, at the handshake stage, and is meant for clients returning. E.g. if I connected to https://example.com 30 minutes ago and return now, it might have my session and we don't need to do the whole handshake again. It doesn't have anything to do with telling clients apart.

How to determine and verify which RabbitMQ Queues are using SSL

I am trying to demonstrate to others that my queue is using SSL, however from the RabbitMQ web management tools there seems to be no distinction over which queues are using SSL and which are not.
Using RabbitMQ management on localhost, I am able to see all my queues. I have set up SSL on port 5671 successfully using the troubleshooting from RabbitMQ website.
Using MassTransit I have configured my incoming bus to use localhost:5671/my_queue_name with a client certificate and all is working successfully - I just can't confirm to others that the queue is secure. If I get a message from the queue using the web management tools, I can read the (JSON) message in plain text. Any ideas how I can prove my messages are secure?
I've attempted using BusDriver to peek the queues but get nothing back (independent of whether is SSL or not).
SSL is used to secure connections, not to encrypt queue contents.
What SSL gives you is that communication from clients to RabbitMQ will be encrypted, so you could theoretically be sure that nobody tampered with your messages.
Also if you need to validate that the sender of the message is a particular user, you could use this RabbitMQ extension: http://www.rabbitmq.com/validated-user-id.html

recv() fails on UDP

I’m writing a simple client-server app which for the time being will be for my own personal use. I’m using Winsock for the net communication. I have not done any networking for the last 10 years, so I am quite rusty. I’d like to use as little external code as possible, so I have written a home-made server discovery mechanism, as follows.
The client broadcasts a message containing the ‘name’ of a client UDP socket bound to an arbitrary port, which I will call the client’s discovery socket. The server recv() the broadcast and then sendto() the client discovery socket the ‘name’ of its listening socket. The client then uses this info to connect to the server (on a different socket). This mechanism should allow the server to bind its listening socket to the first port it can within the dynamic port range (49152-65535) and to the clients to discover where the server is and on which port it is listening.
The server part works fine: the server receives the broadcast messages and successfully sends its response.
On the client side the firewall log shows that the server’s response arrives to the machine and that it is addressed to the correct port (to the client’s discovery socket).
But the message never makes it to the client app. I’ve tried doing a recv() in blocking and non-blocking mode, and there is never any data available. ioctlsocket() always shows no data is available, even though I know the packet got it to the machine.
The server succeeds on doing a recv() on broadcasted data. But the client fails on doing a recv() of the server’s response which is addressed to its discovery socket.
The question is very vague: what gotchas should I watch for in this scenario? Why would recv() fail to get a packet which has actually arrived to the machine? The sockets are UDP, so the fact that they are not connected is irrelevant. Or is it?
Many thanks in advance.
The client broadcasts a message containing the ‘name’ of a client UDP socket bound to an arbitrary port, which I will call the client’s discovery socket.
The message doesn't need to contain anything. Just broadcast an empty message from the 'discovery socket'. recvfrom() will tell the server where it came from, and it can just reply directly.
The server recv() the broadcast and then sendto() the client discovery socket the ‘name’ of its listening socket.
Fair enough, although actually the server could just broadcast its own TCP listening port every 5 seconds or whatever.
On the client side the firewall log shows that the server’s response arrives to the machine and that it is addressed to the correct port (to the client’s discovery socket). But the message never makes it to the client app
If it got to the host it must get to the application. You must have got the ports mixed up somehow. Simplify it as above and retry.
Well, it was one of those stupid situations: Windows Firewall was active, besides the other firewall, and silently dropping packets. Deactivating it solved the problem.
But I still don’t understand how it works, as it was allowing the server to receive packets sent through broadcasting. And when I got at my wits end and set the server to answer back through a broadcast, THOSE packets got dropped.
Two days of frustration. I hope someone profits from my experience.

wcf and duplex communication

I have a lot of client programs and one service.
This Client programs communicate with the server with http channel with WCF.
The clients have dynamic IP.
They are online 24h/day.
I need the following:
The server should notify all the clients in 3 min interval. If the client is new (started in the moment), is should notify it immediately.
But because the clients have dynamic IP and they are working 24h/day and sometimes the connection is unstable, is it good idea to use wcf duplex?
What happens when the connection goes down? Will it automatically recover?
Is is good idea to use remote MSMQ for this type of notification ?
Regards,
WCF duplex is very resource hungry and per rule of thumb you should not use more than 10. There is a lot of overhead involved with duplex channels. Also there is not auto-recover.
If you know the interval of 3 minutes and you want the client to get information when it starts why not let the client poll the information from the server?
When the connection goes down the callback will throw an exception and the channel will close.
I am not sure MSMQ will work for you unless each client will create an MSMQ queue for you and you push messages to each one of them. Again with an unreliable connection it will not help. I don't think you can "push" the data if you loose the connection to a client, client goes off-line or changes an IP without notifying your system.