Websocket connection over Microsoft network load balancer (NLB) - asp.net-core

I have an on premise NLB with 4 nodes, and on each of those nodes there is an app that communicates with the client over websockets using SignalR.
I have noticed through testing that at some point the client stops receiving messages over websocket, there are no errors, the socket just stops receiving messages.
My suspicion is that the client connects to node 1 through NLB, but when the node 1 has too much traffic the NLB switches the client to node 2, and since client isn't registered in the node 2 the messages stop coming and the client doesn't notice the change.
My questions are:
Am I correct to assume that in order to have default NLB configuration I would have to add a Redis backplane that will keep the list of all connections and allow each node to communicate with client regardless of which node was first to receive the connection? (Microsoft suggested solution https://learn.microsoft.com/en-us/aspnet/core/signalr/scale?view=aspnetcore-6.0)
Is there a way to change NLB configuration to allow specific apps to behave differently that others, for example, can I set up the NLB in such a way that the SignalR app so that all of the websocket traffic goes to a specific node, but the rest of the apps on the servers behave as they do with the default configuration?
If I were to move to the cloud, would I face the same issue or is this something that happens to Microsoft on premise NLB only, or is this something that can happen in Kubernetes or AWS as well? Does the cloud NLB use L7 and the on premise one L3?

Related

rabbitmq with ha req/reply

i have the following scenario which i want to fulfill:
rabbit mq must be loadbalanced (is it something which is provided by rabbitmq out of the box OR something like haproxy load balancer would work great. Which one is well loadbalanced.)
CAN haproxy directly push messages to rabbitmq (lets say a POST request coming to http://localhost:3333/redirectToRabbit gets redirected to rabbit and optionally either the ACK or RESPONSE goes back to client. Also note haproxy would load balance the request)
with HA; what the best configuration ( exchange with durable queue, durable queue or something else. NOTE: How would the messages gets redirected to some other rabbitmq instance if one of the rabbitmq instance goes down -- persisted and auto redirection to available rabbitmq )
Assuming you setup a two-node RabbitMQ cluster. Before talking about ha proxy, you need to understand the ha policies and the behavior of ha queues first. Different ha options might cause completely different behaviors of RabbitMQ message replication and node failover. RabbitMQ is so flexible, so don't expect a golden way of configuration which could meet all scenarios.
Then, since you have two nodes which could accept connections, your client could either use a loadbalancer (such as ha proxy) or to use a client driver which supports connecting to multiple nodes of a cluster. Either way will work.
When using haproxy, you have one load balancer ip. Client connects only to this load balancer ip, the load balancer forward you connection to the underlying nodes. But as long as a connection created, the client connection instance keeps talking to one of the node. When one of the node is down, if no "Health Checking" options are configured in your load balancer, client might get random connection failures. When you have "Health Checking" options configured correctly, the load balancer knows which nodes are down, so that clients will only connect to healthy nodes, which solves the issue.
When not using a load balancer and only base on client driver to connect to all the nodes, the client driver should be able to handle connection failure or health check internally and do failover/retry, etc, to ensure connections go to healthy nodes.

Mule HA Cluster - Application configuration issue

We are working on Mule HA cluster PoC with two separate server nodes. We were able to create a cluster. We have developed small dummy application with Http endpoint with reliability pattern implementation which loops for a period and prints a value. When we deploy the application in Mule HA cluster, even though its deploys successfully in cluster and application log file has been generated in both the servers but its running in only one server. In application we can point to only server IP for HTTP endpoint. Could any one please clarify my following queries?
In our case why the application is running in one server (which ever IP points to server getting executed).
Will Mule HA cluster create virtual IP?
If not then which IP we need to configure in application for HTTP endpoints?
Do we need to have Load balancer for HTTP based endpoints request? If so then in application which IP needs to be configured for HTTP endpoint as we don't have virtual IP for Mule HA cluster?
Really appreciate any help on this.
Environment: Mule EE ESB v 3.4.2 & Private cloud.
1) You are seeing one server processing requests because you are sending them to the same server each time.
2) Mule HA will not create a virtual IP
3/4) You need to place a load balancer in front of the Mule nodes in order to distribute the load when using HTTP inbound endpoints. You do not need to decide which IP to place in the HTTP connector within the application, the load balancer will route the request to one of the nodes.
creating a Mule cluster will just allow your Mule applications share information through its shared memory (VM transport and Object Store) and make the polling endpoints poll only from a single node. In the case of HTTP, it will listen in each of the nodes, but you need to put a load balancer in front of your Mule nodes to distribute load. I recommend you to read the High Availability documentation. But the more importante question is why do you need to create a cluster? You can have two separate Mule servers with your application deployed and have a load balancer send request to them.

Backplane vs Sticky Load balancer

I am developing SignalR application.There will be multiple instances of my application running on different servers behind the load balancer. I read about the backplane and found out that it is mainly serves the purpose of server failure and handles the request hops between multiple servers.(there might be another benefits).
Please consider below scenario and suggest if I still needs backplane.
I am using sticky load balancing(i.e. all subsequent request from client goes to same server) ? So there is no chance of request hops in good scenario.
How I handled server down scenario - When server goes down. Client tries to reconnect and gets "404-not found" error.At this time client start new connection and it works.
The main reason for having a backplane when developing SignalR application comes from the following scenario:
let's say you have 2 web servers hosting your application, serverA and serverB
you have 2 clients connecting to your application, client1 who is served by serverA and client2 who is served by serverB
A good assumption when developing a SignalR application is that you want these 2 clients to communicate with one another. So client1 sends a message to client2.
The moment client1 sends a message, his request is completed by server1. But server1 keeps a mapping of connected users in memory. It looks for client2, but client2 is kept in the memory of serverB, so the message will never get there.
By using a backplane, basically every message that comes in one server is broadcast to all other servers.
One solution is to forward messages between servers, using a component called a backplane. With a backplane enabled, each application instance sends messages to the backplane, and the backplane forwards them to the other application instances.
Taken from SignalR Introduction to scaleout
Be sure to check this backplane with Redis from the SignalR documentation.
Hope this helps. Best of luck!

Best Practice for setting up RabbitMQ cluster in production with NServiceBus

Currently we have 2 load balanced web servers. We are just starting to expose some functionality over NSB. If I create two "app" servers would I create a cluster between all 4 servers? Or should I create 2 clusters?
i.e.
Cluster1: Web Server A, App Server A
Cluster2: Web Server B, App Server B
Seems like if it is one cluster, how do I keep a published message from being handled by the same logical subscriber more than once if that subscriber is deployed to both app server A and B?
Is the only reason I would put RabbitMQ on the web servers for message durability (assuming I didn't have any of the app services running on the web server as well)? In that case my assumption is that I am then using the cluster mirroring to get the message to the app server. Is this correct?
Endpoints vs Servers
NServiceBus uses the concept of endpoints. An endpoint is related to a queue on which it receives messages. If this endpoint is scaled out for either high availability or performance then you still have one queue (with RabbitMQ). So if you would have an instance running on server A and B they both (with RabbitMQ) get their messages from the same queue.
I wouldn't think in app servers but think in endpoints and their non functional requirements in regards to deployment, availability and performance.
Availability vs Performance vs Deployment
It is not required to host all endpoints on server A and B. You can also run service X and Y on server A and services U and V on server B. You then scale out for performance but not for availability but availability is already less of an issue because of the async nature of messaging. This can make deployment easier.
Pubsub vs Request Response
If the same logical endpoint has multiple instances deployed then it should not matter which instance processes an event. If it is then it probably isn't pub sub but async request / response. This is handled by NServiceBus by creating a queue for each instance (with RabbitMQ) on where the response can be received if that response requires affinity to requesting instance.
Topology
You have:
Load balanced web farm cluster
Load balanced RabbitMQ cluster
NServiceBus Endpoints
High available multiple instances on different machines
Spreading endpoints on various machines ( could even be a machine per endpoint)
A combination of both
Infrastructure
You could choose to run the RabbitMQ cluster on the same infrastructure as your web farm or do it separate. It depends on your requirements and available resources. If the web farm and rabbit cluster are separate then you can more easily scale out independently.

service discovery, load balancing and connection pooling approach

There are two approaches that can be used for service interaction when having SOA for large systems deployed on cloud like AWS.
Have each service cluster behind internal elb. client makes a connection pool with corresponding elb and elb does round-robin balancing.
going with service discovery approach like netflix eureka.
Currently we are using 1st approach where each service cluster is behind internal elb and clients communicate via elbs so each client instance has to maintain only 1 pool i.e. with the elb endpoint.
I have following doubts regarding 2nd apporach.
Is there a benefit in moving to service discovery and smart client architecture where service client knows all service instances (via eureka service or equivalent) and does internal load balancing?
In above case how does connection pooling work? Currently each client instance has to maintain exactly 1 connection pool i.e. with the corresponding service's elb. But with rich client each client will have all the service instance endpoints to directly communicate to. Making connection on each request will not be efficient and having so many connection pools (1 for each service instance) for each client is a overkill I guess.
Need inputs/suggestions on above two questions.
First question.
Yes there is. First, you can do better failure recovery - for example, retry failed requests to another node without showing any errors to client. Next, you can do better balancing than ELB offers. Next, you can automatically add/remove nodes to/from cluster w/o altering ELB configuration. This is very useful if your nodes have healthchecks. More importantly, software balancer can do this fast.
Second question.
Have connection pool per node. i.e.
[api method in client code] -> [software balancer] -> [node connection pool] -> [node connection] -> [use this connection to make request]