How does a cluster of servers behind a load balancer achieve the replication
of variables that are initialized during a client/server session? For example
when a client (let's say a game client) starts a session first with a load balancer that forwards the request to an available server, how does that server REPLICATES that session AND their memory STATES to the OTHER servers in a cluster?
LB
|
App App App
|
(memory variable)
|------------ replicated?
Or are the memory variables for the established session NOT replicated, and only session files are replicated in a database tier.. that wouldn't account for all the variables that a server must keep in memory.
It seems to me like to achieve synchronization in a multiplayer game, a cluster of servers must replicate all their states to other servers, but does that mean replicating all their memory variables?
Related
When read replica is created 2 IPs are assigned to the master and the read replica.
So when an application is connected to the CloudSQL using master IP, does it only use the master instance or is it connected to both instances?
Does the CloudSQL load balance the traffic among the replicas or do the application have to manually connect to the replicas?
Is there a way to achieve this without manually connecting to each instance?
So when an application is connected to the CloudSQL using master IP,
does it only use the master instance or is it connected to both
instances?
When the client is connected to the IP address of the master, it is only connected to the master.
Does the CloudSQL load balance the traffic among the replicas or do
the application have to manually connect to the replicas?
Google Cloud SQL does not load balance. If you wish to distribute read-only traffic, the client must perform that function.
Is there a way to achieve this without manually connecting to each
instance?
No. The client must connect to the masters and replicas to distribute read-only traffic. Logic must be present to send write traffic to the master only.
I wrote an in-depth article on this topic:
Google Cloud SQL for MySQL – Connection Security, High Availability and Failover
As I understand L4 level load balancers, e.g. Azure Load Balancer, are almost alway stateless, i.e. they do not keep per-flow state on which server handles which TCP connection.
What is the behaviour of such load balancers in case of server additions to DIP pool? Do they lose some of the connections since corresponding packets get sent over to the new server?
As I understand L4 level load balancers, e.g. Azure Load Balancer, are
almost alway stateless, i.e. they do not keep per-flow state on which
server handles which TCP connection.
That is not true.
By default, Azure Load Balancer distributes network traffic equally among multiple VM instances in a 5-tuple hash distribution mode (the source IP, source port, destination IP, destination port, and protocol type). You can also configure session affinity. For more information, see Load Balancer distribution mode. For session affinity, the mode uses a 2-tuple (source IP and destination IP) or 3-tuple (source IP, destination IP, and protocol type) hash to map traffic to the available servers. By using source IP affinity, connections that are initiated from the same client computer go to the same DIP endpoint.
What is the behaviour of such load balancers in case of server
additions to DIP pool? Do they lose some of the connections since
corresponding packets get sent over to the new server?
They do not lose connection.
The load balancing rules work rely on health probes to detect the failure of an application on a backend instance. Refer to probe down behavior. If a backend instance's health probe fails, established TCP connections to this backend instance continue. For a new TCP connection, it will connect to the remaining healthy instances. Load Balancer does not terminate or originate flows. It's a pass-through service (does not terminate TCP connections) and the flow is always between the client and the VM's guest OS and application.
Found Ananta: Cloud Scale Load Balancing - SIGCOMM paper which actually says that per-flow state is stored in one MUX machine (not replicated) which receives the associated traffic from the router. Hence server addition doesn't affect existing TCP connections as long as MUX machines stay as is.
I am writing a TCP/IP server that handlers persistent connections. I'll be using TLS to secure the communication and have a question about how to do this:
Currently I have a load balancer (AWS ELB) in front of a single server. In order for the load balancer to do the TLS termination for the duration of the connection it must hold on to the connection and forward the plain text to the application behind it.
client ---tls---> Load Balancer ---plain text---> App Server
This works great. Yay! My concern is that I'll need a load balancer in front of every app server because, presumably, the number of connections the load balancer can handle is the same as the number of connections the app server can handle (assuming the same OS and NIC). This means that if I had 1 load balancer and 2 app servers, I could wind up in a situation where the load balancer is at full capacity and each app server is at half capacity. In order to avoid this problem I'd have to create a 1 to 1 relationship between the load balancers and app servers.
I'd prefer the app server to not have to do the TLS termination because, well, why recreate the wheel? Are there better methods than to have a 1 to 1 relationship between the load balancer and the app server to avoid the capacity issue mentioned above?
There are two probable flaws in your presumption.
The first is the assumption that your application server will experience the same amount of load for a given number of connections as the load balancer. Unless your application server is extremely well-written, it seems reasonable that it would run out of CPU or memory or encounter other scaling issues before it reached the theoretical maximum ~64K concurrent connections IPv4 can handle on a given IP address. If that's really true, then great -- well done.
The second issue is that a single load balancer from ELB is not necessarily a single machine. A single ELB launches a hidden virtual machine in each availability zone where you've attached the ELB to a subnet, regardless of the number of instances attached, and the number of ELB nodes scales up automatically as load increases. (If I remember right, I've seen as many as nodes 8 running at the same time -- for a single ELB.) Presumably the class of those ELB instances could change , too, but that's not a facet that's well documented. There's not a charge for these machines, as they are included in the ELB price, so as they scale up, the monthly cost for the ELB doesn't change... but provisioning qty = 1 ELB does not mean you get only 1 ELB node.
We have a number of web-apps running on IIS 6 in a cluster of machines. One of those machines is also a state server for the cluster. We do not use sticky IP's.
When we need to take down the state server machine this requires the entire cluster to be offline for a few minutes while it's switched from one machine to another.
Is there a way to switch a state server from one machine to another with zero downtime?
You could use Velocity, which is a distributed caching technology from Microsoft. You would install the cache on two or more servers. Then you would configure your web app to store session data in the Velocity cache. If you needed to reboot one of your servers, the entire state for your cluster would still be available.
You could use the SQL server option to store state. I've used this in the past and it works well as long as the ASPState table it creates is in memory. I don't know how well it would scale as an on-disk table.
If SQL server is not an option for whatever reason, you could use your load balancer to create a virtual IP for your state server and point it at the new state server when you need to change. There'd be no downtime, but people who are on your site at the time would lose their session state. I don't know what you're using for load balancing, so I don't know how difficult this would be in your environment.
I have a Glassfish v2u2 cluster with two instances and I want to to fail-over between them. Every document that I read on this subject says that I should use a load balancer in front of Glassfish, like Apache httpd. In this scenario failover works, but I again have a single point of failure.
Is Glassfish able to do that fail-over without a load balancer in front?
The we solved this is that we have two IP addresses which both respond to the URL. The DNS provider (DNS Made Easy) will round robin between the two. Setting the timeout low will ensure that if one server fails the other will answer. When one server stops responding, DNS Made Easy will only send the other host as the server to respond to this URL. You will have to trust the DNS provider, but you can buy service with extremely high availability of the DNS lookup
As for high availability, you can have cluster setup which allows for session replication so that the user won't loose more than potentially one request which fails.
Hmm.. JBoss can do failover without a load balancer according to the docs (http://docs.jboss.org/jbossas/jboss4guide/r4/html/cluster.chapt.html) Chapter 16.1.2.1. Client-side interceptor.
As far as I know glassfish the cluster provides in-memory session replication between nodes. If I use Suns Glassfish Enterprise Application Server I can use HADB which promisses 99.999% of availability.
No, you can't do it at the application level.
Your options are:
Round-robin DNS - expose both your servers to the internet and let the client do the load-balancing - this is quite attractive as it will definitely enable fail-over.
Use a different layer 3 load balancing system - such as "Windows network load balancing" , "Linux Network Load balancing" or the one I wrote called "Fluffy Linux cluster"
Use a separate load-balancer that has a failover hot spare
In any of these cases you still need to ensure that your database and session data etc, are available and in sync between the members of your cluster, which in practice is much harder.