What can we do when load balancer becomes the bottleneck? - load-balancing

I just started learning load balancers. Taking a server side application (http/https) load balancer as an example, I assume it listens a specific ip address, then forward the http requests to available servers based on its algorithm.
So is it possible for a load balancer to become a bottleneck? Because it's listening a specific ip address, all requests will first go to the single load balancer. So I think there could be a scenario where the amount of traffic is beyond the limit/capacity of the load balancer.
When it becomes a bottleneck, what can we do? Can we use multiple load balancers?
I think one possible solution is to use multiple load balancers and expose all the ips to clients. (This sounds like client side load balancing) So when a client wants to send a request, it can pick from the ip pool and then send a request to one of the load balancers. (For example, ZooKeeper could be used here.) Is this a working solution? Is there any other way to use multiple load balancers?
Thanks.
Ethan

Your last suggestion works with adding a little twist: The usual approach is to publish the load balancer IP addresses under the same domain name.
This is called DNS load balancing. Clients will ask for the IP resolution for your load balancer's domain name and they will get different IP addresses on a round-robin fashion.
To configure DNS load balancing you have to add multiple A records for your load balancer's domain name to your DNS configuration. Here you can find an example guide for that.

Related

Application load balancer vs network load balancer

I am new to AWS. I can't get a clear idea behind ALB vs NLB. Could anyone explain in a simple way?
There are some excellent answers out there already, let me pick out some key points that may help.
Network Load Balancer
As the name implies, this is for the network levels only. Typically layer 4.
It does not care, nor see, about anything regarding the application layer, such as cookies, headers, etc.
It is context-less, caring only about the network-layer information contained within the packets it is directing this way and that.
the 'balancing' done here is done so solely with IP addresses, port numbers, and other network variables.
Application Load Balancer
This takes into account multiple variables, from the application to the network. It can route its traffic based on this.
It is context-aware and can direct requests based on any single variable as easily as it can a combination of variables.
Key Differences
The network load balancer just forward requests whereas the application load balancer examines the contents of the HTTP request header to determine where to route the request
Network load balancing cannot assure availability of the application, where as Application load balancing can.
Some good sources from where I extracted this information are:
https://medium.com/awesome-cloud/aws-difference-between-application-load-balancer-and-network-load-balancer-cb8b6cd296a4
https://linuxacademy.com/community/show/22677-application-load-balancer-vs-network-load-balancer/
https://aws.amazon.com/elasticloadbalancing/features/#compare
In main response by #james above, network level has been defined multiple times and mentions about network layer information. However, I would like to mention, NLB yes operates on Layer4, but Layer4 is Transport Layer not Network Layer. NLB preserves source IP and thus Elastic IP can be used in case of NLB.

Why do we need web servers if we have load balancer to direct the requests?

Suppose we have two servers serving requests through a load balancer. Is it necessary to have web server in both of our servers to process the requests. Can load balancer itself act as a web server. Suppose we are using apache web server and HAProxy. So does that mean that web server(Apache) should be installed in both the server and load balancer in any one of the server. Why can't we have load balancer in both of our server machine that will be receiving the request and talking to each other to process the requests.
At the very basic, you want to have Webservers fulfill requests for static contents, while Application servers handle business logics, i.e. handle requests for dynamic contents.
But Web servers can do many other things as well such as authenticate and validate requests, logging metrics. Also, the important part of Webserver is putting the Content it gets from Application servers with a View for client to represent.
You want to have LB sitting in front of both Web and App servers if you have more than one server. Also, there's nothing preventing you from putting both Web and App server in one.
The load balancer is in front of your webserver(s) to redirect requests according to number of sessions, a hash of source IP and destination IP, requested URL or other criteria. Additionally, it will check availability of the backend servers to ensure requests get answered even if one server fails.
It's not installed on every webserver - you only need one instance. It could be a hardware appliance, or a software (like HAproxy) which may or may not be installed on one of the webservers. Although this would not be prudent, as this webserver could fail and then the proxy would not be able to redirect traffic to the remaining server.
There are several different scenarios for this. One is load balancing requests to 2 webservers which serve the same HTML content, to provide redundancy.
Another would be to provide multiple websites using just one public address, i.e. applying destination NAT according to the requested URL. For this, the software has to determine the URL in the HTML request and redirect traffic to the backend webserver servicing this site. This sometimes is called 'reverse proxy' as it hides the internal server addresses from the outside.

Does a load-balancer need a load balancer?

Considering a situation in which we have a web-application which is deployed in multiple servers and client requests landing to a load balancer, which in turn routes requests to actual server.
Now, if we have too many requests coming concurrently, would the load balancer itself fail? Suppose we get 1 million requests per second, won't that be beyond the processing capacity of a single load balancer?
How do we design (at least conceptually) a system which handles situations like this?
Putting a load balancer in front of your load balancer will not solve the problem simply because if one load balancer would failover due to the high traffic, so would the one in front!
You can achieve what you're looking for with DNS. You can register multiple IP address to a domain name and hence have multiple load balancers.
Let's say you're making a request to www.example.com. Your browser will lookup the record in the DNS and receive a list of corresponding IP addresses. Then the request will go to the first address on the list. If it's unavailable, it will go to the next on the list. The DNS servers will randomize order of the list to spread the load, and even do periodic health checks to remove unresponsive IPs. That means your requests will be split among your load balancers instead of hitting just the one.

Are AWS ELB IP addresses unique to an ELB?

Does anyone know how AWS ELB with SSL work behind the scenes? Running an nslookup on my ELB's domain name I get 4 unique IP addresses. If my ELB is SSL enabled, is it possible for AWS to share these same IPs with other SSL enabled ELBs (not necessarily owned by me)?
As I understand it the hostname in a web request is inside of the encrypted web request for a https request. If this is the case, does AWS have to give each SSL-enabled ELB unique IP addresses that are never shared with anyone else's SSL ELB instance? Put another way -- does AWS give 4 unique IP addresses for every SSL ELB you've requested?
Does anyone know how AWS ELB with SSL work behind the scenes? [...] Put another way --
does AWS give 4 unique IP addresses for every SSL ELB you've
requested?
Elastic Load Balancing (ELB) employs a scalable architecture in itself, meaning the number of unique IP addresses assigned to your ELB does in fact vary depending on the capacity needs and respective scaling activities of your ELB, see section Scaling Elastic Load Balancers within Best Practices in Evaluating Elastic Load Balancing (which provides a pretty detailed explanation of the Architecture of the Elastic Load Balancing Service and How It Works):
The controller will also monitor the load balancers and manage the
capacity [...]. It increases
capacity by utilizing either larger resources (resources with higher
performance characteristics) or more individual resources. The Elastic
Load Balancing service will update the Domain Name System (DNS) record
of the load balancer when it scales so that the new resources have
their respective IP addresses registered in DNS. The DNS record that
is created includes a Time-to-Live (TTL) setting of 60 seconds,[...]. By default, Elastic Load Balancing will return multiple IP
addresses when clients perform a DNS resolution, with the records
being randomly ordered [...]. As the traffic
profile changes, the controller service will scale the load balancers
to handle more requests, scaling equally in all Availability Zones. [emphasis mine]
This is further detailed in section DNS Resolution, including an important tip for load testing an ELB setup:
When Elastic Load Balancing scales, it updates the DNS record with the
new list of IP addresses. [...] It is critical that you factor this
changing DNS record into your tests. If you do not ensure that DNS is
re-resolved or use multiple test clients to simulate increased load,
the test may continue to hit a single IP address when Elastic Load
Balancing has actually allocated many more IP addresses. [emphasis mine]
The entire topic is explored in much more detail within Shlomo Swidler's excellent analysis The “Elastic” in “Elastic Load Balancing”: ELB Elasticity and How to Test it, which meanwhile refers to the aforementioned Best Practices in Evaluating Elastic Load Balancing by AWS as well, basically confirming his analysis but lacking the illustrative step by step samples Shlomo provides.

high availability websites

what's the best way to achieve high availability for a dynamic website? If I create a second copy on another server and do not wish to use a load balancer since it will mess up user sessions, what are the best alternatives?
You can store session data in a database instead, which gets around that problem, then you can round-robin the requests to the application servers.
(Good) Load Balancers can be configured to be "sticky" which means they send requests from the same IP to the same server each time.
Even if you have a load balancer sitting infront of two backend webservers, you just move the single point of failure onto the load balancer instead of the webserver. So your application would still not be highly available.
I highly recommend using a load balancer and at least a pair of web servers. At work, we use HA Proxy, which is fully capable of ensuring sessions are 'sticky', and are sent to the same web server unless it goes down, where it will fail over.
To make your load balancer highly available, you can set up two load balancing servers which are a mirror image of each other. Assign a single virtual IP to both of your load balancers. Write a script that will poll the other server to check if it's down; if it's down, have that script pick up that virtual IP address. The script should be running on both servers.
This link describes one way of managing a virtual IP address. Similar articles have been written for a large number of linux distros, but they are all based on the same method.
Loadbalancers. They should be configured in such a way that they can handle the sessions. Maybe by sending the same ip to the same backend every time. Or store them inside a database, or some shared memory if it needs to be really fast for some reason i haven't thought of.