Reason why load balancers usually implemented in high-availability pairs? - load-balancing

Currently , I am doing some research about the load balancer .
On Wikipedia , refer to this link http://en.wikipedia.org/wiki/Load_balancing_(computing).
It says : "Usually load balancers are implemented in high-availability pairs which may also replicate session persistence data if required by the specific application."
Besides , I have also used the search engine to find some related articles about the reason and the cases when we need to use 2 load balancers in a system but I did not find any good information.
So I want to ask why do we need 2 load balancers in most the cases? and which cases we need to use 2 or more load balancers instead of one?

Now a days there is need of implementing applications which are highly available. So in case of load balancer you should have a pairs of load balancer as a highly available pair.
Because if you are using a single server/node load balancer there is a chance it may go down or need to take off for the maintenance. This will cause application downtime or we need to redirect all requests to only one server which will affect the performance severely.
To avoid these things it is always recommended that load balancers should be available in highly available pairs so that load balancer is continuously operational for a desirably long length of time or all the time.

Related

Are Google Compute Engine Load Balancers Highly Available?

Maybe this is an obvious questions but I didn't find it stated explicitly anywhere. In contrast Linode load balancers are explicitly documented as highly available.
Any guess?
Google Compute Engine Load Balancers is highly available and fault tolerant service. You don't need to worry about scaling it, failing over to a backup node if something goes wrong etc as you would if you'd need to manage the load balancer yourself.
It doesn't mean it has 100% SLA. Just like any other part of Google Cloud Platform it is covered by 99.95% SLA which means it can be unavailable for a duration of 4h 22m per year without being considered as SLA breach.

Uneven cache hits

I have integrated twemproxy into web layer and I have 6 Elasticache(1 master , 5 read replicas) I am getting issue that the all replicas have same keys everything is same but cache hits on one replica is way more than others and I performed several load testing still on every test I am getting same result. I have separate data engine that writes on the master of this cluster and remaining 5 replicas get sync with it. So I am using twemproxy only for reading data from Elasticache not for sharding purpose. So my simple question is why i am getting 90% of hits on single read replicas of Elasticache it should distribute the hits evenly among all read replicas? right?
Thank you in advance
Twemproxy hashes everything as I recall. This means it will try to split keys among the masters you give it. If you have one master this means it hashes everything to one server. Thus, as far as it is concerned you have one server for acceptable queries. As such, it isn't helping you in this case.
If you want to have a single endpoint to distribute reads across a bank of identical slaves, you will need to put a TCP load balancer in front of the slaves and have your application talk to the IP:port of the load balancer. Common options are Nginx and HAProxy for software based ones, on AWS you could use their load balancer but you could run into various resource limits out of your control there, and pretty much any hardware load balancer would work as well (though this is difficult if not impossible on AWS).
Which load balancer to use is dependent on your (or your personnel's) comfort and knowledge level with each option.

Load balancing Vs MyMethod

I am planning to move one of my websites to a web farm (only 2 servers). I understand the basic concept of how load balancing works but need help with 2 different ideas I have.
Load balancing with webfarm
I am worried about SEO, duplicate content, different IP's?
My method
The most of the resource consumption on my server is due to a long heavy memory process for every query. I have in mind to set up a different server (not with a website). Create web services for the heavy memory processes and call these webmethods from my main server. If need be I can add another a 3rd server and replicate the same web methods on it and multiply.
The only downside I see is that; before every call to the webmethod, I need to write an algorithm to find the server with most memory available and call the webmethod on that server.
Any ideas if this is a sound idea?
Lastly, quite a bit of resources are used on my main server due to large file uploads. Is there a way to counter and redirect this to the low memory usage server?
Regards,
Prasad..

Isn't a load balancer a single point of failure

I am studying about scalability design and I've having a hard time thinking of ways to ensure a load balancer does not become a single point of failure. If a load balancer goes down, who makes the decision to route to a back up load balancer? What if that "decision maker" goes down too?
The point in avoiding a load balancer as a single point of failure is the load balancer(s) will run in a high availability cluster with hardware backup.
I believe the answer to this question is redundancy.
The load balancer, instead of being a single computer/service/module/whatever, should be several instances of that computer/service/whatever.
The clients should be aware of the options they have in case their favorite load balancer goes down.
In case a client is timing out on their favorite load balancer, they already have the logic of how to access the next one.
This is the most straight forward way I can think of to get rid of single points of failure, but I'm sure there are many others that have been researched.
Note that any system component is a single point of failure, no matter how much redundancy you put in. The question is: "How sure do you want to be that it will not go down?"
if the probability for a single instance to go down is p, then the probability for n instances to all go down together (assuming they are independent) is p^n. Pick how sure you want to be, or how much resources you can pay, and get the other side of the equation.

Is a load balancer node a bottleneck?

in a web application and service I'm creating it's necessary that it scales horizontally. I want to have 1 load balancer, 3 web server nodes, and 2 database nodes (one will be solely for redundancy and will scale vertically as soon as the first db node goes down).
Does the load balancer act as a proxy?
If so doesn't that defeat the purpose of scaling horizontally if it's going to be a massive bottle-neck?
Should it have significantly more resources than the web server nodes?
[I think you meant "scales horizontally" in your first sentence.]
Yes, typical load balancers are proxys.
The load balancer will be a bottleneck if your load is network bound. Usually load is CPU bound so the load balancer distributes the CPU load, not the network load.
No. load balancers don't need many resources since they don't hold or process much data. It should have as much network resources (bandwidth) as possible however.