When do we switch load balancing algorithm in cloud computing? - load-balancing

I have setup a HAproxy load balancer and I have a question is when do I need to switch load balancing algorithm ?

Well this depends on your requirements.
Take a look into the documentation balance to know which algorithm is the right one for your setup.
Maybe the Blog post Test Driving “Power of Two Random Choices” Load Balancing can help you also to understand the balancing algorithms in HAProxy.

You can rewrite the profile of HAProxy to switch.you shoulod only modify the setting items: bakcend->balance, whicn can be roundrobin, source, static-rr, etc.

Related

Custom load balance logic in HAProxy

I am working on a video-conferencing application. We have a pool of servers where rooms are created, a room can have n number of users. I was exploring HAProxy and several other load balancers, but couldn't find any solution for what I was looking for.
My requirements are as follows
A room should be created on the server with the lowest load at the time of creation.
All users of that room should join on the same server.
I have tried url_param balance logic with consistent hashing, but it is distributing load randomly. Is it even possible with modern L7 load balancers or do I need to write some custom logic (in some load balancer) or a separate application for this scenario?
Is there any way of balancing load based on connections or CPU usage while maintaining the session stickiness?
balance documentation says you can choose algorithm like leastconn and that this only applies when no persistence information is available, or when a connection is redispatched to another server.
So the second part of the answer are stick tables. Read docs about stick match and other stick keywords
So with stick table it looks like this:
backend foo
mode http
balance leastconn
stick store-request src
stick-table type ip size 200k expire 30m
server s1 192.168.1.1:8080
server s2 192.168.1.2:8080
There are more examples in the docs.
What you need to figure out (or tell us) is how can we know the room client wants based on the request and make such stick table and rules. If it's in URL or http header then it is perfectly doable in haproxy.
If leastconn is not good enough, then there is an option of dynamically adjusting servers' weights with haproxy's unix socket CLI and use roundrobin algorithm. Also agent options can be configured for servers to dynamically set servers' weights.

How to scale out apache atlas

There is no info provided in atlas document on how to scale it.
Apache atlas is connected to cassandra or hbase in the backend which can scale out ,but I dont know how apache atlas engine ( rest web-service and request processor ) can scale out.
I can install multiple instances of it on different machine and have load balancer in front of it to fan out the request. But would this model help ? Does it do any kind of locking and do db transaction, so that this model would not work.
Does someone know how apache atlas scales out ?
Thanks.
So Apache Atlas runs Kafka as the message queue under the covers, and in my experience, the way they have designed the Kafka queue (consumer group that says you should ONLY have ONE consumer) is the choke point.
Not only that, when you look at the code, the consumer has a poll time for the broker of 1 sec hard coded into the consumer. Put these two together, and that means that if the consumer can't process the messages from the various producers (HIVE, Spark, etc) within that second, the broker then disengages the ONLY consumer, and waits for a non-existent consumer to pick up messages...
I need to design something similar, but this is as far as I have got...
Hope that helps somewhat...
Please refer to this page. http://atlas.apache.org/#/HighAvailability
Atlas does not support actual horizontal scale-out.
All the requests are handled by the 'Active instance'. the 'Passive instances' just forward all the requests to the 'Active instance'.

IP Load Balancing - Number of requests limit

I want to configure IP Load Balancing service for our VPS. I have got the documentation at http://docs.ovh.ca/en/products-iplb.html#presentation where I can integrate it.
I want to limit the number of requests on each server (S1, S2). How can I achieve this?
Suppose, I want S1 should handle all requests if requests sent to load balancer are less than 3500 per minute.
If requests are greater than 3500 (per minute), then load balancer should forward all extra requests to S2.
Regards,
Just had a look and I believe you won't be able to achieve what you are looking for with the available load balancing algorithm.
If you look at the available ones, you can see five ldb algorithm. I would say from my experience with load balancers (not from OVH) that they should do the following:
First: probably the first real server to reply (with health monitor) will get the query
leastcon: this distributes connections to the server that is currently managing the fewest open connections at the time the new connection request is received.
roundrobin: next connection is given to the next real server in line
source: not sure about this one but I believe you load-balance per src ip. Eg if request is coming from 143.32.Y.Z, send it to server A etc.
uri: I believe it load balances by URI. Typical if you are hosting different webservers.
I would advise to check with OVH what you can do. Typically in those scenario with an F5 load balancer for example, you can configure a simple script for this. Or groups, if the first group fail, we sent the traffic to the second one.
Now a ratio (also called weighted) ldb algo can do the job, not exactly what you want indeed.
Cheers

how does zeromq determine order of load balancing?

The documentation explains which socket types use load balancing between connected peers, but it doesn't say how it does the load balancing. I'm curious if it's deterministic and if so, what it's based on (order connected, based on address, some internal hash, etc).
It's a simple list, with round-robin load balancing.

what are some good "load balancing issues" to know?

Hey there guys, I am a recent grad, and looking at a couple jobs I am applying for I see that I need to know things like runtime complexity (straight forward enough), caching (memcached!), and load balancing issues
 (no idea on this!!)
So, what kind of load balancing issues and solutions should I try to learn about, or at least be vaguely familiar with for .net or java jobs ?
Googling around gives me things like network load balancing, but wouldn't that usually not be adminstrated by a software developer?
One thing I can think of is session management. By default, whenever you get a session ID, that session ID points to some in-memory data on the server. However, when you use load-balacing, there are multiple servers. What happens when data is stored in the session on machine 1, but for the next request the user is redirected to machine 2? His session data would be lost.
So, you'll have to make sure that either the user gets back to the same machine for every concurrent request ('sticky connection') or you do not use in-proc session state, but out-of-proc session state, where session data is stored in, for example, a database.
There is a concept of load distribution where requests are sprayed across a number of servers (usually with session affinity). Here there is no feedback on how busy any particular server may be, we just rely on statistical sharing of the load. You could view the WebSphere Http plugin in WAS ND as doing this. It actually works pretty well even for substantial web sites
Load balancing tries to be cleverer than that. Where some feedback on the relative load of the servers determines where new requests go. (even then session affinity tends to be treated as higher priority than balancing load). The WebSphere On Demand Router that was originally delivered in XD does this. If you read this article you will see the kind of algorithms used.
You can achieve balancing with network spraying devices, they could consult "agents" running in the servers which give feedback to the sprayer to give a basis for decisions where request should go. Hence even this Hardware-based approach can have a Software element. See Dynamic Feedback Protocol
network combinatorics, max- flow min-cut theorems and their use