Hie, could someone please list all the available algorithms for HAPRoxy with simple definition of each.
Regards
From the manual:
<algorithm> is the algorithm used to select a server when doing load
balancing. This only applies when no persistence information
is available, or when a connection is redispatched to another
server. <algorithm> may be one of the following :
roundrobin Each server is used in turns, according to their weights.
This is the smoothest and fairest algorithm when the server's
processing time remains equally distributed. This algorithm
is dynamic, which means that server weights may be adjusted
on the fly for slow starts for instance. It is limited by
design to 4095 active servers per backend. Note that in some
large farms, when a server becomes up after having been down
for a very short time, it may sometimes take a few hundreds
requests for it to be re-integrated into the farm and start
receiving traffic. This is normal, though very rare. It is
indicated here in case you would have the chance to observe
it, so that you don't worry.
static-rr Each server is used in turns, according to their weights.
This algorithm is as similar to roundrobin except that it is
static, which means that changing a server's weight on the
fly will have no effect. On the other hand, it has no design
limitation on the number of servers, and when a server goes
up, it is always immediately reintroduced into the farm, once
the full map is recomputed. It also uses slightly less CPU to
run (around -1%).
leastconn The server with the lowest number of connections receives the
connection. Round-robin is performed within groups of servers
of the same load to ensure that all servers will be used. Use
of this algorithm is recommended where very long sessions are
expected, such as LDAP, SQL, TSE, etc... but is not very well
suited for protocols using short sessions such as HTTP. This
algorithm is dynamic, which means that server weights may be
adjusted on the fly for slow starts for instance.
source The source IP address is hashed and divided by the total
weight of the running servers to designate which server will
receive the request. This ensures that the same client IP
address will always reach the same server as long as no
server goes down or up. If the hash result changes due to the
number of running servers changing, many clients will be
directed to a different server. This algorithm is generally
used in TCP mode where no cookie may be inserted. It may also
be used on the Internet to provide a best-effort stickiness
to clients which refuse session cookies. This algorithm is
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type".
uri This algorithm hashes either the left part of the URI (before
the question mark) or the whole URI (if the "whole" parameter
is present) and divides the hash value by the total weight of
the running servers. The result designates which server will
receive the request. This ensures that the same URI will
always be directed to the same server as long as no server
goes up or down. This is used with proxy caches and
anti-virus proxies in order to maximize the cache hit rate.
Note that this algorithm may only be used in an HTTP backend.
This algorithm is static by default, which means that
changing a server's weight on the fly will have no effect,
but this can be changed using "hash-type".
This algorithm supports two optional parameters "len" and
"depth", both followed by a positive integer number. These
options may be helpful when it is needed to balance servers
based on the beginning of the URI only. The "len" parameter
indicates that the algorithm should only consider that many
characters at the beginning of the URI to compute the hash.
Note that having "len" set to 1 rarely makes sense since most
URIs start with a leading "/".
The "depth" parameter indicates the maximum directory depth
to be used to compute the hash. One level is counted for each
slash in the request. If both parameters are specified, the
evaluation stops when either is reached.
url_param The URL parameter specified in argument will be looked up in
the query string of each HTTP GET request.
If the modifier "check_post" is used, then an HTTP POST
request entity will be searched for the parameter argument,
when it is not found in a query string after a question mark
('?') in the URL. Optionally, specify a number of octets to
wait for before attempting to search the message body. If the
entity can not be searched, then round robin is used for each
request. For instance, if your clients always send the LB
parameter in the first 128 bytes, then specify that. The
default is 48. The entity data will not be scanned until the
required number of octets have arrived at the gateway, this
is the minimum of: (default/max_wait, Content-Length or first
chunk length). If Content-Length is missing or zero, it does
not need to wait for more data than the client promised to
send. When Content-Length is present and larger than
<max_wait>, then waiting is limited to <max_wait> and it is
assumed that this will be enough data to search for the
presence of the parameter. In the unlikely event that
Transfer-Encoding: chunked is used, only the first chunk is
scanned. Parameter values separated by a chunk boundary, may
be randomly balanced if at all.
If the parameter is found followed by an equal sign ('=') and
a value, then the value is hashed and divided by the total
weight of the running servers. The result designates which
server will receive the request.
This is used to track user identifiers in requests and ensure
that a same user ID will always be sent to the same server as
long as no server goes up or down. If no value is found or if
the parameter is not found, then a round robin algorithm is
applied. Note that this algorithm may only be used in an HTTP
backend. This algorithm is static by default, which means
that changing a server's weight on the fly will have no
effect, but this can be changed using "hash-type".
hdr(<name>) The HTTP header <name> will be looked up in each HTTP request.
Just as with the equivalent ACL 'hdr()' function, the header
name in parenthesis is not case sensitive. If the header is
absent or if it does not contain any value, the roundrobin
algorithm is applied instead.
An optional 'use_domain_only' parameter is available, for
reducing the hash algorithm to the main domain part with some
specific headers such as 'Host'. For instance, in the Host
value "haproxy.1wt.eu", only "1wt" will be considered.
This algorithm is static by default, which means that
changing a server's weight on the fly will have no effect,
but this can be changed using "hash-type".
rdp-cookie
rdp-cookie(name)
The RDP cookie <name> (or "mstshash" if omitted) will be
looked up and hashed for each incoming TCP request. Just as
with the equivalent ACL 'req_rdp_cookie()' function, the name
is not case-sensitive. This mechanism is useful as a degraded
persistence mode, as it makes it possible to always send the
same user (or the same session ID) to the same server. If the
cookie is not found, the normal roundrobin algorithm is
used instead.
Note that for this to work, the frontend must ensure that an
RDP cookie is already present in the request buffer. For this
you must use 'tcp-request content accept' rule combined with
a 'req_rdp_cookie_cnt' ACL.
This algorithm is static by default, which means that
changing a server's weight on the fly will have no effect,
but this can be changed using "hash-type".
<arguments> is an optional list of arguments which may be needed by some
algorithms. Right now, only "url_param" and "uri" support an
optional argument.
balance uri [len <len>] [depth <depth>]
balance url_param <param> [check_post [<max_wait>]]
The load balancing algorithm of a backend is set to roundrobin when no other
algorithm, mode nor option have been set. The algorithm may only be set once
for each backend.
Related
I am working on a video-conferencing application. We have a pool of servers where rooms are created, a room can have n number of users. I was exploring HAProxy and several other load balancers, but couldn't find any solution for what I was looking for.
My requirements are as follows
A room should be created on the server with the lowest load at the time of creation.
All users of that room should join on the same server.
I have tried url_param balance logic with consistent hashing, but it is distributing load randomly. Is it even possible with modern L7 load balancers or do I need to write some custom logic (in some load balancer) or a separate application for this scenario?
Is there any way of balancing load based on connections or CPU usage while maintaining the session stickiness?
balance documentation says you can choose algorithm like leastconn and that this only applies when no persistence information is available, or when a connection is redispatched to another server.
So the second part of the answer are stick tables. Read docs about stick match and other stick keywords
So with stick table it looks like this:
backend foo
mode http
balance leastconn
stick store-request src
stick-table type ip size 200k expire 30m
server s1 192.168.1.1:8080
server s2 192.168.1.2:8080
There are more examples in the docs.
What you need to figure out (or tell us) is how can we know the room client wants based on the request and make such stick table and rules. If it's in URL or http header then it is perfectly doable in haproxy.
If leastconn is not good enough, then there is an option of dynamically adjusting servers' weights with haproxy's unix socket CLI and use roundrobin algorithm. Also agent options can be configured for servers to dynamically set servers' weights.
When the load balancer can use round robin algorithm to distribute the incoming request evenly to the nodes why do we need to use the consistent hashing to distribute the load? What are the best scenario to use consistent hashing and RR to distribute the load?
From this blog,
With traditional “modulo hashing”, you simply consider the request
hash as a very large number. If you take that number modulo the number
of available servers, you get the index of the server to use. It’s
simple, and it works well as long as the list of servers is stable.
But when servers are added or removed, a problem arises: the majority
of requests will hash to a different server than they did before. If
you have nine servers and you add a tenth, only one-tenth of requests
will (by luck) hash to the same server as they did before. Consistent hashing can achieve well-distributed uniformity.
Then
there’s consistent hashing. Consistent hashing uses a more elaborate
scheme, where each server is assigned multiple hash values based on
its name or ID, and each request is assigned to the server with the
“nearest” hash value. The benefit of this added complexity is that
when a server is added or removed, most requests will map to the same
server that they did before. So if you have nine servers and add a
tenth, about 1/10 of requests will have hashes that fall near the
newly-added server’s hashes, and the other 9/10 will have the same
nearest server that they did before. Much better! So consistent
hashing lets us add and remove servers without completely disturbing
the set of cached items that each server holds.
Similarly, The round-robin algorithm is used to the scenario that a list of servers is stable and LB traffic is at random. The consistent hashing is used to the scenario that the backend servers need to scale out or scale in and most requests will map to the same server that they did before. Consistent hashing can achieve well-distributed uniformity.
Let's say we want to maintain user sessions on servers. So, we would want all requests from a user to go to the same server. Using round-robin won't be of help here as it blindly forwards requests in circularly fashion among the available servers.
To achieve 1:1 mapping between a user and a server, we need to use hashing based load balancers. Consistent hashing works on this idea and it also elegantly handles cases when we want to add or remove servers.
References: Check out the below Gaurav Sen's videos for further explanation.
https://www.youtube.com/watch?v=K0Ta65OqQkY
https://www.youtube.com/watch?v=zaRkONvyGr8
For completeness, I want to point out one other important feature of Consistent Hashing that hasn't yet been mentioned: DOS mitigation.
If a load-balancer is getting spammed with requests, (either from too many customers, an attack, or a haywire local service) a round-robin approach will apply the request spam evenly across all upstream services. Even spread out, this load might be too much for each service to handle. So what happens? Your loadbalancer, in trying to be helpful, has brought down your entire system.
If you use a modulus or consistent hashing approach, then only a small subset of services will be DOS'd by the barrage.
Being able to "limit the blast radius" in this manner is a critical feature of production systems
Consistent hashing is fits well for stateful systems(where context of the previous request is required in the current requests), so in stateful systems if previous and current request lands in different servers than for current request context is lost and system won't be able to fulfil the request, so in consistent hashing with the use of hashing we can route of requests to same server for that particular user, while in round robin we cannot achieve this, round robin is good for stateless systems.
My project uses the Presets plugin with the flag onlyAllowPresets=true.
The reason for this is to close a potential vulnerability where a script might request an image thousands of times, resizing with 1px increment or something like that.
My question is: Is this a real vulnerability? Or does ImageResizer have some kind of protection built-in?
I kind of want to set the onlyAllowPresets to false, because it's a pain in the butt to deal with all the presets in such a large project.
I only know of one instance where this kind of attack was performed. If you're that valuable of a target, I'd suggest using a firewall (or CloudFlare) that offers DDOS protection.
An attack that targets cache-misses can certainly eat a lot of CPU, but it doesn't cause paging and destroy your disk queue length (bitmaps are locked to physical ram in the default pipeline). Cached images are still typically served with a reasonable response time, so impact is usually limited.
That said, run a test, fake an attack, and see what happens under your network/storage/cpu conditions. We're always looking to improve attack handling, so feedback from more environments is great.
Most applications or CMSes will have multiple endpoints that are storage or CPU-intensive (often a wildcard search). Not to say that this is good - it's not - but the most cost-effective layer to handle this often at the firewall or CDN. And today, most CMSes include some (often poor) form of dynamic image processing, so remember to test or disable that as well.
Request signing
If your image URLs are originating from server-side code, then there's a clean solution: sign the urls before spitting them out, and validate during the Config.Current.Pipeline.Rewrite event. We'd planned to have a plugin for this shipping in v4, but it was delayed - and we've only had ~3 requests for the functionality in the last 5 years.
The sketch for signing would be:
Sort querystring by key
Concatenate path and pairs
HMACSHA256 the result with a secret key
Append to end of querystring.
For verification:
Parse the query,
Remove the hmac
Sort query and concatenate path as before
HMACSHA256 the result and compare to the value we removed.
Raise an exception if it's wrong.
Our planned implementation would permit for 'whitelisted' variations - certain values that a signature would permit to be modified by the client - say for breakpoint-based width values. This would be done by replacing targeted key/value pairs with a serialized whitelist policy prior to signature. For validation, pairs targeted by a policy would be removed prior to signature verification, and policy enforcement would happen if the signature was otherwise a match.
Perhaps you could add more detail about your workflow and what is possible?
I'm trying to implement Telnet Client using C++ and QT as GUI.
I have no idea to handling the telnet negotiations.
Every telnet command is preceded by IAC, e.g.
IAC WILL SUPPRESS_GO_AHEAD
The following is how I handling the negotiation.
Search for IAC character in received buffer
According to the command and option, response to the request
My questions are described as follows:
It seems that the telnet server won't wait for a client response after a negotiation command is sent.
e.g. (send two or more commands without waiting for client reponse)
IAC WILL SUPPRESS_GO_AHEAD
IAC WILL ECHO
How should I handle such situation? Handle two requests or just the last one?
What the option values would be if I don't response the request? Are they set as default?
Why IAC character(255) won't be treated as data instead of command?
Yes, it is allowed to send out several negotiations for different options without synchronously waiting for a response after each of them.
Actually it's important for each side to try to continue (possibly after some timeout if you did decide to wait for a response) even if it didn't receive a reply, as there are legitimate situations according to the RFC when there shouldn't or mustn't be a reply and also the other side might just ignore the request for whatever reason and you don't have control over that.
You need to consider both negotiation requests the server sent, as they are both valid requests (you may choose to deny one or both, of course).
I suggest you handle both of them (whatever "handling" means in your case) as soon as you notice them, so as not to risk getting the server stuck if it decides to wait for your replies.
One possible way to go about it is presented by Daniel J. Bernstein in RFC 1143. It uses a finite state machine (FSM) and is quite robust against negotiation loops.
A compliant server (the same goes for a compliant client) defaults all negotiable options to WON'T and DON'T (i.e. disabled) at the start of the connection and doesn't consider them enabled until a request for DO or WILL was acknowledged by a WILL or DO reply, respectively.
Not all servers (or clients for that matter) behave properly, of course, but you cannot anticipate all ways a peer might misbehave, so just assume that all options are disabled until enabling them was requested and the reply was positive.
I'll assume here that what you're actually asking is how the server is going to send you a byte of 255 as data without you misinterpreting it as an IAC control sequence (and vice versa, how you should send a byte of 255 as data to the server without it misinterpreting it as a telnet command).
The answer is simply that instead of a single byte of 255, the server (and your client in the opposite direction) sends IAC followed by another byte of 255, so in effect doubling all values of 255 that are part of the data stream.
Upon receiving an IAC followed by 255 over the network, your client (and the server in the opposite direction) must replace that with a single data byte of 255 in the data stream it returns.
This is also covered in RFC 854.
I have file download site. What I look for is limiting bandwidth per IP (!). Limit should be set dynamically by HTTP header from backend.
My current implementation uses X-Accel-Limit-Rate (I can change that header, it's not hard-coded anywhere), but it does limit only current connection/request.
Is my idea doable in G-Wan?
Yes, this can be done.
Write a G-WAN handler to extract the X-Accel-Limit-Rate HTTP header. Then enforce this policy by using the throttle_reply() G-WAN API call documented here.
An example available called throttle.c might help you further.
The throttle_reply() G-WAN function lets you apply throttling on a global basis or per connection, so you will just apply the relevant throttling values for either IP addresses or authenticated users, depending on your needs.
throttle_reply() can change the download speed dynamically during the lifespan of each connection so you can slow-down old connections and create new ones with an adaptive download rate.
Of course, this can be enforced on a per client IP address (or cookie, or even ISP/Datacenter AS record) to deal with huge workloads.