I am using HAProxy for load balancing my HTTP requests. I would like to know if there is any way to customize the selection of backend server based on the responses returned by each server. I have a servlet which can return the responses (number of clients connected to it). I would like to use this information and route the request to the backend server which has the lowest number.
My HAProxy configuration looks like:
listen http_front xx.xx.xx.xx:8080
mode http
option httpchk GET /servlet/GetClientCountServlet
server app1 xx.xx.xx.xx:8080 check port 8080
server app2 xx.xx.xx.xx:8080 check port 8080
server app3 xx.xx.xx.xx:8080 check port 8080
Would not leastconn balance mode work for your use case? Otherwise I you can use Lua scripts to customize the way load balancing is done using HAProxy
As I am searching for a solution in the same direction, maye this helps as a base:
Loadbalancing via custom lua script
Create a file called least_sessions.lua and add the following code:
local function backend_with_least_sessions(txn)
-- Get the frontend that was used
local fe_name = txn.f:fe_name()
local least_sessions_backend = ""
local least_sessions = 99999999999
-- Loop through all the backends. You could change this
-- so that the backend names are passed into the function too.
for _, backend in pairs(core.backends) do
-- Look at only backends that have names that start with
-- the name of the frontend, e.g. "www_" prefix for "www" frontend.
if backend and backend.name:sub(1, #fe_name + 1) == fe_name .. '_' then
local total_sessions = 0
-- Using the backend, loop through each of its servers
for _, server in pairs(backend.servers) do
-- Get server's stats
local stats = server:get_stats()
-- Get the backend's total number of current sessions
if stats['status'] == 'UP' then
total_sessions = total_sessions + stats['scur']
core.Debug(backend.name .. ": " .. total_sessions)
end
end
if least_sessions > total_sessions then
least_sessions = total_sessions
least_sessions_backend = backend.name
end
end
end
-- Return the name of the backend that has the fewest sessions
core.Debug("Returning: " .. least_sessions_backend)
return least_sessions_backend
end
core.register_fetches('leastsess_backend', backend_with_least_sessions)
This code will loop through all of the backends that start with the same letters as the current frontend, for example finding the backends www_dc1 and www_dc2 for the frontend www. It will then find the backend that currently has the fewest sessions and return its name.
Use a lua-load directive to load the file into HAProxy. Then, add a use_backend line to your frontend to route traffic to the backend that has the fewest, active sessions.
global
lua-load /path/to/least_sessions.lua
frontend www
bind :80
use_backend %[lua.leastsess_backend]
backend www_dc1
balance roundrobin
server server1 192.168.10.5:8080 check maxconn 30
backend www_dc2
balance roundrobin
server server1 192.168.11.5:8080 check maxconn 30
More details:
https://www.haproxy.com/de/blog/5-ways-to-extend-haproxy-with-lua/
Related
I'd like to store a custom "value" in stick-table and use that in another ACL to select the server.
I've this config, which creates stick-table with the header value "x-external-id" as key and server-id as its value.
frontend frontend
bind 125.213.51.144:8080
default_backend backend
backend backend
balance roundrobin
stick store-request req.hdr(x-external-id)
stick-table type string len 50 size 200k nopurge
server gw1 125.213.51.100:8080 check id 1
server gw2 125.213.51.101:8080 check id 2
This config produced this stick table:
# table: backend, type: string, size:204800, used:3
0x558955d52ac4: key=00000000000 use=0 exp=0 server_id=1
0x558955d53114: key=11111111111 use=0 exp=0 server_id=2
0x558955d87a34: key=22222222222 use=0 exp=0 server_id=2
The value (server-id) is set by HaProxy based on the server handled the request. But I'd like to save a custom value here. Is it possible?
Apparently HAProxy doesn't allow storing custom values. Only server_id and tracking counters can be stored in stick table.
So I defined two backends with one stick table each. Each client hits its own backend and populates the stick table.
From another HAProxy section, I could use table_server_id to lookup in stick tables and route to the backend which owned the stick table having the entry.
############## Frontend ################
frontend my-frontend
bind 125.213.51.100:38989
acl is_service1 req.hdr(x-external-id),table_server_id(stick-table-1) -m int gt 0
use_backend my-backend if is_service1
acl is_service2 req.hdr(x-external-id),table_server_id(stick-table-2) -m int gt 0
use_backend my-backend-2 if is_service2
default_backend my-backend-default
############## Backend 1 ################
backend my-backend
balance roundrobin
server service1 125.213.51.100:18989 check id 1 inter 10s fall 1 rise 1
server service2 125.213.51.200:18989 check id 2 backup
############## Backend 2 ################
backend my-backend-2
balance roundrobin
server service2 125.213.51.100:18989 check id 2 inter 10s fall 1 rise 1
server service1 125.213.51.200:18989 check id 1 backup
############## Backend Default ################
backend my-backend-default
balance roundrobin
server service1 125.213.51.100:18989 check id 1
server service2 125.213.51.200:28989 check id 2
I am trying to solving a scenario now using haproxy. The scenario as below
Block all IP by default
Allow only connection from a specific IP address
If any connections come from a whilelist IP, if should reject if it exceed more than 10 concurrent connection in 30 sec
I want to do this to reduce number of API calls into my server. Could any one please help me with this?
Thanks
First two things are easy, simply allow only whitelisted IP
acl whitelist src 10.12.12.23
use_backend SOMESERVER if whitelist
The third - throttling - requires to use stick-tables (there are many data type - counters conn, sess, http, rates...) as a rate counter:
# max entries count request in 60s periods
stick-table type ip size 200k expire 100s store http_req_rate(60s)
next you have to fill the table, by tracking each request eg. by IP
tcp-request content track-sc0 src
# more info at http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4.2-tcp-request%20connection
and finally the acl:
# is there more than 5req/1min from IP
acl http_rate_abuse sc0_http_req_rate gt 5
# update use_backend condition
use_backend SOMESERVER if whitelisted !http_rate_abuse
For example some working config file with customized errors:
global
log /dev/log local1 debug
defaults
log global
mode http
option httplog
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
frontend http
bind *:8181
stick-table type ip size 200k expire 100s store http_req_rate(60s)
tcp-request content track-sc0 src
acl whitelist src 127.0.0.1
acl http_rate_abuse sc0_http_req_rate gt 5
use_backend error401 if !whitelist
use_backend error429 if http_rate_abuse
use_backend realone
backend realone
server local stackoverflow.com:80
# too many requests
backend error429
mode http
errorfile 503 /etc/haproxy/errors/429.http
# unauthenticated
backend error401
mode http
errorfile 503 /etc/haproxy/errors/401.http
Note: the error handling is a bit tricky. Because above error backends are missing server entries, haproxy will throw HTTP 503, errorfile catch them and send different errors (with different codes).
Example /etc/haproxy/errors/401.http content:
HTTP/1.0 401 Unauthenticated
Cache-Control: no-cache
Connection: close
Content-Type: text/html
<html><body><h1>401 Unauthenticated</h1>
</body></html>
Example /etc/haproxy/errors/429.http content:
HTTP/1.0 429 Too many requests
Cache-Control: no-cache
Connection: close
Content-Type: text/html
<html><body><h1>429 Too many requests</h1>
</body></html>
UPDATE / SUMMARY:
I created a blog article here about the process I went through and my config file has changed slightly from below:
https://medium.com/#silverbackdan/installing-couchdb-2-0-nosql-with-centos-7-and-certbot-lets-encrypt-f412198c3051#.216m9mk1m
Main issues with HTTPS:
If running HTTP and HTTPS, shard dbs appear on HTTPS
Fauxton features lacking over HTTPS (admin user management, config management, setup wizard, Mango indexing/querying)
Not sure if they should be, but databases over HTTP and HTTPS are not the same
I hope I'm just missing something really obvious
ORIGINAL POST:
I'm trying to configure HTTPS (SSL) with CouchDB 2.0. I'm compiling a guide for others to be able to follow as well but have come across some issues.
I think over HTTPS, I don't have the same permissions as when I enable HTTP and use that instead. In Fauxton over HTTP I can see the configuration and I can run the setup procedure. With HTTPS I'm getting errors where it says I cannot create a database (which it tries to do automatically) because they start with an underscore. Most databases get set up but there's a few which show errors such as "_cluster_setup" when I visit the Configuration page.
Additionally I get repeating error messages which does not stop CouchDB, but it says the database "_users" does not exist (database_does_not_exist). It doesn't exist when I enable and connect over HTTP, but it does exist when I connect over HTTPS. If I enable both HTTP and HTTPS then with my HTTPS connection I end up having a lot of shard databases (I'm new to NoSQL and CouchDB so I'm not sure what that's about, but they appear when errors show up similar to the above - creating databases starting with underscores). Either way, I see those shard databases when logged in via HTTPS but not HTTP (Fauxton shows them as "unable to load, and then I am just deleting them from the data directory at the moment)
There are also issues with accessing Fauxton over HTTPS using Chrome, but I think that's a known bug and it's OK to use Firefox or Safari at the moment.
Can anybody tell me if there are any settings which mean that a connection over port 6984 using HTTPS can have the same administrative rights as 5984 of HTTP? ...Or what the permissions issues there may be that results in the HTTPS connection bringing up these errors about underscores at the beginning of table names as I think that could basically resolve my main issues.
Here's my local.ini file which may be of some use (I have also commented out ";httpd={couch_httpd, start_link, []}" in default.ini as it says to here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=48203146
; CouchDB Configuration Settings
; Custom settings should be made in this file. They will override settings
; in default.ini, but unlike changes made to default.ini, this file won't be
; overwritten on server upgrade.
[couchdb]
;max_document_size = 4294967296 ; bytes
;os_process_timeout = 5000
uuid = **REMOVED**
[couch_peruser]
; If enabled, couch_peruser ensures that a private per-user database
; exists for each document in _users. These databases are writable only
; by the corresponding user. Databases are in the following form:
; userdb-{hex encoded username}
;enable = true
; If set to true and a user is deleted, the respective database gets
; deleted as well.
;delete_dbs = true
[chttpd]
;port = 5984
;bind_address = 0.0.0.0
; Options for the MochiWeb HTTP server.
;server_options = [{backlog, 128}, {acceptor_pool_size, 16}]
; For more socket options, consult Erlang's module 'inet' man page.
;socket_options = [{recbuf, 262144}, {sndbuf, 262144}, {nodelay, true}]
[httpd]
; NOTE that this only configures the "backend" node-local port, not the
; "frontend" clustered port. You probably don't want to change anything in
; this section.
; Uncomment next line to trigger basic-auth popup on unauthorized requests.
WWW-Authenticate = Basic realm="administrator"
bind_address = 0.0.0.0
; Uncomment next line to set the configuration modification whitelist. Only
; whitelisted values may be changed via the /_config URLs. To allow the admin
; to change this value over HTTP, remember to include {httpd,config_whitelist}
; itself. Excluding it from the list would require editing this file to update
; the whitelist.
config_whitelist = [{httpd,config_whitelist}, {log,level}, {etc,etc}]
[query_servers]
;nodejs = /usr/local/bin/couchjs-node /path/to/couchdb/share/server/main.js
[httpd_global_handlers]
;_google = {couch_httpd_proxy, handle_proxy_req, <<"http://www.google.com">>}
[couch_httpd_auth]
; If you set this to true, you should also uncomment the WWW-Authenticate line
; above. If you don't configure a WWW-Authenticate header, CouchDB will send
; Basic realm="server" in order to prevent you getting logged out.
require_valid_user = true
secret = **REMOVED**
[os_daemons]
; For any commands listed here, CouchDB will attempt to ensure that
; the process remains alive. Daemons should monitor their environment
; to know when to exit. This can most easily be accomplished by exiting
; when stdin is closed.
;foo = /path/to/command -with args
[daemons]
; enable SSL support by uncommenting the following line and supply the PEM's below.
; the default ssl port CouchDB listens on is 6984
httpsd = {couch_httpd, start_link, [https]}
[ssl]
cert_file = /home/couchdb/couchdb/certs/cert.pem
key_file = /home/couchdb/couchdb/certs/privkey.pem
;password = somepassword
; set to true to validate peer certificates
;verify_ssl_certificates = false
; Set to true to fail if the client does not send a certificate. Only used if verify_ssl_certificates is true.
;fail_if_no_peer_cert = false
; Path to file containing PEM encoded CA certificates (trusted
; certificates used for verifying a peer certificate). May be omitted if
; you do not want to verify the peer.
cacert_file = /home/couchdb/couchdb/certs/chain.pem
; The verification fun (optional) if not specified, the default
; verification fun will be used.
;verify_fun = {Module, VerifyFun}
; maximum peer certificate depth
ssl_certificate_max_depth = 1
;
; Reject renegotiations that do not live up to RFC 5746.
secure_renegotiate = true
; The cipher suites that should be supported.
; Can be specified in erlang format "{ecdhe_ecdsa,aes_128_cbc,sha256}"
; or in OpenSSL format "ECDHE-ECDSA-AES128-SHA256".
;ciphers = ["ECDHE-ECDSA-AES128-SHA256", "ECDHE-ECDSA-AES128-SHA"]
ciphers = undefined
; The SSL/TLS versions to support
tls_versions = [tlsv1, 'tlsv1.1', 'tlsv1.2']
; To enable Virtual Hosts in CouchDB, add a vhost = path directive. All requests to
; the Virual Host will be redirected to the path. In the example below all requests
; to http://example.com/ are redirected to /database.
; If you run CouchDB on a specific port, include the port number in the vhost:
; example.com:5984 = /database
[vhosts]
REMOVEDDOMAIN.COM:* = ./database
[update_notification]
;unique notifier name=/full/path/to/exe -with "cmd line arg"
; To create an admin account uncomment the '[admins]' section below and add a
; line in the format 'username = password'. When you next start CouchDB, it
; will change the password to a hash (so that your passwords don't linger
; around in plain-text files). You can add more admin accounts with more
; 'username = password' lines. Don't forget to restart CouchDB after
; changing this.
[admins]
;admin = mysecretpassword
**REMOVED** = **REMOVED**
[cors]
origins = *
credentials = true
headers = accept, authorization, content-type, origin, referer
methods = GET, PUT, POST, HEAD, DELETE
I've been in touch with the CouchDB team via a chat. CouchDB has been well tested using haproxy, so I've been advised to simply use haproxy instead as erlang can be very difficult to configure for SSL. I'll update the article I've written with complete instructions using haproxy once I've got everything working.
I am using haproxy to balance a cluster of servers. I am attempting to add a maintenance page to the haproxy configuration. I believe I can do this by defining a server declaration in the backend with the 'backup' modifier. Question I have is, how can I use a maintenance page hosted remotely on AWS S3 bucket (static website) without actually redirecting the user to that page (i.e. the haproxy server 'redir' definition).
If I have servers: a, b, c. All servers go down for maintenance then I want all requests to be resolved by server definition d (which is labeled with 'backup') to a static address on S3. Note, that I don't want paths to carry over and be evaluated on s3, it should always render the static maintenance page.
This is definitely possible.
First, declare a backup server, which will only be used if the non-backup servers are down.
server s3-fallback example.com.s3-website-us-east-1.amazonaws.com:80 backup
The following configuration entries are used to modify the request or the response only if we're using the alternate path. We're using two tests in the following examples:
# { nbsrv le 1 } -- if the number of servers in this backend is <= 1
# (and)
# { srv_is_up(s3-fallback) } -- if the server named "s3-fallback" is up; "server name" is the arbitrary name we gave the server in the config file
# (which would mean it's the "1" server that is up for this backend)
So, now that we have a backup back-end, we need a couple of other directives.
Force the path to / regardless of the request path.
http-request set-path / if { nbsrv le 1 } { srv_is_up(s3-fallback) }
If you're using an essentially empty bucket with an error document, then this isn't really needed, since any request path would generate the same error.
Next, we need to set the Host: header in the outgoing request to match the name of the bucket. This isn't technically needed if the bucket is named the same as the Host: header that's already present in the request we received from the browser, but probably still a good idea. If the bucket name is different, it needs to go here.
http-request set-header host example.com if { nbsrv le 1 } { srv_is_up(s3-fallback) }
If the bucket name is not a valid DNS name, then you should include the entire web site endpoint here. For a bucket called "example" --
http-request set-header host example.s3-website-us-east-1.amazonaws.com if { nbsrv le 1 } { srv_is_up(s3-fallback) }
If your clients are sending you their cookies, there's no need to relay these to S3. If the clients are HTTPS and the S3 connection is HTTP, you definitely wat to strip these.
http-request del-header cookie if { nbsrv le 1 } { srv_is_up(s3-fallback) }
Now, handling the response...
You probably don't want browsers to cache the responses from this alternate back-end.
http-response set-header cache-control no-cache if { nbsrv le 1 } { srv_is_up(s3-fallback) }
You also probably don't want to return "200 OK" for these responses, since technically, you are displaying an error page, and you don't want search engines to try to index this stuff. Here, I've chosen "503 Service Unavailable" but any valid response code would work... 500 or 502, for example.
http-response set-status 503 if { nbsrv le 1 } { srv_is_up(s3-fallback) }
And, there you have it -- using an S3 bucket website endpoint as a backup backend, behaving no differently than any other backend. No browser redirect.
You could also configure the request to S3 to use HTTPS, but since you're just fetching static content, that seems unnecessary. If the browser is connecting to the proxy with HTTPS, that section of the connection will still be secure, although you do need to scrub anything sensitive from the browser's request, since it will be forwarded to S3 unencrypted (see "cookie," above).
This solution is tested on HAProxy 1.6.4.
Note that by default, the DNS lookup for the S3 endpoint will only be done when HAProxy is restarted. If that IP address changes, HAProxy will not see the change, without additional configuration -- which is outside the scope of this question, but see the resolvers section of the configuration manual.
I do use S3 as a back-end server behind HAProxy in several different systems, and I find this to be an excellent solution to a number of different issues.
However, there is a simpler way to have a custom error page for use when all the backends are down, if that's what you want.
errorfile 503 /etc/haproxy/errors/503.http
This directive is usually found in global configuration, but it's also valid in a backend -- so this raw file will be automatically returned by the proxy for any request that tries to use this back-end, if all of the servers in this back-end are unhealthy.
The file is a raw HTTP response. It's essentially just written out to the client as it exists on the disk, with zero processing, so you have to include the desired response headers, including Connection: close. Each line of the headers and the line after the headers must end with \r\n to be a valid HTTP response. You can also just copy one of the others, and modify it as needed.
These files are limited by the size of a response buffer, which I believe is tune.bufsize, which defaults to 16,384 bytes... so it's only really good for small files.
HTTP/1.0 503 Service Unavailable\r\n
Cache-Control: no-cache\r\n
Connection: close\r\n
Content-Type: text/plain\r\n
\r\n
This site is offline.
Finally, note that in spite of the fact that you're wanting to "transparently proxy a request," I don't think the phrase "transparent proxy" is the correct one for what you're trying to do, because a "transparent proxy" implies that either the client or the server or both would see each other's IP addresses on the connection and think they were communicating directly, with no proxy in between, because of some skullduggery done by the proxy and/or network infrastructure to conceal the proxy's existence in the path. This is not what you're looking for.
I want to implement a proxy server that intercepts both http and https requests. I came across libmproxy (http://mitmproxy.org/doc/scripting/libmproxy.html) that it is SSL-capable. I start with this simplest proxy that just prints the headers of all requests and responses, and forwards them to clients and servers normally.
#!/usr/bin/env python
from libmproxy import controller, proxy
import os
class Master(controller.Master):
def __init__(self, server):
controller.Master.__init__(self, server)
self.stickyhosts = {}
def run(self):
try:
return controller.Master.run(self)
except KeyboardInterrupt:
self.shutdown()
def handle_request(self, msg):
print "handle request.................................................."
print msg.headers
msg.reply()
def handle_response(self, msg):
print "handle response................................................."
print msg.headers
msg.reply()
config = proxy.ProxyConfig(
cacert = os.path.expanduser("~/.mitmproxy/mitmproxy-ca.pem")
)
server = proxy.ProxyServer(config, 1234)
m = Master(server)
m.run()
Then I configure http and ssl proxy in firefox to 127.0.0.1 port 1234. http seems to work fine as I can see all the headers are printed out. However, when the browser sends https requests, the proxy server does not print anything at all, and the browser displays "the connect was interrupted" error.
Further investigation reveals that the https requests go though the proxy server but not controller.Master. I see that proxy.ProxyHandler.establish_ssl() is being called when there is an https request, but the request does not go though controller.Master.handle_request(). Despite that establish_ssl() is called, the browser does not seem to get any response back. I test this with https://www.google.com.
First, how can I make proxy.ProxyHandler works properly with https requests/responses? Second, how can I modify controller.Master so that it can intercept https requests? I'm also open to other tools that I can build a custom http/https proxy server on top of.
You need to install the mitmproxy CA in the browser you are testing with.
Please see details here ("Installing the mitmproxy CA" section):
http://mitmproxy.org/doc/ssl.html
This solved the problem for me.