Apache proxy load balancing backend server failure detection - apache

Here's my scenario (designed by my predecessor):
Two Apache servers serving reverse proxy duty for a number of mixed backend web servers (Apache, IIS, Tomcat, etc.). There are some sites for which we have multiple backend web servers, and in those cases, we do something like:
<Proxy balancer://www.example.com>
BalancerMember http://192.168.1.40:80
BalancerMember http://192.168.1.41:80
</Proxy>
<VirtualHost *:80>
ServerName www.example.com:80
CustomLog /var/log/apache2/www.example.com.log combined
<Location />
Order allow,deny
Allow from all
ProxyPass balancer://www.example.com/
ProxyPassReverse balancer://www.example.com/
</Location>
</VirtualHost>
So in this example, I've got one site (www.example.com) in the proxy servers' configs, and that site is proxied to one or the other of the two backend servers, 192.168.1.40 and .41.
I'm evaluating this to make sure that we are fault tolerant on all of our web services (I've already put the two reverse proxy servers into a shared IP cluster for this reason), and I want to make sure that the load-balanced backend servers are fault tolerant as well. But I'm having trouble figuring out if backend failure detection (and the logic to avoid the failed backend server) is built into the mod_proxy_balancer module...
So if 192.168.202.40 goes down, will Apache detect this (I'll understand if it takes a failed request first) and automatically route all requests to the other backend, 192.168.202.41? Or will it continue to balance requests between the failed backend and the operational backend?
I've found some clues in the Apache documentation for mod_proxy and mod_proxy_balancer that seem to indicate that failure can be detected ("maxattempts = Maximum number of failover attempts before giving up.", "failonstatus = A single or comma-separated list of HTTP status codes. If set this will force the worker into error state when the backend returns any status code in the list."), but after a few days of searching, I've found nothing conclusive saying for sure that it will (or at least "should") detect backend failure and recovery.
I will say that most of the search results reference using the AJP protocol to pass the traffic to the backend servers, and this apparently does support failure detection-- but my backends are a mixture of Apache, IIS, Tomcat and others, and I am fairly sure that many of them don't support AJP. They are also a mixture of Windows 2k3/2k8 and Linux (mostly Ubuntu Lucid) boxes running various different applications with various different requirements, so add-on modules like Backhand and LVS aren't an option for me.
I've also tried to empirically test this feature, by creating a new test site like this:
<Proxy balancer://test.example.com>
BalancerMember http://192.168.1.40:80
BalancerMember http://192.168.1.200:80
</Proxy>
<VirtualHost *:80>
ServerName test.example.com:80
CustomLog /var/log/apache2/test.example.com.log combined
LogLevel debug
<Location />
Order allow,deny
Allow from all
ProxyPass balancer://test.example.com/
ProxyPassReverse balancer://test.example.com/
</Location>
</VirtualHost>
Where 192.168.1.200 is a bogus address that isn't running any web server, to simulate a backend failure. The test site was served up without a problem for a bunch of different client machines, but even with the LogLevel set to debug, I didn't see anything logged to indicate that it detected that one of the backend servers was down... And I'd like to make 100% sure that I can take our load-balanced backends down for maintenance (one at a time, of course) without affecting production sites.

http://httpd.apache.org/docs/2.4/mod/mod_proxy.html Section "BalancerMember parameters", property=retry:
If the connection pool worker to the backend server is in the error
state, Apache httpd will not forward any requests to that server until
the timeout expires. This enables [one] to shut down the backend
server for maintenance, and bring it back online later. A value of 0
means always retry workers in an error state with no timeout.
However there are other failure conditions that wouldn't be caught using mod_whatever, for example, IIS backend running an application which is down. IIS is up so a connection can be made and a page can be read, it's just that the page will always be 500 internal server error. Here you will have to use failonerror to catch it and force the worker into an error state.
In all cases once the worker is in an error state traffic will not be directed to it. I've been trying different ways of consuming that first failure and retrying it but there always seems to be cases where an error page makes it back to the client.

There is a property 'ping' in the 'BalancerMember parameters'
Reading the documentation it sounds like 'ping' set to 500ms will send a request before mod_proxy directs you to a BalancerMember. mod_proxy will wait 500ms for a response from a BalancerMember, and if mod_proxy doen't get a response it will but the BalancerMember into an error state.
I tired implementing this but it did not appear to help with directing to a live BalancerMember.
<Proxy balancer://APICluster>
BalancerMember https://api01 route=qa-api1 ttl=5 ping=500ms
BalancerMember https://api02 route=qa-api2 ttl=5 ping=500ms
ProxySet lbmethod=bybusyness stickysession=ROUTEID
</Proxy>
http://httpd.apache.org/docs/2.4/mod/mod_proxy.html
Ping property tells the webserver to "test" the connection to the backend before forwarding the request. For AJP, it causes mod_proxy_ajp to send a CPING request on the ajp13 connection (implemented on Tomcat 3.3.2+, 4.1.28+ and 5.0.13+). For HTTP, it causes mod_proxy_http to send a 100-Continue to the backend (only valid for HTTP/1.1 - for non HTTP/1.1 backends, this property has no effect). In both cases, the parameter is the delay in seconds to wait for the reply. This feature has been added to avoid problems with hung and busy backends. This will increase the network traffic during the normal operation which could be an issue, but it will lower the traffic in case some of the cluster nodes are down or busy. By adding a postfix of ms, the delay can be also set in milliseconds.

Related

apache mod_proxy_balancer randomly stops sending traffic to backend server, but no errors

I am using mod_proxy_balancer to load balance two back-end IIS servers. When monitoring the balancer-manager gui, I noticed that occasionally apache will stop sending traffic to one of the members. However, there are no errors present in the logs, and nothing to indicate that the server is unavailable. I have tried various lbmethods (bytraffic, bybusyness) and see the same result. I need to determine why traffic stops going to a member that is seemingly in good health and not returning errors. This generally happens under heavy load, which results in performance issues as one server is handling all requests.
Relevant config:
<Proxy balancer://cluster1>
BalancerMember http://iis1:80 route=2 timeout=45 keepAlive=On
BalancerMember http://iis2:80 route=1 timeout=45 keepAlive=On
ProxySet stickysession=ROUTEID
</Proxy>
Figured this out - it's because we are using sticky sessions on both our hardware load balancer and apache balancer configs. So when we run a load test using jMeter, all of the traffic goes to one server. I hope this helps.

Golang doesn't recognize Close-Notifier

when I use Apache mod_proxy to forward my go-requests to my golang-webserver, my go-server doesn't recognize when client disconnects. I am using the close notifier:
notify := rw.(http.CloseNotifier).CloseNotify()
go func() {
<-notify
brk.closingClients <- cl.session.Value
}
When I use firewall sitepath rooting it doesn't work either.
But when I use my own golang reverse proxy it works verry well without any problems.
With my apache mod_proxy the client receives the notify after some more real data sended to the go webserver.
Perhaps somebody have a idea how can i solve my problem, that i recognize when clients disconnect directly, so without receiving any more data.
Here my mod_proxy configs
SSLProxyEngine On
ProxyRequests On
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
ProxyPass /event https://xxx.xxx.xxx.xxx:8888/event flushpackets=on keepalive=on
The Apache server isn't going to close the connection when the client disconnects. It's much more efficient for it to reuse the connection for as long as possible.
If you really want the reverse proxy to reconnect every time (beware you may run into performance or port allocation issues), you can force mod_proxy to use HTTP/1.0 or explicitly close the connections every time with either of:
SetEnv force-proxy-request-1.0 1
SetEnv proxy-nokeepalive 1
https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#envsettings

Apache mod_proxy on Azure

I keep running into an issue with Apache's mod_proxy where it won't forward any traffic. I'm using a Windows Azure virtual machine running Ubuntu 13.04 and have configured the proper HTTPS endpoint (port 443) for it. The proper Apache modules (proxy, ssl, etc.) are all installed, and the error logs show nothing, not even a warning to explain why this is happening. My VirtualHost setup is as follows:
<VirtualHost *:443>
RequestHeader set X-Forwarded-Proto "https"
ProxyPreserveHost On
ServerName www.example.com
SSLEngine On
#SSLProxyEngine On
SSLCertificateFile /ssl/my.com.crt
SSLCertificateKeyFile /ssl/my.key
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
<Location />
SSLRequireSSL
Order deny,allow
Allow from all
</Location>
ProxyPass / http://127.0.0.1:8080/
ProxyPassReverse / http://127.0.0.1:8080/
</VirtualHost>
I have Listen 443 and NameVirtualHost *:443 all set as well. My service on the other port is running fine as doing a wget responds with an HTTP 200 OK response and I can reach it by manually inputting the port number. I have disabled all firewalls (for testing) to no avail as well. However, whenever I try to reach the service from the outside world through mod_proxy (port 443), the request times out and I get the usual "website not available" browser error.
If it means anything, the app I am running on the other port I need to forward HTTPS traffic to is a Play Framework 2.1 application. I set the server up exactly as in their documentation but still have these problems, so I'm assuming it may have something to do with Azure.
Any ideas? Is there some other type of endpoint configuration that I need to do specific for Windows Azure virtual machines to support SSL/TLS?
So, apparently, I have no idea how or why - but the Azure Gods decided to shine upon my setup all of a sudden. Overnight, without so much as a reboot or anything, mod_proxy on Azure just started working. I have no idea what the issue was, or even if there was one in the first place, but apparently the problem lies with something in the Azure infrastructure.
Sorry I couldn't be of more help for others encountering similar issues, but just giving it time worked for some unknown reason.

Setting up a websocket on Apache?

So I'm doing some research on websockets, and I have a few questions I can't seem to find a definitive answer for:
How can I set up a web socket on my Linux server? Is there an Apache module? Would I have to use 3rd-party PHP code or similar?
Are there any kinds of drawbacks to the method described in question 1 that I should be aware of other than browser compatibility?
How could I "upgrade" my websocket installation to a secure websocket installation (ws:// to wss://)? Would this be made easier or more difficult if SSL was already set up on my Apache server?
Is there any language I could use to connect to my web socket other than JavaScript?
What is the default request method for a web socket?
The new version 2.4 of Apache HTTP Server has a module called mod_proxy_wstunnel which is a websocket proxy.
http://httpd.apache.org/docs/2.4/mod/mod_proxy_wstunnel.html
I can't answer all questions, but I will do my best.
As you already know, WS is only a persistent full-duplex TCP connection with framed messages where the initial handshaking is HTTP-like. You need some server that's listening for incoming WS requests and that binds a handler to them.
Now it might be possible with Apache HTTP Server, and I've seen some examples, but there's no official support and it gets complicated. What would Apache do? Where would be your handler? There's a module that forwards incoming WS requests to an external shared library, but this is not necessary with the other great tools to work with WS.
WS server trends now include: Autobahn (Python) and Socket.IO (Node.js = JavaScript on the server). The latter also supports other hackish "persistent" connections like long polling and all the COMET stuff. There are other little known WS server frameworks like Ratchet (PHP, if you're only familiar with that).
In any case, you will need to listen on a port, and of course that port cannot be the same as the Apache HTTP Server already running on your machine (default = 80). You could use something like 8080, but even if this particular one is a popular choice, some firewalls might still block it since it's not supposed to be Web traffic. This is why many people choose 443, which is the HTTP Secure port that, for obvious reasons, firewalls do not block. If you're not using SSL, you can use 80 for HTTP and 443 for WS. The WS server doesn't need to be secure; we're just using the port.
Edit: According to Iharob Al Asimi, the previous paragraph is wrong. I have no time to investigate this, so please see his work for more details.
About the protocol, as Wikipedia shows, it looks like this:
Client sends:
GET /mychat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat
Sec-WebSocket-Version: 13
Origin: http://example.com
Server replies:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
and keeps the connection alive. If you can implement this handshaking and the basic message framing (encapsulating each message with a small header describing it), then you can use any client-side language you want. JavaScript is only used in Web browsers because it's built-in.
As you can see, the default "request method" is an initial HTTP GET, although this is not really HTTP and looses everything in common with HTTP after this handshaking. I guess servers that do not support
Upgrade: websocket
Connection: Upgrade
will reply with an error or with a page content.
I struggled to understand the proxy settings for websockets for https therefore let me put clarity here what i realized.
First you need to enable proxy and proxy_wstunnel apache modules and the apache configuration file will look like this.
<IfModule mod_ssl.c>
<VirtualHost _default_:443>
ServerName www.example.com
ServerAdmin webmaster#localhost
DocumentRoot /var/www/your_project_public_folder
SSLEngine on
SSLCertificateFile /etc/ssl/certs/path_to_your_ssl_certificate
SSLCertificateKeyFile /etc/ssl/private/path_to_your_ssl_key
<Directory /var/www/your_project_public_folder>
Options Indexes FollowSymLinks
AllowOverride All
Require all granted
php_flag display_errors On
</Directory>
ProxyRequests Off
ProxyPass /wss/ ws://example.com:port_no
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>
</IfModule>
in your frontend application use the url "wss://example.com/wss/" this is very important mostly if you are stuck with websockets you might be making mistake in the front end url. You probably putting url wrongly like below.
wss://example.com:8080/wss/ -> port no should not be mentioned
ws://example.com/wss/ -> url should start with wss only.
wss://example.com/wss -> url should end with / -> most important
also interesting part is the last /wss/ is same as proxypass value if you writing proxypass /ws/ then in the front end you should write /ws/ in the end of url.

Apache Timeout Configuration

We are using apache 2.2 version and we have three servers configured for load balancing purpose (like as below)
BalancerMember http://node1:port/ route=node1
BalancerMember xxxx://node2:xxxx/ route=node2
BalancerMember xxxx://node3:xxxx/ route=node3
However the backend application nodes configured in balancer member requires lot of processing time and hence we were facing issues related to timeout like as below
“The timeout specified has expired: proxy: error reading status line
from remote server ”
As I had a customised .conf file ,I had to add the below lines explicitly to avoid picking default timeout value from default http-default.conf file
<VirtualHost server:port>
Timeout 500
<Proxy balancer://xxxxx>
BalancerMember http://node1:port/ route=node1 timeout=500
</Proxy>
</VirtualHost>
So now my questions are:
Do I need to explicitly configure timeout value at both the levels as shown above,
a) Timeout 500 outside Proxy.
b) timeout=500 at BalancerMember level.
I read in internet that if the timeout of the Apache BalancerMember is
not configured the global Apache timout is inherited there. Please suggest..
Also please suggest the exact parameters needs to be tuned when huge
concurrent requests are anticipated ?
Thanks