We use the apache2-worker package with mod_proxy, mod_proxy_balancer and mod_status.
Apache is configured as a load balancer / dispatcher to WFS servers.
OS: SuSE SLES 11 SP2
Apache httpd: version 2.2.12
All of our workers (WFS servers) can handle only one request at a time. So in /etc/apache2/server-tuning.conf, section <IfModule worker.c>, we sat the parameter ServerLimit to 1.
In the configuration of a BalancerMember we used the parameter max=1.
I.e. /etc/apache2/conf.d/proxy.conf looks like that:
<Proxy balancer://wfscluster>
BalancerMember http://wfsserver01:9090 max=1 timeout=3600 acquire=30000
BalancerMember http://wfsserver01:9091 max=1 timeout=3600 acquire=30000
BalancerMember http://wfsserver02:9090 max=1 timeout=3600 acquire=30000
</Proxy>
ProxyPass /wfs balancer://wfscluster/ nofailover=On
The parameter acquire:
Documentation says:
If set this will be the maximum time to wait for a free connection in the connection pool, in milliseconds. If there are no free connections in the pool the Apache will return SERVER_BUSY status to the client.
My understanding of the parameter acquire is the following, and that my desired behaviour, too:
The load balancer gets some requests from clients. At some point in time all workers are busy.
The next request remains on hold by the load balancer until a worker becomes free. If a worker becomes free the pending request is assigned to the free worker, which then accepts the connection.
If there are no free workers in the time specified by parameter acquire the client gets an error response.
But the parameter acquire doesn't work as expected. The load balancer assignes the next request to a busy worker.
Even if another worker gets free in the meantime, the request is still assigned to the busy worker an the client has to wait until the busy worker finished the current request and accepts the new one.
If you do a /etc/init.d/apache2 reload you get an error message in apaches error_log:
BalancerMember Acquire timeout has wrong format
After that message httpd dies.
If you only start or restart the httpd you don't get that message and the httpd is alive.
I also tried to specify a unit like in acquire=30000ms. But the error remains.
The only thing which helps is to remove the acquire parameter, but the described behaviour is the same.
So the question is:
How do I have to use the parameter acquire? Has someone a working example?
Do I have to use other parameters to get the desired behaviour?
I have apache sitting in front of my node server. Node is running on certain port, I am using apache to proxy to that port and also have apache configured for https.
When I start apache and then start my node server everything runs great. If I bring down the node server and try to hit my service apache says 'Service Temporarily Unavailable'. This is expected as my node server is down.
However when I bring my server back up without touching apache and try to hit me service again apache still says 'Service Temporarily Unavailable'. Its like apache is not trying again. If I bounce apache all is well again.
Since I am running with forever there is a chance my server could be down for a few second if a fatal happens. I don't want to have to bounce apache if that happens.
Is there anyway to get apache to always try and not cache the fact that a Service it recently tried to hit was unavailable?
You need to add retry=0 to ProxyPass directive.
So it will be something like:
ProxyPass /example http://backend.example.com retry=0
Check some info here: http://httpd.apache.org/docs/current/mod/mod_proxy.html#proxypass
This is a rather complicated scenario, so I would highly appreciate any pointer to the correct direction.
So I have setup apache on server A to proxy https traffic το server B, that is a plone site behind varnish and apache.
I connect to A and can browse the site on https, everything is fine. However, problems start when I upload files, via plone's POST forms. I can upload small files (~1 MB), but when I try to upload a 50MB file, I wait all the time till the file is uploaded, and when the indication is 100%, I get a Bad gateway (The proxy server received an invalid response from an upstream server.)
It seems to me that something timeouts between the communication of A and B and instead of being redirected to the correct url, I get a Bad gateway, not to mention that the file is not uploaded.
On the apache log I see
[error] proxy: pass request body failed
As suggested on other threads, I've experimented with the following values with no luck
force-proxy-request-1.0
proxy-nokeepalive
KeepAlive
KeepAliveTimeout
proxy-initial-not-pooled
Timeout
ProxyTimeout
Sooooo..any suggestions? Thanks a million in advance!
Did you check the varnish configuration? varnish has some timeouts of its own, I am familiar with send_timeout which usually breaks downloads if they fail to finish within a few seconds (Varnish really isn't any good for large downloads, because you end doing stupid things like configuring send_timeout=7200 to make it work).
Also, set first_byte_timeout to a larger number for that backend, because a large file upload might delay plone's response just enough to cause this.
Setting the Timeout and KeepAliveTimeout in the apache virtual host file worked for me.
Example:
Timeout 3600
KeepAliveTimeout 50
I am using the following Apache config to forward requests to a Tomcat server:
ProxyPass /myapp ajp://localhost:8009/myapp max=2
This is a simplified config, but is enough to reproduce the issue, which is that the max parameter has no effect. If I through 10 concurrent requests to Apache, all 10 are forwarded to Tomcat at the same time, while I would like to have them forwarded 2 by 2. Should I use something other than the max parameter for this?
The max=2 failed to limit the number of requests concurrently forwarded to Tomcat because I was running this on UNIX, and my Apache came preconfigured with prefork MPM, which creates one process per request. The max applies per process, hence doesn't have the desired effect.
If you are in this situation and need to limit the number concurrent request forwarded to Tomcat, then you'll need to replace your Apache with a worker or event MPM Apache, in the config set ServerLimit to 1, and ThreadsPerChild and MaxClients to the same value, which will be the total number of concurrent connections your Apache will be able to process. You can find more information about this in this section documenting the recommended Apache configuration for Orbeon Forms.
service apache2 restart
This error has been driving me nuts. We have a server running Apache and Tomcat, serving multiple different sites. Normally the server runs fine, but sometimes an error happens where people are served the wrong page - the page that somebody else requested!
Clues:
The pages being delivered are those that another user requested recently, and are otherwise delivered correctly. It's been known for two simultaneous requests to be swapped. As far as I can tell, none of the pages being incorrectly delivered are older than a few minutes.
It only affects the files that are being served by Tomcat. Static files like images are unaffected.
It doesn't happen all the time. When it does happen, it happens for everybody.
It seems to happen at times of peak demand. However, the demand is not yet very high - it's certainly well within the bounds of what Apache can cope with.
Restarting Tomcat fixed it, but only for a few minutes. Restarting Apache fixed it, but only for a few minutes.
The server is running Apache 2 and Tomcat 6, using a Java 6 VM on Gentoo. The connection is with AJP13, and JkMount directives within <VirtualHost> blocks are correct.
There's nothing of use in any of the log files.
Further information:
Apache does not have any form of caching turned on. All the caching-related entries in httpd.conf and related imports say, for example:
<IfDefine CACHE>
LoadModule cache_module modules/mod_cache.so
</IfDefine>
While the options for Apache don't include that flag:
APACHE2_OPTS="-D DEFAULT_VHOST -D INFO -D LANGUAGE -D SSL -D SSL_DEFAULT_VHOST -D PHP5 -D JK"
Tomcat likewise has no caching options switched on, that I can find.
toolkit's suggestion was good, but not appropriate in this case. What leads me to believe that the error can't be within my own code is that it isn't simply a few values that are being transferred - it's the entire request, including the URL, parameters, session cookies, the whole thing. People are getting pages back saying "You are logged in as John", when they clearly aren't.
Update:
Based on suggestions from several people, I'm going to add the following HTTP headers to Tomcat-served pages to disable all forms of caching:
Cache-Control: no-store
Vary: *
Hopefully these headers will be respected not just by Apache, but also by any other caches or proxies that may be in the way. Unfortunately I have no way of deliberately reproducing this error, so I'm just going to have to wait and see if it turns up again.
I notice that the following headers are being included - could they be related in any way?
Connection: Keep-Alive
Keep-Alive: timeout=5, max=66
Update:
Apparently this happened again while I was asleep, but has stopped happening now I'm awake to see it. Again, there's nothing useful in the logs that I can see, so I have no clues to what was actually happening or how to prevent it.
Is there any extra information I can put in Apache or Tomcat's logs to make this easier to diagnose?
Update:
Since this has happened again a couple of times, we've changed how Apache connects to Tomcat to see if it affects things. We were using mod_jk with a directive like this:
JkMount /portal ajp13
We've switched now to using mod_proxy_ajp, like so:
ProxyPass /portal ajp://localhost:8009/portal
We'll see if it makes any difference. This error was always annoyingly unpredictable, so we can never definitively say if it's worked or not.
Update:
We just got the error briefly on a site that was left using mod_jk, while a sister site on the same server using mod_proxy_ajp didn't show the error. This doesn't prove anything, but it does provide evidence that swithing to mod_proxy_ajp may have helped.
Update:
We just got the error again last night on a site using mod_proxy_ajp, so clearly that hasn't solved it - mod_jk wasn't the source of the problem. I'm going to try the anonymous suggestion of turning off persistent connections:
KeepAlive Off
If that fails as well, I'm going to be desperate enough to start investigating GlassFish.
Update:
Dammit! The problem just came back. I hadn't seen it in a while, so I was starting to think we'd finally sorted it. I hate heisenbugs.
Could it be the thread-safety of your servlets?
Do your servlets store any information in instance members.
For example, something as simple as the following may cause thread-related issues:
public class MyServlet ... {
private String action;
public void doGet(...) {
action = request.getParameter("action");
processAction(response);
}
public void processAction(...) {
if (action.equals("foo")) {
// send foo page
} else if (action.equals("bar")) {
// send bar page
}
}
}
Because the serlvet is accessed by multiple threads, there is no guarantee that the action instance member will not be clobbered by someone elses request, and end up sending the wrong page back.
The simple solution to this issue is to use local variables insead of instance members:
public class MyServlet ... {
public void doGet(...) {
String action = request.getParameter("action");
processAction(action, response);
}
public void processAction(...) {
if (action.equals("foo")) {
// send foo page
} else if (action.equals("bar")) {
// send bar page
}
}
}
Note: this extends to JavaServer Pages too, if you were dispatching to them for your views?
Check if your headers allow caching without the correct Vary HTTP header (if you use session cookies, for instance, and allow caching, you need an entry in the Vary HTTP header for the cookie header, or a cache/proxy might serve the cached version of a page intended for one user to another user).
The problem might be not with caching on your web server, but on another layer of caching (either on a reverse proxy in front of your web server, or on a proxy near the users). If the clients are behing a NAT, they might also be behind a transparent proxy (and, to make things even harder to debug, the transparent proxy might be configured to not be visible in the headers).
8 updates of the question later one more issue to use to test/reproduce, albeit it might be difficult (or expensive) for public sites.
You could enable https on the sites. This would at least wipe out any other proxies caches along the way. It'd be bad to see that there are some forgotten loadbalancers or company caches on the way that interfere with your traffic.
For public sites this would imply trusted certificates on the keys, so some money will be involved. For testing self-signed keys might suffice. Also, check that there's no transparent proxy involved that decrypts and reencrypts the traffic. (they are easily detectable, as they can't use the same certificate/key as the original server)
Although you did mention mod_cache was not enabled in your setup, for others who may have encountered the same issue with mod_cache enabled (even on static contents), the solution is to make sure the following directive is enabled on the Set-Cookie HTTP header:
CacheIgnoreHeaders Set-Cookie
The reason being mod_cache will cache the Set-Cookie header that may get served to other users. This would then leak session ID from the user who last filled the cache to another.
I had this problem and it really drove me nuts. I dont know why, but I solved it turning off the Keep Alive on the http.conf
from
KeepAlive On
to
KeepAlive Off
My application doesn't use the keepalive feature, so it worked very well for me.
Try this:
response.setHeader("Cache-Control", "no-cache"); //HTTP 1.1
response.setHeader("Pragma", "no-cache"); //HTTP 1.0
response.setDateHeader("Expires", 0); //prevents caching at the proxy server
Have a look at this site, it describes an issue with mod_jk. I came accross your posting while looking at a very similar issue. Basically the fix is to upgrade to a newer version of mod_jk. I haven't had a chance to implement the change in our server yet, but I'm going to try this tomorrow and see if it helps.
http://securitytracker.com/alerts/2009/Apr/1022001.html
I'm no expert, but could it be some weird Network Address Translation issue?
We switched Apache from proxying with AJP to proxying with HTTP. So far it appears to have solved the issue, or at least vastly reduced it - the problem hasn't been reported in months, and the app's use has increased since then.
The change is in Apache's httpd.conf. Having started with mod_jk:
JkMount /portal ajp13
We switched to mod_proxy_ajp:
ProxyPass /portal ajp://localhost:8009/portal
Then finally to straight mod_proxy:
ProxyPass /portal http://localhost:8080/portal
You'll need to make sure Tomcat is set up to serve HTTP on port 8080. And remember that if you're serving /, you need to include / on both sides of the proxy or it starts crying:
ProxyPass / http://localhost:8080/
It may be not a caching issue at all. Try to increase MaxClients parameter in apache2.conf. If it is too low (150 by default?), Apache starts to queue requests. When it decides to serve queued request via mod_proxy it pulls out a wrong page (or may be it is just stressed doing all the queuing).
Are you sure that is the page that somebody else requested or a page without parameters?,
you could get weird errors if your connectionTimeout is too short at server.xml on the tomcat server behind apache, increase it to a bigger number:
default configuration:
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" />
changed:
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="2000000"
redirectPort="8443" />