How to stop Tomcat from processing old request / clear the cache? - apache

In order to test the execution-time of some methods, I set around 250 requests to my backend, which uses Tomcat and Jersey.
Shortly after I realized that there was a mistake in the code, and changed it. Now I have to wait until Tomcat has finished processing all the previous requests. I tried restarting the Tomcat server and redeploying the backend, but no luck.
When i start the server it continues to process the "old" requests.
How do I flush the cache?

Related

Server sends "internal error" response faster after Tomcat upgrade

I recently upgraded our Tomcat server from 7.0.85 to 9.0.70. I am using Apache 2.4.
My Java application runs in a cluster, and it is expected that if the master node fails during a command, the secondary node will take the master role and finish the action.
I have a test that starts an action, performs a failover, and ensures that the secondary node completes the action.
The client sends the request and loops up to 8 times trying to get an answer from the server.
Before the upgrade, the client gets a read-timeout for the first 3/4 tries and then the secondary finishes the action, sends a 200 response, and the test passes. I can see in the Apache access log that the server is trying to send a 500 (internal error) response for the first tries, but I guess it takes too long and I get a read timeout before that.
After the upgrade, I am getting a read-timeout for the first try, but after that, the client receives the internal error response and stops trying. I can see that on the second try the Apache response is way faster than the first try and from the other tries (the 2,3,4 tries) before the upgrade.
I can see in the tcpdump that in the first try (both before and after the upgrade) the connection between the Apache and the Tomcat reaches the timeout. In the following tries the Tomcat sends the Apache a reset connection. The difference is, after the upgrade the Tomcat sends the reset connection immediately after the request, and before the upgrade, it takes a few seconds to send it.
My socket timeout is 20 seconds, the AJP timeout is 10 seconds (as it was before the upgrade). I am using the same configuration files as before the upgrade (except for some refactoring changes I had to do because of Tomcat changes). I tried changing the AJP timeout to 20 seconds, but it didn't help
Is this a configuration issue? Is there a way to “undo” this change?

Apache mod_wsgi slowloris DoS protection

Assuming the following setup:
Apache server 2.4
mpm_prefork with default settings (256 workers?)
Default Timeout (300s)
High KeepAliveTimeout (100s)
reqtimeout_mod enabled with the following config: RequestReadTimeout header=62,MinRate=500 body=62,MinRate=500
Outdated mod_wsgi 3.5 using Daemon mode with 15 threads and 1 process
AWS ElasticBeanstalk's load balancer acting as a reverse proxy to apache with 60s idle connection timeout
Python/Django being the wsgi application
A simple slowloris attack like the one described here, using a "slow" request body: https://www.blackmoreops.com/2015/06/07/attack-website-using-slowhttptest-in-kali-linux/
The above attack, with just 15 requests (same as mod_wsgi threads) can easily lock the server until a timeout happens, either due to:
Load balancer timeout (60s) happens due to no data sent, this kills the apache connection and mod_wsgi can once again serve requests
Apache RequestReadTimeout happens due to data being sent, but not enough, again mod_wsgi is able to serve requests after this
However, with just 15 concurrent "slow" requests, I was able to lock the server up to 60 seconds.
Repeating the same but with a more bizarre number, like 4096 requests, pretty much locks the server permanently since there will be always a new request that needs to be served by mod_wsgi once the previous times out.
I would expect that the load balancer should handle/detect this before even sending requests to apache, which it already does for similar attacks (partial headers, or tcp syn flood attacks never hit apache which is nice)
What options are available to help against this? I know there's no failproof option since these kind of attacks are difficult to detect and protect, but it's quite silly that the server can be locked that easily.
Also, if the wsgi application never reads request body, I would expect for the issue to not happen as well since the request should return immediately, but I'm not sure about this or the internals of mod_wsgi, for example, this is true when using a local dev wsgi server (the attack files since the request body is never read) but the attack succeeds when using mod_wsgi, which leads me to think it tries to read the body even before sending it to the wsgi code.
Slowloris is a very simple Denial-of-Service attack. This is easy to detect and block.
Detecting and preventing DoS and DDos attacks are complex topics with many solutions. In your case you are making the situation worse by using outdated software and picking a low worker thread count so that the problem arises quickly.
A combination of services are available that would be used to manage Dos and DDos attacks.
The front-end of the total system would be protected by a firewall. Typically this firewall would include a Web Application Firewall to understand the nuances of HTTP protocols. In the AWS world, Amazon WAF and Shield are commonly used.
Another service that helps is a CDN. Amazon CloudFront uses Amazon Shield so it has good DDoS support.
The next step is to combine load balancers with auto scaling mechanisms. When the health checks start to fail (caused by Slowloris), the auto scaler will begin launching new instances and terminating failed instances. However, a sustained Slowloris attack will just hit the new servers. This is why the Web Application Firewall needs to detect the attack and start blocking it.
For your studies, take a look at mod_reqtimeout. This is an effective and tuneable solution for Apache for most Slowloris attacks.
[Update]
In the Amazon DDoS White Paper June 2015, Slowloris is specifically mentioned.
On AWS, you can use Amazon CloudFront and AWS WAF to defend your
application against these attacks. Amazon CloudFront allows you to
cache static content and serve it from AWS Edge Locations that can
help reduce the load on your origin. Additionally, Amazon CloudFront
can automatically close connections from slow-reading or slow-writing
attackers (e.g., Slowloris).
Amazon DDoS White Paper June 2015
In mod_wsgi daemon mode there are a bunch of options to further help to combat such attacks by recovering from it and discarding queued requests as well which have been waiting too long. Try your tests using mod_wsgi-express as it defines defaults for a lot of these options whereas when using mod_wsgi yourself directly, there are no defaults. Use mod_wsgi-express start-server --help to see what defaults are. The actual options you want to look at for mod_wsgi daemon mode are request-timeout, connect-timeout, socket-timeout and queue-timeout. There are also other options related to buffer sizes and listener backlog you can play with. Do note that ultimately the listen backlog of the main Apache worker processes can still be an issue because it usually defaults to 500, which means a lot of requests can queue up stuck before you can even tag them with a time so as to help discard the backlog by tracking queue time.
You can find the documentation at:
http://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIDaemonProcess.html
On the point of whether mod_wsgi reads the request body before sending it, no it doesn't. Apache itself because it reads in block may partially read the request body when reading the headers, but it shouldn't block on it. Once the full request headers are passed off to mod_wsgi and sent through to the daemon process, then mod_wsgi will start transferring the request body.
Soloution:
If you are getting hit, I recommend you go to a provider that protects against DDoS attacks. However your best bet would be to programatically block the IP once it has been decided that it is being malicious. If you receive two large Content-Length POST requests than you should block the IP for a few minutes for suspicious activities. Many large companies are very cheap, and some of them are free for the basic package such as Cloud Flare. I use them for my company and I am beyond happy to have them!
Edit: Their job is literally just to protect you. That is it.

How does Apache detects a stopped Tomcat JVM?

We are running multiple Tomcat JVMs under a single Apache cluster. If we shut down all the JVMs except one, sometime we get 503s. If we increase the
retry interval to 180(from retry=10), problem goes away. That bring me
to this question, how does Apache detects a stopped Tomcat JVM? If I
have a cluster which contains multiple JVMs and some of them are down,
how Apache finds that one out? Somewhere I read, Apache uses a real
request to determine health of a back end JVM. In that case, will that
request failed(with 5xx) if JVM is stopped? Why higher retry value is
making the difference? Do you think introducing ping might help?
If someone can explain a bit or point me to some doc, that would be awesome.
We are using Apache 2.4.10, mod_proxy, byrequests LB algorithm, sticky session,
keepalive is on and ttl=300 for all balancer members.
Thanks!
Well let's examine a little what your configuration is actually doing in action and then move to what might help.
[docs]
retry - Here either you 've set it 10 or 180 what you specify is how much time apache will consider your backend server down and thus won't send him requests. So the higher the value, you gain the time for your backend to get up completely but you put more load to the others since you are -1 server for more time.
stickysession - Here if you lose a backend server for whatever reason all the sessions are on it get an error.
All right now that we described the relevant variables for your situation let's clear that apache mod_proxy does not have a health check mechanism embedded, it updates the status of your backend based on responses on real requests.
So your current configuration works as following:
Request arrives on apache
Apache send it to an alive backend
If request gets an error http code for response or doesn't get a response at all, apache puts that backend in ERROR state.
After retry time is passed apache sends to that backend server requests again.
So reading the above you understand that the first request that will reach a backend server which is down will get an error page.
One of the things you can do is indeed ping, according to the docs will check the backend before send any request. Consider of course the overhead that produces.
Although I would suggest you to configure mod_proxy_ajp which is offering extra functionality (and configuration ofc) to your tomcat backend failover detection.

performance issues with Apache as reverse proxy and an ajax-heavy jsf application

I am currently developing a jsf application (running in jboss7 with primefaces 3.5 and push via primepush which basically uses the atmosphere framework to hide all the transport specific stuff behind a layer of abstraction)
As long as i am running just jboss the application works fine and responds quickly as would be expected. However when deploying this to production where jboss runs behind an Apache reverse proxy several problems appear.
The first problem being that Apache seems to kill the long-polling connection which causes the client to miss out on push messages (even after configuring atmosphere to use a broadcast cache). I currently work around that by periodically refreshing the whole page when user is idle, although this smells really bad..
Second, Apache seems to really slow down the whole application. Watching the Apache error log i am seeing a lot of messages like error reading chunk (will post the exact message later as i am currently writing this post on the go with my smartphone). Lot's of digging around in the atmosphere documentation and trying out different broadcasters did mit change this in any way.
My question would be this: would i be better off by using nginx, especially in the context of push via long polling?
I know i have given only little detail, i will edit this post later when at home ;)
just so this topic gets closed: if you have an atmopshere-based application running behind an apache reverse proxy, be sure to set the TTL parameter for the proxypass directive. setting this parameter to 5 worked for me, apache now discards old connections fast enough so it doesn't run out of worker threads.

How to configure Glassfish to drop hanging requests?

Can I configure Glassfish to drop any request that takes longer than 10 seconds to process?
Example:
I'm using Glassfish to host my web service. The thread pool is configured to have max 5 connections.
My service has a method that does this:
System.out.println("New request");
Thread.sleep(1000*1000);
I'm creating 5 requests to the service and I see 5 messages "New request" in the log. Then the server stop to respond for a looong time.
In live environment all requests must be processed in less than a second. If it takes more time to process then there is a problem with the request and I want Glassfish to drop such requests but stay alive and serve other requests.
Currently I'm using a workaround in the code. At the beginning of my web method I launch a separate thread for request processing with a timeout as it was suggested here: How to timeout a thread
I do not like this solution and still believe that there must be a configuration setting in the Glassfish to apply this logic to all requests, not to just one method.