Sub second request time logging in Apache

Sub second request time logging in Apache - apache

While I can get microsecond resolution on the time taken to process a request (%D) to help reconstruct the sequence of requests I would like to look at this in relation to the times of multiple requests generated by a particular page. However as far as I can tell, the %t specifier only provides accuracy to the nearest second. Which makes it impossible to reconstruct the original sequence of events.
Is there another way to get this information in my access_log files?
TIA

This is now possible with Apache 2.4. Use for example the following log-format instead of %t:
[%{%d/%b/%Y:%H:%M:%S}t.%{msec_frac}t %{%z}t]
This will give times like [10/Apr/2012:10:47:22.027 +0000]

Unfortunately, no. This got covered a while back (How to timestamp request logs with millisecond accuracy in Apache 2.0), and it's still true for the most recent stable (2.2.x) Apache branch.
I know of at least one workaround, though, if you're interested: You can pipe Apache's logs to an external process (see the docs page at http://httpd.apache.org/docs/current/mod/mod_log_config.html, under the "CustomLog" directive) which would add timestamps and actually write to the log file.
Note that this method does NOT capture the true request RX time. Apache doesn't output an access log entry until AFTER it's completed sending its response. Plus, there's an additional variable delay while Apache writes into the pipe and your timestamper reads from it (possibly including some buffering, there). If you turned on Apache's "BufferLogs" directive, there's going to be more variable buffering delay. When the system is under load, and perhaps in other edge cases, the average delay could easily grow to a second or more.
If the delays aren't too bad (i.e., "BufferedLogs off", low system load), you can probably get a pretty tight estimate by subtracting the "%D" value from your external timestamp.
Some people (including me) pipe Apache's access logs to the local Syslog daemon (via the 'logger' command or whatever). The syslog daemon takes care of timestamping, among other things.

Related

Can we benefit from DDOS package if we are already using express-rate-limit?

I implemented the Express-rate-limit npm module on my code (nodejs)
I saw the DDOS Module.
Anyone who have good expertise on Nodejs please suggest me that wheher I have to use DDOS module or not.
I installed the module but this will block the request. I read about express rate-limit also, this module is also working same as DDOS.
Someone suggest me that use DDOS. I told that I already used Express-Rate-Limit but he said that Use this also.
I am confused now. Please give me the Proper input regarding this. Any help is really appreciate.

it's fine as basic shield from ddos, or handling external requests for your api methods, that can go-out-of limit.
But if you want to prevent real ddos attacks, your should check debouncing and event throttling. Also think about per-machine custom firewall configurations;)
Dig a bit more into docs of this module ;)
burst Burst is the number or amount of allowable burst requests before
the client starts being penalized. When the client is penalized, the
expiration is increased by twice the previous expiration.
bursts = base request counter for 1 unit of time, defined by default as 1 second, or a custom set up
limit
limit is the number of maximum counts allowed (do not confuse that
with maxcount). count increments with each request. If the count
exceeds the limit, then the request is denied. Recommended limit is to
use a multiple of the number of bursts.
requests received => check for the limit. If limit achieved, requester gets a penalty.
When you see a lot of requests(multiple bursts detected).
That's real detection for excide of request limit.
So, 5 bursts set, 20 as limit, when burst detected as 5, it will flag 20 request counter like a fully recognized limitation
maxexpiry
maxexpiry is the seconds of maximum amount of expiration time. In
order for the user to use whatever service you are providing again,
they have to wait through the expiration time.
And that's it. Just dive into testing this stuff;)

Blocking duplicate http request using mod security

I am using mod security to look for specific values in post parameters and blocking the request if duplicate comes in. I am using mod security user collection to do just that. The problem is that my requests are long running so a single request can take in more than 5 minutes. The user collection i assume does not get written to disk until the first request gets processed. If during the execution of the first request another request comes in using the duplicate value for post parameter the second request does not gets blocked since the collection is not available yet. I need to avoid this situation. Can I use memory based shared collections across requests in mod security? Any other way? Snippet below:
SecRule ARGS_NAMES "uploadfilename" "id:400000,phase:2,nolog,setuid:%{ARGS.uploadfilename},initcol:USER=%{ARGS.uploadfilename},setvar:USER.duplicaterequests=+1,expirevar:USER.duplicaterequests=3600"
SecRule USER:duplicaterequests "#gt 1" "id:400001,phase:2,deny,status:409,msg:'Duplicate Request!'"
ErrorDocument 409 "<h1>Duplicate request!</h1><p>Looks like this is a duplicate request, if this is not on purpose, your original request is most likely still being processed. If this is on purpose, you'll need to go back, refresh the page, and re-submit the data."

ModSecurity is really not a good place to put this logic.
As you rightly state there is no guarantee when a collection is written, so even if collections were otherwise reliable (which they are not - see below), you shouldn't use them for absolutes like duplicate checks. They are OK for things like brute force or DoS checks where, for example, stopping after 11 or 12 checks rather than 10 checks isn't that big a deal. However for absolute checks, like stopping duplicates, the lack of certainty here means this is a bad place to do this check. A WAF to me should be an extra layer of defence, rather than be something you depend on to make your application work (or at least stop breaking). To me, if a duplicate request causes a real problem to the transactional integrity of the application, then those checks belong in the application rather than in the WAF.
In addition to this, the disk based way that collections work in ModSecurity, causes lots of problems - especially when multiple processes/threads try to access them at once - which make them unreliable both for persisting data, and for removing persisted data. Many folks on the ModSecurity and OWASP ModSecurity CRS mailing lists have seen errors in the log file when ModSecurity tried to automatically clean up collections, and so have seen collections files grow and grow until it starts to have detrimental effects on Apache. In general I don't recommend user collections for production usage - especially for web servers with any volume.
There was a memcache version of ModSecurity created that was created which stopped using the dusk based SDBM format which may have addressed a lot of the above issues however it was not completed, though it may be part of ModSecurity v3. I still disagree however that a WAF is the place to check this.

What Is Meant By Server Response Time

I'm doing website optimisations using Google's Pagespeed Insights to test improvements. Among the high-priority fix suggestions, is this:
Reduce server response time
In our test, your server responded in 2.1 seconds.
I read the 'helpful' doc linked in this section, and now I'm really confused.
Is the server response time the DNS response, the time to first-byte, or a combination? Is it purely a server-side thing, or could this be affected by, for example, a slow JavaScript resource or ready events in the DOM?

My first guess would have been that it's the time taken from the moment the request was issued, to the 1st byte received from the server, however Google's definition is not quite that:
(from this page https://developers.google.com/speed/docs/insights/Server)
Server response time measures how long it takes to load the necessary
HTML to begin rendering the page from your server, subtracting out the
network latency between Google and your server. There may be variance
from one run to the next, but the differences should not be too large.
In fact, highly variable server response time may indicate an
underlying performance issue.
To take 2.1 seconds would suggest to me that your application/webserver is buffering it's output, so all your server side processing is happening before it sends the content. If you don't buffer then the html can begin being sent to the browser more quickly which may help, however you lose the ability to do things like change response headers late in your logic.

Use Redis to track concurrent outbound HTTP requests

I'm a little new to Redis, but I'd like to see if it can be used to keep track of how many concurrent HTTP connections I'm making.
Here's the high level plan:
INCR requests
// request begins
HTTP.get(...)
// request ends
DECR.requests
Then at any point, just call GET requests to see how many are currently open.
The ultimate goal here is to throttle my http requests to stay below some arbitrary amount, say 50 requests/s.
Is this the right way to do it? Are there any pitfalls?

As for pitfalls, the only one I can see is that a server that goes down or loses connection to Redis mid-request may never call DECR.
Since you don't know which server does which request, you can never reset the count to the correct value without bringing the system to a halt and reset to 0.

I'm not clear what you'd gain by using redis in this situation. It seems to me it would be more suitable to use just a global variable in your server. If your server goes down, so does your counter, so you don't have to put complicated things in place to deal with disconnection, inconsistencies, etc...

Apache2 and CGI - how to keep Apache from buffering the POST data?

I'm trying to provide live parsing of a file upload in CGI and show the data on screen as it's being uploaded.
However, Apache2 seems to want to wait for the full POST to complete before sending the CGI application anything at all.
How can I force Apache2 to stop buffering the POST to my CGI application?
EDIT
It appears that it's actually the output of the CGI that's being buffered. I started streaming the data to a temp file to watch it's progress. That, and I have another problem.
1) The output is being buffered. I've tried SetEnvIf (and simply SetEnv) for "!nogzip", "nogzip", and "!gzip" without success (within the CGI Directory definition).
2) Apache2 appears to not be reading the CGI's output until the CGI process exits? I notice that my CGI app (flushing or not) is hanging up permanently on a "fwrite(..., stdout)" line at around 80K.
EDIT
Okay, Firefox is messing with me. If I send a 150K file, then there's no CGI lockup around 80K. If the file is 2G, then there's a lockup. So, Firefox is not reading the output from the server while it's trying to send the file... is there any header or alternate content type to change that behavior?
EDIT
Okay, I suppose the CGI output lockup on big files isn't important actually. I don't need to echo the file! I'm debugging a problem caused by debugging aids. :)
I guess this works well enough then. Thanks!
FINAL NOTE
Just as a note... the reason I thought Apache2 was buffering input was that I always got a "Content-Length" environment variable. I guess FireFox is smart enough to precalculate the content length of a multipart form upload and Apache2 was passing that on. I thought Apache2 was buffering the input and reporting the length itself.

Are you sure it's the input being buffered that's the problem? Output buffering problems are much more common, and might not be distinguishable from input buffering, if your method of debugging is something like just printing to the response.
(Output buffering is commonly caused either by unflushed stdout in the script or by filters. The usual culprit is the DEFLATE filter, which is often used to compress all text/ responses, whether they come from a static file or a script. In general it's a good idea to compress the output of scripts, but a side-effect it that it will cause the response to be fully buffered. If you need immediate response, you'll need to turn it off for that one script or all scripts, by limiting the application of AddOutputFilterByType to particular <Directory>s, or using mod_setenvif to set the !nogzip note.)
Similarly, an input filter (including, again DEFLATE) might cause CGI input to be buffered, if you're using any. But they're less widely-used.
Edit: for now, just comment out any httpd conf you have enabling the deflate filter. You can put it back selectively once you're happy that your IO is unbuffered without it.
I notice that my CGI app (flushing or not) is hanging up permanently on a "fwrite(..., stdout)" line at around 80K.
Yeah... if you haven't read all your input, you can deadlock when trying to write output, if you write too much. You can block on an output call, waiting for the network buffers to unclog so you can send the new data you've got, but they never will because the browser is trying to send all its data before it will start to read the output.
What are you working on here? In general it doesn't make sense to write progress-info output in response to a direct form POST, because browsers typically won't display it. If you want to provide upload-progress feedback on a plain HTML form submission, this is usually done with hacks like having an AJAX connection check back to see how the upload is going (meaning progress information has to be shared, eg. in a database), or using a Flash upload component.

From an (old version) of the Apache HTTP Server manuals:
Every time your script does a "flush"
to output data, that data gets relayed
on to the client. Some scripting
languages, for example Perl, have
their own buffering for output - this
can be disabled by setting the $|
special variable to 1. Of course this
does increase the overall number of
packets being transmitted, which can
result in a sense of slowness for the
end user.
Have you tried flushing STDOUT or checking if the language you are using has buffering you can disable?

here's a useful guide for controlling buffering when using perl on the server side:
http://perl.plover.com/FAQs/Buffering.html
many of the ideas and concepts apply to other languages too, such as using buffered and unbuffered output, raw system calls to read and write data vs I/O libraries which do their own buffering.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas