I have been trying to debug this for weeks. All of the browsers on all of the clients on my home network are sending 'Accept-Encoding: gzip,deflate'. However, that header is somehow, somewhere being dropped before the request makes it to a web server. For example, http://www.whatsmyip.org/http_compression/ says 'No, your browser is not requesting compressed content'.
I've used Fiddler to make sure that all of my browsers are indeed sending the header. I've swapped out my router. I've turned off all anti-virus software.
Brighthouse/Roadrunner (the local cable ISP) says they are not doing any filtering (and I can't see why they would in this case).
Any suggestions would be most welcome!
Try it with HTTPS.
If you are browsing a site via HTTPS, nothing between your browser and the web server can alter any HTTP-level aspect of the the request or response, including whether compression is enabled, without you having immediate and clear knowledge of that fact (check the site's certificate in your browser address bar and see if it's legit).
I had the Accept-Xncoding issue and determined it was CA Internet Security Suite causing the issue. Disabling wan't enought, you had to uninstall and then clear IE Cache.
Check your antivirus software. It's probably intercepting your outbound traffic and modifying headers on the fly in order to get uncompressed content. Lazy programmers don't like to include decompression methods themselves, or deal with chunked encoding.
Norton Internet Security will overwrite accept encoding with this line:
---------------: ----- -------
McAfee overwrites with this:
X-McProxyFilter: *************
something I haven't identified yet overwrites with this:
Accept-Xncoding: gzip, deflate
You are probably in the same boat. I read Zone Alarm wipes out the encoding header entirely (which means recalculating the size of the packet, but why should they care how much load they introduce on your system?). If you're running Zone Alarm, turn off the 'internet privacy option' or whatever it is, and try again.
Everytime I've seen this problem, it has been the result of shitty antivirus. Completely disabling someone's ability to receive compressed content without letting them know is dirty.
Related
Dig, wget, nslookup and curl commands work perfectly for a specific URL I have pointed to another server less than 24 hours ago.
Problem is, it just refuses to be resolved by the browser (Chrome, Safari and Firefox). The strangest part is that it is being successfully resolved by Postman (by testing the OPTIONS and the GET methods separately), but still doesn't return a proper response on the browser side of things.
DNS checks are returning positive, so this is when I started suspecting that the problem is actually within the headers of the HTTP protocol's requests which are sent - alongside the fact that different responses are being returned for the requests that don't include the default browser headers (being issued through the different command-line tools & Postman) and the ones who do (being issued by the browsers automatically or manually using the dev tools).
After fully flushing the current local system's DNS cache, including the browsers's and even trying another device on another network - I still get still no response on the browser.
Kept going, and attempted to verify that with a VPN (locally - which didn't work), and an online web proxy tool (which did work).
Finally, I extracted the router's default DNS server address, used nslookup to look up the URL again, this time specifically mentioning the desired DNS server (the one stated above), and after getting a successful response with the correct values, I am now pretty much sure the HTTP request is causing the problem.
The URL is hosted on Amazon S3 Static Hosting option, which I used many times before, and didn't have a problem with, with that exact same configuration. Looking up the recent changes/features that were possibly added, pointed out that I may need to explicitly set a CORS policy for the newly created bucket, on top of the usual public access policy that is needed.
After applying that as-well - it still doesn't seem to work.
As a quick change in direction that may possibly make some parts clearer about what's going on (and as I started to think that the browser might not be getting the correct Content-Type header in the response, which should be text/html header as its response, and therefore, possibly doesn't resolve the URL with the expected behavior), I went ahead and applied a 301 redirection on the S3 bucket, instead of the static files hosting, and again, it all works perfectly through the command line tools, but not through the browsers.
Anyway, the browser just doesn't seem to complete any of the requests being sent to the URL.
That might be the OPTIONS pre-flight request failing to respond correctly, and the browser just doesn't continue to issuing the GET request, or the URL is not being found by the DNS route the browser is taking, which is unclear to me currently if that is the option.
Any ideas? (besides the fact that sometimes it just takes longer time for some DNS servers that happen to be on the chosen route to update/refresh their cache, which doesn't appear to be affecting my local machine's DNS route specifically for this case. That, being said with caution, was verified by validating the different parts of DNS configuration and prioritization throughout the different possible parts on my system (Mac OS X), including the fact that the response gets back with the correct address successfully).
Found my answer here:
https://serverfault.com/questions/942030/aws-s3-static-hosting-how-to-debug-connection-timeout
As linked there, more details can be found here:
Non-Authoritative-Reason header field [HTTP]
Solution & Explanation: Because of the nature of the domain extension I have purchased (.dev extension) Chrome was silently using HTTPS because of the URL being part of Chrome's HTTP Strict Transport Security (HSTS), because all .dev domains should be using HTTPS only. Therefore, the issue was still showing up, even when explicitly typing http:// into the URL address bar.
This can be overridden by applying a CloudFront distribution with HTTPS support on top of the S3 Static Hosting, as usual (but still, it should be noted as HSTS listings can cause that for different cases, including this one as part of them, because of the .dev domain extension).
Useful Resources (for debugging purposes)
In addition to what is stated here:
https://gist.github.com/stollcri/7c09bafc97223481920e
You can issue a lookup query (and also add or delete your local set of HSTS listings) through the following Chrome's settings URL:
You can also check the current listings here: https://hstspreload.org/
I have a question. Can I find out the version of Apache when full signature is disabled. Is it even possible? If it is, how? I think that is possible because blackhats hacking big, corporate servers while knowledge of the version of the victim services is essential. What do you think? Thanks.
Well for a start there are two (or even three) things to hide:
ServerHeader - which shows the version in the Server response field. This cannot be turned of in Apache config but can be reduced to just "Apache".
ServerSignature - which displays the server version in the footer of error pages.
X-Powered-By which is not used by Apache but used by back end servers and services it might send requests to (e.g. PHP, J2EE servers... etc.).
Now servers do show some information due to differences in how the operate or how they interpret spec. For example the order of response headers, capitalisation, how they respond to certain requests all give clues to what server software might be being used to answer http requests. However using this to fingerprint a specific version of that software is more tricky unless there was an obvious, observable change from the client side.
Other options include looking at server status-page - though you would hope any administrator clever enough to reduce the default server header would also restrict access to the server-status page. Or through another security hole (e.g. being able to upload load executable scripts or the like).
I would guess most hackers would more be aware of bugs and exploits in some versions of Apache or other web servers and try to see if any of those can be exploited rather than trying to guess the specific version first.
In fact, as an interesting aside, Apache themselves have long been of the opinion that hiding server header information is pointless "security through obscurity" (a point I, and many others, disagree with them on) even putting this in their documention:
Setting ServerTokens to less than minimal is not recommended because
it makes it more difficult to debug interoperational problems. Also
note that disabling the Server: header does nothing at all to make
your server more secure. The idea of "security through obscurity" is a
myth and leads to a false sense of safety.
And even allowing open access to their server-status page.
Edit: I found out my problem. I was using the network inspector wrong (in both Chrome and FF) -- basically I was clicking "refresh"
and watching the network inspector, but it would re-download
everything. What you need to do is go to the URL, then open the network inspector, then go to the URL
again ( Don't "refresh", just re-access the URL a second time). The
network inspector will notify you of which resources were pulled from
cache :)
Original question below:
I am trying to set the image cache settings in Apache. I have the following in .htaccess for 1 week image caching:
FileETag MTime Size
ExpiresActive On
<FilesMatch "\.(gif|jpg|jpeg|png)$">
ExpiresDefault A604800
</FilesMatch>
This looks correct when I check the network tab of the Firefox developer console, but I don't understand why the Request Header says "no-cache"
Note: I removed the lines that do not matter for this question.
I am also serving some images dynamically with PHP. I have caching for those images set for 2 days, but again, the response header says "no-cache". Is this anything to worry about? The images do not appear to be cached when I refresh Firefox. They look like they are being redownloaded:
Any help understanding these headers would be appreciated. If there is an easy way to determine if images are being pulled from cache or not, I'm not seeing it.
The Pragma and Cache-Control request headers mean the same thing, one's from HTTP 1.0 and the other is from 1.1. It's used to tell the server, or a proxy that does caching, that it wants fresh versions of the resource. It's NOT for telling the server that the browser won't cache, or that the browser won't be honoring the cache control the server responds with.
Ultimately, the server can tell a user agent "Here's the resource, cache it for 1 week", but it's still up to the user agent (e.g. browser) to honor that. It could always request the uncached version of the resoure every time instead of not sending the request and loading the locally cached copy.
I'm evaluating the front end performance of a secure (SSL) web app here at work and I'm wondering if it's possible to compress text files (html/css/javascript) over SSL. I've done some googling around but haven't found anything specifically related to SSL. If it's possible, is it even worth the extra CPU cycles since responses are also being encrypted? Would compressing responses hurt performance?
Also, I'm wanting to make sure we're keeping the SSL connection alive so we're not making SSL handshakes over and over. I'm not seeing Connection: Keep-Alive in the response headers. I do see Keep-Alive: 115 in the request headers but that's only keeping the connection alive for 115 milliseconds (seems like the app server is closing the connection after a single request is processed?) Wouldn't you want the server to be setting that response header for as long as the session inactivity timeout is?
I understand browsers don't cache SSL content to disk so we're serving the same files over and over and over on subsequent visits even though nothing has changed. The main optimization recommendations are reducing the number of http requests, minification, moving scripts to bottom, image optimization, possible domain sharding (though need to weigh the cost of another SSL handshake), things of that nature.
Yes, compression can be used over SSL; it takes place before the data is encrypted so can help over slow links. It should be noted that this is a bad idea: this also opens a vulnerability.
After the initial handshake, SSL is less of an overhead than many people think* - even if the client reconnects, there's a mechanism to continue existing sessions without renegotiating keys, resulting in less CPU usage and fewer round-trips.
Load balancers can screw with the continuation mechanism, though: if requests alternate between servers then more full handshakes are required, which can have a noticeable impact (~few hundred ms per request). Configure your load balancer to forward all requests from the same IP to the same app server.
Which app server are you using? If it can't be configured to use keep-alive, compress files and so on then consider putting it behind a reverse proxy that can (and while you're at it, relax the cache headers sent with static content - HttpWatchSupport's linked article has some useful hints on that front).
(*SSL hardware vendors will say things like "up to 5 times more CPU" but some chaps from Google reported that when Gmail went to SSL by default, it only accounted for ~1% CPU load)
You should probably never use TLS compression. Some user agents (at least Chrome) will disable it anyways.
You can selectively use HTTP compression
You can always minify
Let's talk about caching too
I am going to assume you are using an HTTPS Everywhere style web site.
Scenario:
Static content like css or js:
Use HTTP compression
Use minification
Long cache period (like a year)
etag is only marginally useful (due to long cache)
Include some sort of version number in the URL in your HTML pointing to this asset so you can cache-bust
HTML content with ZERO sensitive info (like an About Us page):
Use HTTP compression
Use HTML minification
Use a short cache period
Use etag
HTML content with ANY sensitive info (like a CSRF token or bank account number):
NO HTTP compression
Use HTML minification
Cache-Control: no-store, must-revalidate
etag is pointless here (due to revalidation)
some logic to redirect the page after session timeout (taking into account multiple tabs). If someone presses the browser's Back button, the sensitive info is not displayed due to the cache header.
You can use HTTP compression with sensitive data IF:
You never return user input in the response (got a search box? don't use HTTP compression)
Or you do return user input in the response but randomly pad the response
Using compression with SSL opens you up to vulnerabilities like BREACH, CRIME, or other chosen plain-text attacks.
You should disable compression as SSL/TLS have no way to currently mitigate against these length oracle attacks.
To your first question: SSL is working on a different layer than compression. In a sense these two are features of a web server that can work together and not overlap. Yes, by enabling compression you'll use more CPU on your server but have less of outgoing traffic. So it's more of a tradeoff.
To your second question: Keep-Alive behavior is really dependent on HTTP version. You could move your static content to a non-ssl server (may include images, movies, audio, etc)
Using Apache with mod_rewrite, when I load a .css or .js file and view the HTTP headers, the Content-type is only set correctly the first time I load it - subsequent refreshes are missing Content-type altogether and it's creating some problems for me.
I can get around this by appending a random query string value to the end of each filename, eg. http://www.site.com/script.js?12345
However, I don't want to have to do that, since caching is good and all I want is for the Content-type to be present. I've tried using a RewriteRule to force the type but still didn't solve the problem. Any ideas?
Thanks, Brian
The answer depends on information you've not provided here, specifically where are you seeing these headers?
Unless it's from sniffing the network traffic between the browser and client, then you can't be sure if you are looking at a real request to the server or a request which has been satisfied from the cache. Indeed changing the URL as you describe is a very simple way to force a reload from the server rather than a load from the cache.
I don't think its as broken as you seem to. Fire up Wireshark and see for yourself - or just disable caching for these content types.
C.