Apache statically and dynamically compressed content - apache

I have a website which has both dynamically generated (PHP) and static content. Setting up Apache to transparently compress everything in accordance with content negotiation is a trifle.
However, I am interested in not compressing static content that rarely, if ever, changes, but instead serving precompressed data in an "asis" manner.
The idea behind this is to reduce latency and save CPU power, and at the same time compress better. Basically, instead of compressing the same data over and over again, I would like the server to sendfile the contents without touching it, but with proper headers. And, ideally, it would work seamlessly with .html and .html.gz files, using transparent compression in one case and none in the other.
There is mod_asis, but this will not provide proper headers (most importantly the ones affecting cache and proxy operation), and it is agnostic of content negotiation. Adding a content-encoding for .gz seems to be the right thing, but does nothing, the ยด.html.gz` web pages appear as downloads (maybe this interferes with some default typemap?).
It seems that the gatling webserver does just what I want in this respect, but I'd really prefer staying with Apache, because despite anything one could blame Apache for, it's the one mainstream server that has been working kind of OK for many years.
Another workaround would be to serve the static content with another server on a different port or subdomain, but I'd prefer if it just worked "invisibly", and if the system was not made more complex than necessary.
Is there a well-known configuration idiom that makes Apache behave in the way indicated?

Related

Benefits of mod_pagespeed over mod_deflate

The SO post here explains what mod_pagespeed does, but I'm wondering if I would notice any significant difference in page load time with this installed on a server that is already using mod_deflate to compress files.
If it is worth installing, are there any special considerations to take into account with regards to configuration when running both modules, or should one replace the other? The server is running EasyApache4.
Yes you will because these modules do different things.
mod_deflate handles data compression
The mod_deflate module provides the DEFLATE output filter that allows
output from your server to be compressed before being sent to the
client over the network.
Simply put, its sole purpose is to reduce the amount of bytes sent for your server regardless of what kind of data is sent
mod_pagespeed performs optimizations that would speed up the resulting webpage performance from end user's perspective by following a bunch of web pages optimization best-practices
Here's a simple example:
imagine we have 1 html page and 1 small external javascript file
if we use mod_deflate, both of them will be gzipped BUT the browser would need to make 2 HTTP requests to fetch them
mod_pagespeed may decide it's worth inlining the contents of this js file into the .html page
if we use mod_deflate together with mod_pagespeed in this case the resulting number of bytes downloaded would be the same BUT the page would render faster as it would need to make only 1 single HTTP request
Such optimizations of the original .html page and its dependent resources may have a huge difference in terms of execution time especially on slow mobile networks
So the idea is to always enable mod_deflate and either apply these best-practices manually or use mod_pagespeed which would apply them automatically

How can I use gzip with SSL, or any alternatives?

Google now treats HTTP as insecure (check here), and in Chrome, we see warning messages if we access HTTP site. And now we have free SSL, letsencrypt. So I assume, we would surely use HTTPS for nearly every server.
Then I found, using gzip with SSL has some security issue, called Breach Attack. I really wonder, then, how can we achieve the purpose of gzip, while using SSL?
Especially on Angular, when built, it has quite large sizes; for now, I have main files that related to #angular, styles files that related to CSS/SCSS/whatever bundled with Webpack, scripts files that related to external javascript files. For my application case, it is like below (Angular 2.3.1, AoT, production build);
main.js: 739K
main.js.gz: 151K
styles.js: 394K
styles.js.gz: 100K
scripts.js: 1.8M
scripts.js.gz: 415K
For main and styles file, it seems okay without gzip. But for scripts file case, it is really big without gzip. 1.8 Megabytes... it would definitely heavy for mobile.
But my application uses WebRTC, which requires HTTPS. So it's kind of stuck for me. Is there any good solution?
BREACH attack is only a problem for content which contains secrets the attacker likes to guess (like CSRF tokens) and where also attacker controlled data are reflected in the content. Static Javascript files and other static files don't have this property so they can safely be compressed. See also Is gzipping content via TLS allowed? or Current State of BREACH (GZIP SSL Attack)?

Serving dynamic zip files through Apache

One of the responsibilities of my Rails application is to create and serve signed xmls. Any signed xml, once created, never changes. So I store every xml in the public folder and redirect the client appropriately to avoid unnecessary processing from the controller.
Now I want a new feature: every xml is associated with a date, and I'd like to implement the ability to serve a compressed file containing every xml whose date lies in a period specified by the client. Nevertheless, the period cannot be limited to less than one month for the feature to be useful, and this implies some zip files being served will be as big as 50M.
My application is deployed as a Passenger module of Apache. Thus, it's totally unacceptable to serve the file with send_data, since the client will have to wait for the entire compressed file to be generated before the actual download begins. Although I have an idea on how to implement the feature in Rails so the compressed file is produced while being served, I feel my server will get scarce on resources once some lengthy Ruby/Passenger processes are allocated to serve big zip files.
I've read about a better solution to serve static files through Apache, but not dynamic ones.
So, what's the solution to the problem? Do I need something like a custom Apache handler? How do I inform Apache, from my application, how to handle the request, compressing the files and streaming the result simultaneously?
Check out my mod_zip module for Nginx:
http://wiki.nginx.org/NgxZip
You can have a backend script tell Nginx which URL locations to include in the archive, and Nginx will dynamically stream a ZIP file to the client containing those files. The module leverages Nginx's single-threaded proxy code and is extremely lightweight.
The module was first released in 2008 and is fairly mature at this point. From your description I think it will suit your needs.
You simply need to use whatever API you have available for you to create a zip file and write it to the response, flushing the output periodically. If this is serving large zip files, or will be requested frequently, consider running it in a separate process with a high nice/ionice value / low priority.
Worst case, you could run a command-line zip in a low priority process and pass the output along periodically.
it's tricky to do, but I've made a gem called zipline ( http://github.com/fringd/zipline ) that gets things working for me. I want to update it so that it can support plain file handles or paths, right now it assumes you're using carrierwave...
also, you probably can't stream the response with passenger... I had to use unicorn to make streaming work properly... and certain rack middleware can even screw that up (calling response.to_s breaks it)
if anybody still needs this bother me on the github page

HTTP compression - How to send precompressed files that exist in a EAR file?

Is it possible to send pre-compressed files that are contained within an EARfile? More specifically, the jsp and js files within the WAR file. I am using Apache HTTP as the web server and although it is simple to turn on the deflate module and set it up to use a pre-compressed version of the files, I would like to apply this to files that are contained within an EAR file that is deployed to JBoss. The reason being that the content is quite static and compressing it on the fly each time is quite costly in terms of cpu time.
Quite frankly, I am not entirely familiar with how JBoss deploys these EAR files and 'serves' them. The gist of what I want to do is pre-compress the files contained inside the war so that when they are requested they are sent back to the client with gzip for Content-Encoding.
In theory, you could compress them before packging them in the EAR, and then serve them up with a custom controller which adds the http header to the response which tells the client they're compressed, but that seems like a lot of effort to go to.
When you say that on-the-fly compression is quite costly, have you actually measured it? Have you tried requesting a large number of uncompressed pages, measured the cpu usage, then tied it again with compressed pages? I think you may be over-estimating the impact. It uses quite low-intensity stream compression, designed to use little CPU resources.
You need to be very sure that you have a real performance problem before going to such lengths to mitigate it.
I don't frequent this site often and I seem to have left this thread hanging. Sorry about that. I did succeed in getting compression to my javascript and css files. What I did was I precompress them in the ant build process using the gzip. I then had to spoof the name to get rid of the gzip extension. So I had foo.js and compressed it into foo.js.gzip. I renamed this foo.js.gzip to foo.js and this is the file that gets packaged into the WAR file. So that handles the precompression part. To get this file served up properly, we just have to tell the browser that this file is compressed, via the content-encoding header of the http response. This was done via a output filter that is applied to files that matched the *.js extension (some Java/JBoss, WEB-INF/web.xml if it helps. I'm not too familiar with this so sorry guys).

How do I configure apache - that has not got mod_expires or mod_headers - to send expiry headers?

The webserver hosting my website is not returning last-modified or expiry headers. I would like to rectify this to ensure my web content is cacheable.
I don't have access to the apache config files because the site is hosted on a shared environment that I have no control over. I can however make configurations via an .htaccess file. The server - apache 1.3 - is not configured with mod_expires or mod_headers and the company will not install these for me.
With these limitations in mind, what are my options?
Sorry for the post here. I recognise this question is not strictly a programming question, and more a sys admin question. When serverfault is public I'll make sure I direct questions of this nature there.
What sort of content? If static (HTML, images, CSS), then really the only way to attach headers is via the front-end webserver. I'm surprised the hosting company doesn't have mod_headers enabled, although they might not enable it for .htaccess. It's costing them more bandwidth and CPU (ie, money) to not cache.
If it's dynamic content, then you'll have control when generating the page. This will depend on your language; here's an example for PHP (it's from the PHP manual, and is a bad example, as it should also set the response code):
if (!headers_sent()) {
header('Location: http://www.example.com/');
exit;
}
Oh, and one thing about setting caching headers: don't set them for too long a duration, particularly for CSS and scripts. You may not think you want to change these, but you don't want a broken site while people still have the old content in their browsers. I would recommend maximum cache settings in the 4-8 hour range: good for a single user's session, or a work day, but not much more.