How do I add a Vary: Accept-Encoding header to the files of a static website hosted by Amazon S3?
This is the only thing keeping me from getting a 100/100 score from Google PageSpeed, I'd love to get this solved!
R
It's not possible to set the Vary-Accept-Encoding header for S3 objects.
Just to add some perspective, the reason for this is that S3 doesn't currently support on-the-fly compressing, so it's not possible to set this header. If in the future Amazon does add automatic compression, then that header would be set automatically.
With a static site, you're limited to either:
Serving uncompressed assets and having full support, but a slower site/more bandwidth.
Serving compressed assets by compressing them manually, but making the site look like garbage to any browser that doesn't support gzip (there are very few of them now). Note that the extension would still be .html (you don't want to set it to .gz because that implies an archive) but its content would be gzipped.
Related
I have a Cloudfront distribution which has a single S3 origin serving static files. These files all have cache-control: public,max-age=31536000 (one year) as metadata, but when I view the distribution statistics, I see a consistent 60% Miss rate.
I can see that the objects with the lowest Hit rates (usually around 50%) are all from a specific folder, which my Django app uploads thumbnails to. These files still have the same headers, though, so I can't figure out why they're different.
For example – when I load this file (S3 origin, Cloudfront link) in my browser, I see a age: 1169380; x-cache: Hit from cloudfront headers. But if I curl the same URL, I see x-cache: Miss from cloudfront and no age header – if I curl again, the Age begins incrementing from 0 (and I see a Hit).
This feels wrong to me – the cache policy I'm using is a Cloudfront default (Managed-CachingOptimized) which doesn't forward any headers or querystrings, so why does my curl command trigger a call to origin when I just loaded the same file via my browser, and got a cached response?
It's possible I've misunderstood how Cloudfront is supposed to cache these files so would appreciate any pointers.
(If it helps, this page will give a bunch of URLs which show the issue, eg. any image under https://static.scenepointblank.com/thumbs/*)
I have a cloudfront distribution that serves files from AWS S3.
Check the following files:
http://s3.amazonaws.com/kekanto_static/css/site.v70.css
http://s3.amazonaws.com/kekanto_static/css/site.v71.css
If you take a look at the headers, you'll realize that BOTH files contain the entry:
Content-Encoding:gzip
Both are properly gzipped in the S3 bucket as it should be.
But when served from Cloudfront, the content-encoding is not coming:
http://img.kekanto.com/css/site.v70.css
http://img.kekanto.com/css/site.v71.css
Any idea why this is happening?
I also checked the Cloudfront endpoints:
http://d1l2emnu9r9dtc.cloudfront.net/css/site.v70.css
http://d1l2emnu9r9dtc.cloudfront.net/css/site.v71.css
And the problem remains the same.
EDIT:
Looks like after an invalidation everything is working properly again, so you will be unable to test the URIs I've given.
All I can think about is that the S3 bucket makes the file available and a only a while after is that the headers become available, causing sometimes the headers getting skipped.
How can I prevent this? Other than setting my uploading files to sleep for a while before letting my web servers know there's a new version?
I'm looking for an alternative to Amazon S3 for hosting static sites, but that allows GZIP compression depending on the Accept-Encoding header.
In other words, I'm looking for something that will return a different version of a file depending on the client's Accept-Encoding header. I don't mind uploading compressed and uncompressed files myself (it can be automated easily).
Besides that, the service needs to be capable of hosting websites (allows an index page and 404 page to be set). I don't need CNAME capabilities since I'll be using a CDN. I'd also like the capability to set caching headers (Expires, Last-Modified).
FYI, my startup, Site44, does static hosting by syncing with Dropbox. We support all of this except the customized caching headers, but we'd be open to that feature request. (Today we set the Cache-Control header using a heuristic based on the last modified date, with a maximum value of 24 hours.)
Rackspace Cloudfiles supports everything mentioned.
I have added the "Content-Encoding: gzip" header to my S3 files and now when I try to access them, it returns me a "Error 330 (net::ERR_CONTENT_DECODING_FAILED)".
Note that my files are simply images, js and css.
How do I solve that issue?
You're going to have to manually gzip them and then upload them to S3. S3 doesn't have the ability to gzip on the fly like your web server does.
EDIT: Images are already compressed so don't gzip them.
Don't know if you are using Grunt as deployment tool but, use this to compress your files:
https://github.com/gruntjs/grunt-contrib-compress
Then:
https://github.com/MathieuLoutre/grunt-aws-s3
To upload compressed files to Amazon S3. Et voila!
Is it possible to send pre-compressed files that are contained within an EARfile? More specifically, the jsp and js files within the WAR file. I am using Apache HTTP as the web server and although it is simple to turn on the deflate module and set it up to use a pre-compressed version of the files, I would like to apply this to files that are contained within an EAR file that is deployed to JBoss. The reason being that the content is quite static and compressing it on the fly each time is quite costly in terms of cpu time.
Quite frankly, I am not entirely familiar with how JBoss deploys these EAR files and 'serves' them. The gist of what I want to do is pre-compress the files contained inside the war so that when they are requested they are sent back to the client with gzip for Content-Encoding.
In theory, you could compress them before packging them in the EAR, and then serve them up with a custom controller which adds the http header to the response which tells the client they're compressed, but that seems like a lot of effort to go to.
When you say that on-the-fly compression is quite costly, have you actually measured it? Have you tried requesting a large number of uncompressed pages, measured the cpu usage, then tied it again with compressed pages? I think you may be over-estimating the impact. It uses quite low-intensity stream compression, designed to use little CPU resources.
You need to be very sure that you have a real performance problem before going to such lengths to mitigate it.
I don't frequent this site often and I seem to have left this thread hanging. Sorry about that. I did succeed in getting compression to my javascript and css files. What I did was I precompress them in the ant build process using the gzip. I then had to spoof the name to get rid of the gzip extension. So I had foo.js and compressed it into foo.js.gzip. I renamed this foo.js.gzip to foo.js and this is the file that gets packaged into the WAR file. So that handles the precompression part. To get this file served up properly, we just have to tell the browser that this file is compressed, via the content-encoding header of the http response. This was done via a output filter that is applied to files that matched the *.js extension (some Java/JBoss, WEB-INF/web.xml if it helps. I'm not too familiar with this so sorry guys).