Flushing CloudFront's cache when S3 files are deleted - amazon-s3

I have set up S3 with CloudFront as my CDN. As you know, when you upload files to S3 bucket, they are pushed to all CloudFront's edge locations and cached for best performance.
If I delete files from S3 they remain in CDN's cache and are still being served to the end-users. How to prevent this behavior? I want CloudFront to serve only the files that are actually available in the S3 storage.
Thanks in advance.

You can invalidate objects on Cloudfront using the API or the Console. When doing this, the files get deleted from the Cloudfront edge locations.

Related

How best to serve versioned S3 files from CloudFront?

I have a CloudFront distribution that has two origins in it: one origin that is a static S3 bucket (which is cached by CloudFront), and another origin that is a dynamic server (which is not cached). When users log into my application, the dynamic origin redirects users to the static, cached S3 bucket origin.
Right now, I'm handling the versioning of my S3 bucket by doing the following: I prepend the version of my code to the path of the S3 bucket on every release (so if my S3 bucket's path is normally /static/ui, it now becomes /v1.2/static/ui). The S3 bucket's cache behavior in CloudFront has the path pattern /static/ui, BUT the origin settings for the S3 bucket has the origin path /v1.2. Unfortunately, because the origin path isn't included in my cache behavior, whenever I have to change it to point to a new version, I have to invalidate my cache so that CloudFront will check the new origin path.
So, the release process goes like this:
I create a new version of my UI code and add it to S3, and prepend the version to my S3 bucket's path (creating a path that looks like this /v1.2/static/ui).
I change my "Origin Path" value in CloudFront associated with my S3 origin to have the new version in it (so it becomes /v1.2). This makes it so that all requests to my CloudFront distribution get forwarded to my origin with /v1.2 prepended to the origin path.
I invalidate my CloudFront cache.
This method of versioning my UI works - but is there a better way to do it? I'd like to be able to handling versioning my S3 bucket without having to invalidate the cache every time I change versions.
I ended up versioning my S3 bucket and handling my CloudFront cache busting by doing the following:
I changed the way that Webpack built my static assets by adding a hash to all of the files Webpack builds except for my index.html, which now points to all of my hashed filenames. So, for example, previously, a given JavaScript chunk that Webpack builds was called 12.js, and now it might be called 0b2fb3071ad992833dd4.js, and the index.html now references that new hashed filename. The file hash is generated by the content within the file, and will change if the content in the file changes.
I make sure that when my static files are uploaded to S3, the index.html file has the header Cache-Control: no-cache sent out with every request to the index.html file in my S3 bucket. This makes it so that CloudFront never caches the index.html file, but will still cache the hashed filenames that the index.html points to.
I prepend the version of my static files to the S3 bucket's path where I upload my static assets (so for example, my S3 bucket's path might look like /v1.2/static/ui/index.html when I make a request to my index.html file).
Upon every new release, I update my CloudFront origin path for my S3 bucket origin to point to the new version of my UI (so it might change from /v1.2 to /v1.3).
Because my index.html isn't cached by CloudFront, and because all other static files that the index.html points are hashed, and will have different file names upon each release (if the file changed at all), I no longer need to manually send a CloudFront cache invalidation to my S3 bucket's path upon every release, since upon every release, the index.html file will point to totally new files which will not be cached in CloudFront until users make new requests to it.
This also means that switching my CloudFront origin path will now instantaneously switch users to different versions of my UI, and I don't have to wait the estimated 5 minutes for a manual CloudFront cache invalidation to take effect.
Lastly, this method of cache busting also works with browser-side caching, so I was able to have files saved in my S3 bucket send a Cache-Control: max-age=604800 header to users for all files except for my index.html file and enable browser-side caching for my users. The browser-side cache is invalidated when the index.html points to a new hashed filename for its static assets. This greatly improved the performance of my application.
This all came at the cost of not being able to cache my index.html file in CloudFront or the user's browser, but I think the benefits of caching this way outweigh the cost.

deleting a file from AWS cloudfront distribution

I have created a Cloudfront Distribution in AWS using S3 Bucket.
I have uploaded some files in S3 bucket and provided the public readonly file permission to all the files.
I can see all the files using the cloudfront server URL.
I want to delete some files from the cloudfront. I have deleted those files from the S3 bucket, and run the Cloudfront Invalidation also to delete the files from edge servers immediately. but the files are still there. I can access those files form the cloudfront URL. The files are there in the edge servers even after the TTL.
Can someone pls tell me how to solve this problem?
Thanks in advance!

S3 Replication Between Regions

I search a way to replicate between S3 buckets across regions.
The purpose is that if a file accidentally deleted because a bug in my application, I would be able to restore it from the other bucket.
There is any way to do it without upload the file twice (meaning, not in the application layer)?
Set versioning on your S3 Bucket. After that it will keep all version files which you uploaded or updated in S3 Bucket. After that you can restore any version of file from version listing. See - Amazon S3 Object Lifecycle Management

Amazon Cloudfront sometimes ignoring http headers from S3

I have a cloudfront distribution that serves files from AWS S3.
Check the following files:
http://s3.amazonaws.com/kekanto_static/css/site.v70.css
http://s3.amazonaws.com/kekanto_static/css/site.v71.css
If you take a look at the headers, you'll realize that BOTH files contain the entry:
Content-Encoding:gzip
Both are properly gzipped in the S3 bucket as it should be.
But when served from Cloudfront, the content-encoding is not coming:
http://img.kekanto.com/css/site.v70.css
http://img.kekanto.com/css/site.v71.css
Any idea why this is happening?
I also checked the Cloudfront endpoints:
http://d1l2emnu9r9dtc.cloudfront.net/css/site.v70.css
http://d1l2emnu9r9dtc.cloudfront.net/css/site.v71.css
And the problem remains the same.
EDIT:
Looks like after an invalidation everything is working properly again, so you will be unable to test the URIs I've given.
All I can think about is that the S3 bucket makes the file available and a only a while after is that the headers become available, causing sometimes the headers getting skipped.
How can I prevent this? Other than setting my uploading files to sleep for a while before letting my web servers know there's a new version?

Amazon S3 Cloudfront Deployment Best Practice

Our current plan for a site is to use Amazon's Cloudfront service as a CDN for asset files such as CSS, JavaScript, and Images, and any other static files.
We currently have 1 bucket in S3 that contains all of these static files. The files are separated into different folders depending on what they are, "Scripts" are JS files, "Images" are Images, etc yadda yadda yadda.
So, what I didn't realize from the start was that once you deploy a Bucket from S3 to a Cloudfront Distribution, then every subsequent update to the bucket won't deploy again to that same Distribution. So, it looks as if you have to redeploy the bucket to another Cloudfront instance every time you have a static file update.
That's fine for images, because we can easily make sure that if there is a change to an image, then we just create a new image. But, that's difficult to do for CSS and JS.
So, that gets me to the Best Practice questions:
Is it best practice to create another Cloudfront Distribution for every production deployment? The problem here would be that causes trouble with CNAME records.
Is it best practice to NOT warehouse CSS and JS in Cloudfront because of the nature of those files, and their need to be easily modified? Seems like the answer to this would be NO because that's the purpose of a CDN.
Is there some other method with Cloudfront that I don't know about?
You can issue invalidation requests to CloudFront.
http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html
Instead of an S3 bucket, though, we use our own server as a custom origin. We have .htaccess alias style_*.css to style.css, and we inject the file modification time for style.css in the HTML. As CloudFront sees a totally different URL, it'll fetch the new version.
(Note: Some CDNs let you do that via query string, but CloudFront ignores all query string data for caching, hence the .htaccess solution.)
edit: CloudFront can be (optionally) configured to use query strings now.
CloudFront has started supporting query strings, which you can use to invalidate cache.
http://aws.typepad.com/aws/2012/05/amazon-cloudfront-support-for-dynamic-content.html