Amazon S3 Cloudfront Deployment Best Practice - amazon-s3

Our current plan for a site is to use Amazon's Cloudfront service as a CDN for asset files such as CSS, JavaScript, and Images, and any other static files.
We currently have 1 bucket in S3 that contains all of these static files. The files are separated into different folders depending on what they are, "Scripts" are JS files, "Images" are Images, etc yadda yadda yadda.
So, what I didn't realize from the start was that once you deploy a Bucket from S3 to a Cloudfront Distribution, then every subsequent update to the bucket won't deploy again to that same Distribution. So, it looks as if you have to redeploy the bucket to another Cloudfront instance every time you have a static file update.
That's fine for images, because we can easily make sure that if there is a change to an image, then we just create a new image. But, that's difficult to do for CSS and JS.
So, that gets me to the Best Practice questions:
Is it best practice to create another Cloudfront Distribution for every production deployment? The problem here would be that causes trouble with CNAME records.
Is it best practice to NOT warehouse CSS and JS in Cloudfront because of the nature of those files, and their need to be easily modified? Seems like the answer to this would be NO because that's the purpose of a CDN.
Is there some other method with Cloudfront that I don't know about?

You can issue invalidation requests to CloudFront.
http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html
Instead of an S3 bucket, though, we use our own server as a custom origin. We have .htaccess alias style_*.css to style.css, and we inject the file modification time for style.css in the HTML. As CloudFront sees a totally different URL, it'll fetch the new version.
(Note: Some CDNs let you do that via query string, but CloudFront ignores all query string data for caching, hence the .htaccess solution.)
edit: CloudFront can be (optionally) configured to use query strings now.

CloudFront has started supporting query strings, which you can use to invalidate cache.
http://aws.typepad.com/aws/2012/05/amazon-cloudfront-support-for-dynamic-content.html

Related

AWS Next.js on EC2 caching

I'm planning to use Next.js SSR/SSG/ISR on Amazon's EC2 and store images on S3 Bucket. Also to add CloudFront CDN on top of it.
The question is:
Should I cache images from S3 in Next.js (which is in EC2) thus "doubling" images (origin in S3, optimised instances in EC2 Next.js cache), or it makes no sense, since everything is located within one cloud (AWS) and covered with CDN layer (CloudFront)?
Or there is a way to move next.js caching to CloudFront?
I do understand that next/image is providing image optimisation (different sizes and quality), but I'm bothered by "doubling" the images, thus paying more for storage.
P.S. I've seen this question, I'm just not experienced with lambda, so currently looking for something I understand already.
Cloudfront gives you the option to have different origin for different behaviours and you can also apply different cache policy per behaviour. What you can do is have a behaviour for /images which will go to S3 and Default behaviour will point to Ec2 origin.

How best to serve versioned S3 files from CloudFront?

I have a CloudFront distribution that has two origins in it: one origin that is a static S3 bucket (which is cached by CloudFront), and another origin that is a dynamic server (which is not cached). When users log into my application, the dynamic origin redirects users to the static, cached S3 bucket origin.
Right now, I'm handling the versioning of my S3 bucket by doing the following: I prepend the version of my code to the path of the S3 bucket on every release (so if my S3 bucket's path is normally /static/ui, it now becomes /v1.2/static/ui). The S3 bucket's cache behavior in CloudFront has the path pattern /static/ui, BUT the origin settings for the S3 bucket has the origin path /v1.2. Unfortunately, because the origin path isn't included in my cache behavior, whenever I have to change it to point to a new version, I have to invalidate my cache so that CloudFront will check the new origin path.
So, the release process goes like this:
I create a new version of my UI code and add it to S3, and prepend the version to my S3 bucket's path (creating a path that looks like this /v1.2/static/ui).
I change my "Origin Path" value in CloudFront associated with my S3 origin to have the new version in it (so it becomes /v1.2). This makes it so that all requests to my CloudFront distribution get forwarded to my origin with /v1.2 prepended to the origin path.
I invalidate my CloudFront cache.
This method of versioning my UI works - but is there a better way to do it? I'd like to be able to handling versioning my S3 bucket without having to invalidate the cache every time I change versions.
I ended up versioning my S3 bucket and handling my CloudFront cache busting by doing the following:
I changed the way that Webpack built my static assets by adding a hash to all of the files Webpack builds except for my index.html, which now points to all of my hashed filenames. So, for example, previously, a given JavaScript chunk that Webpack builds was called 12.js, and now it might be called 0b2fb3071ad992833dd4.js, and the index.html now references that new hashed filename. The file hash is generated by the content within the file, and will change if the content in the file changes.
I make sure that when my static files are uploaded to S3, the index.html file has the header Cache-Control: no-cache sent out with every request to the index.html file in my S3 bucket. This makes it so that CloudFront never caches the index.html file, but will still cache the hashed filenames that the index.html points to.
I prepend the version of my static files to the S3 bucket's path where I upload my static assets (so for example, my S3 bucket's path might look like /v1.2/static/ui/index.html when I make a request to my index.html file).
Upon every new release, I update my CloudFront origin path for my S3 bucket origin to point to the new version of my UI (so it might change from /v1.2 to /v1.3).
Because my index.html isn't cached by CloudFront, and because all other static files that the index.html points are hashed, and will have different file names upon each release (if the file changed at all), I no longer need to manually send a CloudFront cache invalidation to my S3 bucket's path upon every release, since upon every release, the index.html file will point to totally new files which will not be cached in CloudFront until users make new requests to it.
This also means that switching my CloudFront origin path will now instantaneously switch users to different versions of my UI, and I don't have to wait the estimated 5 minutes for a manual CloudFront cache invalidation to take effect.
Lastly, this method of cache busting also works with browser-side caching, so I was able to have files saved in my S3 bucket send a Cache-Control: max-age=604800 header to users for all files except for my index.html file and enable browser-side caching for my users. The browser-side cache is invalidated when the index.html points to a new hashed filename for its static assets. This greatly improved the performance of my application.
This all came at the cost of not being able to cache my index.html file in CloudFront or the user's browser, but I think the benefits of caching this way outweigh the cost.

How does versioning work on Amazon Cloudfront?

I've just set up a static website on Amazon S3. I'm also using the Cloudfront CDN service.
According to Amazon, there are 2 methods available for clearing the Cloudfront cache: invalidation and versioning. My question is regarding the latter.
Consider the following example:
I link to an image file (image.jpg) from my index.html file. I then decide to replace the image. I upload a second image with the filename: image_2.jpg and change the link in my index.html file.
Will the changes automatically take effect or is some further action required?
What triggers the necessary changes if the edited and newly uploaded files are located in the bucket and not the cache?
Versioning in CloudFront is nothing more than adding (or prefixing) a version in the name of the object or 'folder' where objects are in stored.
all objects in a folder v1 and use a URL like
https://xxx.cloudfront.net/v1/image.png
all objects contain a version in their name like image_v1.png and use a URL like https://xxx.cloudfront.net/image_v1.png
The second option is often a bit more work but then you don't need to upload new files which do not require to be updated (=cheaper in the context of storage). The first solution is often more clear and requires less work.
Using CloudFront Versioning requires more S3 storage but is often cheaper than creating many invalidations.
The other way to invalidate the cache is to create invalidations (can be expensive). If you don't really need invalidations but just need more quick cache refreshes (default 24h) then you can update the origin TTL settings (origin level) or set cache duration for an individual object (object level).
Your cloudfront configuration has a cache TTL, which tells you when the file will be updated, regardless of when the source changes.
If you need it updated right away, use the invalidation function on your index.html file
I'll chime in on this in case anyone else comes here looking for what I did. You can set up Cloudfront with S3 versioning enabled and reference specific S3 versions if you know which version you need. I put it behind a presigned Cloudfront URL and ended up with this in the Java SDK:
S3Properties s3Properties... // Custom properties pulled from a config file
String cloudfrontUrl = "https://" + s3Properties.getCloudfrontDomain() + "/" +
documentS3Key + "?versionId=" + documentS3VersionId;
URL cloudfrontSignedUrl = new URL(CloudFrontUrlSigner.getSignedURLWithCannedPolicy(
cloudfrontUrl,
s3Properties.getCloudfrontKeypairId(),
SignerUtils.loadPrivateKey(s3Properties.getCloudfrontKeyfilePath()),
getPresignedUrlExpiration()));

How to make browser download html when its content changed in s3?

I am using s3 bucket to host my web site. Whenever I release a new version of my web site, I want all clients download it from s3 instead of reading from their browser cache. I know I can set up an expire time for the object saved on s3 bucket but it is not an idea solution since users have to use the cached content for a period of time. Is there a way to force browser to download the content if they are changed in s3 bucket?
Irrespective of whether you are using s3 bucket for hosting or any other hosting server, caching can be controlled by appending hash number to file name.
For example your js file bundle name should be like bundle.7e2c49a622975ebd9b7e.js.
When you deploy it again it will change to some other hash value bundle.205199ab45963f6a62ec.js.
By doing this, browser automatically knows that, new file has arrived and should be downloaded again.
This can be easily done using any popular bundlers like grunt, gulp, webpack.
webpack example

Why do I need Amazon S3 and Cloudfront?

I've read a lot of articles stating that I should be using Amazon S3 in conjunction with the CDN Cloudfront. I'm currently not doing this. I'm simply using Cloudfront with my standard shared hosting package.
Is it OK to use Cloudfront on its own with my standard shared hosting package? Surely there is no added benefit to using S3 also as the files are already located within Cloudfront.
Any enlightenment on this is much appreciated.
Leigh
S3 allows you to do things like static webhosting, with logging and redirection. I.E www.example.com redirects to example.com. You can then use Cloudfront to place your assets as close to the end user as possible ("nearest edge location"). An excellent guide on how to do this is in the AWS docs. Two main things are that S3 supports https, and changes to files in S3 are reflected instantly. Because Cloudfront is a CDN, you have to manually expire files if you change them, otherwise is could take up to 24 hours to reflect your changes.
http://docs.aws.amazon.com/gettingstarted/latest/swh/website-hosting-intro.html
A quick comparison between the two is given here:
http://www.bucketexplorer.com/documentation/cloudfront--amazon-s3-vs-amazon-cloudfront.html
There is no problem of using CloudFront against your own origin server comparing to a S3 server.
There are some benefits of using S3:
Data transfer is faster between S3 and CloudFront
Don't need to worry about the stability and maintenance of origin S3 server
Multiple origin regions
There are also benefits if you use your own server:
Cost saving of S3 hosting (this depends on whether you need to pay for your own server)
Easy for customization should you need it
Data storage location for company/country regulation
So it's all depending on your specific circumstances, such as how much you pay for your hosting package, do you need low-level configuration of your origin server, and how sensitivity your data is.
I would say for majority of the small/medium projects, S3 is a perfect place to store data.