Consider a dynamic content html web site with lots of static .js and image baggage must be hosted in a single location. The site will soon have a few 1000 new users clustered in a single country on the other side of the world. This new remote country has a Amazon S3 node and all users in that country will be within 1000Km of the S3 node.
To improve the user expperience in the remote country I propose to locate the largest and most referenced static files on a local server close to that remote user community and rewrite URLs when servicing those users.
My feeling is that using a commercial CDN would be overkill in this situation and directly referencing our own manually managed S3 static content would give us more control particularly for occasional urgent patches to JavaScript.
If you are already using Amazon S3 to store your static content, it makes sense to use Amazon's CloudFront CDN.
You can start using it and get all the benefits of the CDN without too much effort.
Related
Currently I’m using a shared storage(azure file storage) to store profile pictures and company logos and also some custom python scripts uploaded by admins. My rest services are running in a docker swarm cluster where all the nodes have access to the shared location. Are there any drawbacks to this kind of design? I’m currently saving the files to the location and creating a url for that file and serving it as a static resource using my nginx reverse proxy/load balancer. So I was curious to know if there are any drawbacks to this design and how can I make it better?
There are several ways to access, store, and manipulate files in Azure file storage using REST API:
The Azure File service offers the following four resources: the storage account, shares, directories, and files. Shares provide a way to organize sets of files and also can be mounted as an SMB file share that is hosted in the cloud.
More info here
When it comes to the design, it will depend of what kind of concerns your customers may have, slow connectivity, are they going to need these files permanently etc ...
I've read a lot of articles stating that I should be using Amazon S3 in conjunction with the CDN Cloudfront. I'm currently not doing this. I'm simply using Cloudfront with my standard shared hosting package.
Is it OK to use Cloudfront on its own with my standard shared hosting package? Surely there is no added benefit to using S3 also as the files are already located within Cloudfront.
Any enlightenment on this is much appreciated.
Leigh
S3 allows you to do things like static webhosting, with logging and redirection. I.E www.example.com redirects to example.com. You can then use Cloudfront to place your assets as close to the end user as possible ("nearest edge location"). An excellent guide on how to do this is in the AWS docs. Two main things are that S3 supports https, and changes to files in S3 are reflected instantly. Because Cloudfront is a CDN, you have to manually expire files if you change them, otherwise is could take up to 24 hours to reflect your changes.
http://docs.aws.amazon.com/gettingstarted/latest/swh/website-hosting-intro.html
A quick comparison between the two is given here:
http://www.bucketexplorer.com/documentation/cloudfront--amazon-s3-vs-amazon-cloudfront.html
There is no problem of using CloudFront against your own origin server comparing to a S3 server.
There are some benefits of using S3:
Data transfer is faster between S3 and CloudFront
Don't need to worry about the stability and maintenance of origin S3 server
Multiple origin regions
There are also benefits if you use your own server:
Cost saving of S3 hosting (this depends on whether you need to pay for your own server)
Easy for customization should you need it
Data storage location for company/country regulation
So it's all depending on your specific circumstances, such as how much you pay for your hosting package, do you need low-level configuration of your origin server, and how sensitivity your data is.
I would say for majority of the small/medium projects, S3 is a perfect place to store data.
Current Situation
I have a project on GitHub that builds after every commit on Travis-CI. After each successful build Travis uploads the artifacts to an S3 bucket. Is there some way for me to easily let anyone access the files in the bucket? I know I could generate a read-only access key, but it'd be easier for the user to access the files through their web browser.
I have website hosting enabled with the root document of "." set.
However, I still get an 403 Forbidden when trying to go to the bucket's endpoint.
The Question
How can I let users easily browse and download artifacts stored on Amazon S3 from their web browser? Preferably without a third-party client.
I found this related question: Directory Listing in S3 Static Website
As it turns out, if you enable public read for the whole bucket, S3 can serve directory listings. Problem is they are in XML instead of HTML, so not very user-friendly.
There are three ways you could go for generating listings:
Generate index.html files for each directory on your own computer, upload them to s3, and update them whenever you add new files to a directory. Very low-tech. Since you're saying you're uploading build files straight from Travis, this may not be that practical since it would require doing extra work there.
Use a client-side S3 browser tool.
s3-bucket-listing by Rufus Pollock
s3-file-list-page by Adam Pritchard
Use a server-side browser tool.
s3browser (PHP)
s3index Scala. Going by the existence of a Procfile, it may be readily deployable to Heroku. Not sure since I don't have any experience with Scala.
Filestash is the perfect tool for that:
login to your bucket from https://www.filestash.app/s3-browser.html:
create a shared link:
Share it with the world
Also Filestash is open source. (Disclaimer: I am the author)
I had the same problem and I fixed it by using the
new context menu "Make Public".
Go to https://console.aws.amazon.com/s3/home,
select the bucket and then for each Folder or File (or multiple selects) right click and
"make public"
You can use a bucket policy to give anonymous users full read access to your objects. Depending on whether you need them to LIST or just perform a GET, you'll want to tweak this. (I.e. permissions for listing the contents of a bucket have the action set to "s3:ListBucket").
http://docs.aws.amazon.com/AmazonS3/latest/dev/AccessPolicyLanguage_UseCases_s3_a.html
Your policy will look something like the following. You can use the S3 console at http://aws.amazon.com/console to upload it.
{
"Version":"2008-10-17",
"Statement":[{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": {
"AWS": "*"
},
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::bucket/*"
]
}
]
}
If you're truly opening up your objects to the world, you'll want to look into setting up CloudWatch rules on your billing so you can shut off permissions to your objects if they become too popular.
https://github.com/jupierce/aws-s3-web-browser-file-listing is a solution I developed for this use case. It leverages AWS CloudFront and Lambda#Edge functions to dynamically render and deliver file listings to a client's browser.
To use it, a simple CloudFormation template will create an S3 bucket and have your file server interface up and running in just a few minutes.
There are many viable alternatives, as already suggested by other posters, but I believe this approach has a unique range of benefits:
Completely serverless and built for web-scale.
Open source and free to use (though, of course, you must pay AWS for resource utilization -- such S3 storage costs).
Simple / static client browser content:
No Ajax or third party libraries to worry about.
No browser compatibility worries.
All backing systems are native AWS components.
You never share account credentials or rely on 3rd party services.
The S3 bucket remains private - allowing you to only expose parts of the bucket.
A custom hostname / SSL certificate can be established for your file server interface.
Some or all of the host files can be protected behind Basic Auth username/password.
An AWS WebACL can be configured to prevent abusive access to the service.
I want to upload pictures to the AWS s3 through the iPhone. Every user should be able to upload pictures but they must remain private for each one of them.
My question is very simple. Since I have no real experience with servers I was wondering which of the following two approaches is better.
1) Use some kind of token vending machine system to grant the user access to the AWS s3 database to upload directly.
2) Send the picture to the EC2 Servlet and have the virtual server place it on the S3 storage.
Edit: I would also need to retrieve, should i do it directly or through the servlet?
Thanks in advance.
Hey personally I don't think it's a good idea to use token vending machine to directly upload the data via the iPhone, because it's much harder to control the access privileges, etc. If you have a chance use ec2 and servlet, but that will add costs to your solution.
Also when dealing with S3 you need to take in consideration that some files are not available right after you save them. Look at this answer from S3 FAQ.
For retrieving data directly from S3 you will need to deal with the privileges issue again. Check the access model for S3, but again it's probably easier to manage the access for non public files via the servlet. The good news is that there is no data transfer charge for data transferred between EC2 and S3 within the same region.
Another important point to consider the latter solution
High performance in handling load and network speeds within amazon ecosystem. With direct uploads the client would have to handle complex asynchronous operations of multipart uploads etc instead of focusing on the presentation and rendering of the image.
The servlet hosted on EC2 would be way more powerful than what you can do on your phone.
Our current plan for a site is to use Amazon's Cloudfront service as a CDN for asset files such as CSS, JavaScript, and Images, and any other static files.
We currently have 1 bucket in S3 that contains all of these static files. The files are separated into different folders depending on what they are, "Scripts" are JS files, "Images" are Images, etc yadda yadda yadda.
So, what I didn't realize from the start was that once you deploy a Bucket from S3 to a Cloudfront Distribution, then every subsequent update to the bucket won't deploy again to that same Distribution. So, it looks as if you have to redeploy the bucket to another Cloudfront instance every time you have a static file update.
That's fine for images, because we can easily make sure that if there is a change to an image, then we just create a new image. But, that's difficult to do for CSS and JS.
So, that gets me to the Best Practice questions:
Is it best practice to create another Cloudfront Distribution for every production deployment? The problem here would be that causes trouble with CNAME records.
Is it best practice to NOT warehouse CSS and JS in Cloudfront because of the nature of those files, and their need to be easily modified? Seems like the answer to this would be NO because that's the purpose of a CDN.
Is there some other method with Cloudfront that I don't know about?
You can issue invalidation requests to CloudFront.
http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html
Instead of an S3 bucket, though, we use our own server as a custom origin. We have .htaccess alias style_*.css to style.css, and we inject the file modification time for style.css in the HTML. As CloudFront sees a totally different URL, it'll fetch the new version.
(Note: Some CDNs let you do that via query string, but CloudFront ignores all query string data for caching, hence the .htaccess solution.)
edit: CloudFront can be (optionally) configured to use query strings now.
CloudFront has started supporting query strings, which you can use to invalidate cache.
http://aws.typepad.com/aws/2012/05/amazon-cloudfront-support-for-dynamic-content.html