Serving files on S3 bucket through Cloudflare - amazon-s3

I wanted to serve S3 bucket files through Cloudflare network, but encountered some issues. Integration instructions are given here, but they are suitable only for new buckets since bucket is required to be named subdomain.domain.com while my bucket is named domain.
Are there any other solutions to use CloudFlare with S3 without copying files from one bucket to another - like setting some redirects etc.? The problem is that my bucket contains more than 6 million files and that take 200 GB of storage.
Amazon S3 pricing rules are also hard to understand. I struggle to find information how much it costs to transfer information from one bucket to another if they are in the same location.
Thanks for answers.

Unfortunately Amazon S3 requires that the cname conforms to the bucket name as you already found out. So basically you'll have to fix the name.
Here https://serverfault.com/questions/349460/how-to-move-files-between-two-s3-buckets-with-minimum-cost you can find how to copy files between buckets with minimum cost. Within the same zone, and with the right tools, you will not incur bandwidth costs, only the duplicate storage costs for the duration and access costs, details in the linked answer.
Your link to the cloudflare docs doesnt seem to be working anymore, this is the correct link: https://support.cloudflare.com/hc/en-us/articles/200168926-How-do-I-use-CloudFlare-with-Amazon-s-S3-Service-

Related

Presigned S3 URLs with Cloudfront

I want to append my pre-signed URL to a CloudFront URL to use instead
any idea how to achieve this?
Use an Amazon CloudFront Signed URL instead of attempting to use an Amazon S3 pre-signed URL with CloudFront.
See: Using Signed URLs - Amazon CloudFront
I find the question relevant, it matches my needs. I have files stored in S3 Singapore and external consumers in Europe. AWS default bandwidth quality is quite poor (takes several minutes to download a 50 MB file for quite a few of my end users), so I'd like to optimize their network path through a layer of "dumb" CDN (not leveraging any caching, just using it for more qualitative network paths).
Turns out "Amazon S3 Transfer Acceleration" does exactly that:
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html
============
Why Use Amazon S3 Transfer Acceleration?
You might want to use Transfer Acceleration on a bucket for various reasons, including the following:
You have customers that upload to a centralized bucket from all over the world.
You transfer gigabytes to terabytes of data on a regular basis across
continents.
You are unable to utilize all of your available bandwidth over the Internet when uploading to Amazon S3.
Getting Started with Amazon S3 Transfer Acceleration
To get started using Amazon S3 Transfer Acceleration, perform the following steps:
Enable Transfer Acceleration on a bucket
Transfer data to and from the acceleration-enabled bucket by using one of the following s3-accelerate endpoint domain names:
bucketname.s3-accelerate.amazonaws.com – to access an acceleration-enabled bucket.
============
Remarks:
It's more expensive than S3 + Cloudfront. You pay normal S3 bandwidth + something like 0.04 USD / GB for the acceleration (whereas when using Cloudfront, the S3 <> Cloudfront bandwidth is free)
You will probably need to re-sign the URLs. Usually the host is part of the signature, and acceleration requires using a different host. However, this is just normal S3 signing, not the completely different Cloudfront signing.

Why do I need Amazon S3 and Cloudfront?

I've read a lot of articles stating that I should be using Amazon S3 in conjunction with the CDN Cloudfront. I'm currently not doing this. I'm simply using Cloudfront with my standard shared hosting package.
Is it OK to use Cloudfront on its own with my standard shared hosting package? Surely there is no added benefit to using S3 also as the files are already located within Cloudfront.
Any enlightenment on this is much appreciated.
Leigh
S3 allows you to do things like static webhosting, with logging and redirection. I.E www.example.com redirects to example.com. You can then use Cloudfront to place your assets as close to the end user as possible ("nearest edge location"). An excellent guide on how to do this is in the AWS docs. Two main things are that S3 supports https, and changes to files in S3 are reflected instantly. Because Cloudfront is a CDN, you have to manually expire files if you change them, otherwise is could take up to 24 hours to reflect your changes.
http://docs.aws.amazon.com/gettingstarted/latest/swh/website-hosting-intro.html
A quick comparison between the two is given here:
http://www.bucketexplorer.com/documentation/cloudfront--amazon-s3-vs-amazon-cloudfront.html
There is no problem of using CloudFront against your own origin server comparing to a S3 server.
There are some benefits of using S3:
Data transfer is faster between S3 and CloudFront
Don't need to worry about the stability and maintenance of origin S3 server
Multiple origin regions
There are also benefits if you use your own server:
Cost saving of S3 hosting (this depends on whether you need to pay for your own server)
Easy for customization should you need it
Data storage location for company/country regulation
So it's all depending on your specific circumstances, such as how much you pay for your hosting package, do you need low-level configuration of your origin server, and how sensitivity your data is.
I would say for majority of the small/medium projects, S3 is a perfect place to store data.

amazon S3 bucket level stats

I'd like to know if there's a way for me to have bucket-level stats in amazon s3.
Basically i want to charge customers for storage and GET requests on my system (which is hosted on s3).
So i created a specific bucket for each client, but i can't seem to get the stats just for a specific bucket.
I see the API lets me
GET Bucket
or
GET Bucket requestPayment
But i just can't find how to get the number of requests issued to said bucket and the total size of the bucket.
Thanks for help !
Regards
I don't think that what you are trying to achieve is possible using Amazon API. The GET Bucket request does not contain usage statistics (requests, etc) other than the timestamp of the latest modification (LastModified).
My suggestion would be that you enable logging in your buckets and perform the analysis that you want from there.
S3 starting page gives you an overview on it:
Amazon S3 also supports logging of requests made against your Amazon S3 resources. You can configure your Amazon S3 bucket to create access log records for the requests made against it. These server access logs capture all requests made against a bucket or the objects in it and can be used for auditing purposes.
And I am sure there is plenty of documentation on that matter.
HTH.

How do I use Amazon's new RRS for S3?

Reduced Redundancy Storage (RRS) is a new service from Amazon that is a bit cheaper than S3 because there is less redundancy.
However, I can not find any information on how to specify that my data should use RRS rather than standard S3. In fact, there doesn't seem to be any website interface for an S3 services. If I log into AWS, there are only options for EC2, Elastic MapReduce, CloudFront and RDS, none of which I use.
I know this question is old but it's worth mentioning that Amazon's interface for S3 now has an option to change your files (recursively) to RRS. Select a folder and right click on it, under properties change the storage to RRS.
You can use S3 Browser to switch to Reduced Redundancy Storage. It allows you to view/edit storage class for a single file or for multiple files. Moreover, you can configure default storage class for the bucket, so S3 Browser will automatically apply predefined storage class for all new files you are uploading through S3 Browser.
If you are using S3 Browser to work with RRS, the following article may be helpful:
Working with Amazon S3 Reduced Redundancy Storage (RRS)
Note, Storage Class preferences are stored in a local settings file.Other s3 applications are using their own way to store bucket defaults and currently there is not single standard on this.
All objects in Amazon S3 have a
storage class setting. The default
setting is STANDARD. You can use an
optional header on a PUT request to
specify the setting
REDUCED_REDUNDANCY.
From: http://aws.amazon.com/s3/faqs/#How_do_I_specify_that_I_want_to_store_my_data_using_RRS
If you are looking for a way to convert existing data in amazon s3, you can use a fairly recent version of boto and a script I wrote. Details explained on my blog:
http://www.bryceboe.com/2010/07/02/amazon-s3-convert-objects-to-reduced-redundancy-storage/
If you're on a mac, the free cyberduck ftp program will do it. Log into S3, right-click on the bucket (or folder, or file) and choose 'info' and change the storage class from 'unknown' or 'regular s3 storage' to 'reduced redundancy storage'. Took it about 2 hours to change 30,000 files for me...
If you use boto, you can do this:
key.change_storage_class('REDUCED_REDUNDANCY')

What's the best way to serve images across an EC2 cluster on AWS?

We want to be able to have a folder that can securely serve images across a cluster of web servers. What's the best way to handle this with Amazon Web Services (AWS)? Amazon S3? Amazon Elastic Block Store (EBS)? Amazon Cloudfront?
EDIT: Answer no longer needed...thanks.
I'm not sure what your main goal is or if you have read about the services you ask about. But I will try to explain it as far as I've understood AWS and your choices:
S3 is a STORAGE (with buckets and objects, a sort of folder structure with meta access)
EBS is a VOLUME (these are attached to an EC2 instance as extra drive you can access as a local harddrive)
CloudFront is a WEB-CACHE (you select which datacenter you want them in, and then you point at a S3 bucket and Amazon will replicate the content for you)
So we only need to figure out what you mean by "securely" as there are two options as I see it:
You can protect buckets in the S3 or make access levels with accounts, for "administrator access" only and PUBLIC READABLE...
You can store the data in a EBS volume and keep them there, then they are very secure and NOT public, but shareable (I believe) among the servers (I've planned to check out this myself within the next week)
You cannot protect "cloudfront" data as it's controlled by the Bucket permissions from S3...
Hope you can use this a little. I've not stated anything regarding SPEED nor COST, thats for you to benchmark/test with your data requirements. :o)