What's the best way to serve images across an EC2 cluster on AWS? - amazon-s3

We want to be able to have a folder that can securely serve images across a cluster of web servers. What's the best way to handle this with Amazon Web Services (AWS)? Amazon S3? Amazon Elastic Block Store (EBS)? Amazon Cloudfront?
EDIT: Answer no longer needed...thanks.

I'm not sure what your main goal is or if you have read about the services you ask about. But I will try to explain it as far as I've understood AWS and your choices:
S3 is a STORAGE (with buckets and objects, a sort of folder structure with meta access)
EBS is a VOLUME (these are attached to an EC2 instance as extra drive you can access as a local harddrive)
CloudFront is a WEB-CACHE (you select which datacenter you want them in, and then you point at a S3 bucket and Amazon will replicate the content for you)
So we only need to figure out what you mean by "securely" as there are two options as I see it:
You can protect buckets in the S3 or make access levels with accounts, for "administrator access" only and PUBLIC READABLE...
You can store the data in a EBS volume and keep them there, then they are very secure and NOT public, but shareable (I believe) among the servers (I've planned to check out this myself within the next week)
You cannot protect "cloudfront" data as it's controlled by the Bucket permissions from S3...
Hope you can use this a little. I've not stated anything regarding SPEED nor COST, thats for you to benchmark/test with your data requirements. :o)

Related

Syncing buckets between two S3 Storage Providers

I am currently using RIAK CS as an S3 Provider but I want to change to Scality S3. Therefore, I need to migrate the existing data from RIAK to Scality. Is there a quick an easy way of syncing buckets between the two different storage providers? I have got two docker containers running containing the docker images for the two.
One way of doing it would be to simply download the contents of the buckets to a local folder and then upload to Scality using s3cmd or a similar tool. However, I was hoping there was a direct route between the buckets.
Any ideas?
There would not be a "direct route between the buckets".
While the Amazon S3 CopyObject command can copy objects between different Amazon S3 buckets (even if they are in different regions), it will not work with a non-Amazon endpoint.
Your only hope is if Riak/Scality have somehow built-in connectivity with each other.

Why do I need Amazon S3 and Cloudfront?

I've read a lot of articles stating that I should be using Amazon S3 in conjunction with the CDN Cloudfront. I'm currently not doing this. I'm simply using Cloudfront with my standard shared hosting package.
Is it OK to use Cloudfront on its own with my standard shared hosting package? Surely there is no added benefit to using S3 also as the files are already located within Cloudfront.
Any enlightenment on this is much appreciated.
Leigh
S3 allows you to do things like static webhosting, with logging and redirection. I.E www.example.com redirects to example.com. You can then use Cloudfront to place your assets as close to the end user as possible ("nearest edge location"). An excellent guide on how to do this is in the AWS docs. Two main things are that S3 supports https, and changes to files in S3 are reflected instantly. Because Cloudfront is a CDN, you have to manually expire files if you change them, otherwise is could take up to 24 hours to reflect your changes.
http://docs.aws.amazon.com/gettingstarted/latest/swh/website-hosting-intro.html
A quick comparison between the two is given here:
http://www.bucketexplorer.com/documentation/cloudfront--amazon-s3-vs-amazon-cloudfront.html
There is no problem of using CloudFront against your own origin server comparing to a S3 server.
There are some benefits of using S3:
Data transfer is faster between S3 and CloudFront
Don't need to worry about the stability and maintenance of origin S3 server
Multiple origin regions
There are also benefits if you use your own server:
Cost saving of S3 hosting (this depends on whether you need to pay for your own server)
Easy for customization should you need it
Data storage location for company/country regulation
So it's all depending on your specific circumstances, such as how much you pay for your hosting package, do you need low-level configuration of your origin server, and how sensitivity your data is.
I would say for majority of the small/medium projects, S3 is a perfect place to store data.

Is it possible to restrict access from EC2 instance to use only S3 buckets from specific account?

Goal: I would like to keep sensitive data in s3 buckets and process it on EC2 instances, located in the private cloud. I researched that there is possbility to set up S3 buckets policy by IP and user(iam) arn's thus i consider that data in s3 bucket is 'on the safe side'. But i am worriyng about the next scenario: 1) there is vpc 2) inside theres is an ec2 isntance 3) there is an user under controlled(allowed) account with permissions to connect and work with ec2 instance and buckets. Buckets are defined and configured to work with only with known(authorized) ec2-instances. Security leak: user uploads malware application on ec2 instance and during processing data executes malware application that transfer data to other(unauthorized) buckets under different AWS account. Disabling uploading data to ec2-instance is not an option in my case. Question: is it possible to restrict access on vpc firewal in such way that it will be access to some specific s3 buckets but it will be denied access to any other buckets? Assumed that user might upload malware application to ec2 instance and within it upload data to other buckets(under third-party AWS account).
There is not really a solution for what you are asking, but then again, you seem to be attempting to solve the wrong problem (if I understand your question correctly).
If you have a situation where untrustworthy users are in a position where they are able to "connect and work with ec2 instance and buckets" and upload and execute application code inside your VPC, then all bets are off and the game is already over. Shutting down your application is the only fix available to you. Trying to limit the damage by preventing the malicious code from uploading sensitive data to other buckets in S3 should be the absolute least of your worries. There are so many other options available to a malicious user other than putting the data back into S3 but in a different bucket.
It's also possible that I am interpreting "connect and work with ec2 instance and buckets" more broadly than you intended, and all you mean is that users are able to upload data to your application. Well, okay... but your concern still seems to be focused on the wrong point.
I have applications where users can upload data. They can upload all the malware they want, but there's no way any code -- malicious or benign -- that happens to be contained in the data they upload will ever get executed. My systems will never confuse uploaded data with something to be executed or handle it in a way that this is even remotely possible. If your code will, then you again have a problem that can only be fixed by fixing your code -- not by restricting which buckets your instance can access.
Actually, I lied, when I said there wasn't a solution. There is a solution, but it's fairly preposterous:
Set up a reverse web proxy, either in EC2 or somewhere outside, but of course make its configuration inaccessible to the malicious users. In this proxy's configuration, configure it to only allow access to the desired bucket. With apache, for example, if the bucket were called "mybucket," that might look something like this:
ProxyPass /mybucket http://s3.amazonaws.com/mybucket
Additional configuration on the proxy would deny access to the proxy from anywhere other than your instance. Then instead of allowing your instance to access the s3 endpoints directly, only allow outbound http toward the proxy (via the security group for the compromised instance). Requests for buckets other than yours will not make it through the proxy, which is now the only way "out." Problem solved. At least, the specific problem you were hoping to solved should be solvable by some variation of this approach.
Update to clarify:
To access the bucket called "mybucket" in the normal way, there are two methods:
http://s3.amazonaws.com/mybucket/object_key
http://mybucket.s3.amazonaws.com/object_key
With this configuration, you would block (not allow) all access to all S3 endpoints from your instances via your security group configuration, which would prevent accessing buckets with either method. You would, instead, allow access from your instances to the proxy.
If the proxy, for example, were at 172.31.31.31 then you would access buckets and their objects like this:
http://172.31.31.31/mybucket/object_key
The proxy, being configured to only permit certain patterns in the path to be forwarded -- and any others denied -- would be what controls whether a particular bucket is accessible or not.
Use VPC Endpoints. This allows you to restrict which S3 buckets your EC2 instances in a VPC can access. It also allows you to create a private connection between your VPC and the S3 service, so you don't have to allow wide open outbound internet access. There are sample IAM policies showing how to control access to buckets.
There's an added bonus with VPC Endpoints for S3 that certain major software repos, such as Amazon's yum repos and Ubuntu's apt-get repos, are hosted in S3 so you can also allow your EC2 instances to get their patches without giving them wide open internet access. That's a big win.

Correct Server Schema to upload pictures in Amazon Web Services

I want to upload pictures to the AWS s3 through the iPhone. Every user should be able to upload pictures but they must remain private for each one of them.
My question is very simple. Since I have no real experience with servers I was wondering which of the following two approaches is better.
1) Use some kind of token vending machine system to grant the user access to the AWS s3 database to upload directly.
2) Send the picture to the EC2 Servlet and have the virtual server place it on the S3 storage.
Edit: I would also need to retrieve, should i do it directly or through the servlet?
Thanks in advance.
Hey personally I don't think it's a good idea to use token vending machine to directly upload the data via the iPhone, because it's much harder to control the access privileges, etc. If you have a chance use ec2 and servlet, but that will add costs to your solution.
Also when dealing with S3 you need to take in consideration that some files are not available right after you save them. Look at this answer from S3 FAQ.
For retrieving data directly from S3 you will need to deal with the privileges issue again. Check the access model for S3, but again it's probably easier to manage the access for non public files via the servlet. The good news is that there is no data transfer charge for data transferred between EC2 and S3 within the same region.
Another important point to consider the latter solution
High performance in handling load and network speeds within amazon ecosystem. With direct uploads the client would have to handle complex asynchronous operations of multipart uploads etc instead of focusing on the presentation and rendering of the image.
The servlet hosted on EC2 would be way more powerful than what you can do on your phone.

Optimal storing structure for an EC2 instance?

I'm developing a website in EC2 and I have a lampp server in the original /opt/lampp folder. The thing is that I store all my images related to the website including users' profile images there(/opt/lampp/htdocs). I doubt this is the most efficient way? I have links to the images in my MySQL server.
I actually have no idea of what Amazon EBS and Amazon S3 is, how can I utulize it?
EBS is like an external usb hard drive. You can easily access content from filesystem (/mnt/).
S3 is more like an API based cloud storage. You'll have much more work to integrate it into your system.
You have a pretty good summary here :
http://www.differencebetween.net/technology/internet/difference-between-amazon-s3-and-amazon-ebs/
Google has a lot of infos about this.