Amazon ec2 and s3 [duplicate] - amazon-s3

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How we can mount amazon s3 on amazon ec2
Hi,
I have one Amazon ec2 account and Amazon s3 account. Now I want to store some files in s3 and want to retrieve these files for some computation in ec2. my question is how we can upload files into the buckets of s3 and how we can access these files from ec2 . how we can make a connection between these two.how we locate s3?

Everything is done through standard HTTP methods: GET, PUT, etc.
Amazon has produced some very clear documentation explaining how to work with S3: http://docs.amazonwebservices.com/AmazonS3/latest/dev/
There are also open source libraries published for today's mainstream languages (PHP, .NET, Java, Ruby, Python, etc). These can greatly reduce your development time, however it helps to read throught the AWS docs to know what's happening behind the scenes (especially when something breaks).

Related

Merging pdf files stored on Amazon S3

Currently I'm using pdfbox to download all my pdf files on my server and then using pdfbox to merge them together. It's working perfectly fine but it's very slow--since I have to download them all.
Is there a way to perform all of this on S3 directly? I'm trying to find a way to do it, even if not in java also in python and unable to do so.
I read the following:
Merging files on S3 Amazon
https://github.com/boazsegev/combine_pdf/issues/18
Is there a way to merge files stored in S3 without having to download them?
EDIT
The way I ended up doing it was using concurrent.futures and implementing it with concurrent.futures.ThreadPoolExecutor. I set a maximum of 8 worker threads to download all the pdf files from s3.
Once all files were downloaded I merged them with pdfbox. Simple.
S3 is just a data store, so at some level you need to transfer the PDF files from S3 to a server and then back. You'll probably gain the best speed by doing your conversions on an EC2 instance located in the same region as your S3 bucket.
If you don't want to spin up an EC2 instance yourself just to do this then another alternative may be to make use of AWS Lambda, which is a compute service where you can upload your code and have AWS manage the execution of it.

Why do I need Amazon S3 and Cloudfront?

I've read a lot of articles stating that I should be using Amazon S3 in conjunction with the CDN Cloudfront. I'm currently not doing this. I'm simply using Cloudfront with my standard shared hosting package.
Is it OK to use Cloudfront on its own with my standard shared hosting package? Surely there is no added benefit to using S3 also as the files are already located within Cloudfront.
Any enlightenment on this is much appreciated.
Leigh
S3 allows you to do things like static webhosting, with logging and redirection. I.E www.example.com redirects to example.com. You can then use Cloudfront to place your assets as close to the end user as possible ("nearest edge location"). An excellent guide on how to do this is in the AWS docs. Two main things are that S3 supports https, and changes to files in S3 are reflected instantly. Because Cloudfront is a CDN, you have to manually expire files if you change them, otherwise is could take up to 24 hours to reflect your changes.
http://docs.aws.amazon.com/gettingstarted/latest/swh/website-hosting-intro.html
A quick comparison between the two is given here:
http://www.bucketexplorer.com/documentation/cloudfront--amazon-s3-vs-amazon-cloudfront.html
There is no problem of using CloudFront against your own origin server comparing to a S3 server.
There are some benefits of using S3:
Data transfer is faster between S3 and CloudFront
Don't need to worry about the stability and maintenance of origin S3 server
Multiple origin regions
There are also benefits if you use your own server:
Cost saving of S3 hosting (this depends on whether you need to pay for your own server)
Easy for customization should you need it
Data storage location for company/country regulation
So it's all depending on your specific circumstances, such as how much you pay for your hosting package, do you need low-level configuration of your origin server, and how sensitivity your data is.
I would say for majority of the small/medium projects, S3 is a perfect place to store data.

AWS S3 and AjaXplorer

I'm using AjaXplorer to give access to my clients to a shared directory stored in Amazon S3. I installed the SD, configured the plugin (http://ajaxplorer.info/plugins/access/s3/) and could upload and download files but the upload size is limited to my host PHP limit which is 64MB.
Is there a way I can upload directly to S3 without going over my host to improve speed and have S3 limit, no PHP's?
Thanks
I think that is not possible, because the server will first climb to the PHP file and then make transfer to bucket.
Maybe
The only way around this is to use some JQuery or JS that can bypass your server/PHP entirely and stream directly into S3. This involves enabling CORS and creating a signed policy on the fly to allow your uploads, but it can be done!
I ran into just this issue with some inordinately large media files for our website users that I no longer wanted to host on the web servers themselves.
The best place to start, IMHO is here:
https://github.com/blueimp/jQuery-File-Upload
A demo is here:
https://blueimp.github.io/jQuery-File-Upload/
This was written to upload+write files to a variety of locations, including S3. The only tricky bits are getting your MIME type correct for each particular upload, and getting your bucket policy the way you need it.

Best way of storing and retrieving files - BaaS or S3

We are facing the following dilemma:
Our mobile client application will be user-authenticating through a BaaS (Backend-as-a-Service) and will then need to send a file to the cloud - specifically an Amazon EC2 server where the main processing will take place. Since the time of processing of the file might take place later, there is a need to store the files (and there is also a prospect of keeping an archive of them for future use by the users). The question is what would you suggest as the preferred way from the following:
a) send the file to the EC2 server directly which will then issue an Amazon S3 request to save the file there
OR
b) store the file to the BaaS (which in our case is parse.com which uses S3 as its data-storage) and retrieve it later by the EC2 server
The cost of transferring a file from EC2 to S3 and inverse is 0 as long as both are on the same region which in both a) and b) cases is true. The problem is that there is a need for mapping each user to the files that he has access to and a) and b) differ a lot in this case.
So basically you are sending file to EC2 - EC2 is processing it and saving it to S3 or it is just saving it to S3 ??.I used a very easy way of transferring data from Ec2 to S3. i.e. s3fuse :- you can basically mount EC2 drive with S3 , so when you store something on EC2, it will automatically be stored in S3 also. Might be handy for you.

What's the best way to serve images across an EC2 cluster on AWS?

We want to be able to have a folder that can securely serve images across a cluster of web servers. What's the best way to handle this with Amazon Web Services (AWS)? Amazon S3? Amazon Elastic Block Store (EBS)? Amazon Cloudfront?
EDIT: Answer no longer needed...thanks.
I'm not sure what your main goal is or if you have read about the services you ask about. But I will try to explain it as far as I've understood AWS and your choices:
S3 is a STORAGE (with buckets and objects, a sort of folder structure with meta access)
EBS is a VOLUME (these are attached to an EC2 instance as extra drive you can access as a local harddrive)
CloudFront is a WEB-CACHE (you select which datacenter you want them in, and then you point at a S3 bucket and Amazon will replicate the content for you)
So we only need to figure out what you mean by "securely" as there are two options as I see it:
You can protect buckets in the S3 or make access levels with accounts, for "administrator access" only and PUBLIC READABLE...
You can store the data in a EBS volume and keep them there, then they are very secure and NOT public, but shareable (I believe) among the servers (I've planned to check out this myself within the next week)
You cannot protect "cloudfront" data as it's controlled by the Bucket permissions from S3...
Hope you can use this a little. I've not stated anything regarding SPEED nor COST, thats for you to benchmark/test with your data requirements. :o)