Unique challenge of s3 Bucket Policy for 'Grant/Restrictions' access - permissions

I read the directions for posting, so I will be as specific as possible.
I have an S3 bucket with numerous FLV files that I will be allowing customers to stream on THEIR domains.
What I am trying to accomplish is
Setting a bucket policy that 'GRANTS' access to specific domains (a list) to stream my bucket files from their domains.
A bucket policy that restricts a user to 'one stream' per domain. In other words, for each domain listed in the above policy, they can only stream one file at a time on their site.
The premise is a video site where customers will be streaming videos specific to their niche. I make host and deliver the videos, but need some control over their delivery.
All files are in ONE bucket. There aren't any weird things going on with the files. It's very straight forward.
I just need the bucket policy control that would Grant and also Restrict the ability of my customers to stream my content from their domains.
I PRAY I have been clear enough, but please don't hesitate to ask if I have confused you...
Thanks VERY much
A

I don't think you can achieve what you want by simply setting access permissions to the bucket.
I checked in AccessControlList and CannedAccessControlList.
Your best bet will be to write a webservice wrapper to access the bucket data.
You will have better control over the data you serve and may be you might also explore the option of cached copy of data for higher optimization.

Related

Difference between Data Transfer and GET request for Amazon S3

I was looking at my billing at noticed my price for Data Transfer made almost 100% of my bill, so I want to be sure I understand exactly what Data Transfer entails, that a GET request.
Just for context I host my website on a different server and have it hooked up to an S3 to store user generated files. These files are the made available for download. Does Data Transfer just cover the bandwidth used to download the file, or is it also used to display one of the files store on my s3 on my site. So for example, if I store a mp3 file on my s3, and display this file on the site to play (excluding the downloading), is that just a GET request thats being sent to get and display the file? To me the definitions are little ambiguous. Any help!?
The GET per-request charge is the charge for handling the actual request for the file (checking whether it exists, checking permissions, fetching it from storage, and preparing to return it to the requester), each time it is downloaded.
The data transfer charge is for the actual transfer of the file's contents from S3 to the requester, over the Internet, each time it is downloaded.
If you include a link to a file on your site but the user doesn't download it and the browser doesn't load it to automatically play, or pre-load it, or something like that, S3 would not know anything about that, so you wouldn't be billed. That's also true if you are using pre-signed URLs -- those don't result in any billing unless they're actually used, because they're generated on your server.
If you include an image on a page, and the image is in S3, every time the page is viewed, you're billed for the request and the transfer, unless the browser has cached the image.
If you use CloudFront in front of S3, so that your image or download links point to CloudFront, you would pay only the request charge from S3, not the transfer charge, from S3, because CloudFront would be billing you the transfer charge instead of S3 (and, additionally, a CloudFront per-request charge, but since CloudFront's data transfer charges are slightly cheaper than S3 in some regions, it's not necessarily a bad deal, by any means).

Using Dropbox API for (subscription) content delivery

I run a multi-gigabyte audio content subscription service. Right now all of our clients get download links via email for all of the content.
I had an idea of employing the Dropbox API after a "successful charge" webhook and giving (read-only) access to a shared Dropbox folder with all of the content. That way, the customer would stay in sync with all updates, changes etc...
The way I picture it, the user checks out and is immediately asked if he would like to add our company's folder to his/her Dropbox.
Does this seem feasible/practical?
Looking at the API, I only see an option to provide a download link but not an actual shared folder. Am I correct in this observation?
That's correct, the Dropbox API doesn't currently offer any API calls for managing shared folders. It only has a way to get the read-only share links like you mentioned.
However, if you'd be interested in potentially participating in a shared folder API beta in the future, please sign up here.
#Greg's answer is correct, but I thought I'd mention a couple other options:
You could use the Saver to let users save the files directly into their Dropbox. This wouldn't help you to push new content to them—they'd still have to visit your site to save the new files—but it would let you cut down on your bandwidth costs, since Dropbox would cache the files for you.
You could use a combination of /copy_ref and /fileops/copy to copy the contents from a central Dropbox account into each user's Dropbox. This wouldn't use any of your bandwidth (once the file was in the central Dropbox account).
Please note, however, that free Dropbox accounts only start with 2GB of storage space. Since you mentioned "multi-gigabyte," you'll need to keep in mind whether your customers will actually have sufficient Dropbox space to store the files you want to share with them. (Even if you were able to use a shared folder, they would need to have enough space left to accept the shared folder invitation.)

Google Drive to be used as our SaaS storage

I've seen the recently Google Drive pricing changes and they are amazing.
1Tb in Google Drive = $9.99
1Tb in Amazon S3 = $85 ($43 if you have more than 5000TB with them)
This changes everything !
We have a SaaS website in which we keep customer's files. Does anyone know if Google Drive can be used to keep this kind of files/service or it's just for personal use?
Does it have a robust API for uploading, downloading, and create public URL's to access files as S3 have ?
Edit: I saw the SDK here (https://developers.google.com/drive/v2/reference/). The main concern is if this service can be used for keeping customer's files, I mean, a SaaS website offering a service and keeping files there.
This doesn't really change anything.
“Google Drive storage is for users and Google Cloud Storage is for developers.”
— https://support.google.com/a/answer/2490100?hl=en
The analogous service with comparable functionality to S3 is Google Cloud Storage, which is remarkably similar to S3 in pricing.
https://developers.google.com/storage/pricing
Does anyone know if Google Drive can be used to keep this kind of files/service or it's just for personal use?
Yes you can. That's exactly why the Drive SDK exists. You can either store files under the user's own account, or under an "app" account called a Service Account.
Does it have a robust API for uploading, downloading, and create public URL's to access files as S3 have ?
"Robust" is a bit subjective, but there is certainly an API.
There are a number of techniques you can use to access the stored files. Look at https://developers.google.com/drive/v2/reference/files to see the various URLs which are provided.
Por true public access, you will probably need to have the files under a public directory. See https://support.google.com/drive/answer/2881970?hl=en
NB. If you are in the TB space, be very aware that Drive has a bunch of quotas, some of which are unpublished. Make sure you test any proof of concept at full scale.
Sorry to spoil your party, before you get too excited, look at this issue. It is in Google's own product, and has been active since November 2013 (i.e.4 months). Now imagine re-syncing a few hundred GB of files once a while. Or better, ask your customers to do it with their files after you recommended Drive to them.

Can I restrict a S3 bucket's size?

I've been looking all over and I can't find a yes or no answer. Can I restrict a bucket in S3 to specific size?
If so, could you please point me into the right direction in doing so? Thanks.
Well you can do this from within the application you are building, assuming you know the size of the bucket in question, you can use the AWS API for getting the bucket size. However, there seems to be no way to accomplish this from within the AWS dashboard, nor can it be done with an S3 Bucket Policy.
Bummer, because I think this would be a great feature as well.
My advice is to be careful of which applications are uploading content to your S3 bucket, or to interface your application with the AWS API, check the bucket size before inserting content. This is not ideal however.

Possible to get image from Amazon S3 but create it if it doesn't exist

I'm not sure how to word the question but here is what I am looking to do.
I have a site that uses custom map tile overlays on a google map.
The javascript calls a php file on my server that checks to see if an existing map tile exists for the given x, y, and zoom level.
If if exists, it displays that image using file_get_contents.
If it doesn't exist, it creates the new tile then displays it.
I would like to utilize Amazon S3 store and serve the images since there could end being a lot of them and my server is slow. If I have my script check to see if the image exists on amazon and then display it, I am guessing I am not getting the benefits of the speed and Amazons CDN. Is there a way to do this?
Or is there a way to try and pull the file from Amazon first then set up something on Amazon to redirect to my script if the files no there?
Maybe host the script on another of Amazons services? The tile generation is quite slow also in some cases.
Thanks
Ideas:
1 - Use CloudFront, but point it to a cluster of tile generation machines. This way, you can generate the tiles on demand, and any future requests are served right from Cloudfront.
2 - Use CloudFront, but back with with an S3 store of generated tiles. Turn on logging for the S3 bucket, so you can detect failed requests. Consume those logs on a schedule, and generate the missing tiles. This results in a cheaper way of generating tiles, but means that when a tile fails the user get's nothing.
3 - Just pre-generate all the tiles. Throw tasks in an SQS queue, then spin up a collection of EC2 instances to generate the tiles. This will cost the most up front, but all users get a fast experience.
I've written a blog post with a strategy for dealing with this. It's designed to make intelligent and thrifty use of CloudFront, maximize caching and deal with new versions of existing images. You may find the technique described there helpful. The example code shows how to handle different dimensions (i.e. thumbnails) of images. You could modify it to handle different zoom levels.
I need to update that post to support CloudFront custom origins, and I think that for your application you might be better off skipping S3 and using a custom origin. The advantage of a custom origin is simply that it's probably going to be easier to manage all of your images on your local filesystem compared to managing them on S3.