How to store pointer to S3 objects in Amazon SimpleDB? - amazon-s3

I'm trying to figure out how to store a database consisting of metadata in Amazon SimpleDB, with the actual content the metadata refers to (videos) in S3. As I understand it, I should place a pointer in SimpleDB that refers to the videos in S3. What is this pointer, exactly? Is it the URL of the video located in S3?
Also, are there any code samples that would pertain to this?
Thanks!

You're right, just enter the url on simpleDB and you're done.
What you're trying to do is pointed as an use case: http://aws.amazon.com/en/simpledb/usecases_metadata_indexing/
Taking a look at the code library, you can filter by S3 or SimpleDB and you'll find examples like SimpleDB PHP Sample Program Set and Travel Log - Sample Java Web Application.
Regards.

Related

Proper way to name objects in mass storage service

I wonder as one of my personal projects development goes further forward how should i organize the files ( images, videos, audio files ) uploaded by the users onto AWS's S3/GCE Cloud Storage, i'm used to see these kinds of URL below;
Facebook fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-xft1/v/t1.0-9/11873531_1015...750483_5263546700711467249_n.jpg?oh=b3f06f7e...b7ebf7&oe=56392950&__gda__=1446569890_628...c7765669456
Tumblr 36.media.tumblr.com/686b47...e93fa09c2478/tumblr_nt7lnyP3ld1rqbl96o1_500.png
Twitter pbs.twimg.com/media/CMimixsV...AcZeM.jpg
Does these random characters carry some kind of meaning? or they're just "UUIDs"? Is there a performance/organization issue in using, for instance this kind of URL below?
content.socialnetworkX.com/userY/post/customName_dinosaurs.jpg
EDIT: Let be clear that i'm considering millions of files.
For S3, see the Performance Considerations page where it talks about object naming. Specifically, if you plan to upload objects at a high rate, you should avoid sequentially named objects, as they can be a bottleneck.
Google Cloud Storage does not have this performance bottleneck. See this answer.

Software that supports S3 API from HTML Forms?

I've tried Riak CS and Walrus, and read a few other's documentation pages but can't tell whether they would support this or not.
What I have is an application that uses S3 policies to allow the client to upload and download directly from their browser. I'm looking for a way to replace S3 (for some customers, who would prefer their data not in Amazon's cloud), without having to maintain two different branches of code everywhere I currently talk to S3.
Example of what I do now:
https://aws.amazon.com/articles/1434
Help would be greatly appreciated, I'm stumped!
What you want is called "POST OBJECT". Just check document of S3 compatible implements to see support status.
For example, Walrus claims that it supports POST OBJECT. https://github.com/eucalyptus/eucalyptus/wiki/Walrus-S3-API
On the other hand, Riak CS does not support POST OBJECT http://docs.basho.com/riakcs/latest/references/apis/storage/s3/
Ps.: Many S3 implements claims that they are compatible with AWS S3. In fact, it is not ture. If you want fully compatibility, try google cloudian.

Meteor Amazon S3 image upload with thumbnails

I'm using Meteor and would like to create a form with an image upload field that saves the uploaded file to an Amazon S3 bucket in its original size as well as multiple thumbnails sizes defined (passed) via the code.
So far I'm using the lepozepo:s3 package which works great but doesn't seem to allow options for generating additional thumbnails.
Given I can upload the original files onto S3 I'm considering looking into a service on Amazon that can generate the desired thumbnails and then notify my Meteor app. But I'm not sure how to achieve that.
Can anyone point me in the right direction or share some insight into the best approach for this?
PS: I want to avoid using Filepicker.io is possible.
Seems I was following the wrong path. CollectionFS has everything I need and more. I now have this working with plenty of scope to do more later. This is one brilliant collection of packages with clear guides on respective Github pages.
Here are the packages I ended up usings:
cfs:standard-packages - base
cfs:gridfs - required for some reason, not sure why
cfs:graphicsmagick - thumbnailing/cropping
cfs:s3 - S3 upload
Code sample →
CollectionFS is now deprecated, but there are other options:
Only upload, without S3 integration*: https://github.com/tomitrescak/meteor-uploads
Use the jQuery-File-Upload (which is great), it generates thumbs, has size and format validation, etc. Using basically these two packages together:
https://atmospherejs.com/tomi/upload-jquery
https://atmospherejs.com/tomi/upload-server
You can use other package for S3 integration.
Like: https://github.com/peerlibrary/meteor-aws-sdk/
Upload + Integration with S3: https://github.com/Lepozepo/S3
Good, but if you need to generate thumbs for example you will need to integrate with other package or do it yourself. I not tested, but I got this suggestion: https://github.com/jamgold/cropuploader
Upload only, but with examples of how to generate thumbs or integrate with S3 / DropBox / GridFS /: https://github.com/VeliovGroup/Meteor-Files/
Rich documentation and does well which proposes: Upload images.
Use that adapt best to your needs.
look at blueimp's "jquery file upload" for client and image server resizing. On client you have a bit limited possibilities quality wise, on server you can use full power of imagemagick. Or look at my blog post on http://doctorllama.wordpress.com for file uploads for meteor in general.
cfs:gridfs - required for some reason, not sure why
Meteor using gridfs to store file chunks inside mongo database. In case of s3 it's for temporary storage.

Can I restrict a S3 bucket's size?

I've been looking all over and I can't find a yes or no answer. Can I restrict a bucket in S3 to specific size?
If so, could you please point me into the right direction in doing so? Thanks.
Well you can do this from within the application you are building, assuming you know the size of the bucket in question, you can use the AWS API for getting the bucket size. However, there seems to be no way to accomplish this from within the AWS dashboard, nor can it be done with an S3 Bucket Policy.
Bummer, because I think this would be a great feature as well.
My advice is to be careful of which applications are uploading content to your S3 bucket, or to interface your application with the AWS API, check the bucket size before inserting content. This is not ideal however.

Possible to get image from Amazon S3 but create it if it doesn't exist

I'm not sure how to word the question but here is what I am looking to do.
I have a site that uses custom map tile overlays on a google map.
The javascript calls a php file on my server that checks to see if an existing map tile exists for the given x, y, and zoom level.
If if exists, it displays that image using file_get_contents.
If it doesn't exist, it creates the new tile then displays it.
I would like to utilize Amazon S3 store and serve the images since there could end being a lot of them and my server is slow. If I have my script check to see if the image exists on amazon and then display it, I am guessing I am not getting the benefits of the speed and Amazons CDN. Is there a way to do this?
Or is there a way to try and pull the file from Amazon first then set up something on Amazon to redirect to my script if the files no there?
Maybe host the script on another of Amazons services? The tile generation is quite slow also in some cases.
Thanks
Ideas:
1 - Use CloudFront, but point it to a cluster of tile generation machines. This way, you can generate the tiles on demand, and any future requests are served right from Cloudfront.
2 - Use CloudFront, but back with with an S3 store of generated tiles. Turn on logging for the S3 bucket, so you can detect failed requests. Consume those logs on a schedule, and generate the missing tiles. This results in a cheaper way of generating tiles, but means that when a tile fails the user get's nothing.
3 - Just pre-generate all the tiles. Throw tasks in an SQS queue, then spin up a collection of EC2 instances to generate the tiles. This will cost the most up front, but all users get a fast experience.
I've written a blog post with a strategy for dealing with this. It's designed to make intelligent and thrifty use of CloudFront, maximize caching and deal with new versions of existing images. You may find the technique described there helpful. The example code shows how to handle different dimensions (i.e. thumbnails) of images. You could modify it to handle different zoom levels.
I need to update that post to support CloudFront custom origins, and I think that for your application you might be better off skipping S3 and using a custom origin. The advantage of a custom origin is simply that it's probably going to be easier to manage all of your images on your local filesystem compared to managing them on S3.