I know that there are plenty of services and tools to read logs.
Here is the problem :
If I'm planning to charge my users by their usage, then should I create buckets for all of them or can I create folders for all of them and log something like folder bandwidth or usage?
I also want to use cloudfront distribution. If I create buckets for all of my users, then how can I manage cloudfront distribution too? (all distributions are associated with a specific bucket?)
I want to hear your suggestions.
Thanks for any idea.
Each AWS account can only have a maximum of 100 buckets so unless you plan on having less than 100 users, you'll need to use folders within buckets.
http://docs.amazonwebservices.com/AmazonS3/latest/dev/BucketRestrictions.html
planning to charge my users by their usage
I think you can use Bucket Tagging which can also help to AWS Cost Allocation For Customer Bills.
About Bucket Tagging
Related
Hi is there any version of amazon web services provides unlimited buckets?
No. From the S3 documentation:
Each AWS account can own up to 100 buckets at a time.
S3 buckets are expensive (in terms of resources) to create and destroy:
The high availability engineering of Amazon S3 is focused on get, put, list, and delete operations. Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often.
There's also no good reason to use lots of buckets:
There is no limit to the number of objects that can be stored in a bucket and no variation in performance whether you use many buckets or just a few. You can store all of your objects in a single bucket, or you can organize them across several buckets.
You want a separate space for each of your users to put things. Fine: create a single bucket and give your user-specific information a <user_id>/ prefix. Better yet, put it in users/<user_id>/, so you can use the same bucket for other non-user-specific things later, or change naming schemes, or anything else you might want.
ListObjects accepts a prefix parameter (users/<user_id>/), and has special provisions for hierarchical keys that might be relevant.
Will is correct. For cases where you can really prove a decent use-case, I'm would imagine that AWS would consider bumping your quota (as they will do for most any service). I wouldn't be surprised if particular users have 200-300 buckets or more, but not without justifying it on good grounds to AWS.
With that said, I cannot find any S3 Quota Increase form alongside the other quota increase forms.
I'm setting up my client with a system that allows users to upload a video or two. These videos will be stored on Amazon S3, which I've not used before. I'm unsure about buckets, and what they represent. Do you think I would have a single bucket for my application, a bucket per user or a bucket per file?
If I were to just have the one bucket, presumably I'd have to have really long, illogical file names to prevent a file name clash.
There is no limit to the amount of objects you can store in a bucket, so generally you would have a single bucket per application, or even across multiple applications. Bucket names have to be globally unique across S3 so it would certainly be impossible to manage a bucket per object. A bucket per user would also be difficult if you had more than a handful of users.
For more background on buckets you can try reading Working with Amazon S3 Buckets
Your application should generate unique keys for objects you are adding to the bucket. Try and avoid numeric ascending ids, as these are considered inefficient. Simply reversing a numeric id can usually make an effective object key. See Amazon S3 Performance Tips & Tricks for a more detailed explanation.
I have two wordpress blogs and I am planning to use amazon S3 with one blog and (amazon S3+cloudfront) for another blog.
I read that we need to choose a location when we start our AWS account.
However, for one site (One using cloudfront and amazon S3), my target market is US and UK and another site (Using amazon S3 alone), my target market is India.
In this case, should I use two separate accounts? Or can I have one single account with two locations? (US and Asia).
The one I am using cloudfront for will have video streaming and the one which I use S3 alone will be heavy on images.
Thank you in advance
you can have multiple locations within the same account. When creating a bucket you will be given a choice in which region to create it. E.g. you can have different buckets within the same account located in the different regions.
Thanks,
Andy
I am migrating my Java,Tomcat, Mysql server to AWS EC2.
I have already attached EBS volume for storing MySql data. In my web application people may upload images. So I should persist them. There are 2 alternatives in my mind:
Save uploaded images to EBS volume.
Use the S3 service.
The followings are my notes, please be skeptic about them, as my expertise is not on servers, but software development.
EBS plus: S3 storage is more expensive. (0.15 $/Gb > 0.1$/Gb)
S3 plus: Serving statics from EBS may influence my web server's performance negatively. Is this true? Does Serving images affect server performance notably? For S3 my server will not be responsible for serving statics.
S3 plus: Serving statics from EBS may result I/O cost, probably it will be minor.
EBS plus: People say EBS is faster.
S3 plus: People say S3 is more safe for persistence.
EBS plus: No need to learn API, it is straight forward to save the images to EBS volume.
Namely I can not decide, will be happy if you guide.
Thanks
The price comparison is not quite right:
S3 charges are $0.14 per GB USED, whereas EBS charges are $0.10 per GB PROVISIONED (the size of your EBS volume), whether you use it or not. As a result, S3 may or may not be cheaper than EBS.
I'm currently using S3 for a project and it's working extremely well.
EBS means you need to manage a volume + machines to attach it to. You need to add space as it's filling up and perform backups (not saying you shouldn't back up your S3 data, just that it's not as critical).
It also makes it harder to scale: when you want to add additional machines, you either need to pull off the images to a separate machine or clone the images across all. This also means you're adding a bottleneck: you'll have to manage your own upload process that will either upload to all machines or have a single machine managing it.
I recommend S3: it's set and forget. Any number of machines can be performing uploads in parallel and you don't really need to notify other machines about the upload.
In addition, you can use Amazon Cloudfront as a cheap CDN in front of the images instead of directly downloading from S3.
I have architected solutions on AWS for Stock photography sites which stores millions of images spanning TB's of data, I would like to share some of the best practice in AWS for your requirement:
P1) Store the Original Image file in S3 Standard option
P2) Store the reproducible images like thumbs etc in the S3 Reduced Redundancy option (RRS) to save costs
P3) Meta data about images including the S3 URL can be stored in Amazon RDS or Amazon DynamoDB depending upon the query complexity. Query the entries from Amazon RDS. If your query is complex it is also common practice to Store the meta data in Amazon CloudSearch or Apache Solr.
P4) Deliver your thumbs to users with low latency using Amazon CloudFront.
P5) Queue your image conversion either thru SQS or RabbitMQ on Amazon EC2
P6) If you are planning to use EBS, then they are not scalable with your EC2. So ideally you can use GlusterFS as your common storage pool for all your images. Multiple Amazon EC2 in Auto Scaled mode can still connect to it and access/write images.
You already outlined the advantages and disadvantages of both.
If you are planning to store terabytes of images, with storage requirements increasing day after day, S3 will probably be your best bet as it is built especially for these kinds of situations. You get unlimited storage space, without having to worry about sharding your data over many EBS volumes.
The recurrent cost of S3 is that it comes 50% more expensive than EBS. You will also have to learn the API and implement it in your application, but that is a one-off expense which I think you should be able to absorb very quickly.
Do you expect the images to last indefinitely?
The Amazon EBS FAQ is pretty clear; the annual failure rate is not "essentially zero"; they quote 0.1% to 0.5%. It's better than the disk under your desk, but it would need some kind of backup.
I'm try to host videos on S3 and to put data transfer limits so I don't get charged for more than say 20GB, any way to do that?
Not that I know of
But you could try issuing Query String Authentication Urls to S3 resources with expiration dates set. And then use the Server Access Logs to track total downloads. It you miss by a GB or so it's not going to cost too much :-)