Is there a possibility to set maximum file (object) size using a bucket's policy?
I found here a question like this, but there is no size limitation in the examples.
No, you can't do this with a bucket policy. Check the Element Descriptions page of the S3 documentation for an exhaustive list of the things you can do in a bucket policy.
However, you can specify a content-length-range restriction within a Browser Uploads policy document. This feature is commonly used for giving untrusted users write access to specific keys within an S3 bucket you control (e.g. user-facing media uploads), and it provides the appropriate tools for limiting the location, size, and data types that can be uploaded without needing to expose your S3 credentials.
Related
Digging through our Pulumi code, I see that it sets the hostedZoneId on the S3 bucket it creates and I don't understand what it's used for.
The bucket holds internal content (Pulumi state files) and is not set as a static website.
The Pulumi docs in AWS S3 Bucket hostedZoneId only state:
The Route 53 Hosted Zone ID for this bucket's region.
with a link to what appears to be an irrelevant page (looks like a copy-paste error since that link is mentioned earlier on the page).
S3 API docs don't mention the field either.
Terraform's S3 bucket docs, which Pulumi relies on many times but is also a good reference for S3 API in general, exposes this as an output, but not as an input attribute.
Does anyone know what this attribute is used for?
I am using AMWS s3 in a ruby on rails project to store images for my models. Everything is working fine. I was just wondering if it okay/normal that if someone right clicks an image, it shows the following url:
https://mybucketname.s3.amazonaws.com/uploads/photo/picture/100/batman.jpg
Is this a hacking risk, letting people see your bucket name? I guess I was expecting to see a bunch of randomized letters or something. /Noob
Yes, it's normal.
It's not a security risk unless your bucket permissions allow unauthenticated actions like uploading and deleting objects by anonymous users (obviously, having the bucket name would be necessary if a malicious user wanted to overwrite your files) or your bucket name itself provides some kind of information you don't want revealed.
If it makes you feel better, you can always associate a CloudFront distribution with your bucket -- a CloudFront distribution has a default hostname like d1a2b3c4dexample.cloudfront.net, which you can use in your links, or you can associate a vanity hostname with the CloudFront distribution, like assets.example.com, neither of which will reveal the bucket name.
But your bucket name, itself, is not considered sensitive information. It is common practice to use links to objects in buckets, which necessarily include the bucket name.
It appears as though I can only use tags at the bucket level in S3. That seems to make sense in a way, because you would likely only do billing at that kind of macro level. However, I can see a few use cases for tagging so that different folks get billed for different objects in the same bucket.
Can you tag individual S3 objects?
Object tagging is a new feature, announced at December, 2016. From the announcement:
With S3 Object Tagging you can manage and control access for Amazon S3 objects. S3 Object Tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object. With these, you’ll have the ability to create Identity and Access Management (IAM) policies, setup S3 Lifecycle policies, and customize storage metrics. These object-level tags can then manage transitions between storage classes and expire objects in the background.
See also: S3 » Objects » Object Tagging
At the moment, it doesn't look like you can search by tags, or that object tagging affects billing.
It's not "tagging" for the purpose of AWS-side billing, but you can use object metadata to store whatever data you'd like for an object.
http://docs.amazonwebservices.com/AmazonS3/latest/dev/UsingMetadata.html
Now, we can add tags to each object.
Using AWS S3API,
aws s3api put-object-tagging --bucket bucket_name --key key_name --tagging 'TagSet=[{Key=type,Value=text1}]'
We can also add tags to objects using python API. Following code snippet add tags to all objects in bucket. You can pass object name if you want to add tag to just one object.
session = aws_session.set_aws_session()
s3 = boto3.Session(aws_access_key_id, aws_secret_access_key)
bucketName = 'bucketName'
bucket = s3.Bucket(bucketName)
object_list = bucket.objects.all()
s3 = session.client('s3')
tagging = {'TagSet' : [{'Key': 'CONF', 'Value':'No'}]}
for obj in object_list:
s3.put_object_tagging(
Bucket = bucketName,
Key = obj.key,
Tagging = tagging
)
According to documentation you can only tag buckets:
Cost allocation tagging allows you to label S3 buckets so you can more
easily track their cost against projects or other criteria..
It is consistent with what you can see in both management console and SDK documentation.
Of course you could use folder/object metadata to do a finer "tagging" on your own, but I think you will find a better solution.
S3 tags are new feature released on 29 Nov, 2016. Tags can be added on bucket and on individual objects. S3 tags are exciting feature as you can keep business taxonomy data, even control access.release of s3 tag feature
s3 tags can be added using new s3 console from browser. To add tag from browser, assuming you are on new s3 console. Select the item --> More --> Add tag.
To view tag, click on object using new console and view properties.
Aws S3 cli currently not supporting tag feature. Aws s3 api are providing way to add and read tag on object. add tag using s3 api,get tag using s3 api
I don't think you can tag individual items in S3 the same way you can generally tag resources.
However you can add metadata to items in S3 to identify them. You could then report on items with different types by either:
- Paging through items in the bucket (obviously rather slow) and collating any information you want about them
- Having an external metadata store in a database of your choice, which you could then use to run reports. For example how many items of different types, total size, etc. Of course anything you would want to report on would have to be added to the database first
I would definitely be interested in any better solutions though!
Yes, you can tag objects...but not for cost allocation:
Perhaps it is important to draw a distinction between cost allocation tags, and labelling objects with tags. To quote the Amazon documentation: "Cost allocation tags can only be used to label buckets. For information about tags used for labeling objects, see Object Tagging"
Labels: Tagging an object in a bucket:
These are much like meta-data key value pairs defined by users themselves:
Just wondering if there is a recommended strategy for storing different types of assets/files in separate S3 buckets or just put them all in one bucket? The different types of assets that I have include: static site images, user's profile images, user-generated content like documents, files, and videos.
As far as how to group files into buckets. That is really not that critical of an issue unless you want to have different domain names or CNAMEs fordifferent types on content, in which case you would need a separate bucket for each domain name you would want to use.
I would tend to group them by functionality. Perhaps static files used in your application that you have full control over you might deploy into a separate bucket from content that is going to be user generated. Or you might want to have video in a different bucket than images, etc.
To add to my earlier comments about S3 metadata. It is going to be a critical part of optimizing how you server up content from S3/Cloudfront.
Basically, S3 metadata consists of key-value pairs. So you could have Content-Type as a key with a value of image/jpeg for example if the file is .jpg. This will automatically send appropriate Content-Type headers corresponding to your values for requests made directly to S3 URL or via Cloudfront. The same is true of Cache-Control metatags. You can also use your own custom metatags. For example, I use a custom metatag named x-amz-meta-md5 to store an md5 hash of the file. It is used for simple bucket comparisons against content stored in a revision control system, so we don't have to make checksums of each file in the bucket on the fly. We use this for pushing differential content updates to the buckets (i.e. only push those that have changed).
As far as how revision control goes. I would HIGHLY recommend using versioned file names. In other words say you have bigimage.jpg and you want to make an update, call it bigimage1.jpg and change your code to reflect this. Why? Because optimally, you would like to set long expiration time frames in your Cache-Control headers. Unfortunately, if you then want to deploy a file of the same name and you are using Cloudfront, it becomes problematic to invalidate the edge caching locations. Whereas if you have a new file name, Cloudfront would just begin to populate the edge nodes and you don't have to worry about invalidating the cache at all.
Similarly for user-produced content, you might want to include an md5 or some other (mostly) unique identifier scheme, so that each video/image can have its own unique filename and place in the cache.
For your reference here is a link to the AWs documentation on setting up streaming in Cloudfront
http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/CreatingStreamingDistributions.html
I am syndicating out my multi-media content (mp4 and images) to several clients. So I create one S3 object for every mp4 say "my_content_that_pays_my_bills.mp4" and let the client access the S3 URL for the objects and embed it wherever they want.
What I want is for client A to access this MP4 as "A_my_content_that_pays_my_bills.mp4"
and Client B to access this as "B_my_content_that_pays_my_bills.mp4" and so on.
I want to bill the clients by usage: so I could process access logs and count access to "B_my_content_that_pays_my_bills.mp4" and bill client B for usage.
I know that S3 allows only one key per object. So how do I get around this ?
I don't know that you can alias file names in the way you'd like. Here are a couple of hacks I can think of for public files embedded freely by a customer:
1) Create one Cloudfront distribution per client, each pointing at the same bucket. Each AWS account can have 100 distributions, so you could support only that many clients. Or,
2) Duplicate the files, using the the client-specific names that you'd like. This is simpler but your file storage costs scale with your clients (which may or may not be significant).