How to make 10,000 files in S3 public - amazon-s3

I have a folder in a bucket with 10,000 files. There seems to be no way to upload them and make them public straight away. So I uploaded them all, they're private, and I need to make them all public.
I've tried the aws console, it just gives an error (works fine with folders with less files).
I've tried using S3 organizing in Firefox, same thing.
Is there some software or some script I can run to make all these public?

You can generate a bucket policy (see example below) which gives access to all the files in the bucket. The bucket policy can be added to a bucket through AWS console.
{
"Id": "...",
"Statement": [ {
"Sid": "...",
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::bucket/*",
"Principal": {
"AWS": [ "*" ]
}
} ]
}
Also look at following policy generator tool provided by Amazon.
http://awspolicygen.s3.amazonaws.com/policygen.html

If you are uploading for the first time, you can set the files to be public on upload on the command line:
aws s3 sync . s3://my-bucket/path --acl public-read
As documented in Using High-Level s3 Commands with the AWS Command Line Interface
Unfortunately it only applies the ACL when the files are uploaded. It does not (in my testing) apply the ACL to already uploaded files.
If you do want to update existing objects, you used to be able to sync the bucket to itself, but this seems to have stopped working.
[Not working anymore] This can be done from the command line:
aws s3 sync s3://my-bucket/path s3://my-bucket/path --acl public-read
(So this no longer answers the question, but leaving answer for reference as it used to work.)

I had to change several hundred thousand objects. I fired up an EC2 instance to run this, which makes it all go faster. You'll want to install the aws-sdk gem first.
Here's the code:
require 'rubygems'
require 'aws-sdk'
# Change this stuff.
AWS.config({
:access_key_id => 'YOURS_HERE',
:secret_access_key => 'YOURS_HERE',
})
bucket_name = 'YOUR_BUCKET_NAME'
s3 = AWS::S3.new()
bucket = s3.buckets[bucket_name]
bucket.objects.each do |object|
puts object.key
object.acl = :public_read
end

I had the same problem, solution by #DanielVonFange is outdated, as new version of SDK is out.
Adding code snippet that works for me right now with AWS Ruby SDK:
require 'aws-sdk'
Aws.config.update({
region: 'REGION_CODE_HERE',
credentials: Aws::Credentials.new(
'ACCESS_KEY_ID_HERE',
'SECRET_ACCESS_KEY_HERE'
)
})
bucket_name = 'BUCKET_NAME_HERE'
s3 = Aws::S3::Resource.new
s3.bucket(bucket_name).objects.each do |object|
puts object.key
object.acl.put({ acl: 'public-read' })
end

Just wanted to add that with the new S3 Console you can select your folder(s) and select Make public to make all files inside the folders public. It works as a background task so it should handle any number of files.

Using the cli:
aws s3 ls s3://bucket-name --recursive > all_files.txt && grep .jpg all_files.txt > files.txt && cat files.txt | awk '{cmd="aws s3api put-object-acl --acl public-read --bucket bucket-name --key "$4;system(cmd)}'

Had this need myself but the number of files makes it WAY to slow to do in serial. So I wrote a script that does it on iron.io's IronWorker service. Their 500 free compute hours per month are enough to handle even large buckets (and if you do exceed that the pricing is reasonable). Since it is done in parallel it completes in less than a minute for the 32,000 objects I had. Also I believe their servers run on EC2 so the communication between the job and S3 is quick.
Anybody is welcome to use my script for their own needs.

Have a look at BucketExplorer it manages bulk operations very well and is a solid S3 Client.

You would think they would make public read the default behavior, wouldn't you? : )
I shared your frustration while building a custom API to interface with S3 from a C# solution. Here is the snippet that accomplishes uploading an S3 object and setting it to public-read access by default:
public void Put(string bucketName, string id, byte[] bytes, string contentType, S3ACLType acl) {
string uri = String.Format("https://{0}/{1}", BASE_SERVICE_URL, bucketName.ToLower());
DreamMessage msg = DreamMessage.Ok(MimeType.BINARY, bytes);
msg.Headers[DreamHeaders.CONTENT_TYPE] = contentType;
msg.Headers[DreamHeaders.EXPECT] = "100-continue";
msg.Headers[AWS_ACL_HEADER] = ToACLString(acl);
try {
Plug s3Client = Plug.New(uri).WithPreHandler(S3AuthenticationHeader);
s3Client.At(id).Put(msg);
} catch (Exception ex) {
throw new ApplicationException(String.Format("S3 upload error: {0}", ex.Message));
}
}
The ToACLString(acl) function returns public-read, BASE_SERVICE_URL is s3.amazonaws.com and the AWS_ACL_HEADER constant is x-amz-acl. The plug and DreamMessage stuff will likely look strange to you as we're using the Dream framework to streamline our http communications. Essentially we're doing an http PUT with the specified headers and a special header signature per aws specifications (see this page in the aws docs for examples of how to construct the authorization header).
To change an existing 1000 object ACLs you could write a script but it's probably easier to use a GUI tool to fix the immediate issue. The best I've used so far is from a company called cloudberry for S3; it looks like they have a free 15 day trial for at least one of their products. I've just verified that it will allow you to select multiple objects at once and set their ACL to public through the context menu. Enjoy the cloud!

If your filenames have spaces, we can take Alexander Vitanov's answer above and run it through jq:
#!/bin/bash
# make every file public in a bucket example
bucket=www.example.com
IFS=$'\n' && for tricky_file in $(aws s3api list-objects --bucket "${bucket}" | jq -r '.Contents[].Key')
do
echo $tricky_file
aws s3api put-object-acl --acl public-read --bucket "${bucket}" --key "$tricky_file"
done

Related

In AWS S3 how do I grant a permission to an account if the file exists already

I have already uploaded about 500 files to an S3 bucket. Now I want to add an account to the permissions for each object (adding a bucket permission doesn't give that account read access to the files themselves).
How do I do it? I don't want to re-upload 500 large video files twice just to get the granted permissions correct.
I tried aws s3 mv s3://mybucket/mybigvideo.mp4 s3://mybucket/ --grants read=id=abcde... but I can't move a file to itself.
You can actually copy the file to itself. This is allowed as long as some attribute is changing, such as the Access Control List (ACL).
aws s3 cp s3://bucket/foo.mp4 s3://bucket/foo.mp4 --grants read=id=abcd...
Dang, this isn't elegant but it works: create a dummy s3 bucket, move each file into that bucket and when you move it back, include the --grants flag.
So I listed all 500 files into a file and edited the file to look like this:
aws s3 mv s3://myrealbucket/bigvideo-001.mp4 s3://tempbucket/; aws s3 mv s3://tempbucket/bigvideo-001.mp4 s3://myrealbucket/ --grants read=id=abcd...
aws s3 mv s3://myrealbucket/bigvideo-002.mp4 s3://tempbucket/; aws s3 mv s3://tempbucket/bigvideo-002.mp4 s3://myrealbucket/ --grants read=id=abcd...
That'll take an hour or two to complete, but it'll work.
Anybody got a nicer way to do it?
You can use Assume role,
https://aws.amazon.com/blogs/security/how-to-restrict-amazon-s3-bucket-access-to-a-specific-iam-role/
To Control access to buckets from a different account,
{
"type": "AssumedRole",
"principalId": "AROAJI4AVVEXAMPLE:ROLE-SESSION-NAME",
"arn": "arn:aws:sts::ACCOUNTNUMBER:assumed-role/ROLE-NAME/ROLE-SESSION-NAME",
"accountId": "ACCOUNTNUMBER",
"accessKeyId": "ASIAEXAMPLEKEY",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "XXXX-XX-XXTXX:XX:XXZ"
},
"sessionIssuer": {
"type": "Role",
"principalId": "AROAJI4AVV3EXAMPLEID",
"arn": "arn:aws:iam::ACCOUNTNUMBER:role/ROLE-NAME",
"accountId": "ACCOUNTNUBMER",
"userName": "ROLE-SESSION-NAME"
}
}
}
Hope it helps.

aws s3 bucket delete issue

I am deleting bucket from AWS S3 and versioning is enabled, but it's showing this error:
aws_s3_bucket.bucket: Error deleting S3 Bucket: BucketNotEmpty: The bucket you tried to delete is not empty. You must delete all versions of the bucket.
resource "aws_s3_bucket" "bucket" {
bucket = "${module.naming.aws_s3_bucket}"
acl = "log-delivery-write"
force_destroy = true
versioning {
enabled = true
}
}
I am using Terraform version 10.8
force_destroy is not applying because it only can affect the aws s3 bucket directly. Not inherit with it's containing childs.
If you want to delete all objects you need to run:
aws s3 rm s3://name-of-bucket --recursive
To get rid of all versions within the object that are still existing you better run:
aws s3api delete-objects --bucket name-of-bucket --delete '{"Objects": [{"Key": "*", "VersionId": "*"}]}'
To be fair to others who tried to answer your question it is indeed the obvious answer, no, force_destroy can't manage that. But I hope my explanation can give you an inside why it is the case. Good luck!
"force_destroy": true,
"versioning": [
{
"enabled": true,
"mfa_delete": false
}
is working as expected
I have Terraform v0.13.4

Connection to Cloudformation generated S3bucket times out

I'm trying to solve an issue with my AWS Cloudformation template. The template I have includes a VPC with a private subnet, and a VPC endpoint to allow connections to S3 buckets.
The bucket itself includes 3 buckets, and I have a couple of preexisting buckets already said up in the same region (in this case, eu-west-1).
I use aws-cli to log into an EC2 instance in the private subnet, then use aws-cli commands to access S3 (e.g. sudo aws s3 ls bucketname)
My problem is that I can only list the content of pre-existing buckets in that region, or new buckets that I create manually through the website. When I try to list cloudformation-generated buckets it just hangs and times out:
[ec2-user#ip-10-44-1-129 ~]$ sudo aws s3 ls testbucket
HTTPSConnectionPool(host='vltestbucketxxx.s3.amazonaws.com', port=443): Max retries exceeded with url: /?delimiter=%2F&prefix=&encoding-type=url (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPSConnection object at 0x7f2cc0bcf110>, 'Connection to vltestbucketxxx.s3.amazonaws.com timed out. (connect timeout=60)'))
It does not seem to be related to the VPC endpoint (setting the config to allow everything has no effect)
{
"Statement": [
{
"Action": "*",
"Effect": "Allow",
"Resource": "*",
"Principal": "*"
}
]
}
nor does accesscontrol seem to affect it.
{
"Resources": {
"testbucket": {
"Type": "AWS::S3::Bucket",
"Properties": {
"AccessControl": "PublicReadWrite",
"BucketName": "testbucket"
}
}
}
}
Bucket policies don't seem to be the issue either (I've generated buckets with no policy attached, and again only the cloudformation generated ones time out). On the website, configuration for a bucket that connects and one that times out looks identical to me.
Trying to access buckets in other regions also times out, but as I understood it cloudformation generates buckets in the same region as the VPC, so that shouldn't be it (the website also shows the buckets to be in the same region).
Does anyone have an idea of what the issue might be?
Edit: I can connect from the VPC public subnet, so maybe it is an endpoint problem after all?
When using a VPC endpoint, make sure that you've configured your client to send requests to the same endpoint that your VPC Endpoint is configured for via the ServiceName property (e.g., com.amazonaws.eu-west-1.s3).
To do this using the AWS CLI, set the AWS_DEFAULT_REGION environment variable or the --region command line option, e.g., aws s3 ls testbucket --region eu-west-1. If you don't set the region explicitly, the S3 client will default to using the global endpoint (s3.amazonaws.com) for its requests, which does not match your VPC Endpoint.

Granting read access to the Authenticated Users group for a file

How do I grant read access to the Authenticated Users group for a file? I'm using s3cmd and want to do it while uploading but I'm just focusing directly on changing the acl. What should I put in for http://acs.amazonaws.com/groups/global/AuthenticatedUsers? I have tried every combination of AuthenticatedUsers possible.
./s3cmd setacl
--acl-grant=read:http://acs.amazonaws.com/groups/global/AuthenticatedUsers
s3://BUCKET/FILE
./s3cmd setacl
--acl-grant=read:AuthenticatedUsers
s3://BUCKET/FILE
This doesn't seem to be possible with s3cmd. Instead I had to switch to the aws cli tools.
Here are the directions to install them:
http://docs.aws.amazon.com/cli/latest/userguide/installing.html
It's possible to set the acl to read by authenticated users during upload with the command:
aws s3 cp <file-to-upload> s3://<bucket>/ --acl authenticated-read
Plus a whole load of other combinations you can check out here:
http://docs.aws.amazon.com/cli/latest/reference/s3/index.html#cli-aws-s3
The following command works for me with s3cmd version 1.6.0:
s3cmd setacl s3://<bucket>/<file-name> --acl-grant='read:http://acs.amazonaws.com/groups/global/AuthenticatedUsers' for an individual file.
s3cmd setacl s3://<bucket>/<dir-name> --acl-grant='read:http://acs.amazonaws.com/groups/global/AuthenticatedUsers' --recursive
for all files in a directory.
This is from http://s3tools.org/s3cmd:
Upload a file into the bucket ~$ s3cmd put addressbook.xml
s3://logix.cz-test/addrbook.xml File 'addressbook.xml' stored as
s3://logix.cz-test/addrbook.xml (123456 bytes) Note about ACL
(Access control lists) — a file uploaded to Amazon S3 bucket can
either be private, that is readable only by you, possessor of the
access and secret keys, or public, readable by anyone. Each file
uploaded as public is not only accessible using s3cmd but also has a
HTTP address, URL, that can be used just like any other URL and
accessed for instance by web browsers.
~$ s3cmd put --acl-public --guess-mime-type storage.jpg
s3://logix.cz-test/storage.jpg File 'storage.jpg' stored as
s3://logix.cz-test/storage.jpg (33045 bytes) Public URL of the
object is: http://logix.cz-test.s3.amazonaws.com/storage.jpg
Now anyone can display the storage.jpg file in their browser. Cool, eh?
try changing public to authenticated and that should work.
see http://docs.amazonwebservices.com/AmazonS3/latest/dev/ACLOverview.html#CannedACL
it explains on amazon side how to use their ACLs, supposedly if you use public in s3cmd - this would translate to public-read in amazon, so authenticated should translate to authenticated-read.
If you're willing to use Python, the boto library provides all the functionality to get and set an ACL; from the boto S3 documentation:
b.set_acl('public-read')
Where b is a bucket. Of course in your case you should change 'public-read' to 'authenticated-read'. You can do something similar for keys (files).
If you want to do it at bucket level you can do -
aws s3api put-bucket-acl --bucket bucketname --grant-full-control uri=http://acs.amazonaws.com/groups/global/AuthenticatedUsers
Docs - http://docs.aws.amazon.com/cli/latest/reference/s3api/put-bucket-acl.html
Here is an example command that will set the ACL on an S3 object to authenticated-read.
aws s3api put-object-acl --acl authenticated-read --bucket mybucket --key myfile.txt
.

Amazon S3 Permission problem - How to set permissions for all files at once?

I have uploaded some files using the Amazon AWS management console.
I got an HTTP 403 Access denied error. I found out that I needed to set the permission to view.
How would I do that for all the files on the bucket?
I know that it is possible to set permission on each file, but it's time-consuming when having many files that need to be viewable for everyone.
I suggest that you apply a bucket policy1 to the bucket where you want to store public content. This way you don't have to set ACL for every object. Here is an example of a policy that will make all the files in the bucket mybucket public.
{
"Version": "2008-10-17",
"Id": "http better policy",
"Statement": [
{
"Sid": "readonly policy",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::mybucket/sub/dirs/are/supported/*"
}
]
}
That * in "Resource": "arn:aws:s3:::mybucket/sub/dirs/are/supported/*" allows recursion.
1 Note that a Bucket Policy is different than an IAM Policy. (For one you will get an error if you try to include Principal in an IAM Policy.) The Bucket Policy can be edit by going the root of the bucket in your AWS web console and expanding Properties > Permissions. Subdirectories of a bucket also have Properties > Permissions, but there is no option to Edit bucket policy
You can select which directory you want it to be public.
Press on "more" and mark it as public; it will make the directory and all the files to be accessible as public.
You can only modify ACLs for a unique item (bucket or item), soy you will have to change them one by one.
Some S3 management applications allows you to apply the same ACL to all items in a bucket, but internally, it applies the ACL to each one by one.
If you upload your files programmatically, it's important to specify the ACL as you upload the file, so you don't have to modify it later. The problem of using an S3 management application (like Cloudberry, Transmit, ...) is that most of them uses the default ACL (private read only) when you upload each file.
I used Cloudberry Explorer to do the job :)
Using S3 Browser you can update permissions using the gui, also recursively. It's a useful tool and free for non-commercial use.
To make a bulk of files public, do the following:
Go to S3 web interface
Open the required bucket
Select the required files and folders by clicking the checkboxes at the left of the list
Click «More» button at the top of the list, click «Make public»
Confirm by clicking «Make public». The files won't have a public write access despite the warning says «...read this object, read and write permissions».
You could set ACL on each file using aws cli:
BUCKET_NAME=example
BUCKET_DIR=media
NEW_ACL=public-read
aws s3 ls $BUCKET_NAME/$BUCKET_DIR/ | \
awk '{$1=$2=$3=""; print $0}' | \
xargs -t -I _ \
aws s3api put-object-acl --acl $NEW_ACL --bucket $BUCKET_NAME --key "$BUCKET_DIR/_"
I had same problem while uploading the files through program (java) to s3 bucket ..
Error: No 'Access-Control-Allow-Origin' header is present on the requested resource.
Origin 'http://localhost:9000' is therefore not allowed access. The response had HTTP status code 403
I added the origin identity and changed the bucket policy and CORS configuration then everything worked fine.
Transmit 5
I wanted to add this here for potential macOS users that already have the beautifully-crafted FTP app called Transmit by Panic.
I already had Panic and it supports S3 buckets (not sure what version this came in but I think the upgrades were free). It also supports recursively updating Read and Write permissions.
You simply right click the directory you want to update and select the Read and Write permissions you want to set them to.
It doesn't seem terribly fast but you can open up the log file by going Window > Transcript so you at least know that it's doing something.
Use AWS policy generator to generate a policy which fits your need. The principal in the policy generator should be the IAM user/role which you'd be using for accessing the object(s).
Resource ARN should be arn:aws:s3:::mybucket/sub/dirs/are/supported/*
Next, click on "Add statement" and follow through. You'll finally get a JSON representing the policy. Paste this in your s3 bucket policy management section which is at "your s3 bucket page in AWS -> permissions -> bucket policy".
This worked for me on digital ocean, which allegedly has the same API as s3:
s3cmd modify s3://[BUCKETNAME]/[DIRECTORY] --recursive --acl-public
The above sets all files to public.