Is there a way to touch() a file in Amazon S3? - amazon-s3

I'm currently working with Amazon S3 and I am writing a program that uses the modified dates. I'm looking for a way to edit the modified dates.
I could loop trough all the files and save them as they are, but this sounds like a bad solution.
In PHP there is this function touch().
Does anyone know a solution, or has the same problem?

In response to #Daniel Golden's comment on #tkotisis answer. It looks like at least the AWS CLI tools do not let you copy an item on to itself. You can however 'force' a copy by updating the metadata.
$ aws s3 cp --metadata '{"touched":"now"}' s3://path/to/object s3://path/to/object
This recreates the object (downloads to the caller and reuploads it) replacing its content, owner and metadata. This will also trigger any attached Lambda events.

You can achieve the same through a copy object request, specifying the CopySource to be same as the target key.
In essence, this will issue a PUT Object - COPY request to S3 with the corresponding source and target bucket/key.

Here is another whay to upload a null (or o Byte) file to S3. I verified this works You can also use the S3 API to upload a file with no body, like so:
aws s3api put-object --bucket "myBucketName" --key "dir-1/my_null_file"
Normally you would specify a --body blob, but its option and will just add the key as expected. See more on S3 API put-object
The version of AWS CLI tested with is: aws-cli/2.0.4 Python/3.7.5 Windows/10 botocore/2.0.0dev8
Here's how I did it in PHP (even works in outdated 5.4, had to go way back):
// Init an S3Client
$awsConfig = $app->config('aws');
$aws = Aws::factory($awsConfig);
$s3Bucket = $app->config('S3_Bucket');
$s3Client = $aws->get('s3');
// Set null/empty file.
$result = $s3Client->putObject([
'Bucket' => $s3Bucket,
'Key' => "dir-1/my_null_file",
'Body' => '',
'ServerSideEncryption' => 'AES256',
]);

I find myself performing the copy trick pretty often when testing, to the extent that I've added a handy function to my .bashrc:
s3-touch() {
aws s3 cp \
--metadata 'touched=touched' \
--recursive --exclude="*" \
--include="$2" \
"${#:3}" \
"$1" "$1"
}
Example usage:
# will do a dryrun on a copy operation
s3-touch s3://bucket/prefix/ "20200311*" --dryrun
# the real thing, creating events for all objects
# in s3://bucket/prefix/ that start with 20200311
s3-touch s3://bucket/prefix/ "20200311*"
I'm doing this mainly for the S3-events that I want to trigger.

Following #g-io answer that simplified my day, here is another version of the same that makes it easy to touch a single file
s3-touch-single() {
aws s3 cp \
--metadata 'touched=touched' \
"${#:3}" \
"$1" "$1"
}
for example, looping an array of files we need to touch:
paths=("mydir/image.png" "mydir2/image2.png")
for i in "${paths[#]}"; do s3-touch-single "s3://my-bucket/$i"; done

check out https://github.com/emdgroup/awscli-s3touch
It's a plugin to the AWS CLI that adds a touch command.
Usage:
aws s3 touch my-bucket --prefix myfolder/
It works by reading the events attached to the bucket and simulating them client side.

Related

How to make AWS S3 Glacier files available for retrieval recursively with AWS CLI

How can I make files stored at AWS S3 Glacier available for retrieval recursively from CLI?
I run the following command:
aws s3 cp "s3://mybucket/remotepath/" localpath --recursive
and got the following line for each of the files:
warning: Skipping file s3://mybucket/remotepath/subdir/filename.xml. Object is of storage class GLACIER. Unable to perform download operations on GLACIER objects. You must restore the object to be able to perform the operation. See aws s3 download help for additional parameter options to ignore or force these transfers.
However, the aws s3api restore-object has a --key parameter which specifies a single file without an ability to recursively traverse through directories.
How can I recursively restore files for retrieval from AWS CLI?
The Perl script to restore the files
You can use the following Perl script to start the restore process of the files recursively and monitor the process. After the restore is completed, you can copy the files during the specified number of days.
#!/usr/bin/perl
use strict;
my $bucket = "yourbucket";
my $path = "yourdir/yoursubdir/";
my $days = 5; # the number of days you want the restored file to be accessible for
my $retrievaloption = "Bulk"; # retrieval option: Bulk, Standard, or Expedited
my $checkstatus = 0;
my $dryrun = 0;
my $cmd = "aws s3 ls s3://$bucket/$path --recursive";
print "$cmd\n";
my #lines = `$cmd`;
my #cmds;
foreach (#lines) {
my $pos = index($_, $path);
if ($pos > 0) {
my $s = substr($_, $pos);
chomp $s;
if ($checkstatus)
{
$cmd = "aws s3api head-object --bucket $bucket --key \"$s\"";
} else {
$cmd = "aws s3api restore-object --bucket $bucket --key \"$s\" --restore-request Days=$days,GlacierJobParameters={\"Tier\"=\"$retrievaloption\"}";
}
push #cmds, $cmd;
} else {
die $_;
}
}
undef #lines;
foreach (#cmds)
{
print "$_\n";
unless ($dryrun) {print `$_`; print"\n";}
}
Before running the script, modify the $bucket and $path value. Run the script then and watch the output.
You can first run it in a "dry run" mode that will only print the AWS CLI commands to the screen without actually restoring the file. To do that, modify the $dryrun value to 1. You can redirect the output of the dry run to a batch file and execute it separately.
Monitor the restoration status
After you run the script and started the restore process, it will take from a few minutes to a few hours for the files to get available for copying.
You will be only able to copy the files after the restore process completes for each of the files.
To monitor the status, modify the $checkstatus value to 1 and run the script again. While the restoration is still in process, you will see the output, for each of the files, similar to the following:
{
"AcceptRanges": "bytes",
"Restore": "ongoing-request=\"true\"",
"LastModified": "2022-03-07T11:13:53+00:00",
"ContentLength": 1219493888,
"ETag": "\"ad02c999d7fe6f1fb5ddb0734017d3b0-146\"",
"ContentType": "binary/octet-stream",
"Metadata": {},
"StorageClass": "GLACIER"
}
When the files will finally became available for retrieval, the "Restore" line will look like the following:
"Restore": "ongoing-request=\"false\", expiry-date=\"Wed, 20 Apr 2022 00:00:00 GMT\"",
After that, you will be able to copy the files from AWS S3 to your local disk, e.g.
aws s3 cp "s3://yourbucket/yourdir/yoursubdir/" yourlocaldir --recursive --force-glacier-transfer
Restore options
Depending on the retrieval option you selected in the script for your files stored in the Amazon S3 Glacier Flexible Retrieval (formerly S3 Glacier) archive tier, "Expedited" retrievals complete recovery in 1-5 minutes, "Standard" — in 3-5 hours, and "Bulk" — in 5-12 hours. The "Bulk" retrieval option is the cheapest if not free (it depends on the Glacier tier which you chosen to keep your files at). "Expedited" is the most expensive retrival option and may not be available for retrivals from Amazon S3 Glacier Deep Archive storage tier, for which restoration may take up to 48 hours.
Improve the script to accept command-line parameters
By the way, you can modify the script to accept the bucket name and the directory name from the command line. In this case, replace the following two lines:
my $bucket = "yourbucket";
my $path = "yourdir/yoursubdir/";
to the following lines:
my $numargs = $#ARGV + 1;
unless ($numargs == 2) {die "Usage: perl restore-aws.pl bucket path/\n";}
my $bucket=$ARGV[0];
my $path=$ARGV[1];

Is it possible to trigger lambda by changing the file of local s3 manually in serverless framework?

I used the serverless-s3-local to trigger aws lambda locally with serverless framework.
Now it worked when I created or updated a file by function in local s3 folder, but when I added a file or changed the context of the file in local s3 folder manually, it didn’t trigger the lambda.
Is there any good way to solve it?
Thanks for using serverlss-s3-local. I'm the author of serverless-s3-local.
How did you add a file or change the context of the file? Did you use the AWS command as following?
$ AWS_ACCESS_KEY_ID=S3RVER AWS_SECRET_ACCESS_KEY=S3RVER aws --endpoint http://localhost:8000 s3 cp ./face.jpg s3://local-bucket/incoming/face.jpg
{
"ETag": "\"6fa1ab0763e315d8b1a0e82aea14a9d0\""
}
If you don't use the aws command and apply these operations to the files directory, these modifications aren't detected by S3rver which is the local S3 emurator. resize_image example may be useful for you.

Is there a way to check if folder exists in s3 using aws cli?

Let's say I have a bucket named Test which has folder Alpha/TestingOne,Alpha/TestingTwo . I want to check if a folder named Alpha/TestingThree is present in my bucket using aws cli . I did try aws
s3api head-object --bucket Test --key Alpha/TestingThree
But seems the head-object is for files and not for folders . So is there a way to check if a folder exists in aws s3 using aws cli api .
Using aws cli,
aws s3 ls s3://Test/Alpha/TestingThree
If It is exists, it shows like below, else returns nothting.
PRE TestingThree
Note that S3 is flat structure (acutually no hierarchy like directory).
https://docs.aws.amazon.com/AmazonS3/latest/user-guide/using-folders.html
There seem to be no way to test if a 'folder' exists in s3 bucket. That should be somehow related to the fact that everything is an 'object' with a key/value.
As an option, you could utilize list-objects-v2 or list-objects to check for the 'contents' of a prefix (aka 'folder'):
$ aws s3api list-objects-v2 --bucket <bucket_name> --prefix <non_exist_prefix_name> --query 'Contents[]'
null
Prefix (aka 'folder') that doesn't exist will always return 'null' for such query. Then you can use comparison against 'null' value:
$ test "$result" == "null" && echo 'yes' || echo 'no'
Folders do not actually exist in Amazon S3. For example you could use this command:
aws s3 cp foo.txt s3://my-bucket/folder1/folder2/foo.txt
This would work successfully even if folder1 and folder2 do not exist. This is because the filename (Key) of an Amazon S3 object contains the full path. Amazon S3 is a flat storage system that does not use folders. However, to make things easier for humans, the S3 management console makes it "appear" as though there are folders, and it is possible to list objects that have a CommonPrefix (which is like a path).
If a new folder is created in the S3 management console, it actually creates a zero-length object with the same name as the folder. This makes it possible to show "empty folders" even though they don't actually exist.

Amazon S3 console: download multiple files at once

When I log to my S3 console I am unable to download multiple selected files (the WebUI allows downloads only when one file is selected):
https://console.aws.amazon.com/s3
Is this something that can be changed in the user policy or is it a limitation of Amazon?
It is not possible through the AWS Console web user interface.
But it's a very simple task if you install AWS CLI.
You can check the installation and configuration steps on Installing in the AWS Command Line Interface
After that you go to the command line:
aws s3 cp --recursive s3://<bucket>/<folder> <local_folder>
This will copy all the files from given S3 path to your given local path.
Selecting a bunch of files and clicking Actions->Open opened each in a browser tab, and they immediately started to download (6 at a time).
If you use AWS CLI, you can use the exclude along with --include and --recursive flags to accomplish this
aws s3 cp s3://path/to/bucket/ . --recursive --exclude "*" --include "things_you_want"
Eg.
--exclude "*" --include "*.txt"
will download all files with .txt extension. More details - https://docs.aws.amazon.com/cli/latest/reference/s3/
I believe it is a limitation of the AWS console web interface, having tried (and failed) to do this myself.
Alternatively, perhaps use a 3rd party S3 browser client such as http://s3browser.com/
If you have Visual Studio with the AWS Explorer extension installed, you can also browse to Amazon S3 (step 1), select your bucket (step 2), select al the files you want to download (step 3) and right click to download them all (step 4).
The S3 service has no meaningful limits on simultaneous downloads (easily several hundred downloads at a time are possible) and there is no policy setting related to this... but the S3 console only allows you to select one file for downloading at a time.
Once the download starts, you can start another and another, as many as your browser will let you attempt simultaneously.
In case someone is still looking for an S3 browser and downloader I have just tried Fillezilla Pro (it's a paid version). It worked great.
I created a connection to S3 with Access key and secret key set up via IAM. Connection was instant and downloading of all folders and files was fast.
Using AWS CLI, I ran all the downloads in the background using "&" and then waited on all the pids to complete. It was amazingly fast. Apparently the "aws s3 cp" knows to limit the number of concurrent connections because it only ran 100 at a time.
aws --profile $awsProfile s3 cp "$s3path" "$tofile" &
pids[${npids}]=$! ## save the spawned pid
let "npids=npids+1"
followed by
echo "waiting on $npids downloads"
for pid in ${pids[*]}; do
echo $pid
wait $pid
done
I downloaded 1500+ files (72,000 bytes) in about a minute
I wrote a simple shell script to download NOT JUST all files but also all versions of every file from a specific folder under AWS s3 bucket. Here it is & you may find it useful
# Script generates the version info file for all the
# content under a particular bucket and then parses
# the file to grab the versionId for each of the versions
# and finally generates a fully qualified http url for
# the different versioned files and use that to download
# the content.
s3region="s3.ap-south-1.amazonaws.com"
bucket="your_bucket_name"
# note the location has no forward slash at beginning or at end
location="data/that/you/want/to/download"
# file names were like ABB-quarterly-results.csv, AVANTIFEED--quarterly-results.csv
fileNamePattern="-quarterly-results.csv"
# AWS CLI command to get version info
content="$(aws s3api list-object-versions --bucket $bucket --prefix "$location/")"
#save the file locally, if you want
echo "$content" >> version-info.json
versions=$(echo "$content" | grep -ir VersionId | awk -F ":" '{gsub(/"/, "", $3);gsub(/,/, "", $3);gsub(/ /, "", $3);print $3 }')
for version in $versions
do
echo ############### $fileId ###################
#echo $version
url="https://$s3region/$bucket/$location/$fileId$fileNamePattern?versionId=$version"
echo $url
content="$(curl -s "$url")"
echo "$content" >> $fileId$fileNamePattern-$version.csv
echo ############### $i ###################
done
Also you could use the --include "filename" many times in a single command with each time including a different filename within the double quotes, e.g.
aws s3 mycommand --include "file1" --include "file2"
It will save your time rather than repeating the command to download one file at a time.
Also if you are running Windows(tm), WinSCP now allows drag and drop of a selection of multiple files. Including sub-folders.
Many enterprise workstations will have WinSCP installed for editing files on servers by means of SSH.
I am not affiliated, I simply think this was really worth doing.
In my case Aur's didn't work and if you're looking for a quick solution to download all files in a folder just using the browser, you can try entering this snippet in your dev console:
(function() {
const rows = Array.from(document.querySelectorAll('.fix-width-table tbody tr'));
const downloadButton = document.querySelector('[data-e2e-id="button-download"]');
const timeBetweenClicks = 500;
function downloadFiles(remaining) {
if (!remaining.length) {
return
}
const row = remaining[0];
row.click();
downloadButton.click();
setTimeout(() => {
downloadFiles(remaining.slice(1));
}, timeBetweenClicks)
}
downloadFiles(rows)
}())
I have done, by creating shell script using aws cli (i.e : example.sh)
#!/bin/bash
aws s3 cp s3://s3-bucket-path/example1.pdf LocalPath/Download/example1.pdf
aws s3 cp s3://s3-bucket-path/example2.pdf LocalPath/Download/example2.pdf
give executable rights to example.sh (i.e sudo chmod 777 example.sh)
then run your shell script ./example.sh
I think simplest way to download or upload files is to use aws s3 sync command. You can also use it to sync two s3 buckets in same time.
aws s3 sync <LocalPath> <S3Uri> or <S3Uri> <LocalPath> or <S3Uri> <S3Uri>
# Download file(s)
aws s3 sync s3://<bucket_name>/<file_or_directory_path> .
# Upload file(s)
aws s3 sync . s3://<bucket_name>/<file_or_directory_path>
# Sync two buckets
aws s3 sync s3://<1st_s3_path> s3://<2nd_s3_path>
What I usually do is mount the s3 bucket (with s3fs) in a linux machine and zip the files I need into one, then I just download that file from any pc/browser.
# mount bucket in file system
/usr/bin/s3fs s3-bucket -o use_cache=/tmp -o allow_other -o uid=1000 -o mp_umask=002 -o multireq_max=5 /mnt/local-s3-bucket-mount
# zip files into one
cd /mnt/local-s3-bucket-mount
zip all-processed-files.zip *.jpg
import os
import boto3
import json
s3 = boto3.resource('s3', aws_access_key_id="AKIAxxxxxxxxxxxxJWB",
aws_secret_access_key="LV0+vsaxxxxxxxxxxxxxxxxxxxxxry0/LjxZkN")
my_bucket = s3.Bucket('s3testing')
# download file into current directory
for s3_object in my_bucket.objects.all():
# Need to split s3_object.key into path and file name, else it will give error file not found.
path, filename = os.path.split(s3_object.key)
my_bucket.download_file(s3_object.key, filename)

S3: make a public folder private again?

How do you make an AWS S3 public folder private again?
I was testing out some staging data, so I made the entire folder public within a bucket. I'd like to restrict its access again. So how do I make the folder private again?
The accepted answer works well - seems to set ACLs recursively on a given s3 path too. However, this can also be done more easily by a third-party tool called s3cmd - we use it heavily at my company and it seems to be fairly popular within the AWS community.
For example, suppose you had this kind of s3 bucket and dir structure: s3://mybucket.com/topleveldir/scripts/bootstrap/tmp/. Now suppose you had marked the entire scripts "directory" as public using the Amazon S3 console.
Now to make the entire scripts "directory-tree" recursively (i.e. including subdirectories and their files) private again:
s3cmd setacl --acl-private --recursive s3://mybucket.com/topleveldir/scripts/
It's also easy to make the scripts "directory-tree" recursively public again if you want:
s3cmd setacl --acl-public --recursive s3://mybucket.com/topleveldir/scripts/
You can also choose to set the permission/ACL only on a given s3 "directory" (i.e. non-recursively) by simply omitting --recursive in the above commands.
For s3cmd to work, you first have to provide your AWS access and secret keys to s3cmd via s3cmd --configure (see http://s3tools.org/s3cmd for more details).
From what I understand, the 'Make public' option in the managment console recursively adds a public grant for every object 'in' the directory.
You can see this by right-clicking on one file, then click on 'Properties'. You then need to click on 'Permissions' and there should be a line:
Grantee: Everyone [x] open/download [] view permissions [] edit permission.
If you upload a new file within this directory it won't have this public access set and therefore be private.
You need to remove public read permission one by one, either manually if you only have a few keys or by using a script.
I wrote a small script in Python with the 'boto' module to recursively remove the 'public read' attribute of all keys in a S3 folder:
#!/usr/bin/env python
#remove public read right for all keys within a directory
#usage: remove_public.py bucketName folderName
import sys
import boto3
BUCKET = sys.argv[1]
PATH = sys.argv[2]
s3client = boto3.client("s3")
paginator = s3client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket=BUCKET, Prefix=PATH)
for page in page_iterator:
keys = page['Contents']
for k in keys:
response = s3client.put_object_acl(
ACL='private',
Bucket=BUCKET,
Key=k['Key']
)
I tested it in a folder with (only) 2 objects and it worked. If you have lots of keys it may take some time to complete and a parallel approach might be necessary.
For AWS CLI, it is fairly straight forward.
If the object is: s3://<bucket-name>/file.txt
For single object:
aws s3api put-object-acl --acl private --bucket <bucket-name> --key file.txt
For all objects in the bucket (bash one-liner):
aws s3 ls --recursive s3://<bucket-name> | cut -d' ' -f5- | awk '{print $NF}' | while read line; do
echo "$line"
aws s3api put-object-acl --acl private --bucket <bucket-name> --key "$line"
done
From the AWS S3 bucket listing (The AWS S3 UI), you can modify individual file's permissions after making either one file public manually or by making the whole folder content public (To clarify, I'm referring to a folder inside a bucket). To revert the public attribute back to private, you click on the file, then go to permissions and click in the radial button under "EVERYONE" heading. You get a second floating window where you can uncheck the *read object" attribute. Don't forget to save the change. If you try to access the link, you should get the typical "Access Denied" message. I have attached two screenshots. The first one shows the folder listing. Clicking the file and following the aforementioned procedure should show you the second screenshot, which shows the 4 steps. Notice that to modify multiple files, one would need to use the scripts as proposed in previous posts. -Kf
I actually used Amazon's UI following this guide http://aws.amazon.com/articles/5050/
While #Varun Chandak's answer works great, it's worth mentioning that, due to the awk part, the script only accounts for the last part of the ls results. If the filename has spaces in it, awk will get only the last segment of the filename split by spaces, not the entire filename.
Example: A file with a path like folder1/subfolder1/this is my file.txt would result in an entry called just file.txt.
In order to prevent that while still using his script, you'd have to replace $NF in awk {print $NF} by a sequence of variable placeholders that accounts for the number of segments that the 'split by space' operation would result in. Since filenames might have a quite large number of spaces in their names, I've gone with an exaggeration, but to be honest, I think a completely new approach would probably be better to deal with these cases. Here's the updated code:
#!/bin/sh
aws s3 ls --recursive s3://my-bucket-name | awk '{print $4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25}' | while read line; do
echo "$line"
aws s3api put-object-acl --acl private --bucket my-bucket-name --key "$line"
done
I should also mention that using cut didn't have any results for me, so I removed it. Credits still go to #Varun Chandak, since he built the script.
As of now, according to the boto docs you can do it this way
#!/usr/bin/env python
#remove public read right for all keys within a directory
#usage: remove_public.py bucketName folderName
import sys
import boto
bucketname = sys.argv[1]
dirname = sys.argv[2]
s3 = boto.connect_s3()
bucket = s3.get_bucket(bucketname)
keys = bucket.list(dirname)
for k in keys:
# options are 'private', 'public-read'
# 'public-read-write', 'authenticated-read'
k.set_acl('private')
Also, you may consider to remove any bucket policies under permissions tab of s3 bucket.
I did this today. My situation was I had certain top level directories whose files needed to be made private. I did have some folders that needed to be left public.
I decided to use the s3cmd like many other people have already shown. But given the massive number of files, I wanted to run parallel s3cmd jobs for each directory. And since it was going to take a day or so, I wanted to run them as background processes on an EC2 machine.
I set up an Ubuntu machine using the t2.xlarge type. I chose the xlarge after s3cmd failed with out of memory messages on a micro instance. xlarge is probably overkill but this server will only be up for a day.
After logging into the server, I installed and configured s3cmd:
sudo apt-get install python-setuptools
wget https://sourceforge.net/projects/s3tools/files/s3cmd/2.0.2/s3cmd-2.0.2.tar.gz/download
mv download s3cmd.tar.gz
tar xvfz s3cmd.tar.gz
cd s3cmd-2.0.2/
python setup.py install
sudo python setup.py install
cd ~
s3cmd --configure
I originally tried using screen but had some problems, mainly processes were dropping from screen -r despite running the proper screen command like screen -S directory_1 -d -m s3cmd setacl --acl-private --recursive --verbose s3://my_bucket/directory_1. So I did some searching and found the nohup command. Here's what I ended up with:
nohup s3cmd setacl --acl-private --recursive --verbose s3://my_bucket/directory_1 > directory_1.out &
nohup s3cmd setacl --acl-private --recursive --verbose s3://my_bucket/directory_2 > directory_2.out &
nohup s3cmd setacl --acl-private --recursive --verbose s3://my_bucket/directory_3 > directory_3.out &
With a multi-cursor error this becomes pretty easy (I used aws s3 ls s3//my_bucket to list the directories).
Doing that you can logout as you want, and log back in and tail any of your logs. You can tail multiple files like:
tail -f directory_1.out -f directory_2.out -f directory_3.out
So set up s3cmd then use nohup as I demonstrated and you're good to go. Have fun!
It looks like that this is now addressed by Amazon:
Selecting the following checkbox makes the bucket and its contents private again:
Block public and cross-account access if bucket has public policies
https://aws.amazon.com/blogs/aws/amazon-s3-block-public-access-another-layer-of-protection-for-your-accounts-and-buckets/
UPDATE: The above link was updated August 2019. The options in the image above no longer exist. The new options are in the image below.
If you have S3 Browser, you will be having an option to make it public or private.
If you want a delightfully simple one-liner, you can use the AWS Powershell Tools. The reference for the AWS Powershell Tools can be found here. We'll be using the Get-S3Object and Set-S3ACL commandlets.
$TargetS3Bucket = "myPrivateBucket"
$TargetDirectory = "accidentallyPublicDir"
$TargetRegion = "us-west-2"
Set-DefaultAWSRegion $TargetRegion
Get-S3Object -BucketName $TargetS3Bucket -KeyPrefix $TargetDirectory | Set-S3ACL -CannedACLName private
There are two ways to manage this:
Block all the bucket (simplier but does not applies to all use cases like a s3 bucket with static website and a sub folder for CDN) - https://aws.amazon.com/blogs/aws/amazon-s3-block-public-access-another-layer-of-protection-for-your-accounts-and-buckets/
Block access to a directory from the s3 bucket that was granted Make Public option where you can execute the script from ascobol (I just rewrite it with boto3)
#!/usr/bin/env python
#remove public read right for all keys within a directory
#usage: remove_public.py bucketName folderName
import sys
import boto3
BUCKET = sys.argv[1]
PATH = sys.argv[2]
s3client = boto3.client("s3")
paginator = s3client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket=BUCKET, Prefix=PATH)
for page in page_iterator:
keys = page['Contents']
for k in keys:
response = s3client.put_object_acl(
ACL='private',
Bucket=BUCKET,
Key=k['Key']
)
cheers
Use #ascobol's script, above. Tested with ~2300 items in 1250 subfolders and appears to have worked (lifesaver, thanks!).
I'll provide some additional steps for less experienced folks, but if anyone with more reputation would like to delete this answer and comment on his post stating that it works with 2000+ folders, that'd be fine with me.
Install AWS CLI
Install Python 3 if not present (on mac/linux, check with python3 --version
Install BOTO package for Python 3 with pip install boto3
Create a text file named remove_public.py, and paste in the contents of #ascobol's script
run python3 remove_public.py bucketName folderName
Script contents from ascobol's answer, above
#!/usr/bin/env python
#remove public read right for all keys within a directory
#usage: remove_public.py bucketName folderName
import sys
import boto3
BUCKET = sys.argv[1]
PATH = sys.argv[2]
s3client = boto3.client("s3")
paginator = s3client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket=BUCKET, Prefix=PATH)
for page in page_iterator:
keys = page['Contents']
for k in keys:
response = s3client.put_object_acl(
ACL='private',
Bucket=BUCKET,
Key=k['Key']
)