How to list several specific files from amazon S3 quickly

How to list several specific files from amazon S3 quickly - amazon-s3

I want to check if some files are really in my s3 bucket. Using aws cli I can do it for one files with ls like
aws s3 ls s3://mybucket/file/path/file001.jpg
but I need to be able to do it for several files
aws s3 ls s3://mybucket/file/path/file001.jpg ls s3://mybucket/file/path/file002.jpg
Won't work
nor
aws s3 ls s3://mybucket/file/path/file001.jpg s3://mybucket/file/path/file005.jpg
Of course
aws s3 ls s3://mybucket/file/path/file001.jpg;
aws s3 ls s3://mybucket/file/path/file005.jpg
Works perfectly, but slowly. It takes about 1 sec to get one file, because it connects an close the connection each time.
I've hundreds of files to check on a regular basis, so I need a fast way to do it. Thanks
I'm not insisting on using ls, or passing a path, a "find" of the filenames would also do (but aws cli seems to lack find). Another tool (as long as it can be invoked with the command line), will be ok
I don’t want to get a list of all files or have a script looking at all files and then post process. I need a way to ask s3 give me fila a,r,z in one go.
I think s3api listobjects call should be the one but I fail at its syntax to ask several file names at once.

You can easily do that using python boto3 sdk for AWS
import boto3
s3 = boto3.resource('s3')
bucket=s3.Bucket('mausamrest');
for obj in bucket.objects.all():
print(obj.key)
where mausamrest is the bucket

Related

sync local folder structure to s3 root structure

In my pipeline I am trying to sync my local folder (or should I say repository folder) to the s3 bucket. Now I can do the aws s3 sync . s3:// but this off course gives an error, since the bucket is not specified. But basically that is exactly what I want. Exactly how my folder-structure locally is; is how I want in S3.
so locally:
bucket1/file1.txt
bucket1/file2.txt
bucket1/subbucket1/file3.txt
needs to go exactly to root of my s3 account... how to fix this?
btw; the sync might be an overkill since I only want to copy (and overwrite!) to the s3 folders, coming from the root. Not (yet) interested in deleting etc.
what can I do..?

The AWS Command-Line Interface (CLI) aws s3 sync command requires a bucket name.
Therefore, you will either need to write a script that extracts the bucket name and inserts it into the aws s3 sync command, or you'll need to write your own program to use in place of the AWS CLI.
If you have a limited number of buckets and they don't change that often, you could just write a script that repeatedly calls the AWS CLI, such as:
aws s3 sync bucket1/ s3://bucket1/
aws s3 sync bucket2/ s3://bucket2/
etc.

if somebody comes to the same question:
for file in `find -type f`;
do
newFilename="${file#./}"
dirName=$ENVIRONMENT-$(dirname "$newFilename")
#get first part of dir (only root)
dirName="${dirName%%/*}"
echo bucket: $dirName
if aws s3api head-bucket --bucket "$dirName" 2>/dev/null; then
echo "bucket already exists"
else
if [[ $dirName == *"/"* ]]; then
echo $dirName
echo "This bucket is a subfolder and will not be created"
else
aws s3 mb s3://$dirName
fi
fi
aws s3 cp $newFilename s3://$ENVIRONMENT-$newFilename
done
the scripts retrieves all the files that it can find;
then it will check the root directory (relative to the current folder)
it will check it the directory exists as a bucket. If not; it will be created.
And then every file will be copied.
Since i do not know if a root-directory exists (as a bucket) we have to manually check it.
I couldn't use the sync because I might not have an existing bucket.
If you do know that your root directory as a bucket exists; then i would use the sync, one liner vs 10-liner :see_no_evil:.
anyway, that was it for me!

AWS S3 auto save folder

Is there a way I can autosave autocad files or changes on the autocad files directly to S3 Bucket?, probably an API I can utilize for this workflow?

While I was not able to quickly find a plug in that does that for you, what you can do is one of the following:
Mount S3 bucket as a drive. You can read more at CloudBerry Drive - Mount S3 bucket as Windows drive
This might create some performance issues with AutoCad.
Sync saved files to S3
You can set a script to run every n minutes that automatically syncs your files to S3 using aws s3 sync. You can read more about AWS S3 Sync here. Your command might look something like
aws s3 sync /path/to/cad/files s3://bucket-with-cad/project/test

How to use S3 adapter cli for snowball

I'm using s3 adapter to copy files from a snowball device to local machine.
Everything appears to be in order as I was able to run this command and see the bucket name:
aws s3 ls --endpoint http://snowballip:8080
But besides this, aws doesn't offer any examples for calling cp command. How do I provide the bucket name and the key with this --endpoint flag.
Further, when I ran this:
aws s3 ls --endpoint http://snowballip:8080/bucketname
It returned 'Bucket'... Not sure what that means because I expect to see the files.

I can confirm the following is correct for snowball and snowball edge, as #sqlbot says in the comment
aws s3 ls --endpoint http://snowballip:8080 s3://bucketname/[optionalprefix]
References:
http://docs.aws.amazon.com/cli/latest/reference/
http://docs.aws.amazon.com/snowball/latest/ug/using-adapter-cli.html
Just got one in the post

AWS S3 download files with exec permission

I've been struggling with this one for quite a while. Thought it would work out-of-box based on AWS documentation of supporting the acl header.
I'm using the AWS S3 CLI in order to download files from my S3 bucket. Some of the files will need to have 'exec' permissions (running on Linux).
I can chmod the files but I would like to control that during the upload rather than during the download.
So, the question is whether I can use the AWS CLI so that it will automatically grant execution (or other) permissions based on something that I can set during the upload or afterwards on the uploaded file.
Thanks,

Limiting 'ls' command output in s3fs

My Amazon S3 bucket has millions of files and I am mounting it using s3fs. Anytime a ls command is issued (not intentionally) the terminal hangs.
Is there a way to limit the number of results returned to 100 when a ls command is issued in a s3fs mounted path?

Try goofys (https://github.com/kahing/goofys). It doesn't limit the number of item returned for ls, but ls is about 40x faster than s3fs when there are lots of files.

It is not recommended to use s3fs in production situations. Amazon S3 is not a filesystem, so attempting to mount it can lead to some synchronization issues (and other issues like you have experienced).
It would be better to use the AWS Command-Line Interface (CLI), which has commands to list, copy and sync files to/from Amazon S3. It can also do partial listing of S3 buckets by path.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to list several specific files from amazon S3 quickly - amazon-s3

You can easily do that using python boto3 sdk for AWS import boto3 s3 = boto3.resource('s3') bucket=s3.Bucket('mausamrest'); for obj in bucket.objects.all(): print(obj.key) where mausamrest is the bucket

Related

sync local folder structure to s3 root structure

AWS S3 auto save folder

How to use S3 adapter cli for snowball

AWS S3 download files with exec permission

Limiting 'ls' command output in s3fs

Categories

Resources