Is it possible to sync a single file to s3? - amazon-s3

I'd like to sync a single file from my filesystem to s3.
Is this possible or can only directories by synced?

Use include/exclude options for the sync-directory command:
e.g. To sync just /var/local/path/filename.xyz to S3 use:
s3 sync /var/local/path s3://bucket/path --exclude='*' --include='*/filename.xyz'

cp can be used to copy a single file to S3. If the filename already exists in the destination, this will replace it:
aws s3 cp local/path/to/file.js s3://bucket/path/to/file.js
Keep in mind that per the docs, sync will only make updates to the target if there have been file changes to the source file since the last run: s3 sync updates any files that have a size or modified time that are different from files with the same name at the destination. However, cp will always make updates to the target regardless of whether the source file has been modified.
Reference: AWS CLI Command Reference: cp

Just to comment on pythonjsgeo's answer. That seems to be the right solution but make sure so execute the command without the = symbol after the include and exclude tag. I was including the = symbol and getting weird behavior with the sync command.
s3 sync /var/local/path s3://bucket/path --exclude '*' --include '*/filename.xyz'

You can mount S3 bucket as a local folder (using RioFS, for example) and then use your favorite tool to synchronize file(-s) or directories.

Related

Prevent rclone from re-copying files to AWS S3 Deep Archive

I'm using rclone in order to copy some files to an S3 bucket (deep archive). The command I'm using is:
rclone copy --ignore-existing --progress --max-delete 0 "/var/vmail" foo-backups:foo-backups/vmail
This is making rclone to copy files that I know for sure that already exist in the bucket. I tried removing the --ignore-existing flag (which IMHO is badly named, as it does exactly the opposite of what you'd initially expect), but I still get the same behaviour.
I also tried adding --size-only, but the "bug" doesn't get fixed.
How can I make rclone copy only new files?
You could use rclone sync, check out https://rclone.org/commands/rclone_sync/
Doesn’t transfer unchanged files, testing by size and modification time or MD5SUM. Destination is updated to match source, including deleting files if necessary.
It turned out to be a bug in rclone. https://github.com/rclone/rclone/issues/3834

Can make figure out file dependencies in AWS S3 buckets?

The source directory contains numerous large image and video files.
These files need to be uploaded to an AWS S3 bucket with the aws s3 cp command. For example, as part of this build process, I copy my image file my_image.jpg to the S3 bucket like this: aws s3 cp my_image.jpg s3://mybucket.mydomain.com/
I have no problem doing this copy to AWS manually. And I can script it too. But I want to use the makefile to upload my image file my_image.jpg iff the same-named file in my S3 bucket is older than the one in my source directory.
Generally make is very good at this kind of dependency checking based on file dates. However, is there a way I can tell make to get the file dates from files in S3 buckets and use that to determine if dependencies need to be rebuilt or not?
The AWS CLI has an s3 sync command that can take care of a fair amount of this for you. From the documentation:
A s3 object will require copying if:
the sizes of the two s3 objects differ,
the last modified time of the source is newer than the last modified time of the destination,
or the s3 object does not exist under the specified bucket and prefix destination.
I think you'll need to make S3 look like a file system to make this work. On Linux it is common to use FUSE to build adapters like that. Here are some projects to present S3 as a local filesystem. I haven't tried any of those, but it seems like the way to go.

Sync with S3 with s3cmd, but not re-download files that only changed name

I'm syncing a bunch of files between my computer and Amazon S3. Say a couple of the files change name, but their content is still the same. Do I have to have the local file removed by s3cmd and then the "new" file re-downloaded, just because it has a new name? Or is there any other way of checking for changes? I would like s3cmd to, in that case, simply change the name of the local file in accordance with the new name on the server.
s3cmd upstream (github.com/s3tools/s3cmd master branch) and 1.5.0-rc1 latest published version, can figure this out, if you used a recent version to put the file into S3 in the first place that used the --preserve option to store the md5sum of each file. Using the md5sums, it knows that you have a duplicate (even if renamed) file locally, and won't re-download it, but instead will do a local copy (or hardlink) from the file system name to the name from S3.

Exclude folders for s3cmd sync

I am using s3cmd and i would like to know how to exclude all folders within a bucket and just sync the bucket root.
for example
bucket
folder/two/
folder/two/file.jpg
get.jpg
with the sync i just want it to sync the get.jpg and ignore the folder and its contents.
s3cmd --config sync s3://s3bucket (only sync root) local/
If someone could help that would be amazing i have already tried the --exclude but not sure how to use it in this situation?
You should indeed use the --exclude option. If you want to sync every file on the root but not the folders, you should try :
s3cmd --exclude="/*/*" sync local/ s3://s3bucket
Keep in mind that a folder doesn't really exist on S3. What seems to be a file file in a folder folder is just a file named folder/file! So you just have to exclude file with the pattern /*/*.
As mentioned by #physiocoder excluding a folder is done the following way:
s3cmd --exclude 'foldername/*'
So that is different from the question but I landed on this page due to its title.

Moving many files in the same bucket

I've got 200k files in a bucket which I need to move into a sub folder within the same bucket, whats the best approach?
I recently encountered the same problem. I solved it using the command line API.
http://docs.aws.amazon.com/cli/latest/index.html
http://docs.aws.amazon.com/cli/latest/reference/s3/mv.html
aws s3 mv s3://BUCKETNAME/myfolder/photos/ s3://BUCKETNAME/myotherfolder/photos/ --recursive
I had a need for the objects to be publicly viewable, so I added the --acl public-read option.
Recently was able to do this with one command. Went much faster than individual requests for each file too.
Running a snippet like this:
aws s3 mv s3://bucket-name/ s3://bucket-name/subfolder --recursive --exclude "*" --include "*.txt"
Use the --include flag to selectively pick up the files you want
There is no 'Rename' operation though it would be great if there was.
Instead, you need to loop through each item that you want to rename, perform a copy to a new object and then a delete on the old object.
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectCOPY.html
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectDELETE.html
Note: for simplistic purposes I'm assuming you don't have versioning enabled on your bucket.
I had this same problem and I ended up using aws s3 mv along with a bash for loop.
I did aws ls bucket_name to get all of the files in the bucket. Then I decided which files I wanted to move and added them to file_names.txt.
Then I ran the following snippet to move all of the files:
for f in $(cat file_names.txt)
do
aws s3 mv s3://bucket-name/$f s3://bucket-name/subfolder/$f
done
if your files are in a folder, you can use s3cmd tool
s3cmd cp --recursive s3://bucket/folder/ s3://bucket/sub_folder/
Ps: I'm assuming you have already installed and configured s3cmd
The below script works perfectly to me without any issues
for i in `cat s3folders`
do
aws s3 mv s3://Bucket_Name/"$i"/ s3://Another_Bucket_Name/ --recursive
done
It also delete the empty folder from source once the files moved to the target.