copy to destination and delete from source using single command - gsutil

Wanted to copy files to destination bucket and delete the same from source if it is success which is from two different bucket in different region.
Is that gsutil cp command support this using -d or any suggestion will be great helpful. Thanks

sorry for not noticing it properly, i was able to do that using gsutil -m mv gs://s_bucketname/foldername gs://d_bucketname/foldername

Related

Prevent rclone from re-copying files to AWS S3 Deep Archive

I'm using rclone in order to copy some files to an S3 bucket (deep archive). The command I'm using is:
rclone copy --ignore-existing --progress --max-delete 0 "/var/vmail" foo-backups:foo-backups/vmail
This is making rclone to copy files that I know for sure that already exist in the bucket. I tried removing the --ignore-existing flag (which IMHO is badly named, as it does exactly the opposite of what you'd initially expect), but I still get the same behaviour.
I also tried adding --size-only, but the "bug" doesn't get fixed.
How can I make rclone copy only new files?
You could use rclone sync, check out https://rclone.org/commands/rclone_sync/
Doesn’t transfer unchanged files, testing by size and modification time or MD5SUM. Destination is updated to match source, including deleting files if necessary.
It turned out to be a bug in rclone. https://github.com/rclone/rclone/issues/3834

gsutil rsync only files matching a pattern

I need to rsync files from a bucket to a local machine everyday, and the bucket contains 20k files. I need to download only the changed files that end with *some_naming_convention.csv .
What's the best way to do that? using a wildcard in the download source gave me an error.
I don't think you can do that with Rsynch. As Christopher told you, you can skip files by using the "-x" flag, but no just synch those [1]. I created a public Feature Request on your behalf [2] for you to follow updates there.
As I say in the FR, IMHO I consider this to not follow the purpose of rsynch, as it's to keep folders/buckets synchronise, and just synchronising some of them don't fall in that purpose.
There is a possible "workaround" by using gsutil cp to copy files and -n to skip the ones that already exist. The whole command for your case should be:
gsutil -m cp -n <bucket>/*some_naming_convention.csv <directory>
Other option, maybe a little bit more far-fetched is to copy/move those files to a folder and then use that folder to rsynch.
I hope this works for you ;)
Original Answer
From here, you can do something like gsutil rsync -r -x '^(?!.*\.json$).*' gs://mybucket mydir to rsync all json files. The key is the ?! prefix to the pattern you actually want.
Edit
The -x flag excludes a pattern. The pattern ^(?!.*\.json$).* uses negative look-ahead to specify patterns not ending in .json. It follows that the result of the gsutil rsync call will get all files which end in .json.
Rsync lets you include and exclude files matching patterns.
For each file rsync applies the first patch that matches, some if you want to sync only selected files then you need to include those, and then exclude everything else.
Add the following to your rsync options:
--include='*some_naming_convention.csv' --exclude='*'
That's enough if all your files are in one directory. If you also want to search sub folders then you need a little bit more:
--include='*/' --include='*some_naming_convention.csv' --exclude='*'
This will duplicate all the directory tree, but only copy the files you want. If that leaves empty directories you don't want then add --prune-empty-dirs.

Non coder need to download from S3 using AWS CLI

I am trying to download a folder from S3 using AWS CLI and I think the issue I am having is the target folder and what I need to describe to get the folder to go to!
I have all the inital steps in place configure, keys, region and that is all good but its the call and place to deliver to being the issue I think.
[aws s3 cp s3://arn:aws:s3:::temporary-bucket-to-restore-website-files/ folder/file --profile pname --exclude \"*\" --recursive]:
The mistake that you made is, you did not pass the bucket name correctly. You have to pass the S3Uri(s3://temporary-bucket-to-restore-website-files/) not the bucket ARN. Modify your command as given below, and it will work.
aws s3 cp temporary-bucket-to-restore-website-files/ folder/file --profile pname --exclude \"*\" --recursive
Ref: https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html
I hope this helps you!

Is it possible to sync a single file to s3?

I'd like to sync a single file from my filesystem to s3.
Is this possible or can only directories by synced?
Use include/exclude options for the sync-directory command:
e.g. To sync just /var/local/path/filename.xyz to S3 use:
s3 sync /var/local/path s3://bucket/path --exclude='*' --include='*/filename.xyz'
cp can be used to copy a single file to S3. If the filename already exists in the destination, this will replace it:
aws s3 cp local/path/to/file.js s3://bucket/path/to/file.js
Keep in mind that per the docs, sync will only make updates to the target if there have been file changes to the source file since the last run: s3 sync updates any files that have a size or modified time that are different from files with the same name at the destination. However, cp will always make updates to the target regardless of whether the source file has been modified.
Reference: AWS CLI Command Reference: cp
Just to comment on pythonjsgeo's answer. That seems to be the right solution but make sure so execute the command without the = symbol after the include and exclude tag. I was including the = symbol and getting weird behavior with the sync command.
s3 sync /var/local/path s3://bucket/path --exclude '*' --include '*/filename.xyz'
You can mount S3 bucket as a local folder (using RioFS, for example) and then use your favorite tool to synchronize file(-s) or directories.

Moving many files in the same bucket

I've got 200k files in a bucket which I need to move into a sub folder within the same bucket, whats the best approach?
I recently encountered the same problem. I solved it using the command line API.
http://docs.aws.amazon.com/cli/latest/index.html
http://docs.aws.amazon.com/cli/latest/reference/s3/mv.html
aws s3 mv s3://BUCKETNAME/myfolder/photos/ s3://BUCKETNAME/myotherfolder/photos/ --recursive
I had a need for the objects to be publicly viewable, so I added the --acl public-read option.
Recently was able to do this with one command. Went much faster than individual requests for each file too.
Running a snippet like this:
aws s3 mv s3://bucket-name/ s3://bucket-name/subfolder --recursive --exclude "*" --include "*.txt"
Use the --include flag to selectively pick up the files you want
There is no 'Rename' operation though it would be great if there was.
Instead, you need to loop through each item that you want to rename, perform a copy to a new object and then a delete on the old object.
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectCOPY.html
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectDELETE.html
Note: for simplistic purposes I'm assuming you don't have versioning enabled on your bucket.
I had this same problem and I ended up using aws s3 mv along with a bash for loop.
I did aws ls bucket_name to get all of the files in the bucket. Then I decided which files I wanted to move and added them to file_names.txt.
Then I ran the following snippet to move all of the files:
for f in $(cat file_names.txt)
do
aws s3 mv s3://bucket-name/$f s3://bucket-name/subfolder/$f
done
if your files are in a folder, you can use s3cmd tool
s3cmd cp --recursive s3://bucket/folder/ s3://bucket/sub_folder/
Ps: I'm assuming you have already installed and configured s3cmd
The below script works perfectly to me without any issues
for i in `cat s3folders`
do
aws s3 mv s3://Bucket_Name/"$i"/ s3://Another_Bucket_Name/ --recursive
done
It also delete the empty folder from source once the files moved to the target.