aws-cli fails to work with one particular S3 bucket on one particular machine - amazon-s3

I'm trying to remove the objects (empty bucket) and then copy new ones into an AWS S3 bucket:
aws s3 rm s3://BUCKET_NAME --region us-east-2 --recursive
aws s3 cp ./ s3://BUCKET_NAME/ --region us-east-2 --recursive
The first command fails with the following error:
An error occurred (InvalidRequest) when calling the ListObjects
operation: You are attempting to operate on a bucket in a region that
requires Signature Version 4. You can fix this issue by explicitly
providing the correct region location using the --region argument, the
AWS_DEFAULT_REGION environment variable, or the region variable in the
AWS CLI configuration file. You can get the bucket's location by
running "aws s3api get-bucket-location --bucket BUCKET". Completed 1
part(s) with ... file(s) remaining
Well, the error prompt is self-explanatory but the problem is that I've already applied the solution (I've added the --region argument) and I'm completely sure that it is the correct region (I got the region the same way the error message is suggesting).
Now, to make things even more interesting, the error happens in a gitlab CI environment (let's just say some server). But just before this error occurs, there are other buckets which the exact same command can be executed against and they work. It's worth mentioning that those other buckets are in different regions.
Now, to top it all off, I can execute the command on my personal computer with the same credentials as in CI server!!! So to summarize:
server$ aws s3 rm s3://OTHER_BUCKET --region us-west-2 --recursive <== works
server$ aws s3 rm s3://BUCKET_NAME --region us-east-2 --recursive <== fails
my_pc$ aws s3 rm s3://BUCKET_NAME --region us-east-2 --recursive <== works
Does anyone have any pointers what might the problem be?

For anyone else that might be facing the same problem, make sure your aws is up-to-date!!!
server$ aws --version
aws-cli/1.10.52 Python/2.7.14 Linux/4.13.9-coreos botocore/1.4.42
my_pc$ aws --version
aws-cli/1.14.58 Python/3.6.5 Linux/4.13.0-38-generic botocore/1.9.11
Once I updated the server's aws cli tool, everything worked. Now my server is:
server$ aws --version
aws-cli/1.14.49 Python/2.7.14 Linux/4.13.5-coreos-r2 botocore/1.9.2

Related

AWS cli throws error when copying large files

I'm trying to copy objects from an s3 bucket to another using aws cli tool.
It works OK for small objects, but on large file buckets, as soon as the copy starts, I get one of the following errors:
copy failed: s3://bucket/file.ogv to s3://bucket-tmp/file.ogv ('Connection aborted.', OSError(0, 'Error'))
or
copy failed: s3://bucket/file.ogv to s3://bucket-tmp/file.ogv An error occurred (NoSuchKey) when calling the UploadPartCopy operation: Unknown
if I include the --no-guess-mime-type I get
fatal error: ('Connection aborted.', OSError(0, 'Error'))
I tryied --debug, but I really didn't understand much of the debug output but I could see OSError(0, 'Error') again in the log.
Anyone has seen anything like this ? in another answer (this one), people told about another tool s3cmd, but I couldn't make it work.
I'm trying to access ceph on a corporate server with path-style urls and https endpoint.
My command:
aws --endpoint-url https://myendpoint.url s3 cp s3://mybucket s3://mybucket-tmp --recursive
Also when I tried to configure s3cmd I get an ungly python debug output with OSError: [Errno 0] Error in the middle.
I discovered that if I use s3api command instead of s3 command it works. Format of working command:
aws --endpoint-url <my-endpoint-url> s3api copy-object --copy-source my-source-bucket/whatever/path/file.txt --key whatever/path/file.txt --bucket my-destination-bucket
It only copys one file at once. You can grab a list of objects in the bucket using s3 command ls or s3api command list-objects

Move files in S3 bucket to folder based on file name pattern

I have an S3 bucket with a few thousand files where the file names always match the pattern {hostname}.{contenttype}.{yyyyMMddHH}.zip. I want to create a script that will run once a day to move these files into folders based on the year and month in the file name.
If I try the following aws-cli command
aws s3 mv s3://mybucket/*.202001* s3://mybucket/202001/
I get the following error:
fatal error: An error occurred (404) when calling the HeadObject operation: Key "*.202001*" does not exist
Is there an aws-cli command that I could run on a schedule to achieve this?
I think the way forward would be through the --filter parameter used in S3 CLI commands.
So, for your case,
aws s3 mv s3://mybucket/ s3://mybucket/202001/ --recursive --exclude "*" --include "*.202001*"
should probably do the trick.
For scheduling the CLI command to run daily, I think you can refer to On AWS, run an AWS CLI command daily

gsutil: specify project on copy

I'm attempting to come up with commands to facilitate deployment to different environments (production, staging) in my GCP project using gsutil.
The following deploys to production without issue:
gsutil cp -r ./build/* gs://<production-project-name>/
I'd like to deploy to a bucket in another project. The gsutil help page alludes to a -p option for ls and mb used to change the project context of the gsutil command.
I'd like to use a command like this to deploy my app to a staging environment:
gsutil cp -r ./build/* gs://<existing-bucket-in-staging-project>/ -p <staging-project-name>
Alas, the -p option is not available for the cp command. I confirmed on the gsutil cp doc page.
What is the best way to deploy a build artifact to a Google Cloud storage bucket to a bucket in a project other than the one currently specified in the terminal environment?
The bucket namespace is global, so as long as the credentials you're using have permission to the other project, you shouldn't need a project parameter with the cp command. In other words, this command should work fine:
gsutil cp -r ./build/* gs://<bucket-in-staging-project>

Files will not move or copy from folder on file system to local bucket

I am using the command
aws s3 mv --recursive Folder s3://bucket/dsFiles/
The aws console is not giving me any feedback. I change the permissions of the directory
sudo chmod -R 666 ds000007_R2.0.1/
It looks like AWS is passing through those files and giving "File does not exist" for every directory.
I am confused about why AWS is not actually performing the copy is there some size limitation or recursion depth limitation?
I believe you want to cp, not mv. Try the following:
aws s3 cp $local/folder s3://your/bucket --recursive --include "*".
Source, my answer here.

AWS EMR --steps

I am running the following .sh to run a command on AWS using EMR:
aws emr create-cluster --name "Big Matrix Re Run 5" --ami-version 3.1.0 --auto-terminate --log-uri FILE LOCATION --enable-debugging --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=c3.xlarge InstanceGroupType=CORE,InstanceCount=3,InstanceType=c3.xlarge --steps NAME AND LOCATION OF FILE
I've deleted the pertinent file name and locations as those aren't my issue, but I am having an issue with the --steps portion of the script.
How do I specify the steps that I want to run in the cluster? The documentation doesn't give any examples.
Here is the error:
Error parsing parameter '--steps': should be: Key value pairs, where values are separated by commas, and multiple pairs are separated by spaces.
--steps Name=string1,Jar=string1,ActionOnFailure=string1,MainClass=string1,Type=string1,Properties=string1,Args=string1,string2 Name=string1,Jar=string1,ActionOnFailure=string1,MainClass=string1,Type=string1,Properties=string1,Args=string1,string2
Thanks!
The documentation page for the AWS Command-Line Interface create-cluster command shows examples for using the --steps parameter.
Steps can be supplied on the command-line, or can refer to files available within HDFS or Amazon S3.
Within HDFS:
aws emr create-cluster --steps file://./multiplefiles.json --ami-version 3.3.1 --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=2,InstanceType=m3.xlarge --auto-terminate
Within Amazon S3:
aws emr create-cluster --steps Type=HIVE,Name='Hive program',ActionOnFailure=CONTINUE,ActionOnFailure=TERMINATE_CLUSTER,Args=[-f,s3://elasticmapreduce/samples/hive-ads/libs/model-build.q,-d,INPUT=s3://elasticmapreduce/samples/hive-ads/tables,-d,OUTPUT=s3://mybucket/hive-ads/output/2014-04-18/11-07-32,-d,LIBS=s3://elasticmapreduce/samples/hive-ads/libs] --applications Name=Hive --ami-version 3.1.0 --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=2,InstanceType=m3.xlarge