I want to change the files of a bucket (all files) to private, so i'm wondering how to do it with aws-shell. I think that maybe the mv command can be useful to achieve this, but i cant figure out how to use it, because this is the first time that i use aws-shell.
Edit 1
I tried using s3 mv s3://bucket s3://temporary --recursive --acl private, but i needed to create another temporary bucket to make the swap. Because of this error:
Cannot mv a file onto itself [...]
Is there a way to do this without creating a temporary bucket? I mean that this could cause charges for having transactions and space being used by duplicate files
You can copy the files onto themselves and change the Access Control List.
Test it out, but it would be something like:
aws s3 cp s3://bucket s3://bucket --recursive --acl private
Keep the source and destination the same.
Related
I have command to move files between s3 folders. I am getting bucket name from context variable.
placed the command in array line
"aws s3 mv s3://"+context.bucket+"/Egress/test1/abc/xyz.dat s3://"+context.bucket+"/Egress/test1/abc/archive/archive_xyz.dat"
The command fetches the bucket name from context variable, but shows no file or direcrtory error=2.
I think it is due to (") at begin and end.
Is there any way to solve.
please help
You probably want to use an array command.
Using /bin/bash or cmd
then your command.
I am trying to generate a file from a Dataframe that I have created in AWS-Glue, I am trying to give it a specific name, I see most answers on stack overflow actually uses Filesystem modules, but here this particular csv file is generated in S3, also I want to give the file a name while generating it, and not rename it after it is generated, is there any way to do that?
I have tried using df.save(s3:://PATH/filename.csv) which actually generates a new directory in S3 named filename.csv and then generates part-*.csv inside that directory
df.repartition(1).write.mode('append').format('csv').save('s3://PATH').option("header", "true")
I want to use the step S3 CSV Input to load multiple files from an s3 bucket then transform and load back into S3. But I can see this step support only one file at once and I need to supply the file names, is there any way to load all files at once by supplying only the bucket name i.e. <s3-bucket-name>/*?
S3-CSV-Input is inspired by CSV-Input and doesn't support multi-file-processing like Text-File-Input does, for example. You'll have to retrieve the filenames first, so you can loop over the filename list as you would do with CSV-Input.
Two options:
AWS CLI method
Write a simple shell script that calls AWS CLI. Put it in your path. Call it s3.sh
aws s3 ls s3://bucket.name/path | cut -c32-
In PDI:
Generate Rows: Limit 1, Fields: Name: process, Type: String, Value s3.sh
Execute a Process: Process field: process, Output Line Delimiter |
Split Field to Rows: Field to split: Result output. Delimiter | New field name: filename
S3 CSV Input: The filename field: filename
S3 Local Sync
Mount the S3 directory to a local directory, using s3fs
If you have many large files in that bucket directory, it wouldn't work so fast...well it might be okay if your PDI runs on an Amazon machine
Then use the standard file reading tools
$ s3fs my-bucket.example.com/path/ ~/my-s3-files -o use_path_request_style -o url=https://s3.us-west-2.amazonaws.com
I have a huge (~6 GB) file on Amazon S3 and want to get the first 100 lines of it without having to download the whole thing. Is this possible?
Here's what I'm doing now:
aws cp s3://foo/bar - | head -n 100
But this takes a while to execute. I'm confused -- shouldn't head close the pipe once it's read enough lines, causing aws cp to crash with a BrokenPipeError before it has time to download the entire file?
Using the Range HTTP header in a GET request, you can retrieve a specific range of bytes in an object stored in Amazon S3. (see http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html)
if you use aws cli you can use aws s3api get-object --range bytes=0-xxx, see http://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html
It is not exactly as a number of lines but should allow you to retrieve your file in part so avoid downloading the full object
I'm trying to use s3.exe, a windows CLI for S3 from s3.codeplex.com, to PUT an object.
Here is the command I'm running:
c:\>s3 put My-Bucket file.txt /key:MYKEY /secret:MYSECRET
It returns: <403> Forbidden.
But when I try to PUT the file into a bucket without a hypen, it works.
c:\>s3 put MyNoHyphenBucket file.txt /key:MYKEY /secret:MYSECRET
Can someone else try it and see if they have the same issue? Any help on how to get it working with hyphenated bucket names would be greatly appreciated.
I'd be open to trying alternative s3 CLI for Windows.
Are you using an EU or NA bucket?
I found this:
"European Bucket allows only lower case letters. Although Buckets created in the US may contain lower case and upper case both, Amazon recommends that you use all lower case letters when creating a bucket."
Apparently whatever's behind that also impacts hyphens.
With an EU bucket, I get the same behaviour (403) as yourself. Repeat experiment with an NA bucket, and it succeeds.
I saw this error on NOT US buckets.
So, I created US bucket (select region US Standard when creating) and all works fine!