Updating existing metadata of my S3 object - amazon-s3

I am trying to update the existing metadata of my S3 object but in spite of updating it is creating the new one. As per the documentation, it is showing the same way but don't know why it is not able to update it.
k = s3.head_object(Bucket='test-bucket', Key='test.json')
s3.copy_object(Bucket='test-bucket', Key='test.json', CopySource='test-bucket' + '/' + 'test.json', Metadata={'Content-Type': 'text/plain'}, MetadataDirective='REPLACE')

I was able to update using the copy_from method
s3 = boto3.resource('s3')
object = s3.Object(bucketName, uploadedKey)
object.copy_from(
CopySource={'Bucket': bucketName,'Key': uploadedKey},
MetadataDirective="REPLACE",
ContentType=value
)

S3 metadata is read-only, so updating only metadata of an S3 object is not possible. The only way to update the metadata is to recreate/copy the object. Check the 1st paragraph of the official docs
You can set object metadata at the time you upload it. After you upload the object, you cannot modify object metadata. The only way to modify object metadata is to make a copy of the object and set the metadata.

Related

AWS structure of S3 trigger

I am building a Python Lambda in AWS and wanted to add an S3 trigger to it. Following these instructions I saw how to get the bucket and key on which I got the trigger using:
def func(event):
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
There is an example of such an object in the link, but I wasn't able, however, to find a description of the entire event object anywhere in AWS' documentation.
Is there a documentation for this object's structure? Where might I find it?
You can find documentation about the whole object in the S3 documentation:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-content-structure.html
I would also advise to iterate the records, because there could be multiple at once:
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
[...]

how to update metadata on an S3 object larger than 5GB?

I am using the boto3 API to update the S3 metadata on an object.
I am making use of How to update metadata of an existing object in AWS S3 using python boto3?
My code looks like this:
s3_object = s3.Object(bucket,key)
new_metadata = {'foo':'bar'}
s3_object.metadata.update(new_metadata)
s3_object.copy_from(CopySource={'Bucket':bucket,'Key':key}, Metadata=s3_object.metadata, MetadataDirective='REPLACE')
This code fails when the object is larger than 5GB. I get this error:
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
How does one update the metadata on an object larger than 5GB?
Due to the size of your object, try invoking a multipart upload and use the copy_from argument. See the boto3 docs here for more information:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.MultipartUploadPart.copy_from
Apparently, you can't just update the metadata - you need to re-copy the object to S3. You can copy it from s3 back to s3, but you can't just update, which is annoying for objects in the 100-500GB range.

Add Metadata on Amazon S3 object is not persistent

I am trying to add a new meta data information to an S3 object so I know what are all objects that I already processed. But for some reason, the new data that I added is not persistent. When I exit the program, its not there anymore. I do not see the new field "processed" in the old data, but I see it is present in the new data. I expect the newly added metadata field to present on the object permanently but it is gone after I exit the program
ObjectMetadata objMetaData = new ObjectMetadata();
objMetaData = s3.getObjectMetadata(bucketName,prefix);
Map<String, String> map = new HashMap<String, String>();
Map<String, String> newMap = new HashMap<String, String>();
map = objMetaData.getUserMetadata();
System.out.println("old Meta data is " + map.toString());
objMetaData.addUserMetadata("x-amz-meta-processed", "true");
newMap = objMetaData.getUserMetadata();
System.out.println("New processed data is" +newMap.toString());
I suspect your confusion comes from not understanding an important part of the design of S3:
S3 objects can't be modified, and neither can their metadata. It's all immutable.
Wait, what? It's technically true.
The only way to modify object metadata is to make a copy of the object and set the metadata.
http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html
You can, in practice, "add" metadata to an object, but what that really means is that you're asking S3 to make a copy of the object, with the same key for the source and target, but using "different" metadata.
If the "different" metadata you want on the object includes metadata that was already present, you have to include that in the request.
S3 supports copying an object onto itself, so you don't actually have to re-upload the object.
What you are doing now is just changing the values in your local code's data structures.

How to rename objects boto3 S3?

I have about 1000 objects in S3 which named after
abcyearmonthday1
abcyearmonthday2
abcyearmonthday3
...
want to rename them to
abc/year/month/day/1
abc/year/month/day/2
abc/year/month/day/3
how could I do it through boto3. Is there easier way of doing this ?
As explained in Boto3/S3: Renaming an object using copy_object
you can not rename an object in S3 you have to copy object with a new name and then delete the Old object
s3 = boto3.resource('s3')
s3.Object('my_bucket','my_file_new').copy_from(CopySource='my_bucket/my_file_old')
s3.Object('my_bucket','my_file_old').delete()
There is not direct way to rename S3 object.
Below two steps need to perform :
Copy the S3 object at same location with new name.
Then delete the older object.
I had the same problem (in my case I wanted to rename files generated in S3 using the Redshift UNLOAD command). I solved creating a boto3 session and then copy-deleting file by file.
Like
import boto3
session = boto3.session.Session(aws_access_key_id=my_access_key_id,aws_secret_access_key=my_secret_access_key).resource('s3')
# Save in a list the tuples of filenames (with prefix): [(old_s3_file_path, new_s3_file_path), ..., ()] e.g. of tuple ('prefix/old_filename.csv000', 'prefix/new_filename.csv')
s3_files_to_rename = []
s3_files_to_rename.append((old_file, new_file))
for pair in s3_files_to_rename:
old_file = pair[0]
new_file = pair[1]
s3_session.Object(s3_bucket_name, new_file).copy_from(CopySource=s3_bucket_name+'/'+old_file)
s3_session.Object(s3_bucket_name, old_file).delete()

Amazon S3 boto: How do you rename a file in a bucket?

How do you rename a S3 key in a bucket with boto?
You can't rename files in Amazon S3. You can copy them with a new name, then delete the original, but there's no proper rename function.
Here is an example of a Python function that will copy an S3 object using Boto 2:
import boto
def copy_object(src_bucket_name,
src_key_name,
dst_bucket_name,
dst_key_name,
metadata=None,
preserve_acl=True):
"""
Copy an existing object to another location.
src_bucket_name Bucket containing the existing object.
src_key_name Name of the existing object.
dst_bucket_name Bucket to which the object is being copied.
dst_key_name The name of the new object.
metadata A dict containing new metadata that you want
to associate with this object. If this is None
the metadata of the original object will be
copied to the new object.
preserve_acl If True, the ACL from the original object
will be copied to the new object. If False
the new object will have the default ACL.
"""
s3 = boto.connect_s3()
bucket = s3.lookup(src_bucket_name)
# Lookup the existing object in S3
key = bucket.lookup(src_key_name)
# Copy the key back on to itself, with new metadata
return key.copy(dst_bucket_name, dst_key_name,
metadata=metadata, preserve_acl=preserve_acl)
There is no direct method to rename the file in s3. what do you have to do is copy the existing file with new name (Just set the target key) and delete the old one. Thank you
//Copy the object
AmazonS3Client s3 = new AmazonS3Client("AWSAccesKey", "AWSSecretKey");
CopyObjectRequest copyRequest = new CopyObjectRequest()
.WithSourceBucket("SourceBucket")
.WithSourceKey("SourceKey")
.WithDestinationBucket("DestinationBucket")
.WithDestinationKey("DestinationKey")
.WithCannedACL(S3CannedACL.PublicRead);
s3.CopyObject(copyRequest);
//Delete the original
DeleteObjectRequest deleteRequest = new DeleteObjectRequest()
.WithBucketName("SourceBucket")
.WithKey("SourceKey");
s3.DeleteObject(deleteRequest);