boto3 S3: update `expiry-date` on object - amazon-s3

My object has attribute 'Expiration': 'expiry-date="Sun, 16 Jul 2017 00:00:00 GMT"' that define when this object will be deleted - this date set by S3 from lifecycle rule. Is any way to update this date from boto3 to autodelete this object later? By the way I found the same datetime in attribute x-amz-expiration.

While your object is probably already gone, there is already an answered question for that specific topic: s3 per object expiry
tl;dr: Expiration is per S3 bucket, but by emulating touch you can extend the expiry date of individual objects.
As you asked for a boto3-solution for that and such a solution isn't noted in the linked question, here is one with boto3:
#!/usr/bin/env python3
import boto3
client = boto3.client('s3')
# Upload the object initially.
client.put_object(Body='file content',
Bucket='your-bucket',
Key='testfile')
# Replace the object with itself. That will "reset" the expiry timer.
# As S3 only allows that in combination of changing metadata, storage
# class, website redirect location or encryption attributes, simply
# add some metadata.
client.copy_object(CopySource='your-bucket/testfile',
Bucket='your-bucket',
Key='testfile',
Metadata={'foo': 'bar'},
MetadataDirective='REPLACE')

Related

Concurrent invalidations of the same object in Cloudfront via Lambda

I have a problem with concurrent creations of Cloudfront invalidations from AWS Lambda for the same object.
I have set up a Lambda handler to be triggered by specific S3 objects creations and removals, in order to perform invalidation of cached versions on my Cloudfront distribution. This is the function code, written using Python. The code does not detect if an invalidation is currently in progress:
from __future__ import print_function
import boto3
import time
import boto3
from botocore.config import Config
config = Config(
retries = {
'max_attempts': 6,
'mode': 'standard'
}
)
cloudfront = boto3.client('cloudfront', config=config)
def lambda_handler(event, context):
for items in event["Records"]:
path = "/" + items["s3"]["object"]["key"]
print(path)
invalidation = cloudfront.create_invalidation(DistributionId='xxxxx',
InvalidationBatch={
'Paths': {
'Quantity': 1,
'Items': [path]
},
'CallerReference': str(time.time())
})
I wonder how I would tell the function to only trigger when there is no invalidation status of InProgress for that same object?
I wonder how I would tell the function to only trigger when there is no invalidation status of InProgress for that same object?
The function will always trigger. There is no way to tell it to not trigger based on something happening in CloudFront.
However, you could add some logic in the function to only send an invalidation request to CloudFront if one isn't already running for that path. To do this you would list the current invalidations, and then get the details of each invalidation to see if it has the same path.

Elixir Arc: Extend S3 Header Expiry Time

I'm using the arc attachment library for elixir: https://github.com/stavro/arc, and I'm wanting to increase the expiry time of the signed generated URL's.
The default expiry time for S3 headers is set here:
https://github.com/stavro/arc/blob/3d1754b3e65e0f43b87c38c8ba696eadaeeeae27/lib/arc/storage/s3.ex#L3
Which produces the following in the link request to S3:
...&X-Amz-Date=20180125T203430Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=...
The readme says that you can extend the S3 header expires by adding a s3_object_headers method to your uploader:
Presuming that this is what I needed to do, here's what I added:
def s3_object_headers(version, {file, scope}) do
[expires: 600]
end
But I still get the same Amz-Expires value (300). I also tried using :expires_in and :expires_at as the code seemed to reference those values, but got the same result.
What have I done wrong or failed to understand about how this works?
expires_in needs to be passed in the last argument to your module's url/3 function, not put in s3_object_headers/2:
YourModule.url(..., ..., expires_in: 600)
I think the readme might be wrong by reading the signing and it's
:expires_in (or :expire_in) that you need to define in s3_object_headers

S3 java SDK - set expiry to object

I am trying to upload a file to S3 and set an expire date for it using Java SDK.
This is the code i got:
Instant expiration = Instant.now().plus(3, ChronoUnit.DAYS);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setExpirationTime(Date.from(expiration));
metadata.setHeader("Expires", Date.from(expiration));
s3Client.putObject(bucketName, keyName, new FileInputStream(file), metadata);
The object has no expire data on it in the S3 console.
What can I do?
Regards,
Ido
These are two unrelated things. The expiration time shown in the console is x-amz-expiration, which is populated by the system, by lifecycle policies. It is read-only.
x-amz-expiration
Amazon S3 will return this header if an Expiration action is configured for the object as part of the bucket's lifecycle configuration. The header value includes an "expiry-date" component and a URL-encoded "rule-id" component.
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectHEAD.html
Expires is a header which, when set on an object, is returned in the response when the object is downloaded.
Expires
The date and time at which the object is no longer able to be cached. For more information, go to http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.21.
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html
It isn't possible to tell S3 when to expire (delete) a specific object -- this is only done as part of bucket lifecycle policies, as described in the User Guide under Object Lifecycle Management.
Following the documentation the method setExpirationTime() using for internal needs and do not define expiration time for the uploaded object
public void setExpirationTime(Date expirationTime)
For internal use only. This will not set the object's expiration
time, and is only used to set the value in the object after receiving
the value in a response from S3.
So you can’t directly set expiration date for particular object. To solve this problem you can:
Define lifecycle rule for a bucket(remove bucket with objects after number of days)
Define lifecycle rule for bucket level to remove objects with specific tag or prefix after numbers of days
To define those rules use documentation:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html

boto - more concise way to get key's value from bucket?

I'm trying to figure out a concise way to get data from s3 via boto
my current code looks like this. s3 manager is simply a class that does all the s3 setup for my app.
log.debug("generating downloader")
downloader = s3_manager()
log.debug("accessing bucket")
bucket_archive = downloader.s3_buckets['#archive']
log.debug("getting key")
key = bucket_archive.get_key(archive_filename)
log.debug("getting key into string")
source = key.get_contents_as_string()
the problem is that , looking at my debug logs, i'm making two requests to amazon s3:
key = bucket_archive.get_key(archive_filename)
source = key.get_contents_as_string()
looking at the docs [ http://boto.readthedocs.org/en/latest/ref/s3.html ] , it seems that the call to get_key checks to see if it exists , while the second call gets the actual data. does anyone know of a method to do both at once ? a more concise way of doing this with one request is preferable for our app.
The get_key() method performs a HEAD request on the object to verify that it exists. If you are certain that the bucket and key exist and would prefer not to have the overhead of a HEAD request, you can simply create a Key object directly. Something like this would work:
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket', validate=False)
key = bucket.new_key('myexistingkey')
contents = key.get_contents_as_string()
The validate=False on the call to get_bucket eliminates a GET request that also is intended to validate that the bucket exists.

Deleting logs file in Amazon s3 bucket according to created date

How to delete the log files in Amazon s3 according to date.? I have log files in a logs folder folder inside my bucket.
string sdate = datetime.ToString("yyyy-MM-dd");
string key = "logs/" + sdate + "*" ;
AmazonS3 s3Client = AWSClientFactory.CreateAmazonS3Client();
DeleteObjectRequest delRequest = new DeleteObjectRequest()
.WithBucketName(S3_Bucket_Name)
.WithKey(key);
DeleteObjectResponse res = s3Client.DeleteObject(delRequest);
I tried this but doesn't seem to work. I can delete individual files if I put the whole name in the key. But I want to delete all the log files created for a particular date.
You can use S3's Object Lifecycle feature, specifically Object Expiration, to delete all objects under a given prefix and over a given age. It's not instantaneous, but it beats have to make myriad individual requests. To delete everything, just make the age small.
http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html