Concurrent invalidations of the same object in Cloudfront via Lambda - amazon-s3

I have a problem with concurrent creations of Cloudfront invalidations from AWS Lambda for the same object.
I have set up a Lambda handler to be triggered by specific S3 objects creations and removals, in order to perform invalidation of cached versions on my Cloudfront distribution. This is the function code, written using Python. The code does not detect if an invalidation is currently in progress:
from __future__ import print_function
import boto3
import time
import boto3
from botocore.config import Config
config = Config(
retries = {
'max_attempts': 6,
'mode': 'standard'
}
)
cloudfront = boto3.client('cloudfront', config=config)
def lambda_handler(event, context):
for items in event["Records"]:
path = "/" + items["s3"]["object"]["key"]
print(path)
invalidation = cloudfront.create_invalidation(DistributionId='xxxxx',
InvalidationBatch={
'Paths': {
'Quantity': 1,
'Items': [path]
},
'CallerReference': str(time.time())
})
I wonder how I would tell the function to only trigger when there is no invalidation status of InProgress for that same object?

I wonder how I would tell the function to only trigger when there is no invalidation status of InProgress for that same object?
The function will always trigger. There is no way to tell it to not trigger based on something happening in CloudFront.
However, you could add some logic in the function to only send an invalidation request to CloudFront if one isn't already running for that path. To do this you would list the current invalidations, and then get the details of each invalidation to see if it has the same path.

Related

Import Telethon Session AWS Lambda

I am trying to use Telethon with AWS Lambda. More precisely I am trying get messages from some public channels using client object.
Is there a way to import an existing session in AWS Lambda, in order to prevent Telegram/telethon to ask for a validation code (which is not possible to input) ?
Here is the code I am using to try to connect to telegram through telethon in AWS Lambda :
api_id== os.environ.get('TELEGRAM_API_ID')
api_hash = os.environ.get('TELEGRAM_API_HASH')
userName = os.environ.get('TELEGRAM_USERNAME')
phone = os.environ.get('TELEGRAM_PHONE')
os.chdir("/tmp")
client = TelegramClient(userName, api_id, api_hash)
Here is the session file I have imported in AWS Lambda through Layers (same name as userName) session file
But it seems the session file is not used/read as telethon is asking the verification code and phone number.
Anyone know how to fix this ? Thanks
It took some time, but I found a solution to this problem and ran a Telegram client on Lambda)
All you need to do is use a different session type, namely StringSession.
As indicated in the official documentation, all you need to do is generate a StringSession in your local environment, save the string in a file or local variables and use it in your lambda code.
Generate StringSession, you will see the output in your terminal in this case:
from telethon.sync import TelegramClient
from telethon.sessions import StringSession
with TelegramClient(StringSession(), api_id, api_hash) as client:
print(client.session.save())
Save your newly created StringSession into environment variables in Lambda, as described here and now you can do something like this:
from telethon.sync import TelegramClient
from telethon.sessions import StringSession
import os
string = os.environ.get('session') # env variable named "session"
with TelegramClient(StringSession(string), api_id, api_hash) as client:
client.loop.run_until_complete(client.send_message('me', 'Hi'))

Boto3 generate presinged url does not work

Here is my code that I use to create a s3 client and generate a presigned url, which are some quite standard codes. They have been up running in the server for quite a while. I pulled the code out and ran it locally in a jupyter notebook
def get_s3_client():
return get_s3(create_session=False)
def get_s3(create_session=False):
session = boto3.session.Session() if create_session else boto3
S3_ENDPOINT = os.environ.get('AWS_S3_ENDPOINT')
if S3_ENDPOINT:
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']
AWS_DEFAULT_REGION = os.environ["AWS_DEFAULT_REGION"]
s3 = session.client('s3',
endpoint_url=S3_ENDPOINT,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
region_name=AWS_DEFAULT_REGION)
else:
s3 = session.client('s3', region_name='us-east-2')
return s3
s3 = get_s3_client()
BUCKET=[my-bucket-name]
OBJECT_KEY=[my-object-name]
signed_url = s3.generate_presigned_url(
'get_object',
ExpiresIn=3600,
Params={
"Bucket": BUCKET,
"Key": OBJECT_KEY,
}
)
print(signed_url)
When I tried to download the file using the url in the browser, I got an error message and it says "The specified key does not exist." I noticed in the error message that my object key becomes "[my-bucket-name]/[my-object-name]" rather than just "[my-object-name]".
Then I used the same bucket/key combination to generate a presigned url using aws cli, which is working as expected. I found out that somehow the s3 client method (boto3) inserted [my-object-name] in front of [my-object-name] compared to the aws cli method. Here are the results
From s3.generate_presigned_url()
https://[my-bucket-name].s3.us-east-2.amazonaws.com/[my-bucket-name]/[my-object-name]?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAV17K253JHUDLKKHB%2F20210520%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210520T175014Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=5cdcc38e5933e92b5xed07b58e421e5418c16942cb9ac6ac6429ac65c9f87d64
From aws cli s3 presign
https://[my-bucket-name].s3.us-east-2.amazonaws.com/[my-object-name]?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAYA7K15LJHUDAVKHB%2F20210520%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210520T155926Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=58208f91985bf3ce72ccf884ba804af30151d158d6ba410dd8fe9d2457369894
I've been working on this and searching for solutions for day and half and I couldn't find out what was wrong with my implementation. I guess it might be that I ignored some basic but important settings to create a s3 client using boto3 or something else. Thanks for the help!
Ok, myth is solved, I shouldn't provide the endpoint_url=S3_ENDPOINT param when I create the s3 client, boto3 will figure it out. After i removed it, everything works as expected.

boto3 S3: update `expiry-date` on object

My object has attribute 'Expiration': 'expiry-date="Sun, 16 Jul 2017 00:00:00 GMT"' that define when this object will be deleted - this date set by S3 from lifecycle rule. Is any way to update this date from boto3 to autodelete this object later? By the way I found the same datetime in attribute x-amz-expiration.
While your object is probably already gone, there is already an answered question for that specific topic: s3 per object expiry
tl;dr: Expiration is per S3 bucket, but by emulating touch you can extend the expiry date of individual objects.
As you asked for a boto3-solution for that and such a solution isn't noted in the linked question, here is one with boto3:
#!/usr/bin/env python3
import boto3
client = boto3.client('s3')
# Upload the object initially.
client.put_object(Body='file content',
Bucket='your-bucket',
Key='testfile')
# Replace the object with itself. That will "reset" the expiry timer.
# As S3 only allows that in combination of changing metadata, storage
# class, website redirect location or encryption attributes, simply
# add some metadata.
client.copy_object(CopySource='your-bucket/testfile',
Bucket='your-bucket',
Key='testfile',
Metadata={'foo': 'bar'},
MetadataDirective='REPLACE')

Exporting Cloudwatch Logs to S3 using Lambda

I have some logs in CloudWatch and everyday, I keep getting new logs. Now, I want to store today's and yesterday's logs in Cloud Watch itself but logs that are 2 days older have to be moved to S3.
I have tried using the below code to export CloudWatch Logs to S3 :
import boto3
import collections
region = 'us-east-1'
def lambda_handler(event, context):
s3 = boto3.client('s3')
response = s3.create_export_task(
taskName='export_task',
logGroupName='/aws/lambda/test2',
logStreamNamePrefix='2016/11/29/',
fromTime=1437584472382,
to=1437584472402,
destination='prudhvi1234',
destinationPrefix='AWS'
)
print response
When I run this, I got the following error :
'S3' object has no attribute 'create_export_task': AttributeError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 10, in lambda_handler
response = s3.create_export_task(
AttributeError: 'S3' object has no attribute 'create_export_task'
What might the mistake be?
client = boto3.client('logs')
You are accessing logs from CloudWatch and not S3. Hence the error.
Subsequently
response = client.create_export_task(
taskName='export_task',
logGroupName='/aws/lambda/test2',
logStreamNamePrefix='2016/11/29/',
fromTime=1437584472382,
to=1437584472402,
destination='prudhvi1234',
destinationPrefix='AWS'
)
http://boto3.readthedocs.io/en/latest/reference/services/logs.html#CloudWatchLogs.Client.create_export_task
checkout this link for more information

boto - more concise way to get key's value from bucket?

I'm trying to figure out a concise way to get data from s3 via boto
my current code looks like this. s3 manager is simply a class that does all the s3 setup for my app.
log.debug("generating downloader")
downloader = s3_manager()
log.debug("accessing bucket")
bucket_archive = downloader.s3_buckets['#archive']
log.debug("getting key")
key = bucket_archive.get_key(archive_filename)
log.debug("getting key into string")
source = key.get_contents_as_string()
the problem is that , looking at my debug logs, i'm making two requests to amazon s3:
key = bucket_archive.get_key(archive_filename)
source = key.get_contents_as_string()
looking at the docs [ http://boto.readthedocs.org/en/latest/ref/s3.html ] , it seems that the call to get_key checks to see if it exists , while the second call gets the actual data. does anyone know of a method to do both at once ? a more concise way of doing this with one request is preferable for our app.
The get_key() method performs a HEAD request on the object to verify that it exists. If you are certain that the bucket and key exist and would prefer not to have the overhead of a HEAD request, you can simply create a Key object directly. Something like this would work:
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket', validate=False)
key = bucket.new_key('myexistingkey')
contents = key.get_contents_as_string()
The validate=False on the call to get_bucket eliminates a GET request that also is intended to validate that the bucket exists.