How can AWS Lambda pick the latest versions of of the script from S3 - amazon-s3

I have a S3 bucket and we are using as code repository to store our Lambda code, which is then read by lambda.
The S3 bucket is version so that every time we upload the script again( after altering the code) there is a new version of the zip file created for the existing file.
Now I want Lambda to automatically pickup the latest version of the zip file automatically instead of me altering it manually in the CloudFormation templet and running it OR attaching it manually to the Lambda every time.

I was able to resolve the issue ,so just wanted to post the solution for reference:
Followed the below steps:
Make sure that the name of the Lambda function and the name of the zip file ( deployment package) is exactly the same.
Create a Lambda that will be triggered when you upload any new code in your S3 bucket.
Process the event information and use s3 API it to fetch the latest version of the file from s3.
Use boto3 API to reconfigure your final Lambda
import boto3
import json
client = boto3.client("s3")
lambda_client = boto3.client("lambda")
def lambda_handler(event, context):
bucket = event["Records"][0]["s3"]["bucket"]["name"]
file = event["Records"][0]["s3"]["object"]["key"]
get_version = client.get_object(
Bucket = bucket,
Key = file
)
versionId = (get_version["VersionId"]) #Getting the latest version of the code
update_lambda = lambda_client.update_function_code(
FunctionName= file.split("/")[-1].split(".")[0],
S3Bucket=bucket,
S3Key=file,
S3ObjectVersion= versionId
)

If you want to deploy a new version of a Lambda function's code automatically as it is uploaded to the S3 bucket, then you can use S3 Event Notifications to e.g. notify an SNS topic and subscribe another Lambda function which performs the deployment (such as via CloudFormation or AWS SDK deploy lambda function).

Related

AWS SageMaker Notebook's Default S3 Bucket - Cant Access Uploaded Files within Notebook

In SageMaker Studio, I created directories and uploaded files to my SageMaker's default S3 bucket using the GUI, and was exploring how to work with those uploaded files using a SageMaker Studio Notebook.
Within the SageMaker Studio Notebook, I ran
sess = sagemaker.Session()
bucket = sess.default_bucket() #sagemaker-abcdef
prefix = "folderJustBelowRoot"
conn = boto3.client('s3')
conn.list_objects(Bucket=bucket, Prefix=prefix)
# this returns a response dictionary with the corresponding metadata, which includes 'HTTPStatusCode': 200, 'server': 'AmazonS3' => which means the request-response was successful
What I dont understand is why the 'Contents' key and its value are missing from the 'conn.list_objects' dictionary response?
And when I go to 'my SageMaker's default bucket' in the S3 console, I am wondering why my uploaded files are not appearing.
===============================================================
I was expecting
the response from conn.list_objects(Bucket=bucket, Prefix=prefix) to contain the 'Contents' key (within my SageMaker Studio Notebook)
the S3 console to show the files I uploaded to 'my SageMaker's default bucket'
Question 2: And when I go to 'my SageMaker's default bucket' in the S3 console, I am wondering why my uploaded files are not appearing.
It seems that when you upload files from your local desktop/laptop onto AWS SageMaker Studio using the GUI, your files are in the the Elastic Block Storage/EBS of your SageMaker Studio instance.
To access the following items within your SageMaker Studio instance:
Folder Path - "subFolderLayer1/subFolderLayer2/subFolderLayer3" => to access 'subFolderLayer3'
File Path - "subFolderLayer1/subFolderLayer2/subFolderLayer3/fileName.extension" => to access 'fileName.extension' within your subFolderLayers
=========
To access the files on the default S3 storage bucket for your AWS SageMaker instance, first identify it by
sess = sagemaker.Session()
bucket = sess.default_bucket() #sagemaker-abcdef
Then go to the bucket and upload your files and folders. When you have done that, move to the response for question 1.
=================================================================
Question 1: What I dont understand is why the 'Contents' key and its value are missing from the 'conn.list_objects' dictionary response?
prefix = "folderJustBelowYourBucket"
conn = boto3.client('s3')
conn.list_objects(Bucket=bucket, Prefix=prefix)
The 'conn.list_objects' dictionary response now contains a 'Contents' key, containing a list of metadata as its values - 1 metadata dictionary for each file/folder within that 'prefix'/'folderJustBelowYourBucket'.
You can upload and download files from Amazon SageMaker to Amazon S3 using SageMaker Python SDK. SageMaker S3 utilities provides S3Uploader and S3Downloader classes to easily work with S3 from within SageMaker studio notebooks.
A comment about the 'file system' in your question 2, the files are stored onto SageMaker Studio user profile Amazon Elastic File System (Amazon EFS) volume, and not EBS(SageMaker classic notebooks uses EBS volumes). Refer this blog for more detailed overview of SageMaker architecture

How to dynamically change the "S3 object key" in AWS CodePipeline when source is S3 bucket

I am trying to use S3 bucket as source for CodePipeline. We want to save source code version like "1.0.1" or "1.0.2" in S3 bucket each time we trigger Jenkins pipeline dynamically as source which is saved in S3 bucket. But since the "S3 object key" is not dynamic we cant able to build artifact based on version numbers which is generated dynamically by Jenkins. Is there a way to make the "S3 object key" dynamic and take value from Jenkins pipeline when code pipeline is triggered.
Not possible natively but you can do that by writing your own Lambda function. It’d require Lambda as it’s a restriction with CodePipeline that you’ve to specify a fixed object key name while setting up the pipeline.
So, let’s say you’ve 2 pipelines, CircleCI (CCI) & CodePipeline (CP). CCI generates some files and push it to your S3 bucket (S3-A). Now, you want CP to pick up the latest zip file as a source. But since the latest zip file will be having different names (1.0.1 or 1.0.2), you can’t do that dynamically.
So, on that S3 bucket (S3-A), you can have have S3 event notification trigger enabled with your custom Lambda function. Whenever any new object gets uploaded to that S3 bucket (S3-A), your Lambda function will be triggered, it’ll fetch the latest uploaded object to that S3 bucket (S3-A), zip/unzip that object and push it to an another S3 bucket (S3-B) with some fixed name like file.zip with which you’ll configure your CP with as a source. As there’s a new object with file.zip in your S3 bucket (S3-B), your CP will be triggered automatically.
PS: You’ll have to write your own Lambda function such that it’ll do all those above operations like zipping/unzipping up the newly uploaded object in S3-A, uploading it to S3-B, etc.

How to create a log and export to S3 bucket by executing a Python Lambda function

From my Lambda Python code, I'm trying to create a log to my S3 bucket (s3://my_bucket/logs/) and throwing an error. It is working fine when I execute outside & generating logs.
Error is below:
**[Errno 2] No such file or directory: '/var/task/s3:/my_bucket/logs/error.log':
FileNotFoundError.**
When this line of code is encountered in my local environment, it is creating the log properly:
`LOGFILE_PATH = "D:\logs\error.log"`
And when i tried using Lambda after updating it to :
LOGFILE_PATH = "s3://my_bucket/logs/" . It is throwing the error when executed using Lambda function.
LOGFILE_PATH = "D:\logs\error.log" -- working code in local exec.
It should generate the log in my S3 bucket. But it's not creating it. I doubt whether we can write logs to S3 from Lambda execution?
Thanks.
The notation you've used s3://my_bucket/logs/ is not a real address, it's a kind of shorthand, mostly only used when using the AWS CLI s3 service, that won't work in the same way as a URL or file system path; If you want to write to a bucket (instead of a local file) then from a python lambda you should probably be using boto3 and its s3 client to store the file - it also depends on what exactly you're doing in your code with the LOGFILE_PATH variable.

How to trigger Update of an AWS Cloudformation stack when an object is uploaded to an S3 bucket?

We have a list of AWS Lambda functions that are deployed using AWS CloudFormation and their code is placed in an Amazon S3 bucket.
We update the Lambda code by uploading the latest code to S3 and running the update-stack command which has an S3 object version parameter. So when object version change is detected (ie: new Lambda code is uploaded to s3) we run the update stack command and then CloudFormation deploys the new code.
I was think of automating this process by triggering the stack update when an object is uploaded to S3. Can this be done?
You can use an Amazon S3 Event to trigger an AWS Lambda function when an object is uploaded to Amazon S3.
See: Using AWS Lambda with Amazon S3
This Lambda function could then call the CloudFormation UpdateStack() function to update the stack. You can use your choice of language in the Lambda function.

Apache Airflow: operator to copy s3 to s3

What is the best operator to copy a file from one s3 to another s3 in airflow?
I tried S3FileTransformOperator already but it required either transform_script or select_expression. My requirement is to copy the exact file from source to destination.
Use S3CopyObjectOperator
copy_step = S3CopyObjectOperator(
source_bucket_key='source_file',
dest_bucket_key='dest_file',
aws_conn_id='aws_connection_id',
source_bucket_name='source-bucket',
dest_bucket_name='dest-bucket'
)
You have 2 options (even when I disregard Airflow)
Use AWS CLI: cp command
aws s3 cp <source> <destination>
In Airflow this command can be run using BashOperator (local machine) or SSHOperator (remote machine)
Use AWS SDK aka boto3
Here you'll be using boto3's S3Client
Airflow already provides a wrapper over it in form of S3Hook
Even copy_object(..) method of S3Client is available in S3Hook as (again) copy_object(..)
You can use S3Hook inside any suitable custom operator or just PythonOperator