Should we handle a lambda container crash? - amazon-s3

Reading a lot about error handling for AWS Lambdas and nothing covers to topic of a running Lambda container just crashing.
Is this a possibility because it seems like one? I'm building an event driven system using Lambdas, triggered by a file upload to S3 and am uncertain if I should bother building in logic to pickup processing if a lambda has died.
e.g. File object is create on S3 -> S3 notifies Lambda of the event -> Lambda instance happens to crash before it can start processing -> Event is now gone forever* (assumption here, I'm unsure if that's true, but can't find anything to say the contrary).
I'm debating building in logic to reconcile what is on S3 and what was processed each day so I can detect the (albeit rare) scenario where a Lambda died (died and couldn't write a failure to a DLQ) and we need to process these files. Is this worth it? Would S3 somehow know that the lambda died and it needs to put the event on a DLQ of its own?

From https://docs.aws.amazon.com/fr_fr/lambda/latest/dg/with-s3.html AWS S3 are async.
Next from https://docs.aws.amazon.com/lambda/latest/dg/invocation-retries.html, Async lambda invocation are retries twice without any queuing.
I guess if more tries are needed, better to setup a SNS/SQS queuing.

Related

How to architect a solution where one Lambda writes to S3 on a timer and another Lambda reads from it on demand

We need to implement two Lambdas, one writes a file to S3 and another reads that file. "write" Lambda is on a timer, "read" Lambda is on demand. Not sure about best practices and options to synchronize these independent processes. Please advise on some options.
What you describe is directly supported by AWS, so the setup is simple:
Configure the first Lambda to be triggered on a schedule.
Configure the second Lambda to be triggered by S3 new object notifications.

Load Control Function in AWS Step Function

An AWS Step Function State Machine has a Lambda Function at its core, that does heavy writes to a S3 bucket. When the State Machine gets a usage spike, the function starts failing due to S3 blocking further requests (com.amazonaws.services.s3.model.AmazonS3Exception: Please reduce your request rate.). This obviously leads to failures of the state machine execution as a whole and it takes the whole system some minutes to fully recover.
I looked into the AWS Lambda Function Scaling Documentation and found out, that when we reduce the reserved concurrency flag, the function will start to return 429 status codes, as soon as it can't handle new events.
So my idea to load control the function execution can be summarized as following:
Set the reserved concurrency to some lower value.
Catching the 429 errors in the step function and retrying with a backoff rate.
I'd like to have feedback from you guys, on the following aspects:
a. Does my approach make sense or am I missing some obvious better way? I first thought of looking into managing the load with AWS SQS or some execution wide locking/semaphore but didn't really see any further.
b. Is there maybe another way to tackle the issue from the S3 side?
This approach worked well for me:
States:
MyFunction:
Type: Task
End: true
Resource: "..."
Retry:
- ErrorEquals:
- TooManyRequestsException
IntervalSeconds: 30
MaxAttemtps: 5
BackoffRate: 2

Aws lambda getting executed multiple times

I have implemented a simple lambda function which gets triggered whenever there is objected created on s3 bucket.
Whenever an object is created on S3 the lambda gets triggered.However , once the lambda is triggered, the lambda keeps executing at a certain interval even if there is upload on S3 bucket.
Any suggestions would be really helpful.
Your function is timing out because you aren't calling the callback or using the context.succeed() method. I believe retry is two with backoff for errors, but with timeout, S3 will keep retrying for a period of time that is not guaranteed but is usually quite long (a day?)

AWS: Broadcast notifications for multiple worker processes running on multiple instances

I have multiple application instances inside of Amazon EC2, each running several worker processes. What I want is each worker process to be subscribed to some notification(e.g. configuration change). This notification should be basically broadcast message, so that once it is sent - every worker receives it.
I know SQS does not support messages broadcast. Looking through similar questions/threads I see the suggestions to use SNS instead of SQS. I'm not sure this will work for me due to the following reasons:
application instances are part of autoscaling group so they can be dynamically added and removed. In this case I don't see any clear way to unsubscribe every worker(I have multiple workers per instance) once instance gets terminated, which means I'll end up with the mess of dead subscribers after some time.
protocol to use for subscription is also not clear. HTTP endpoint looks like the only option, which means my every worker should run HTTP server on its own port. It also looks I should listen only on instance public IP, which adds one more layer of complexity and insecurity.
At the moment I have a solution based on third party - I'm using 0MQ pub/sub server. But I'm looking for some out-of-box solutions AWS provides.
Thanks,
Vovan
The out-of-the-box AWS solution that comes to mind would be to create one SNS topic, and then for each instance, when the instance boots up, it would create its own SQS queue and subscribe the queue to the SNS topic, so that each individual queue gets a broadcast copy of each message you publish to SNS.
You'd want unsubscribe and delete these queues on instance termination, which could be done with lifecycle hooks.
If you didn't want to use a server to manage the processing of the lifecycle hooks (which publish the launch or termination events to SNS or SQS) you could create an AWS API Gateway endpoint to fire an AWS Lambda function, then subscribe the API Gateway endpoint to the SNS topic using https, to handle the cleanup tasks in Lambda, with no server needed.
That's several services working together and may sound a little complicated, but would be very inexpensive and require little maintenance or attention.
One more solution I've figured out is to use Amazon Kinesis.The implication here is that each subscriber has to maintain it's own checkpoint to receive only most recent notifications.
I realize this is an old thread, but I'd like to share my experience with this. Kinesis has a 5 reads/sec throttle. So if you have 10 nodes polling for events in the stream 1/sec, you're going to be in a constant state of throttling.
Kinesis looks to be primarily for massive writes with just a few readers, which doesn't quite fit a broadcast to many nodes use-case.
Redis is handy solution for broadcasting a message to all subscribers on a topic. It is convenient because it can be used as a docker container for rapid prototyping, but is also offered by AWS as a managed service for multi-node clusters.

How can I get notification about new S3 objects?

I have a scenario where we have many clients uploading to s3.
What is the best approach to knowing that there is a new file?
Is it realistic/good idea, for me to poll the bucket ever few seconds?
UPDATE:
Since November 2014, S3 supports the following event notifications:
s3:ObjectCreated:Put – An object was created by an HTTP PUT operation.
s3:ObjectCreated:Post – An object was created by HTTP POST operation.
s3:ObjectCreated:Copy – An object was created an S3 copy operation.
s3:ObjectCreated:CompleteMultipartUpload – An object was created by the completion of a S3 multi-part upload.
s3:ObjectCreated:* – An object was created by one of the event types listed above or by a similar object creation event added in the future.
s3:ReducedRedundancyObjectLost – An S3 object stored with Reduced Redundancy has been lost.
These notifications can be issued to Amazon SNS, SQS or Lambda. Check out the blog post that's linked in Alan's answer for more information on these new notifications.
Original Answer:
Although Amazon S3 has a bucket notifications system in place it does not support notifications for anything but the s3:ReducedRedundancyLostObject event (see the GET Bucket notification section in their API).
Currently the only way to check for new objects is to poll the bucket at a preset time interval or build your own notification logic in the upload clients (possibly based on Amazon SNS).
Push notifications are now built into S3:
http://aws.amazon.com/blogs/aws/s3-event-notification/
You can send notifications to SQS or SNS when an object is created via PUT or POST or a multi-part upload is finished.
Your best option nowadays is using the AWS Lambda service. You can write a Lambda using either node.js javascript, java or Python (probably more options will be added in time).
The lambda service allows you to write functions that respond to events from S3 such as file upload. Cost effective, scalable and easy to use.
You can implement a pub-sub mechanism relatively simply by using SNS, SQS, and AWS Lambda. Please see the below steps. So, whenever a new file is added to the bucket a notification can be raised and acted upon that (everything is automated)
Please see attached diagram explaining the basic pub-sub mechanism
Step 1
Simply configure the S3 bucket event notification to notify an SNS topic. You can do this from the S3 console (Properties tab)
Step 2
Make an SQS Queue subscribed to this topic. So whenever an object is uploaded to the S3 bucket a message will be added to the queue.
Step 3
Create an AWS Lambda function to read messages from the SQS queue. AWS Lambda supports SQS events as a trigger. Therefore, whenever a message appears in the SQS queue, Lambda will trigger and read the message. Once successfully read the message it will be automatically deleted from the queue. For the messages that can't be processed by Lambda (erroneous messages) will not be deleted. So those messages will pile up in the queue. To prevent this behavior using a Dead Letter Queue (DLQ) is a good idea.
In your Lambda function, add your logic to handle what to do when users upload files to the bucket
Note: DLQ is nothing more than a normal queue.
Step 4
Debugging and analyzing the process
Make use of AWS Cloud watch to log details. Each Lambda function creates a log under log groups. This is a good place to check if something went wrong.