Is it reasonable to be concerned an SES object won't be available in S3? - amazon-s3

I've setup SES Rule in the following way:
Actions:
1) S3: Saves SES object to an S3 bucket
2) Lambda: Triggers my lambda function for email processing
In my testing, I've always been able to retrieve my SES object from the bucket using the messageID in the very first line of code. I'm then able to parse and read it without issue.
My question is, is it reasonable to be concerned that the SES object may not always be immediately available? I'm considering adding error handling incase the object isn't there. Basically to wait 1/2 a second and try again until the lambda times out. But I don't want to complicate the code if this is not a reasonable concern, handled by boto3, ect. Thoughts?

In your case, it is best to use only one S3 action configured with a notification on a SNS topic and have your Lambda subscribe to this topic.
Your Lambda will receive a SNS event containing a stringified SES event in the message:
{
"Records": [
{
"EventSource": "aws:sns",
"EventVersion": "1.0",
...
"Sns": {
"Type": "Notification",
"MessageId": "...",
"TopicArn": "...",
"Subject": "Amazon SES Email Receipt Notification",
"Message": "<STRINGIFIED SES EVENT>",
...
}
}
]
}
If you parse the Message, you will get something like this:
{
"notificationType": "Received",
"mail": {
"timestamp": "...",
"source": "...",
"messageId": "...",
"destination": [
...
],
"headersTruncated": false,
"headers": [
...
],
"commonHeaders": {
"returnPath": "...",
"from": [
"..."
],
"date": "...",
"to": [
...
],
"messageId": "...",
"subject": "..."
}
},
"receipt": {
...
"action": {
"type": "S3",
"topicArn": "...",
"bucketName": "<YOUR_BUCKET>",
"objectKey": "<YOUR_OBJECT_KEY>"
}
}
}
where you will find the exact reference to the uploaded object in your bucket (receipt.action.bucketName and receipt.action.objectKey).
With this setup, it reasonable to consider that when your Lambda is triggered the object is available.

Related

How do I use ContentMD5 Properly in Step Function Definition when using PutObject on an ObjectLock Enabled S3 Bucket

Currently, I have written a Step Function definition that uses PutObject SDK to an S3 bucket that has ObjectLock Enabled. Because of Object Lock enabled on my S3, I need to pass ContentMD5 from the Definition. This is the definition I am currently using:
{
"Comment": "PutObject to S3",
"StartAt": "PutObject",
"States": {
"PutObject": {
"Type": "Task",
"End": true,
"Parameters": {
"Body": "test data",
"Bucket": "worm-bucket-test",
"Key": "logs.txt",
"ContentMD5": "States.Base64Encode(States.Hash('test data', 'MD5'))" },
"Resource": "arn:aws:states:::aws-sdk:s3:putObject",
"Resource": "arn:aws:states:::aws-sdk:s3:putObject",
"Catch": [ {
"ErrorEquals": [ "States.TaskFailed" ],
"Next": "Wait1Sec"
} ]
},
"Wait1Sec": {
"Type": "Wait",
"Seconds": 1,
"Next": "PutObject"
}
}
}
Unfortunately, I continue to receive the following error:
{
"Error": "S3.S3Exception",
"Cause": "The Content-MD5 you specified was invalid. (Service: S3, Status Code: 400, Request ID: xxx, Extended Request ID: xxx)"
}
I am able to create a Lambda, and write code that handles Content-MD5 data to S3, but my goal is to have this same functionality from SF to S3 directly without having to use a Lambda function. Any help will be much appreciated.

How to pass AWS Lambda error in AWS SNS notification through AWS Step Functions?

I have created an AWS Step Function which triggers a Lambda python code, terminates without error if Lambda succeeds, otherwise calls an SNS topic to message the subscribed users if the Lambda fails. It is running, but the message was fixed. The Step Function JSON is as follows:
{
"StartAt": "Lambda Trigger",
"States": {
"Lambda Trigger": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-2:xxxxxxxxxxxx:function:helloworldTest",
"End": true,
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error",
"Next": "Notify Failure"
}
]
},
"Notify Failure": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"Message": "Batch job submitted through Step Functions failed with the following error, $.error",
"TopicArn": "arn:aws:sns:us-east-2:xxxxxxxxxxxx:lambda-execution-failure"
},
"End": true
}
}
}
Only thing is, I want to append the failure error message to my message string, which I tried, but is not working as expected.
But I get a mail as follows:
How to go about it?
I could solve the problem using "Error.$": "$.Cause".
The following is a working example of the failure portion of state machine:
"Job Failure": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"Subject": "Lambda Job Failed",
"Message": {
"Alarm": "Lambda Job Failed",
"Error.$": "$.Cause"
},
"TopicArn": "arn:aws:sns:us-east-2:xxxxxxxxxxxx:Job-Run-Notification"
},
"End": true
}
Hope this helps!
Here is the full version of the code
{
"Comment": "A Hello World example of the Amazon States Language using an AWS Lambda function",
"StartAt": "HelloWorld",
"States": {
"HelloWorld": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:XXXXXXXXXXXXX:function:StepFunctionTest",
"End": true,
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Next": "NotifyFailure"
}
]
},
"NotifyFailure": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"Subject": "[ERROR]: Task failed",
"Message": {
"Alarm": "Batch job submitted through Step Functions failed with the following error",
"Error.$": "$.Cause"
},
"TopicArn": "arn:aws:sns:us-east-1:XXXXXXXXXXXXX:Notificaiton"
},
"End": true
}
}
}
This line is already appending exception object to 'error' path.
"ResultPath": "$.error"
We just need pass '$' to Message.$ to SNS task, both input and error details will be sent to SNS.
{
"TopicArn":"${SnsTopic}",
"Message.$":"$"
}
if we don't want input to Lambda to be appended in email, we should skip ResultPath or have just '$' as ResultPath, input object is ignored.
"ResultPath": "$"

Can step functions wait on a static website?

If I deploy a static website with s3 and api gateway, is there any way for a step function to wait for some activity, then redirect the user on that static website to another?
WeCanBeFriends,
This is possible using the Job Status Poller pattern, but tweaked slightly. If the "Job" is to deploy the website, then the condition to "Complete Job" is to see some activity come in (ideally through cloudwatch metrics).
Once you see enough metrics to be ok with your deployment, you can either do a push notification to the webapp to inform it to redirect (using a lambda function that calls SNS - as in the wait timer sample) or have the webapp poll the execution status until it's complete.
Below I've posted a very simple variation to the Job Status Poller to illustrate my example:
{
"Comment": "A state machine that publishes to SNS after a deployment completes.",
"StartAt": "StartDeployment",
"States": {
"StartDeployment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:012345678912:function:KickOffDeployment",
"ResultPath": "$.guid",
"Next": "CheckIfDeploymentComplete"
},
"CheckIfDeploymentComplete": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:012345678912:function:CheckIfDeploymentComplete",
"Next": "TriggerWebAppRefresh",
"InputPath": "$.guid",
"ResultPath": "$.status",
"Retry": [ {
"ErrorEquals": [ "INPROGRESS" ],
"IntervalSeconds": 5,
"MaxAttempts": 240,
"BackoffRate": 1.0
} ],
"Catch": [ {
"ErrorEquals": ["FAILED"],
"Next": "DeploymentFailed"
}]
},
"DeploymentFailed": {
"Type": "Fail",
"Cause": "Deployment failed",
"Error": "Deployment FAILED"
},
"TriggerWebAppRefresh": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:012345678912:function:SendSNSToWebapp",
"InputPath": "$.guid",
"End": true
}
}
}

AWS SES S3 process inbound email

I'm working on a publish by email system based on AWS SES. For all incoming emails I've set routing to save messages in an S3 bucket so I can asynchronously process them. The problem I have is that the messages are saved in the S3 bucket in a raw format: headers, email body, etc + the encrypted attachment (a huge string) - all in a single file.
Is there a way to break the email message apart form the attachment and save both in separate files at AWS SES level? I'm trying to get the data in the format I need straight from AWS and avoid adding another processing step to the process.
If AWS SES doesn't provide such a feature, what would be the proper way to process these messages to obtain the result described above?
It doesn't look possible to have SES automatically split up the email for you. As per the documentation here:
Amazon SES provides you the raw, unmodified email, which is typically
in Multipurpose Internet Mail Extensions (MIME) format.
I would use S3 or SNS to trigger a Lambda function whenever SES puts a new email file to S3. The Lambda function could split the file however you wish, then write those new files to another S3 bucket.
For anyone coming back later on to this question, this is the link to the JSON structure that you get when you invoke a Lambda function from SES.
http://docs.aws.amazon.com/ses/latest/DeveloperGuide/receiving-email-notifications-examples.html
It took some searching to arrive at that page ;-)
From the link, a Lambda notification would look like this,
{
"notificationType": "Received",
"receipt": {
"timestamp": "2015-09-11T20:32:33.936Z",
"processingTimeMillis": 406,
"recipients": [
"recipient#example.com"
],
"spamVerdict": {
"status": "PASS"
},
"virusVerdict": {
"status": "PASS"
},
"spfVerdict": {
"status": "PASS"
},
"dkimVerdict": {
"status": "PASS"
},
"action": {
"type": "S3",
"topicArn": "arn:aws:sns:us-east-1:012345678912:example-topic",
"bucketName": "my-S3-bucket",
"objectKey": "\email"
}
},
"mail": {
"timestamp": "2015-09-11T20:32:33.936Z",
"source": "0000014fbe1c09cf-7cb9f704-7531-4e53-89a1-5fa9744f5eb6-000000#amazonses.com",
"messageId": "d6iitobk75ur44p8kdnnp7g2n800",
"destination": [
"recipient#example.com"
],
"headersTruncated": false,
"headers": [
{
"name": "Return-Path",
"value": "<0000014fbe1c09cf-7cb9f704-7531-4e53-89a1-5fa9744f5eb6-000000#amazonses.com>"
},
{
"name": "Received",
"value": "from a9-183.smtp-out.amazonses.com (a9-183.smtp-out.amazonses.com [54.240.9.183]) by inbound-smtp.us-east-1.amazonaws.com with SMTP id d6iitobk75ur44p8kdnnp7g2n800 for recipient#example.com; Fri, 11 Sep 2015 20:32:33 +0000 (UTC)"
},
{
"name": "DKIM-Signature",
"value": "v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=ug7nbtf4gccmlpwj322ax3p6ow6yfsug; d=amazonses.com; t=1442003552; h=From:To:Subject:MIME-Version:Content-Type:Content-Transfer-Encoding:Date:Message-ID:Feedback-ID; bh=DWr3IOmYWoXCA9ARqGC/UaODfghffiwFNRIb2Mckyt4=; b=p4ukUDSFqhqiub+zPR0DW1kp7oJZakrzupr6LBe6sUuvqpBkig56UzUwc29rFbJF hlX3Ov7DeYVNoN38stqwsF8ivcajXpQsXRC1cW9z8x875J041rClAjV7EGbLmudVpPX 4hHst1XPyX5wmgdHIhmUuh8oZKpVqGi6bHGzzf7g="
},
{
"name": "From",
"value": "sender#example.com"
},
{
"name": "To",
"value": "recipient#example.com"
},
{
"name": "Subject",
"value": "Example subject"
},
{
"name": "MIME-Version",
"value": "1.0"
},
{
"name": "Content-Type",
"value": "text/plain; charset=UTF-8"
},
{
"name": "Content-Transfer-Encoding",
"value": "7bit"
},
{
"name": "Date",
"value": "Fri, 11 Sep 2015 20:32:32 +0000"
},
{
"name": "Message-ID",
"value": "<61967230-7A45-4A9D-BEC9-87CBCF2211C9#example.com>"
},
{
"name": "X-SES-Outgoing",
"value": "2015.09.11-54.240.9.183"
},
{
"name": "Feedback-ID",
"value": "1.us-east-1.Krv2FKpFdWV+KUYw3Qd6wcpPJ4Sv/pOPpEPSHn2u2o4=:AmazonSES"
}
],
"commonHeaders": {
"returnPath": "0000014fbe1c09cf-7cb9f704-7531-4e53-89a1-5fa9744f5eb6-000000#amazonses.com",
"from": [
"sender#example.com"
],
"date": "Fri, 11 Sep 2015 20:32:32 +0000",
"to": [
"recipient#example.com"
],
"messageId": "<61967230-7A45-4A9D-BEC9-87CBCF2211C9#example.com>",
"subject": "Example subject"
}
}
}
Regarding the question on how to write a Lambda. Here is a portion of our Lambda. The main thing to take out of it is the parseEvent function. and data.event.Records[0] which will give you details
exports.handler = function(event, context, callback) {
var AWS = require('aws-sdk');
// Validate characteristics of a SES event record.
if (!event ||
!event.hasOwnProperty('Records') ||
event.Records.length !== 1 ||
event.Records[0].hasOwnProperty('eventSource') ||
event.Records[0].eventSource !== 'aws:ses' ||
event.Records[0].eventVersion !== '1.0') {
callback(null, {'disposition':'STOP_RULE_SET'});
}
email = data.event.Records[0].ses.mail;
subjectLine = event.Records[0].ses.mail.commonHeaders.subject;
}
The key is the event.Record[0].ses.mail. Unfortunately, I can't find the structure of it via a Google search, I am sure I had seen it before.

Invalid Path error while inserting job from google cloud storage to google bigquery

I am trying to insert a job through HTTP Post request, but i am getting Invalid path error.
My request body is as follows:
{
"configuration": {
"load": {
"sourceUris": [
"gs://onianalytics/PersData.csv"
],
"schema": {
"fields": [
{
"name": "Name",
"type": "STRING"
},
{
"name": "Age",
"type": "INTEGER"
}
]
},
"destinationTable": {
"datasetId": "Test_Dataset",
"projectId": "lithe-anvil-404",
"tableId": "tb_test_Pers"
}
}
},
"jobReference": {
"jobId": "10",
"projectId": "lithe-anvil-404"
}
}
For the sourceuri parameter, I am passing "gs://onianalytics/PersData.csv", where onianalytics is my bucket name and PersData.csv is my csv file (from which I want to upload data into google bigquery).
I am getting below response:
"status": {
"state": "DONE",
"errorResult": {
"reason": "invalid",
"message": "Invalid path: gs://onianalytics/PersData.csv"
},
"errors": [
{
"reason": "invalid",
"message": "Invalid path: gs://onianalytics/PersData.csv"
}
]
},
"statistics": {
"creationTime": "1387276603674",
"startTime": "1387276603751",
"endTime": "1387276603751"
}
}
My bucket is under the same projectid which has the BigQuery service activated. Also, I have Google Cloud Storage enabled under APIs and Auth. Following scopes are added while authenticating:
googleapis.com/auth/bigquery, googleapis.com/auth/cloud-platform, googleapis.com/auth/devstorage.full_control,googleapis.com/auth/devstorage.read_only,googleapis.com/auth/devstorage.read_write
I am inserting this job by "Try it!" link which is available on developers.google.com/bigquery/docs/reference/v2/jobs/insert.
In fact I am able to create buckets and objects in goggle cloud storage through APIs. But when i try to insert job from the uploaded object (which is a csv file), i got "Invalid Path" error. Can anyone please help me to identify why this error is occurring?
The error I get when trying the code above is "Not found: URI gs://onianalytics/PersData.csv".
I'm wondering if instead of /onianalytics/ you had a different path with invalid characters?