how to achieve idempotent behavior in lambda that writes to S3? - amazon-s3

I have a AWS lambda that does some calculation and writes output to S3 location. The AWS lambda is triggered by cloudwatch cron expression. Since the lambda can be triggered multiple times, I want to modify lamda code such that it handles multiple triggers for the lambda.
The only major side-effect for my lambda is writing to S3 and sending a mail. In this case, how do I ensure the lambda executes multiple times but still ensuring idempotent behavior?

You need a unique id and check on it before processing.
See if the event has one. For example, Sample event for "cron expression" in cloud watch rule suggests that you will get something like this:
{
"version": "0",
"id": "89d1a02d-5ec7-412e-82f5-13505f849b41",
"detail-type": "Scheduled Event",
"source": "aws.events",
"account": "123456789012",
"time": "2016-12-30T18:44:49Z",
"region": "us-east-1",
"resources": [
"arn:aws:events:us-east-1:123456789012:rule/SampleRule"
],
"detail": {}
}
In this case your code would read event.id and write that to S3 (say yourbucket/running/89d1a02d-5ec7-412e-82f5-13505f849b41). When lambda is initiated it can list the keys under "yourbucket/running" and see if event.id matches any of them.
If none matches, create it.
It's not bullet-proof solution, you might conceivably run into some race-condition, say if another lambda fires up while AWS is slow in creating the key but fast at launching the other lambda but that is what you have to live with if you would like to use s3.

Related

Filter EC2 based on tag while sending EC2 Instance State-change Notification through SNS using Cloudwatch event rule

I am trying to configure AWS Event rule using event pattern. Bye default the code is
{
"source": [
"aws.ec2"
],
"detail-type": [
"EC2 Instance State-change Notification"
]
}
I want to filter the EC2 based on tag lets say all of my EC2 has unique AppID attached i.e.20567. Reason I want to filter it because other teams have EC2's under same AWS account and I want to configure SNS only for the instances that belongs to me based on tag 'App ID'
Target I have selected SNS topic and using input formatter with value
{"instance":"$.detail.instance-id","state":"$.detail.state","time":"$.time","region":"$.region","account":"$.account"}
Any suggestion where can I pass tag key value to filter my EC2 Instances.
I can only speak for Cloudwatch Events (now called as EventBridge). We do not get tag information from EC2 prior to rule-matching. A sample EC2 event is shown at https://docs.aws.amazon.com/eventbridge/latest/userguide/event-types.html#ec2-event-type
{
"id":"7bf73129-1428-4cd3-a780-95db273d1602",
"detail-type":"EC2 Instance State-change Notification",
"source":"aws.ec2",
"account":"123456789012",
"time":"2015-11-11T21:29:54Z",
"region":"us-east-1",
"resources":[
"arn:aws:ec2:us-east-1:123456789012:instance/i-abcd1111"
],
"detail":{
"instance-id":"i-abcd1111",
"state":"pending"
}
}
So you best course of action would be to fetch the tags for a resource and filter out the events after reading.

Exporting Cloudwatch logs in original format

I am looking to find a way to export CW logs in their original form to s3. I used the console to export a days worth of logs from a log group, and it seems that a timestamp was prepended on each line, breaking the original JSON formatting. I was looking to import this into glue as a json file for a test transformation script. The original data used is formated as a normal json string when imported to cloudwatch and normally process the data it looks like:
{ "a": 123, "b": "456", "c": 789 }
After exporting and decompressing the data it looks like
2019-06-28T00:00:00.099Z { "a": 123, "b": "456", "c": 789 }
Which breaks reading the line as a json string since its no long a standard format.
The dataset is fairly large(100GB+) for this run, and will possibly grow larger in the future, so running the command a CLI command and processing each line locally isn't feasible in my opinion. Is there any known way to do what I am looking to do?
Thank you
TimeStamps are automatically added when you push the logs to the CloudWatch.
All the log events present in the CloudWatch has timestamp.
You can create a subscription filter to Kinesis Firehose and on Kinesis using lambda function you can formate the log events(remove the timestamp) then store the logs in the S3.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html

How can I delete an existing S3 event notification?

When I try to delete an event notification from S3, I get the following message:
In Text:
Unable to validate the following destination configurations. Not authorized to invoke function [arn:aws:lambda:eu-west-1:FOOBAR:function:FOOBAR]. (arn:aws:lambda:eu-west-1:FOOBAR:function:FOOBAR, null)
Nobody in my organization seems to be able to delete that - not even admins.
When I try to set the same S3 event notification in AWS Lambda as a trigger via the web interface, I get
Configuration is ambiguously defined. Cannot have overlapping suffixes in two rules if the prefixes are overlapping for the same event type. (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument; Request ID: FOOBAR; S3 Extended Request ID: FOOBAR/FOOBAR/FOOBAR)
How can I delete that existing event notification? How can I further investigate the problem?
I was having the same problem tonight and did the following:
1) Issue the command:
aws s3api put-bucket-notification-configuration --bucket=mybucket --notification-configuration="{}"
2) In the console, delete the troublesome event.
Assuming you have better permissions from the CLI:
aws s3api put-bucket-notification-configuration --bucket=mybucket --notification-configuration='{"LambdaFunctionConfigurations": []}'
retrieve all the notification configurations of a specific bucket
aws s3api get-bucket-notification-configuration --bucket=mybucket > notification.sh
the notification.sh file would look like the following
{
"LambdaFunctionConfigurations": [
{
"Id": ...,
"LambdaFunctionArn": ...,
"Events": [...],
"Filter": {...},
},
{ ... },
]
}
remove the notification object from the notification.sh
modify the notification.sh like the following
#! /bin/zsh
aws s3api put-bucket-notification-configuration --bucket=mybucket --notification-configuration='{
"LambdaFunctionConfigurations": [
{
"Id": ...,
"LambdaFunctionArn": ...,
"Events": [...],
"Filter": {...},
},
{ ... },
]
}'
run the shell script
source notification.sh
There is no 's3api delete notification-configuration' in AWS CLI. Only the 's3api put-bucket-notification-configuration' is present which will override any previously existing events in the s3 bucket. So, if you wish to delete a specific event only you need to handle that programatically.
Something like this:
Step 1. Do a 's3api get-bucket-notification-configuration' and get the s3-notification.json file.
Step 2. Now edit this file to reach the required s3-notification.json file using your code.
Step 3. Finally, do 's3api put-bucket-notification-configuration' (aws s3api put-bucket-notification-configuration --bucket my-bucket --notification-configuration file://s3-notification.json)
i had worked on the logic in AWS CLI, it requires a jq command to merge the json output
I tried but doesnt work for me, I uploaded a lambda with the same name of function but without events, after go to the function in the dashboard and add trigger with the same prefix and suffix, when apply changes the dashboard says error, but if you come back to function lambda, you can see the trigger now is linked to lambda, so after you can remove tha lambda or events

Amazon Step Function with a Lambda that takes trigger from Kinesis

So I am trying to create a simple pipeline in Amazon AWS. I want to execute a step function using data generated by a stream which triggers the first lambda of the state machine
What I want to do is following.
Input data is streamed by AWS Kinesis
This Kinesis stream is used as a trigger for a lambda1 that executes and writes to S3 Bucket.
This would trigger (using step function) a lambda2 that would read the content from the given bucket and write it to another bucket
Now I want to implement a state machine using Amazon Step Function. I have created the state machine which is quite straightforward
{
"Comment": "Linear step function test",
"StartAt": "lambda1",
"States": {
"lambda1": {
"Type": "Task",
"Resource": "arn:....",
"Next": "lambda2"
},
"lambda2": {
"Type": "Task",
"Resource": "arn:...",
"End": true
}
}
}
What I want is, that Kinesis should trigger the first Lambda and once its executed the step function would execute lambda 2. Which does not seem to happen. Step function does nothing even though my Lambda 1 is triggered from the stream and writing to S3 bucket. I have an option to manually start a new execution and pass a JSON as input, but that is not the work flow I am looking for
you did wrong to kick off State machine.
you need to add another Starter Lambda function to use SDK to invoke State Machine. The process is like this:
kinesis -> starter(lambda) -> StateMachine (start Lambda 1 and Lambda 2)
The problem of using Step Function is lack of triggers. There are only 3 triggers which are CloudWatch Events, SDK or API Gateway.

Checking a SQL Azure Database is available from a c# code

I do an up scale with a code like this on an Azure SQL Database:
ALTER DATABASE [MYDATABASE] Modify (SERVICE_OBJECTIVE = 'S1');
How is it possible to know from a c# code when Azure has completed the job and the table is already available?
Checking for SERVICE_OBJECTIVE value is not enough, the process still continue further.
Instead of performing this task in T-SQL I would perform the task from C# using an API call over to the REST API, you can find all of the details on MSDN.
Specifically, you should look at the Get Create or Update Database Status API method which allows you to call the following URL:
GET https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/Microsoft.Sql/servers/{server-name}/databases/{database-name}}/operationResults/{operation-id}?api-version={api-version}
The JSON body allows you to pass the following parameters:
{
"id": "{uri-of-database}",
"name": "{database-name}",
"type": "{database-type}",
"location": "{server-location}",
"tags": {
"{tag-key}": "{tag-value}",
...
},
"properties": {
"databaseId": "{database-id}",
"edition": "{database-edition}",
"status": "{database-status}",
"serviceLevelObjective": "{performance-level}",
"collation": "{collation-name}",
"maxSizeBytes": {max-database-size},
"creationDate": "{date-create}",
"currentServiceLevelObjectiveId":"{current-service-id}",
"requestedServiceObjectiveId":"{requested-service-id}",
"defaultSecondaryLocation": "{secondary-server-location}"
}
}
In the properties section, the serviceLevelObjective property is the one you can use to resize the database. To finish off you can then perform a GET on the Get Database API method where you can compare both the currentServiceLevelObjectiveId and requestedServiceObjectiveId properties to ensure your command has been successful.
Note: Don't forget to pass all of the common parameters required to make API calls in Azure.