Make CloudWatch Event pass an integer timestamp instead of string of UTC time - amazon-cloudwatch

I'm invoking a scheduled Step Function with a CloudWatch event. The input of the first batch job in the step function state machine is like the following:
{
"version": "0",
"id": "sdjlafgdf05-7c32-435hf-aa3b5a8sfade815",
"detail-type": "Scheduled Event",
"source": "aws.events",
"account": "xxxxxxxx",
"time": "2022-01-14T19:46:49Z",
"region": "us-east-1",
"resources": [
"arn:aws:events:us-east-1:xxxxxxxxxxxx:rule/adfnwelkqnlngqrej-SAFFJKHF734"
],
"detail": {}
}
I want the "time" field can give me the integer. Specifically, instead of "2022-01-14T19:46:49Z", I want "1642189609" (epoch in seconds), so I don't need to parse it in my batch job code. I'm using CDK to build the infrastructure. Is there any way of doing this?

At the moment this is not possible directly with native Step Functions "Intrinsic Functions". So you have two options:
Do it via a lambda as explained in this other answer
Pass it first to EventBridge and then create a rule for EventBridge to foward it to CloudWatch

Related

How to save queries executed by Athena in LogsGroup CloudWatch

I want to save the requests executed by Athena in a LogsGroup of the CloudWatch service.
In CloudWatch, I created this rule:
{
"source": [
"aws.athena"
],
"detail-type": [
"Athena Query State Change"
],
"detail": {
"currentState": [
"QUEUED",
"RUNNING",
"SUCCEEDED",
"FAILED",
"CANCELLED"
]
}
}
And, I attached the rule to a CloudWatch LogsGroup like this:
LogsGroup
I managed to register logs in CloudWatch -> Log groups -> /aws/events/TestAthena but I don't have the information I want:
{
"version": "0",
"id": "a8bad43b-1b9a-da7e-c004-f3c920e1bddd",
"detail-type": "Athena Query State Change",
"source": "aws.athena",
"account": "<account_id>",
"time": "2021-08-23T15:54:13Z",
"region": "eu-west-3",
"resources": [],
"detail": {
"currentState": "RUNNING",
"previousState": "QUEUED",
"queryExecutionId": "b0fe7373-676d-43d5-b866-19d701c9dc56",
"sequenceNumber": "2",
"statementType": "DML",
"versionId": "0",
"workgroupName": "dev-Connect-CardBulk"
}
}
I wish to have :
The request executed
The time the request was executed
The user who executed the request
It is possible to have this with CloudWatch ?
Thank you in advance for your help,
Out of the box, you can have QueryPlanningTime, QueryQueuetime etc. metrics.
Nonetheless, you need Cloudtrail to track who executed.
Refer to these links:
List of CloudWatch Metrics and Dimensions for Athena
Monitoring Amazon Athena Queries using Amazon CloudWatch

How to monitor a EMR cluster whether it is terminated using cloudwatch

I want to set alarm, when any EMR cluster is terminated(caused by internal errors), I know there is a "IsIdle" option, but my EMR clusters are designed to be persistent, so "IsIdle" is not really fit my case. Is there a health-check metric that I can used?
You can configure Amazon CloudWatch to send a "State Change" event to another service like an AWS Lambda function or an Amazon SNS topic.
To achieve this, open the CloudWatch console, in the navigation pane click on Rules > Create rule.
Service Name: EMR
Event Type: State Change
Specific detail type(s): EMR Cluster State Change
Specific State: TERMINATED and TERMINATED_WITH_ERRORS
Targets: Put the receiving service of your choice.
Here's an example of such an event:
{
"version": "0",
"id": "8535abb0-f87e-4640-b7b6-8de000dfc30a",
"detail-type": "EMR Cluster State Change",
"source": "aws.emr",
"account": "123456789012",
"time": "2016-12-16T21:00:23Z",
"region": "us-east-1",
"resources": [],
"detail": {
"severity": "INFO",
"stateChangeReason": "{\"code\":\"USER_REQUEST\",\"message\":\"Terminated by user request\"}",
"name": "Development Cluster",
"clusterId": "j-1YONHTCP3YZKC",
"state": "TERMINATED",
"message": "Amazon EMR Cluster j-1YONHTCP3YZKC (Development Cluster) has terminated at 2016-12-16 21:00 UTC with a reason of USER_REQUEST."
}
}

Cloudwatch rule to match ssm hierarchy

I'd like to create a cloudwatch rule to trigger an action whenever a SSM parameter in a given hiearchy is updated (in my example anything in the /config hierarchy)
If I put a rule matching the whole name of the parameter the action gets triggered correctly.
I tried the following thus far:
{
"source": [
"aws.ssm"
],
"detail-type": [
"Parameter Store Change"
],
"detail": {
"name": [
"/config/",
"/config/*",
"/config/%"
],
"operation": [
"Update"
]
}
}
Is there any way to achieve such behaviour ?
Not exactly what you want, but you can leave off the "name" array entirely. You will then get notifications for all parameters, and then filter from the message receive side.

Cloudwatch event for out of region creation

I am trying to create a auto-remediation process that will stop/delete any VPC, Cloudformation Stack, VPC, Lambda, Internet Gateway or EC2 created outside of the eu-central-1 region. My first step is to parameter a CloudWatch event rule to detect any of the previously mentioned event.
{
"source": [
"aws.cloudtrail"
],
"detail-type": [
"AWS API Call via CloudTrail"
],
"detail": {
"eventSource": [
"ec2.amazonaws.com",
"cloudformation.amazonaws.com",
"lambda.amazonaws.com"
],
"eventName": [
"CreateStack",
"CreateVpc",
"CreateFunction20150331",
"CreateInternetGateway",
"RunInstances"
],
"awsRegion": [
"us-east-1",
"us-east-2",
"us-west-1",
"us-west-2",
"ap-northeast-1",
"ap-northeast-2",
"ap-south-1",
"ap-southeast-1",
"ap-southeast-2",
"ca-central-1",
"ap-south-1",
"eu-west-1",
"eu-west-2",
"eu-west-3"
"sa-east-1"
]
}
}
For now, the event should only trigger an SNS topic that will send me an email, but in the future there will be a lambda fonction to do the remediation.
Unfortunately, when I go create an Internet Gateway in another region (let's say eu-west-1), no notification occur. The Event does not appear if I want to set an alarm on it either, while it does appear in CloudWatch Events).
Any idea what could be wrong with my event config?
OK, I figured it out. The source of the event changes even if the notification comes from CloudTrail. The "source" parameters should therefore be:
"source": [
"aws.cloudtrail",
"aws.ec2",
"aws.cloudformation",
"aws.lambda"
]

AWS data pipeline activity with multiple inputs

As part of an Amazon AWS data pipeline, I have a hive activity using two unstaged S3 data nodes as input. What I want is to be able to set two script variables on the activity, each pointing to an input data node, but I can't get the syntax right. With the single input, I could write the following and it would work just fine:
INPUT_FOO=#{input.directoryPath}
When I add the second input, I run into a problem of how to reference them since they are now an array of inputs, as you can see in the pipeline definition below. Essentially, I want to achieve the following, but can't figure out the correct syntax:
INPUT_FOO=#{input[1].directoryPath}
INPUT_BAR=#{input[2].directoryPath}
Here's the activity portion of the pipeline definition:
{
"id": "ActivityId_7u1sR",
"input": [
{
"ref": "DataNodeId_iYnxf"
},
{
"ref": "DataNodeId_162Ka"
}
],
"schedule": {
"ref": "DefaultSchedule"
},
"scriptUri": "#{myS3ScriptLocation}calculate-results.q",
"name": "Perform Calculations",
"runsOn": {
"ref": "EmrClusterId_jHeiV"
},
"scriptVariable": [
"INPUT_SOURCE1=#{input[1].directoryPath}",
"OUTPUT=#{output.directoryPath}Results/",
"INPUT_SOURCE2=#{input[2].directoryPath}"
],
"output": {
"ref": "DataNodeId_2jY6v"
},
"type": "HiveActivity",
"stage": "false"
}
I plan to keep the tables unstaged and take care of table creation in the hive script so that it's easier to run each Hive activity in isolation as well as in the pipeline itself.
Here's the error I see when using array syntax:
Unable to resolve input[1].directoryPath for object ActivityId_7u1sR'
As it stands now, this scenario is not supported, but a feature request was added to support it in the future.