How to troubleshoot logging to cloudwatch from docker in an ECS container? - amazon-cloudwatch

I'm tearing my hair out over this.
Here's my setup:
ECS Container on an EC2 Instance that contains an
ECS Task Definition that runs a
Docker instance with a python script that logs to stderr
I see a Cloudwatch log group fo the ECS Task get created, but nothing I print to stderr appears.
My ECS container has the default ecsInstanceRole.
The ecsInstanceRole has the AmazonEC2ContainerServiceforEC2Role policy which is as follows:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:CreateCluster",
"ecs:DeregisterContainerInstance",
"ecs:DiscoverPollEndpoint",
"ecs:Poll",
"ecs:RegisterContainerInstance",
"ecs:StartTelemetrySession",
"ecs:UpdateContainerInstancesState",
"ecs:Submit*",
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
Task definition
{
"containerDefinitions": [
{
"dnsSearchDomains": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "torres",
"awslogs-region": "us-west-2",
"awslogs-stream-prefix": "torres"
}
},
"command": [
"sleep",
"infinity"
],
"cpu": 100,
"mountPoints": [
{
"readOnly": null,
"containerPath": "/mnt/efs",
"sourceVolume": "efs"
}
],
"memoryReservation": 512,
"image": "X.dkr.ecr.us-west-2.amazonaws.com/torres-docker:latest",
"interactive": true,
"essential": true,
"pseudoTerminal": true,
"readonlyRootFilesystem": false,
"privileged": false,
"name": "torres"
}
],
"family": "torres",
"volumes": [
{
"name": "efs",
"host": {
"sourcePath": "/mnt/efs"
}
}
]
}

I figured out the issue, but I don't know why it solves it.
When I changed my command from sleep infinity to just running a piece of code that logs python script.py the logging in CloudWatch started working.
Before I was launching the container with sleep infinity ssh-ing into the container and launching script.py from the shell and this was NOT logging.

Related

Set Subnet ID and EC2 Key Name in EMR Cluster Config via Step Functions

As of November 2019 AWS Step Function has native support for orchestrating EMR Clusters. Hence we are trying to configure a Cluster and run some jobs on it.
We could not find any documentation on how to set the SubnetId as well as the Key Name used for the EC2 instances in the cluster. Is there any such possibility?
As of now our create cluster step looks as following:
"States": {
"Create an EMR cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "TestCluster",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.26.0",
"Applications": [
{ "Name": "spark" }
],
"ServiceRole": "SomeRole",
"JobFlowRole": "SomeInstanceProfile",
"LogUri": "s3://some-logs-bucket/logs",
"Instances": {
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "MasterFleet",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge"
}
]
},
{
"Name": "CoreFleet",
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 2,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 100 }
]
}
]
}
},
"ResultPath": "$.cluster",
"End": "true"
}
}
As soon as we try to add "SubnetId" key in any of the subobjects in Parameters, or in Parameter itself we get the error:
Invalid State Machine Definition: 'SCHEMA_VALIDATION_FAILED: The field "SubnetId" is not supported by Step Functions at /States/Create an EMR cluster/Parameters' (Service: AWSStepFunctions; Status Code: 400; Error Code: InvalidDefinition;
Referring to the SF docs on the emr integration we can see that createCluster.sync uses the emr API RunJobFlow. In RunJobFlow we can specify the Ec2KeyName and Ec2SubnetId located at the paths $.Instances.Ec2KeyName and $.Instances.Ec2SubnetId.
With that said I managed to create a State Machine with the following definition (on a side note, your definition had a syntax error with "End": "true", which should be "End": true)
{
"Comment": "A Hello World example of the Amazon States Language using Pass states",
"StartAt": "Create an EMR cluster",
"States": {
"Create an EMR cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "TestCluster",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.26.0",
"Applications": [
{
"Name": "spark"
}
],
"ServiceRole": "SomeRole",
"JobFlowRole": "SomeInstanceProfile",
"LogUri": "s3://some-logs-bucket/logs",
"Instances": {
"Ec2KeyName": "ENTER_EC2KEYNAME_HERE",
"Ec2SubnetId": "ENTER_EC2SUBNETID_HERE",
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "MasterFleet",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge"
}
]
},
{
"Name": "CoreFleet",
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 2,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 100
}
]
}
]
}
},
"ResultPath": "$.cluster",
"End": true
}
}
}

Can step functions wait on a static website?

If I deploy a static website with s3 and api gateway, is there any way for a step function to wait for some activity, then redirect the user on that static website to another?
WeCanBeFriends,
This is possible using the Job Status Poller pattern, but tweaked slightly. If the "Job" is to deploy the website, then the condition to "Complete Job" is to see some activity come in (ideally through cloudwatch metrics).
Once you see enough metrics to be ok with your deployment, you can either do a push notification to the webapp to inform it to redirect (using a lambda function that calls SNS - as in the wait timer sample) or have the webapp poll the execution status until it's complete.
Below I've posted a very simple variation to the Job Status Poller to illustrate my example:
{
"Comment": "A state machine that publishes to SNS after a deployment completes.",
"StartAt": "StartDeployment",
"States": {
"StartDeployment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:012345678912:function:KickOffDeployment",
"ResultPath": "$.guid",
"Next": "CheckIfDeploymentComplete"
},
"CheckIfDeploymentComplete": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:012345678912:function:CheckIfDeploymentComplete",
"Next": "TriggerWebAppRefresh",
"InputPath": "$.guid",
"ResultPath": "$.status",
"Retry": [ {
"ErrorEquals": [ "INPROGRESS" ],
"IntervalSeconds": 5,
"MaxAttempts": 240,
"BackoffRate": 1.0
} ],
"Catch": [ {
"ErrorEquals": ["FAILED"],
"Next": "DeploymentFailed"
}]
},
"DeploymentFailed": {
"Type": "Fail",
"Cause": "Deployment failed",
"Error": "Deployment FAILED"
},
"TriggerWebAppRefresh": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:012345678912:function:SendSNSToWebapp",
"InputPath": "$.guid",
"End": true
}
}
}

Unable to access S3 from EC2 Instance in Cloudformation -- A client error (301) occurred when calling the HeadObject operation: Moved Permanently

I'm trying to download a file from an S3 bucket to an instance through the userdata property of the instance. However, I get the error:
A client error (301) occurred when calling the HeadObject operation:
Moved Permanently.
I use an IAM Role, Managed Policy, and Instance Profile to give the instance accessibility to the s3 bucket:
"Role": {
"Type": "AWS::IAM::Role",
"Properties": {
"AssumeRolePolicyDocument": {
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": [
"ec2.amazonaws.com",
"s3.amazonaws.com"
]
},
"Action": [
"sts:AssumeRole"
]
}
]
},
"Path": "/",
"ManagedPolicyArns": [
{
"Ref": "ManagedPolicy"
}
]
},
"Metadata": {
"AWS::CloudFormation::Designer": {
"id": "069d4411-2718-400f-98dd-529bb95fd531"
}
}
},
"RolePolicy": {
"Type": "AWS::IAM::Policy",
"Properties": {
"PolicyName": "S3Download",
"PolicyDocument": {
"Statement": [
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::mybucket/*"
}
]
},
"Roles": [
{
"Ref": "Role"
}
]
},
"Metadata": {
"AWS::CloudFormation::Designer": {
"id": "babd8869-948c-4b8a-958d-b1bff9d3063b"
}
}
},
"InstanceProfile": {
"Type": "AWS::IAM::InstanceProfile",
"Properties": {
"Path": "/",
"Roles": [
{
"Ref": "Role"
}
]
},
"Metadata": {
"AWS::CloudFormation::Designer": {
"id": "890c4df0-5d25-4f2c-b81e-05a8b8ab37c4"
}
}
},
And I attempt to download the file using this line in the userdata property:
aws s3 cp s3://mybucket/login.keytab
destination_directory/
Any thoughts as to what is going wrong? I can download the file successfully if I make it public then use wget from the command line, but for some reason the bucket/file can't be found when using cp and the file isn't publicly accessible.
Moved Permanently normally indicates that you are being redirected to the location of the object. This is normally because the request is being sent to an endpoint that is in a different region.
Add a --region parameter where the region matches the bucket's region. For example:
aws s3 cp s3://mybucket/login.keytab destination_directory/ --region ap-southeast-2
you can modify /root/.aws/credentials file and add region like region = ap-southeast-2

Create env fails when using a daemonset to create processes in Kubernetes

I want to deploy a software in to nodes with daemonset, but it is not a docker app. I created a daemonset json like this :
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "uniagent"
},
"annotations": {
"scheduler.alpha.kubernetes.io/tolerations": "[{\"key\":\"beta.k8s.io/accepted-app\",\"operator\":\"Exists\", \"effect\":\"NoSchedule\"}]"
},
"enable": true
},
"spec": {
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"processes": [
{
"name": "foundation",
"package": "xxxxx",
"resources": {
"limits": {
"cpu": "100m",
"memory": "1Gi"
}
},
"lifecyclePlan": {
"kind": "ProcessLifecycle",
"namespace": "engb",
"name": "app-plc"
},
"env": [
{
"name": "SECRET_USERNAME",
"valueFrom": {
"secretKeyRef": {
"name": "key-secret",
"key": "uniagentuser"
}
}
},
{
"name": "SECRET_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "key-secret",
"key": "uniagenthash"
}
}
}
]
},
when the app deploy succeeds, the env variables do not exist at all.
What should I do to solve this problem?
Thanks
Daemon Sets have to be docker containers. You can't have non-containerized programs run as Daemon Sets. https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/ Kubernetes only launches containers.
Also in your YAML manifest file, I see a "processes" key and I have reason to believe it's not a valid manifest file, so I doubt you deployed it successfully.
You have not pasted the "full" YAML file, but I'm guessing the "template" key at the beginning is the spec.template key of the file.
Run kubectl explain daemonset.spec.template.spec and you'll see that there is no "processes" field.

Take Scheduled EBS Snapshots using Cloud Watch and Cloud Formation

The task seems to be simple: I want to take scheduled EBS snapshots of my EBS volumes on a daily basis. According to the documentation, CloudWatch seems to be the right place to do that:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/events/TakeScheduledSnapshot.html
Now I want to create such a scheduled rule when launching a new stack with CloudFormation. For this, there is a new resource type AWS::Events::Rule:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-events-rule.html
But now comes the tricky part: How can I use this resource type to create a built-in target that creates my EBS snapshot, like described in the above scenario?
I'm pretty sure there is way to do it, but I can't help myself right now. My resource template looks like this at the moment:
"DailyEbsSnapshotRule": {
"Type": "AWS::Events::Rule",
"Properties": {
"Description": "creates a daily snapshot of EBS volume (8 a.m.)",
"ScheduleExpression": "cron(0 8 * * ? *)",
"State": "ENABLED",
"Targets": [{
"Arn": { "Fn::Join": [ "", "arn:aws:ec2:", { "Ref": "AWS::Region" }, ":", { "Ref": "AWS::AccountId" }, ":volume/", { "Ref": "EbsVolume" } ] },
"Id": "SomeId1"
}]
}
}
Any ideas?
I found the solution to this over on a question about how to do this in terraform. The solution in plain CloudFormation JSON seems to be.
"EBSVolume": {
"Type": "AWS::EC2::Volume",
"Properties": {
"Size": 1
}
},
"EBSSnapshotRole": {
"Type": "AWS::IAM::Role",
"Properties": {
"AssumeRolePolicyDocument": {
"Version" : "2012-10-17",
"Statement": [ {
"Effect": "Allow",
"Principal": {
"Service": ["events.amazonaws.com", "ec2.amazonaws.com"]
},
"Action": ["sts:AssumeRole"]
}]
},
"Path": "/",
"Policies": [{
"PolicyName": "root",
"PolicyDocument": {
"Version" : "2012-10-17",
"Statement": [ {
"Effect": "Allow",
"Action": [
"ec2:CreateSnapshot"
],
"Resource": "*"
} ]
}
}]
}
},
"EBSSnapshotRule": {
"Type": "AWS::Events::Rule",
"Properties": {
"Description": "creates a daily snapshot of EBS volume (1 a.m.)",
"ScheduleExpression": "cron(0 1 * * ? *)",
"State": "ENABLED",
"Name": {"Ref": "AWS::StackName"},
"RoleArn": {"Fn::GetAtt" : ["EBSSnapshotRole", "Arn"]},
"Targets": [{
"Arn": {
"Fn::Join": [
"",
[
"arn:aws:automation:",
{"Ref": "AWS::Region"},
":",
{"Ref": "AWS::AccountId"},
":action/",
"EBSCreateSnapshot/EBSCreateSnapshot_",
{"Ref": "AWS::StackName"}
]
]
},
"Input": {
"Fn::Join": [
"",
[
"\"arn:aws:ec2:",
{"Ref": "AWS::Region"},
":",
{"Ref": "AWS::AccountId"},
":volume/",
{"Ref": "EBSVolume"},
"\""
]
]
},
"Id": "EBSVolume"
}]
}
}
Unfortunately, it is not (yet) possible to set up scheduled EBS snapshots via CloudWatch Events within a CloudFormation stack.
It is a bit hidden in the docs: http://docs.aws.amazon.com/AmazonCloudWatchEvents/latest/APIReference/API_PutTargets.html
Note that creating rules with built-in targets is supported only in the AWS Management Console.
And "EBSCreateSnapshot" is one of these so-called "built-in targets".
Amazon seem to have removed their "built-in" targets and now its become possible to create Cloudwatch rules to schedule EBS snapshots.
First you must create a rule, which will be used to attach targets to.
Replace XXXXXXXXXXXXX with your aws account-id
aws events put-rule \
--name create-disk-snapshot-for-ec2-instance \
--schedule-expression 'rate(1 day)' \
--description "Create EBS snapshot" \
--role-arn arn:aws:iam::XXXXXXXXXXXXX:role/AWS_Events_Actions_Execution
Then you simply add your targets (up to 10 targets allowed per rule).
aws events put-targets \
--rule create-disk-snapshot-for-ec2-instance \
--targets "[{ \
\"Arn\": \"arn:aws:automation:eu-central-1:XXXXXXXXXXXXX:action/EBSCreateSnapshot/EBSCreateSnapshot_mgmt-disk-snapshots\", \
\"Id\": \"xxxx-yyyyy-zzzzz-rrrrr-tttttt\", \
\"Input\": \"\\\"arn:aws:ec2:eu-central-1:XXXXXXXXXXXXX:volume/<VolumeId>\\\"\" \}]"
There's a better way to automate EBS snapshots these days, using DLM (Data Lifecycle Manager). It's also available through Cloudformation. See these for details:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/snapshot-lifecycle.html
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-dlm-lifecyclepolicy.html