Set Subnet ID and EC2 Key Name in EMR Cluster Config via Step Functions - amazon-emr

As of November 2019 AWS Step Function has native support for orchestrating EMR Clusters. Hence we are trying to configure a Cluster and run some jobs on it.
We could not find any documentation on how to set the SubnetId as well as the Key Name used for the EC2 instances in the cluster. Is there any such possibility?
As of now our create cluster step looks as following:
"States": {
"Create an EMR cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "TestCluster",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.26.0",
"Applications": [
{ "Name": "spark" }
],
"ServiceRole": "SomeRole",
"JobFlowRole": "SomeInstanceProfile",
"LogUri": "s3://some-logs-bucket/logs",
"Instances": {
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "MasterFleet",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge"
}
]
},
{
"Name": "CoreFleet",
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 2,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 100 }
]
}
]
}
},
"ResultPath": "$.cluster",
"End": "true"
}
}
As soon as we try to add "SubnetId" key in any of the subobjects in Parameters, or in Parameter itself we get the error:
Invalid State Machine Definition: 'SCHEMA_VALIDATION_FAILED: The field "SubnetId" is not supported by Step Functions at /States/Create an EMR cluster/Parameters' (Service: AWSStepFunctions; Status Code: 400; Error Code: InvalidDefinition;

Referring to the SF docs on the emr integration we can see that createCluster.sync uses the emr API RunJobFlow. In RunJobFlow we can specify the Ec2KeyName and Ec2SubnetId located at the paths $.Instances.Ec2KeyName and $.Instances.Ec2SubnetId.
With that said I managed to create a State Machine with the following definition (on a side note, your definition had a syntax error with "End": "true", which should be "End": true)
{
"Comment": "A Hello World example of the Amazon States Language using Pass states",
"StartAt": "Create an EMR cluster",
"States": {
"Create an EMR cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "TestCluster",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.26.0",
"Applications": [
{
"Name": "spark"
}
],
"ServiceRole": "SomeRole",
"JobFlowRole": "SomeInstanceProfile",
"LogUri": "s3://some-logs-bucket/logs",
"Instances": {
"Ec2KeyName": "ENTER_EC2KEYNAME_HERE",
"Ec2SubnetId": "ENTER_EC2SUBNETID_HERE",
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "MasterFleet",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge"
}
]
},
{
"Name": "CoreFleet",
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 2,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 100
}
]
}
]
}
},
"ResultPath": "$.cluster",
"End": true
}
}
}

Related

OpenDistro Kibana Index Policy Stopped Working

I've been using the below Index Management Policy for a little less than a year now with no issue, but sometime a few months back it apparently stopped working, as all indices which should fall into the "delete" category are now stuck in "Evaluating transition conditions..." state. I have been searching for possible changes to syntax, but have not found any. I am also not aware of any updates having been performed for either the host machine or Kibana/Elastic. What could possibly be the issue?
{
"policy_id": "delete_14d",
"description": "Deletes old indices after 14 days",
"last_updated_time": 1661536875977,
"schema_version": 1,
"error_notification": null,
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"read_write": {}
}
],
"transitions": [
{
"state_name": "Delete",
"conditions": {
"min_index_age": "14d"
}
}
]
},
{
"name": "Delete",
"actions": [
{
"delete": {}
}
],
"transitions": []
}
],
"ism_template": {
"index_patterns": [
"staging*"
],
"priority": 100,
"last_updated_time": 1632510094716
}
}

How is S3 bucket name being derived in CloudFormation?

I've this cloudformation script template.js that creates a bucket. I'm bit unsure how the bucket name is being assembled.
Assuming my stackname is my-service I'm getting bucket name created as my-service-s3bucket-1p3s4szy5bomf
I want to know how this name was derived
I also want to get rid of that arn at the end. -1p3s4szy5bomf
Can I skip Outputs at the end, Not sure what they do
Code in template.js
var stackTemplate = {
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "with S3",
"Resources": {
"S3Bucket": {
"Type": "AWS::S3::Bucket",
"DeletionPolicy": "Retain",
"Properties": {},
"Metadata": {
"AWS::CloudFormation::Designer": {
"id": "bba483af-4ae6-4d3d-b37d-435f66c42e44"
}
}
},
"S3BucketAccessPolicy": {
"Type": "AWS::IAM::Policy",
"Properties": {
"PolicyName": "S3BucketAccessPolicy",
"Roles": [
{
"Ref": "IAMServiceRole"
}
],
"PolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:List*"
],
"Resource": [
{
"Fn::Sub": [
"${S3BucketArn}",
{
"S3BucketArn": {
"Fn::GetAtt": ["S3Bucket", "Arn"]
}
}
]
},
{
"Fn::Sub": [
"${S3BucketArn}/*",
{
"S3BucketArn": {
"Fn::GetAtt": ["S3Bucket", "Arn"]
}
}
]
}
]
}
]
}
}
}
},
"Outputs": {
"s3Bucket": {
"Description": "The created S3 bucket.",
"Value": {
"Ref": "S3Bucket"
},
"Export": {
"Name": {
"Fn::Sub": "${AWS::StackName}-S3Bucket"
}
}
},
"s3BucketArn": {
"Description": "The ARN of the created S3 bucket.",
"Value": {
"Fn::GetAtt": ["S3Bucket", "Arn"]
},
"Export": {
"Name": {
"Fn::Sub": "${AWS::StackName}-S3BucketArn"
}
}
}
}
};
stackUtils.assembleStackTemplate(stackTemplate, module);
I want to know how this name was derived
If you don't specify a name for your bucket, CloudFormation generate a new one based on the pattern $name-of-stack-s3bucket-$generatedId
from documentation https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-s3-bucket.html
BucketName
A name for the bucket. If you don't specify a name, AWS CloudFormation generates a unique ID and uses that ID for the bucket name.
I also want to get rid of that arn at the end. -1p3s4szy5bomf
You can assign a name of you bucket, but AWS recommand to let it empty to generate a new one, to avoid creation with the same name (stackset...) by CloudFormation example :
"Resources": {
"S3Bucket": {
"Type": "AWS::S3::Bucket",
"DeletionPolicy": "Retain",
"Properties": {
"BucketName": "DesiredNameOfBucket" <==
},
"Metadata": {
"AWS::CloudFormation::Designer": {
"id": "bba483af-4ae6-4d3d-b37d-435f66c42e44"
}
}
},
Can I skip Outputs at the end, Not sure what they do
It is used to have the information, name and the ARN of the bucket created, if you want you can delete the Outputs part from your template

Alexa testing console doesn't ask to me

I'm working with an alexa skill, I tested it in AWS and it works but when i go to test tab to try to ask something the only result is " " what should be the problem? my skill name is "test" that has an intent called "sensor" that recive a {sensor}variable type. To trigger Alexa I use to ask "Alexa, start test sensor 1234" but the answer is the one I indicated above.
"interactionModel": {
"languageModel": {
"invocationName": "test",
"intents": [
{
"name": "AMAZON.CancelIntent",
"samples": []
},
{
"name": "AMAZON.HelpIntent",
"samples": []
},
{
"name": "AMAZON.StopIntent",
"samples": []
},
{
"name": "AMAZON.NavigateHomeIntent",
"samples": []
},
{
"name": "sensor",
"slots": [
{
"name": "sensorid",
"type": "AMAZON.NUMBER"
}
],
"samples": [
"sensore {sensorid}",
]
}
],
"types": []
}
}
}

How to get currently running containers of a service using Docker Engine API?

I'm trying to get the currently running containers of a service to visualize them like in Portainer.io.
Portainer shows the currently running machines and replicas like 5/8.
I can get desired replica number using engine api with /services endpoint.
What I couldn't find is currently running containers of a service.
Service endpoint returns a result like;
{
"ID": "frf43534t43543t43gt435",
"Version": {
"Index": 10936
},
"CreatedAt": "2019-12-11T14:36:03.361254384Z",
"UpdatedAt": "2019-12-11T14:40:19.911714617Z",
"Spec": {
"Name": "connector-service",
"Labels": {
"com.docker.stack.image": "connector",
"com.docker.stack.namespace": "conn"
},
"TaskTemplate": {
"ContainerSpec": {
"Image": "connector:latest",
"Labels": {
"com.docker.stack.namespace": "conn"
},
"Hostname": "connector-service{{.Task.Slot}}",
"Env": [
"CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR=3",
"CONNECT_STATUS_STORAGE_REPLICATION_FACTOR=3"
],
"Privileges": {
"CredentialSpec": null,
"SELinuxContext": null
},
"Isolation": "default"
},
"Resources": {},
"Placement": {},
"Networks": [
{
"Target": "sfer32432fr4ewt4r3g4tr54",
"Aliases": [
"connector-service"
]
}
],
"ForceUpdate": 0,
"Runtime": "container"
},
"Mode": {
"Replicated": {
"Replicas": 6
}
},
"EndpointSpec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8083,
"PublishedPort": 8083,
"PublishMode": "ingress"
}
]
}
},
"Endpoint": {
"Spec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8083,
"PublishedPort": 8083,
"PublishMode": "ingress"
}
]
},
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8083,
"PublishedPort": 8083,
"PublishMode": "ingress"
}
],
"VirtualIPs": [
{
"NetworkID": "safcedsvcsg4425r32dsf",
"Addr": "10.0.0.55/24"
},
{
"NetworkID": "sfsfe4233fr3g435432greg43",
"Addr": "10.0.3.11/24"
}
]
}
}
I've realized that in engine API containers can be retrieved with two endpoints; first one is /containers and second one is /tasks. In order to get running containers of a service /tasks endpoint with two filters can be used for example; http://192.168.4.142:1777/v1.40/tasks?filters={"service":{"my-service":true},"desired-state":{"running":true}}
This endpoint returns total number of running containers for a service, /services endpoint returns desired number so one can find how many of the desired containers are running.

Create env fails when using a daemonset to create processes in Kubernetes

I want to deploy a software in to nodes with daemonset, but it is not a docker app. I created a daemonset json like this :
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "uniagent"
},
"annotations": {
"scheduler.alpha.kubernetes.io/tolerations": "[{\"key\":\"beta.k8s.io/accepted-app\",\"operator\":\"Exists\", \"effect\":\"NoSchedule\"}]"
},
"enable": true
},
"spec": {
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"processes": [
{
"name": "foundation",
"package": "xxxxx",
"resources": {
"limits": {
"cpu": "100m",
"memory": "1Gi"
}
},
"lifecyclePlan": {
"kind": "ProcessLifecycle",
"namespace": "engb",
"name": "app-plc"
},
"env": [
{
"name": "SECRET_USERNAME",
"valueFrom": {
"secretKeyRef": {
"name": "key-secret",
"key": "uniagentuser"
}
}
},
{
"name": "SECRET_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "key-secret",
"key": "uniagenthash"
}
}
}
]
},
when the app deploy succeeds, the env variables do not exist at all.
What should I do to solve this problem?
Thanks
Daemon Sets have to be docker containers. You can't have non-containerized programs run as Daemon Sets. https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/ Kubernetes only launches containers.
Also in your YAML manifest file, I see a "processes" key and I have reason to believe it's not a valid manifest file, so I doubt you deployed it successfully.
You have not pasted the "full" YAML file, but I'm guessing the "template" key at the beginning is the spec.template key of the file.
Run kubectl explain daemonset.spec.template.spec and you'll see that there is no "processes" field.