Fargate environment variable redis.yaml - aws-fargate

I have a microservice and I need to pass in a file redis.yaml to configure Elasticache for Redis.
Assume I have a file called redis.yaml with contents:
clusterServersConfig:
idleConnectionTimeout: 10000
pingTimeout: 1000
connectTimeout: 10000
timeout: 60000
retryAttempts: 3
retryInterval: 60000
And my application.properties I use:
redis.config.location=file:/opt/usr/conf/redis.yaml
In Kubernetes, I can just create a secret with --from-file redis.yaml and the application runs properly.
I do not know how to do the same with AWS Fargate. I believe it could be done with AWS SSM but any help/steps on how to do it would be appreciated.

For externalized configuration, Fargate supports environment variables. Environment variables can be passed in Task definition.
"environment": [
{ "name": "env_name1", "value": "value1" },
{ "name": "env_name2", "value": "value2" }
]
If it's sensitive information, store it in AWS SSM-Parameter store (you can use KMS) and specify the parameter key in the task definition.
{
"containerDefinitions": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:ssm:region:aws_account_id:parameter/parameter_name"
}]
}]
}
In your case, you can convert your yaml to JSON and store it in the Parameter store and refer it in the task definition.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/specifying-sensitive-data.html

Related

Define a condition in cloudformation template with alarms

How to define/declare a condition to create an alarm in prod?
With the condition:Isprod would work to create an alarm in prod?
WOULD this work? how to define a condition below?
LambdaInvocationsAlarm:
Condition: IsProd
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Lambda invocations
AlarmName: LambdaInvocationsAlarm
ComparisonOperator: LessThanLowerOrGreaterThanUpperThreshold
EvaluationPeriods: 1
Metrics:
- Expression: ANOMALY_DETECTION_BAND(m1, 2)
Id: ad1
- Id: m1
MetricStat:
Metric:
MetricName: Invocations
Namespace: AWS/Lambda
Period: !!int 86400
Stat: Sum
ThresholdMetricId: ad1
TreatMissingData: breaching
As #Marcin said, you should explain what you have tried and what is blocking more precisely.
But what you suggest could work yes: you can define a Condition named isProd and use it to create - or not - resources. Regarding this condition: AWS does not know what is a production stage in your environment, so you need to specify that. Does your production stage matches an account? Does it match a region? Something else?
As an example and if we assume that your production stage matches a specific AWS account, then you could define the condition as below (it's JSON, feel free to convert to YAML):
{
"Parameters": {
"ProdAccountParameter": {
"Type": "String",
"Description": "Enter the production account identifier."
}
},
"Conditions": {
"isProd": {
"Fn::Equals": [
{
"Ref": "ProdAccountParameter"
},
{
"Ref": "AWS::AccountId"
}
]
}
},
...
}
(Then, when deploying the template, you'll need to provide your AWS production account).

Mapping EMR steps to YARN applications

Am aggregating EMR yarn application logs to S3 using below YARN configuration:
[
{
"Classification": "yarn-site",
"Properties": {
"yarn.log-aggregation-enable": "true",
"yarn.log-aggregation.retain-seconds": "-1",
"yarn.nodemanager.remote-app-log-dir": "s3://mybucket/logs"
}
}
]
The logs are grouped by yarn application ID in S3:
s3://mybucket/logs/appplication_id_001/
s3://mybucket/logs/appplication_id_002/
s3://mybucket/logs/appplication_id_003/
I want to map the EMR step ID to YARN application ID, so that given a step ID I will be able to fetch its logs.
The reason why I need this is because am using Apache Airflow for orchestration and would like to fetch the logs and show it it Airflow. My Airflow DAG looks like this:
create-cluster
-> add-step-1 -> watch-step-1
-> add-step-2 -> watch-step-2
-> add-step-3 -> watch-step-3
At the end of every watch-step-n task I would like to fetch the logs for that step from s3 and print it. Since none of the tasks in the DAG are aware of the YARN application ID am trying to find a way to get the application ID for a step ID.
EDIT
I couldn't find a way to map EMR step ID to YARN application ID. However I was able to group the logs by cluster ID. For example,
s3://mybucket/logs/cluster-id-0/appplication_id_001/
s3://mybucket/logs/cluster-id-0/appplication_id_002/
s3://mybucket/logs/cluster-id-1/appplication_id_001/
I found a way to get cluster ID from within the EMR nodes:
cat /mnt/var/lib/info/job-flow.json | jq -r '.jobFlowID'
I injected the cluster ID (as property named JOB_FLOW_ID) with YARN_OPTS in the yarn-env configuration. And set the yarn.nodemanager.remote-app-log-dir-suffix to JOB_FLOW_ID in yarn-site configuration.
yarn-env configuration:
{
"Classification": "yarn-env",
"Properties": {},
"Configurations": [
{
"Classification": "export",
"Properties": {
"JOB_FLOW_ID": "\"$(cat /mnt/var/lib/info/job-flow.json | jq -r '.jobFlowId')\"",
"YARN_OPTS": "\"$YARN_OPTS -Djob_flow_id=$JOB_FLOW_ID\""
},
"Configurations": []
}
]
}
yarn-site configuration:
{
"Classification": "yarn-site",
"Properties": {
"yarn.log-aggregation.retain-seconds": "-1",
"yarn.log-aggregation-enable": "true",
"yarn.nodemanager.remote-app-log-dir": "s3://mybucket/logs",
"yarn.nodemanager.remote-app-log-dir-suffix": "${job_flow_id}"
}
}

Substitute parts of a typed array in ASP.NET core appsettings.json from secrets/environment variables?

We have an ASP.NET Core web app with this appsettings.json:
{
"Subscriptions": [
{
"Name": "Production",
"PublishSettings": "<PublishData>SECRET</PublishData>",
"Environments": [
{
"Name": "Prod",
"DeploymentServiceNames": [
"api1",
"api2",
"api3"
]
}
]
},
{
"Name": "Test",
"PublishSettings": "<PublishData>SECRET</PublishData>",
"Environments": [
{
"Name": "Test1",
"DeploymentServiceNames": [
"api1",
"api2"
]
},
{
"Name": "Test2",
"DeploymentServiceNames": [
"api1",
"api2"
]
}
]
}
]
}
The PublishSettings values are secret so I want these in my local user secrets file, and in environment variables for my deployments. But, because Subscriptions is an array I'm not sure how. I don't particularly want to swap in the entire Subscriptions section. Is there a way to swap in a single property for each item in such an array, perhaps by defining a key property on the strongly typed subscription model?
When you load configuration in .NET Core, under the hood it's represented as a set of key-value pairs (both key and value have string type) supplied by added configuration providers.
For example, appsettings.json will be represented by JsonConfigurationProvider as the following settings list:
{Subscriptions:0:Environments:0:DeploymentServiceNames:0, api1}
{Subscriptions:0:Environments:0:DeploymentServiceNames:1, api2}
{Subscriptions:0:Environments:0:DeploymentServiceNames:2, api3}
{Subscriptions:0:Environments:0:Name, Prod}
{Subscriptions:0:Name, Production}
{Subscriptions:0:PublishSettings, <PublishData>SECRET</PublishData>}
{Subscriptions:1:Environments:0:DeploymentServiceNames:0, api1}
{Subscriptions:1:Environments:0:DeploymentServiceNames:1, api2}
{Subscriptions:1:Environments:0:Name, Test1}
{Subscriptions:1:Environments:1:DeploymentServiceNames:0, api1}
{Subscriptions:1:Environments:1:DeploymentServiceNames:1, api2}
{Subscriptions:1:Environments:1:Name, Test2}
{Subscriptions:1:Name, Test}
{Subscriptions:1:PublishSettings, <PublishData>SECRET</PublishData>}
As you see JSON structure was flattened and keys are built by joining inner section names with a colon. Array element are added with appropriate index as a name.
If you add another configuration source, e.g. environment variables or another secrets json file, which will have settings with the same keys, it will overwrite the setting.
So if you want to add or overwrite PublishSettings, you could add either another JSON file as configuration source:
{
"Subscriptions": [
{
"PublishSettings": "<PublishData>SECRET</PublishData>"
},
{
"PublishSettings": "<PublishData>SECRET</PublishData>"
}
]
}
Or add it as environment variables with the following keys:
Subscriptions:0:PublishSettings
Subscriptions:1:PublishSettings
Such setting override (or addition) is transparent for .NET Core configuration binder. Settings POCO will contain value of PublishSettings from the last configuration source that provides such value.

List of all environment variables for a Pod

I have a web app on OpenShift v3 (all-in-One), using the Wildfly Builder Image. In addition, I created a service named xyz, to point to an external host+IP. Something like this:
"kind": "Service",
"apiVersion": "v1",
"metadata": { "name": "xyz" },
"spec": {
"ports": [
{ "port": 61616,
"protocol": "TCP",
"targetPort": 61616
}
],
"selector": {}
}
I also have an endpoint, pointing externally, but that is not relevant for this question.
When deployed, my program can access an environment variable named XYZ_PORT=tcp://172.30.192.186:61616
However, I cannot figure out how to see all the values of all such variables either via the web-console, or using the CLI. Using the web-console, I cannot see it being injected into the YAML.
I tried some of the oc env options, but none seem to list what I want.
Let's say you are deploying kitchensink, then the below CLI should list all the environment variables:
oc env bc/kitchensink --list

AWS data pipeline activity with multiple inputs

As part of an Amazon AWS data pipeline, I have a hive activity using two unstaged S3 data nodes as input. What I want is to be able to set two script variables on the activity, each pointing to an input data node, but I can't get the syntax right. With the single input, I could write the following and it would work just fine:
INPUT_FOO=#{input.directoryPath}
When I add the second input, I run into a problem of how to reference them since they are now an array of inputs, as you can see in the pipeline definition below. Essentially, I want to achieve the following, but can't figure out the correct syntax:
INPUT_FOO=#{input[1].directoryPath}
INPUT_BAR=#{input[2].directoryPath}
Here's the activity portion of the pipeline definition:
{
"id": "ActivityId_7u1sR",
"input": [
{
"ref": "DataNodeId_iYnxf"
},
{
"ref": "DataNodeId_162Ka"
}
],
"schedule": {
"ref": "DefaultSchedule"
},
"scriptUri": "#{myS3ScriptLocation}calculate-results.q",
"name": "Perform Calculations",
"runsOn": {
"ref": "EmrClusterId_jHeiV"
},
"scriptVariable": [
"INPUT_SOURCE1=#{input[1].directoryPath}",
"OUTPUT=#{output.directoryPath}Results/",
"INPUT_SOURCE2=#{input[2].directoryPath}"
],
"output": {
"ref": "DataNodeId_2jY6v"
},
"type": "HiveActivity",
"stage": "false"
}
I plan to keep the tables unstaged and take care of table creation in the hive script so that it's easier to run each Hive activity in isolation as well as in the pipeline itself.
Here's the error I see when using array syntax:
Unable to resolve input[1].directoryPath for object ActivityId_7u1sR'
As it stands now, this scenario is not supported, but a feature request was added to support it in the future.