How do you obtain an aws-iam-token to access S3 using IRSA? - amazon-s3

I've create an IRSA role in terraform so that the associated service account can be used by a K8s job to access an S3 bucket but I keep getting an AccessDenied error within the job.
I first enabled IRSA in our EKS cluster with enable_irsa = true in our eks module.
I then created a simple aws_iam_policy as:
resource "aws_iam_policy" "eks_s3_access_policy" {
name = "eks_s3_access_policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = [
"s3:*",
]
Effect = "Allow"
Resource = "arn:aws:s3:::*"
},
]
})
}
and a iam-assumable-role-with-oidc:
module "iam_assumable_role_with_oidc_for_s3_access" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "~> 3.0"
create_role = true
role_name = "eks-s3-access"
role_description = "Role to access s3 bucket"
tags = { Role = "eks_s3_access_policy" }
provider_url = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
role_policy_arns = [aws_iam_policy.eks_s3_access_policy.arn]
number_of_role_policy_arns = 1
oidc_fully_qualified_subjects = ["system:serviceaccount:default:my-user"]
}
I created a K8s service account using Helm like:
Name: my-user
Namespace: default
Labels: app.kubernetes.io/managed-by=Helm
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::111111:role/eks-s3-access
meta.helm.sh/release-name: XXXX
meta.helm.sh/release-namespace: default
Image pull secrets: <none>
Mountable secrets: my-user-token-kwwpq
Tokens: my-user-token-kwwpq
Events: <none>
Finally, jobs are created using the K8s API from a job template:
apiVersion: batch/v1
kind: Job
metadata:
name: job
namespace: default
spec:
template:
spec:
serviceAccountName: my-user
containers:
- name: {{ .Chart.Name }}
env:
- name: AWS_ROLE_ARN
value: arn:aws:iam::746181457053:role/eks-s3-access
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
volumeMounts:
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
name: aws-iam-token
readOnly: true
volumes:
- name: aws-iam-token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
audience: sts.amazonaws.com
expirationSeconds: 86400
path: token
When the job attempts to get the specified credentials, however, the specified token is not there:
2021-08-03 18:02:41 Refreshing temporary credentials failed during mandatory refresh period.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/aiobotocore/credentials.py", line 291, in _protected_refresh
metadata = await self._refresh_using()
File "/usr/local/lib/python3.7/site-packages/aiobotocore/credentials.py", line 345, in fetch_credentials
return await self._get_cached_credentials()
File "/usr/local/lib/python3.7/site-packages/aiobotocore/credentials.py", line 355, in _get_cached_credentials
response = await self._get_credentials()
File "/usr/local/lib/python3.7/site-packages/aiobotocore/credentials.py", line 410, in _get_credentials
kwargs = self._assume_role_kwargs()
File "/usr/local/lib/python3.7/site-packages/aiobotocore/credentials.py", line 420, in _assume_role_kwargs
identity_token = self._web_identity_token_loader()
File "/usr/local/lib/python3.7/site-packages/botocore/utils.py", line 2365, in __call__
with self._open(self._web_identity_token_path) as token_file:
FileNotFoundError: [Errno 2] No such file or directory: '/var/run/secrets/eks.amazonaws.com/serviceaccount/token'
From what is described in https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/ a webhook typically creates these credentials when the pod is created. However, since we're creating the new k8s' job on demand within the k8s cluster, I suspect that the webhook is not creating any such credentials.
How can I request the correct credentials to be created from within a K8s cluster? Is there a way to instantiate the webhook from within the cluster?

There are a couple of things that could cause this to fail.
Check all settings for the IRSA role. For the trust relationship setting check the name of the namespace and the name of service account are correct. Only if these settings match the role can be assumed.
While the pod is running try to access the pod with a shell. Check the content of the "AWS_*" environment variables. Check AWS_ROLE_ARN points to the correct role. Check, if the file which AWS_WEB_IDENTITY_TOKEN_FILE points to, is in its place and it is readable. Just try to do a cat on the file to see if it is readable.
If you are running your pod a non-root (which is recommended for security reasons) make sure the user who is running the pod has access to the file. If not, adjust the securityContext for the pod. Maybe the setting of fsGroup can help here. https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context
Make sure the SDK your pos is using supports IRSA. If you are using older SDKs IRSA may not be supported. Look into the IRSA documentation for supported SDK versions. https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html

Related

How do I add users to Argo CD using terraform resource?

I've deployed ArgoCD using the following terraform, which uses the argoproj help chart.
resource "helm_release" "argo_cd" {
chart = "argo-cd"
repository = "https://argoproj.github.io/argo-helm"
name = "argocd"
namespace = var.namespace
}
I knew when I created this that I needed to do two things. I needed to:
Add API access to the default admin user
Add new users
It appears that the solution to both #1 and #2 are to edit the configmap deployed with the helm chart. I've seen the question here about this very thing, but I've tried setting
I did try adding a kubernetes_config_map like the following to add a new local user (I didn't see how to add api_key access to the default admin account yet):
resource "kubernetes_config_map" "local_users" {
metadata {
#Also tried generate_name here
name = "argocd-cm"
namespace = "argocd"
labels = {
"app.kubernetes.io/name" = "argocd-cm"
"app.kubernetes.io/part-of" = "argocd"
}
}
data = {
"accounts.new-user" = "apiKey, login"
"accounts.new-user.enabled" = "true"
}
}
When I deployed this, it said the configmap named argocd-cm already exists. Which makes sense. So, as seen from the question posted above, I tried setting to generate_name, which went through, but I don't see the new user. And I can't query the API list names using argocd account list, because my default admin user doesn't have API access.
So, I need to know how to edit the existing configmap in terraform in order to add new users (assuming that is in fact the best method), and grant permission of the default admin user in order to use the API as that user. Any help is hugely appreciated.
Also from the question above, I did try using the set values, but I'm fairly certain I'm formatting them incorrectly, or this isn't how you do it (shown below):
resource "helm_release" "argo_cd" {
chart = "argo-cd"
repository = "https://argoproj.github.io/argo-helm"
name = "argocd"
namespace = var.namespace
set {
name = "accounts.new-user"
value = "apiKey, login"
}
}
Which just gives me a parsing error failed parsing key "accounts.new-user" with value apiKey, login, key " login" has no value
The current configmap named argocd-cm looks like this:
kubectl get configmap argocd-cm -n argocd -o yaml
apiVersion: v1
data:
application.instanceLabelKey: argocd.argoproj.io/instance
exec.enabled: "false"
server.rbac.log.enforce.enable: "false"
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: argocd
meta.helm.sh/release-namespace: argocd
creationTimestamp: "2022-07-02T14:06:49Z"
labels:
app.kubernetes.io/component: server
app.kubernetes.io/instance: argocd
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: argocd-cm
app.kubernetes.io/part-of: argocd
helm.sh/chart: argo-cd-4.9.7
name: argocd-cm
namespace: argocd
resourceVersion: "4382002"
uid: da102b9c-45d9-41j8-9c9a-f8cbca79b003

serverless-s3-local writing to real S3 bucket

I am using Serverless framework with the serverless-s3-local plugin to test my code during development. However, despite being in offline mode, the real S3 bucket is being written to. How can I alter my configuration to use a local fake s3 bucket when in offline mode?
Relevant serverless.yml sections:
plugins:
- serverless-stack-output
- serverless-plugin-include-dependencies
- serverless-layers
- serverless-deployment-bucket
- serverless-s3-local
- serverless-offline
custom:
#...
s3:
bucketName: test-s3-buck
host: localhost
serverless-offline:
ignoreJWTSignature: true
httpPort: 4000
noAuth: true
directory: /tmp
resources:
Resources:
#...
Bucket:
Type: AWS::S3::Bucket
Properties:
BucketName: ${self:custom.s3.bucketName}
Endpoint Calling S3:
import boto3
def post(event, context):
s3_path = "/test.txt"
body = "test"
encoded_string = body.encode("utf-8")
s3 = boto3.resource("s3")
bucket_name = "test-s3-buck"
s3.Bucket(bucket_name).put_object(Key=s3_path, Body=encoded_string)
response = {
"statusCode": 200,
"body": "Created."
}
return response
Launching Serverless Offline:
serverless offline start
on the readme file in serverless-s3-local we have:
const S3 = new AWS.S3({
s3ForcePathStyle: true,
accessKeyId: 'S3RVER', // This specific key is required when working offline
secretAccessKey: 'S3RVER',
endpoint: new AWS.Endpoint('http://localhost:4569'),
});
you can achieve the same with boto:
import boto3
client = boto3.client(
's3',
aws_access_key_id='S3RVER',
aws_secret_access_key='S3RVER'
)
which means, when you run your serverless offline start you need to set the aws access key id to S3RVER and aws secret access key to S3RVER, otherwise, the real bucket will be used.
also in the readme, there's instructions to setup a s3local aws profile, https://github.com/ar90n/serverless-s3-local#triggering-aws-events-offline
another way to achieve it is to run your command with environment variables:
AWS_ACCESS_KEY_ID=S3RVER AWS_SECRET_ACCESS_KEY=S3RVER serverless offline start
in that way, the aws-sdk inside your code will read the correct values for the offline mode

Authenticate AppSync queries console with Cognito User Pools

I am trying to authenticate Queries playground in AWS AppSync console. I have created User Pool and linked it to the AppSync API, I have also created an App Client in Cognito User Pool (deployed using CloudFormation). It appears under Select the authorization provider to use for running queries on this page: in the console.
When I run test query I get:
{
"errors": [
{
"errorType": "UnauthorizedException",
"message": "Unable to parse JWT token."
}
]
}
This is what I would expect. There is an option to Login with User Pools. The issue is I can't select any Client ID and when I choose to insert Client ID manually, anything I enter I get Invalid UserPoolId format. I am trying to copy Pool ID from User Pool General settings (format eu-west-2_xxxxxxxxx) but no joy. Btw, I am not using Amplify and I have not configured any Identity Pools.
EDIT:
Here is the CloudFormation GraphQLApi definition:
MyApi:
Type: AWS::AppSync::GraphQLApi
Properties:
Name: !Sub "${AWS::StackName}-api"
AuthenticationType: AMAZON_COGNITO_USER_POOLS
UserPoolConfig:
UserPoolId: !Ref UserPoolClient
AwsRegion: !Sub ${AWS::Region}
DefaultAction: ALLOW
To set up the stack using CloudFormation I have followed these 2 examples:
https://adrianhall.github.io/cloud/2018/04/17/deploy-an-aws-appsync-graphql-api-with-cloudformation/
https://gist.github.com/adrianhall/f330a10451f05a529680f32978dddb64
Turns out they both (same author) have an issue in them in the section where ApiGraphQL is defined. This:
MyApi:
Type: AWS::AppSync::GraphQLApi
Properties:
Name: !Sub "${AWS::StackName}-api"
AuthenticationType: AMAZON_COGNITO_USER_POOLS
UserPoolConfig:
UserPoolId: !Ref UserPoolClient
AwsRegion: !Sub ${AWS::Region}
DefaultAction: ALLOW
Should be:
MyApi:
Type: AWS::AppSync::GraphQLApi
Properties:
Name: !Sub "${AWS::StackName}-api"
AuthenticationType: AMAZON_COGNITO_USER_POOLS
UserPoolConfig:
UserPoolId: !Ref UserPool
AwsRegion: !Sub ${AWS::Region}
DefaultAction: ALLOW
Thank you #Myz for pointing me back to review the whole CF yaml file

How to get SNS topic ARN inside lambda handler and set permissions to wite to it?

I have two lambda functions defined in serverless.yml: graphql and convertTextToSpeech. The former (in one of the GraphQL endpoints) should write to SNS topic to execute the latter one. Here is my serverless.yml file:
service: hello-world
provider:
name: aws
runtime: nodejs6.10
plugins:
- serverless-offline
functions:
graphql:
handler: dist/app.handler
events:
- http:
path: graphql
method: post
cors: true
convertTextToSpeach:
handler: dist/tasks/convertTextToSpeach.handler
events:
- sns:
topicName: convertTextToSpeach
displayName: Convert text to speach
And GraphQL endpoint writing to SNS:
// ...
const sns = new AWS.SNS()
const params = {
Message: 'Test',
Subject: 'Test SNS from lambda',
TopicArn: 'arn:aws:sns:us-east-1:101972216059:convertTextToSpeach'
}
await sns.publish(params).promise()
// ...
There are two issues here:
Topic ARN (which is required to write to a topic) is hardcoded it. How I can get this in my code "dynamically"? Is it provided somehow by serverless framework?
Even when topic arn is hardcoded lambda functions does not have permissions to wrote to that topic. How I can define such permissions in serverless.yml file?
1) You can resolve the topic dynamically.
This can be done through CloudFormation Intrinsic Functions, which are available within the serverless template (see the added environment section).
functions:
graphql:
handler: handler.hello
environment:
topicARN:
Ref: SNSTopicConvertTextToSpeach
events:
- http:
path: graphql
method: post
cors: true
convertTextToSpeach:
handler: handler.hello
events:
- sns:
topicName: convertTextToSpeach
displayName: Convert text to speach
In this case, the actual topic reference name (generated by the serverless framework) is SNSTopicConvertTextToSpeach. The generation of those names is explained in the serverless docs.
2) Now that the ARN of the topic is mapped into an environment variable, you can access it within the GraphQL lambda through the process variable (process.env.topicARN).
// ...
const sns = new AWS.SNS()
const params = {
Message: 'Test',
Subject: 'Test SNS from lambda',
TopicArn: process.env.topicARN
}
await sns.publish(params).promise()
// ...

Spring Cloud Config (Vault backend) teminating too early

I am using Spring Cloud Config Server to serve configuration for my client apps. To facilitate secrets configuration I am using HashiCorp Vault as a back end. For the remainder of the configuration I am using a GIT repo. So I have configured the config server in composite mode. See my config server bootstrap.yml below:-
server:
port: 8888
spring:
profiles:
active: local, git, vault
application:
name: my-domain-configuration-server
cloud:
config:
server:
git:
uri: https://mygit/my-domain-configuration
order: 1
vault:
order: 2
host: vault.mydomain.com
port: 8200
scheme: https
backend: mymount/generic
This is all working as expected. However, the token I am using is secured with a Vault auth policy. See below:-
{
"rules": "path "mymount/generic/myapp-app,local" {
policy = "read"
}
path "mymount/generic/myapp-app,local/*" {
policy = "read"
}
path "mymount/generic/myapp-app" {
policy = "read"
}
path "mymount/generic/myapp-app/*" {
policy = "read"
}
path "mymount/generic/application,local" {
policy = "read"
}
path "mymount/generic/application,local/*" {
policy = "read"
}
path "mymount/generic/application" {
policy = "read"
}
path "mymount/generic/application/*" {
policy = "read"
}"
}
My issue is that I am not storing secrets in all these scopes. I need to specify all these paths just so I can authorize the token to read one secret from mymount/generic/myapp-app,local. If I do not authorize all the other paths the VaultEnvironmentRepository.read() method returns a 403 HTTP status code (Forbidden) and throws a VaultException. This results in complete failure to retrieve any configuration for the app, including GIT based configuration. This is very limiting as client apps may have multiple Spring profiles that have nothing to do with retrieving configuration items. The issue is that config server will attempt to retrieve configuration for all the active profiles provided by the client.
Is there a way to enable fault tolerance or lenience on the config server, so that VaultEnvironmentRepository does not abort and returns any configuration that it is actually authorized to return?
Do you absolutely need the local profile? Would you not be able to get by with just the 'vault' and 'git' profiles in Config Server and use the 'default' profile in each Spring Boot application?
If you use the above suggestion then the only two paths you'd need in your rules (.hcl) file are:
path "mymount/generic/application" {
capabilities = ["read", "list"]
}
and
path "mymount/generic/myapp-app" {
capabilities = ["read", "list"]
}
This assumes that you're writing configuration to
vault write mymount/generic/myapp-app
and not
vault write mymount/generic/myapp-app,local
or similar.