Assign roles to EKS cluster in manifest file? - amazon-eks

I'm new to Kubernetes, and am playing with eksctl to create an EKS cluster in AWS. Here's my simple manifest file
kind: ClusterConfig
apiVersion: eksctl.io/v1alpha5
metadata:
name: sandbox
region: us-east-1
version: "1.18"
managedNodeGroups:
- name: ng-sandbox
instanceType: r5a.xlarge
privateNetworking: true
desiredCapacity: 2
minSize: 1
maxSize: 4
ssh:
allow: true
publicKeyName: my-ssh-key
fargateProfiles:
- name: fp-default
selectors:
# All workloads in the "default" Kubernetes namespace will be
# scheduled onto Fargate:
- namespace: default
# All workloads in the "kube-system" Kubernetes namespace will be
# scheduled onto Fargate:
- namespace: kube-system
- name: fp-sandbox
selectors:
# All workloads in the "sandbox" Kubernetes namespace matching the
# following label selectors will be scheduled onto Fargate:
- namespace: sandbox
labels:
env: sandbox
checks: passed
I created 2 roles, EKSClusterRole for cluster management, and EKSWorkerRole for the worker nodes? Where do I use them in the file? I'm looking at eksctl Config file schema page and it's not clear to me where in manifest file to use them.

As you mentioned, it's in the managedNodeGroups docs
managedNodeGroups:
- ...
iam:
instanceRoleARN: my-role-arn
# or
# instanceRoleName: my-role-name
You should also read about
Creating a cluster with Fargate support using a config file
AWS Fargate

Related

How can I configure the AdmissionConfiguration > PodSecurity > PodSecurityConfiguration in an EKS cluster?

If I understand right from Apply Pod Security Standards at the Cluster Level, in order to have a PSS (Pod Security Standard) as default for the whole cluster I need to create an AdmissionConfiguration in a file that the API server needs to consume during cluster creation.
I don't see any way to configure / provide the AdmissionConfiguration at CreateCluster , also I'm not sure how to provide this AdmissionConfiguration in a managed EKS node.
From the tutorials that use KinD or minikube it seems that the AdmissionConfiguration must be in a file that is referenced in the cluster-config.yaml, but if I'm not mistaken the EKS API server is managed and does not allow to change or even see this file.
The GitHub issue aws/container-roadmap Allow Access to AdmissionConfiguration seems to suggest that currently there is no possibility of providing AdmissionConfiguration at creation, but on the other hand aws-eks-best-practices says These exemptions are applied statically in the PSA admission controller configuration as part of the API server configuration
so, is there a way to provide PodSecurityConfiguration for the whole cluster in EKS? or I'm forced to just use per-namespace labels?
See also Enforce Pod Security Standards by Configuration the Built-in Admission Controller and EKS Best practices PSS and PSA
I don't think there is any way currently in EKS to provide configuration for the built-in PSA controller (Pod Security Admission controller).
But if you want to implement a cluster-wide default for PSS (Pod Security Standards) you can do that by installing the the official pod-security-webhook as a Dynamic Admission Controller in EKS.
git clone https://github.com/kubernetes/pod-security-admission
cd pod-security-admission/webhook
make certs
kubectl apply -k .
The default podsecurityconfiguration.yaml in pod-security-admission/webhook/manifests/020-configmap.yaml allows EVERYTHING so you should edit it and write something like
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-security-webhook
namespace: pod-security-webhook
data:
podsecurityconfiguration.yaml: |
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
# Array of authenticated usernames to exempt.
usernames: []
# Array of runtime class names to exempt.
runtimeClasses: []
# Array of namespaces to exempt.
namespaces: ["policy-test2"]
then
kubectl apply -k .
kubectl -n pod-security-webhook rollout restart deployment/pod-security-webhook # otherwise the pods won't reread the configuration changes
After those changes you can verify that the default forbids privileged pods with:
kubectl --context aihub-eks-terraform create ns policy-test1
kubectl --context aihub-eks-terraform -n policy-test1 run --image=ecerulm/ubuntu-tools:latest --rm -ti rubelagu-$RANDOM --privileged
Error from server (Forbidden): admission webhook "pod-security-webhook.kubernetes.io" denied the request: pods "rubelagu-32081" is forbidden: violates PodSecurity "restricted:latest": privileged (container "rubelagu-32081" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "rubelagu-32081" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "rubelagu-32081" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "rubelagu-32081" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "rubelagu-32081" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Note: that you get the error forbidding privileged pods even when the namespace policy-test1 has no label pod-security.kubernetes.io/enforce, so you know that this rule comes from the pod-security-webhook that we just installed and configured.
Now if you want to create a pod you will be forced to create in a way that complies with the restricted PSS, by specifying runAsNonRoot, seccompProfile.type and capabilities and For example:
apiVersion: v1
kind: Pod
metadata:
name: test-1
spec:
restartPolicy: Never
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: test
image: ecerulm/ubuntu-tools:latest
imagePullPolicy: Always
command: ["/bin/bash", "-c", "--", "sleep 900"]
securityContext:
privileged: false
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL

Registered Targets Disappear

I have a working EKS cluster. It is using a ALB for ingress.
When I apply a service and then an ingress most of these work as expected. However some target groups eventually have no registered targets. If I get the service IP address kubectl describe svc my-service-name and manually register the EndPoints in the target group the pods are reachable again but that's not a sustainable process.
Any ideas on what might be happening? Why doesn't EKS find the target groups as pods cycle?
Each service (secrets, deployment, service and ingress consists of a set of .yaml files applied like:
deploy.sh
#!/bin/bash
set -e
kubectl apply -f ./secretsMap.yaml
kubectl apply -f ./configMap.yaml
kubectl apply -f ./deployment.yaml
kubectl apply -f ./service.yaml
kubectl apply -f ./ingress.yaml
service.yaml
apiVersion: v1
kind: Service
metadata:
name: "site-bob"
namespace: "next-sites"
spec:
ports:
- port: 80
targetPort: 3000
protocol: TCP
type: NodePort
selector:
app: "site-bob"
ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: "site-bob"
namespace: "next-sites"
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/tags: Environment=Production,Group=api
alb.ingress.kubernetes.io/backend-protocol: HTTP
alb.ingress.kubernetes.io/ip-address-type: ipv4
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80},{"HTTPS":443}]'
alb.ingress.kubernetes.io/load-balancer-name: eks-ingress-1
alb.ingress.kubernetes.io/group.name: eks-ingress-1
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-2:402995436123:certificate/9db9dce3-055d-4655-842e-xxxxx
alb.ingress.kubernetes.io/healthcheck-port: traffic-port
alb.ingress.kubernetes.io/healthcheck-path: /
alb.ingress.kubernetes.io/healthcheck-interval-seconds: '30'
alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '16'
alb.ingress.kubernetes.io/success-codes: 200,201
alb.ingress.kubernetes.io/healthy-threshold-count: '2'
alb.ingress.kubernetes.io/unhealthy-threshold-count: '2'
alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=60
alb.ingress.kubernetes.io/target-group-attributes: deregistration_delay.timeout_seconds=30
alb.ingress.kubernetes.io/actions.ssl-redirect: >
{
"type": "redirect",
"redirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}
}
alb.ingress.kubernetes.io/actions.svc-host: >
{
"type":"forward",
"forwardConfig":{
"targetGroups":[
{
"serviceName":"site-bob",
"servicePort": 80,"weight":20}
],
"targetGroupStickinessConfig":{"enabled":true,"durationSeconds":200}
}
}
labels:
app: site-bob
spec:
rules:
- host: "staging-bob.imgeinc.net"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ssl-redirect
port:
name: use-annotation
- backend:
service:
name: svc-host
port:
name: use-annotation
pathType: ImplementationSpecific
Something in my configuration added tagged two security groups as being owned by the cluster. When I checked the load balancer controller logs:
kubectl logs -n kube-system aws-load-balancer-controller-677c7998bb-l7mwb
I saw many lines like:
{"level":"error","ts":1641996465.6707578,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-nextsite-sitefest-89a6f0ff0a","namespace":"next-sites","error":"expect exactly one securityGroup tagged with kubernetes.io/cluster/imageinc-next-eks-4KN4v6EX for eni eni-0c5555fb9a87e93ad, got: [sg-04b2754f1c85ac8b9 sg-07b026b037dd4d6a4]"}
sg-07b026b037dd4d6a4 has description: EKS created security group applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloads.
sg-04b2754f1c85ac8b9 has description: Security group for all nodes in the cluster.
I removed the tag:
{
Key: 'kubernetes.io/cluster/_cluster name_',
value:'owned'
}
from sg-04b2754f1c85ac8b9
and the TargetGroups started to fill in and everything is now working. Both groups were created and tagged by Terraform. I suspect my worker group configuration is off.
facing the same issue when creating the cluster with terraform. Solved updating aws load balancer controller from 2.3 to 2.4.4

NIFI CLUSTER AND ZOOKEEPER CLUSTER

I want to configure a NIFI Cluster with external TLS zookeeper cluster (deployed in a kubernetes cluster). All is ok (quorum, zookeeper tls...) but when I set the zookeeper connection string to … myzk:3181,myzk2:3181… and Nifi tries connect to zookeeper cluster, I get this message :
io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 0000002d0000
I think that is because Nifi is talking with zookeeper HTTP and the 3181 is HTTPS
Thanks in advance, regards
NIFI Version : 1.12.1
Zookeeper 3.7.0 (QUORUM IS OK)
#nifi.properties
# Site to Site properties
nifi.remote.input.host=nifi-0.nifi-headless.nifi-pro.svc.cluster.local
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10443
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs
# web properties #
nifi.web.war.directory=./lib
nifi.web.proxy.host=my_proxy.com
nifi.web.http.port=
nifi.web.https.port=9443
nifi.web.http.host=nifi-0.nifi-headless.nifi-pro.svc.cluster.local
nifi.web.http.network.interface.default=eth0
nifi.web.https.host=nifi-0.nifi-headless.nifi-pro.svc.cluster.local
nifi.web.https.network.interface.default=
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
# nifi.web.proxy.context.path=
# security properties #
nifi.sensitive.props.key=
nifi.sensitive.props.key.protected=
nifi.sensitive.props.algorithm=PBEWITHMD5AND256BITAES-CBC-OPENSSL
nifi.sensitive.props.provider=BC
nifi.sensitive.props.additional.keys=
nifi.security.keystore=/opt/nifi/nifi-current/config-data/certs/keystore.jks
nifi.security.keystoreType=jks
nifi.security.keystorePasswd=tym6nSAHI7xwnqUdwi4OGn2RpXtq9zLpqurol1lLqVg
nifi.security.keyPasswd=tym6nSAHI7xwnqUdwi4OGn2RpXtq9zLpqurol1lLqVg
nifi.security.truststore=/opt/nifi/nifi-current/config-data/certs/truststore.jks
nifi.security.truststoreType=jks
nifi.security.truststorePasswd=wRbjBPa62GLnlWaGMIMg6Ak6n+AyCeUKEquGSwyJt24
nifi.security.needClientAuth=true
nifi.security.user.authorizer=managed-authorizer
nifi.security.user.login.identity.provider=
nifi.security.ocsp.responder.url=
nifi.security.ocsp.responder.certificate=
# OpenId Connect SSO Properties #
nifi.security.user.oidc.discovery.url=https://my_url_oidc
nifi.security.user.oidc.connect.timeout=5 secs
nifi.security.user.oidc.read.timeout=5 secs
nifi.security.user.oidc.client.id=lkasdnlnsda
nifi.security.user.oidc.client.secret=fdjksalfnslknasfiDHn
nifi.security.user.oidc.preferred.jwsalgorithm=
nifi.security.user.oidc.claim.identifying.user=email
nifi.security.user.oidc.additional.scopes=
# Apache Knox SSO Properties #
nifi.security.user.knox.url=
nifi.security.user.knox.publicKey=
nifi.security.user.knox.cookieName=hadoop-jwt
nifi.security.user.knox.audiences=
# Identity Mapping Properties #
# These properties allow normalizing user identities such that identities coming from different identity providers
# (certificates, LDAP, Kerberos) can be treated the same internally in NiFi. The following example demonstrates normalizing
# DNs from certificates and principals from Kerberos into a common identity string:
#
# nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?), O=(.*?), L=(.*?), ST=(.*?), C=(.*?)$
# nifi.security.identity.mapping.value.dn=$1#$2
# nifi.security.identity.mapping.pattern.kerb=^(.*?)/instance#(.*?)$
# nifi.security.identity.mapping.value.kerb=$1#$2
# cluster common properties (all nodes must have same values) #
nifi.cluster.protocol.heartbeat.interval=5 sec
nifi.cluster.protocol.is.secure=true
# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=nifi-0.nifi-headless.nifi-pro.svc.cluster.local
nifi.cluster.node.protocol.port=11443
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=1 mins
nifi.cluster.flow.election.max.candidates=
# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=nifi-zk:2181
nifi.zookeeper.connect.timeout=3 secs
nifi.zookeeper.session.timeout=3 secs
nifi.zookeeper.root.node=/nifi
nifi.zookeeper.client.secure=true
## BY DEFAULT, NIFI CLIENT WILL USE nifi.security.* if you require separate keystore and truststore uncomment below section
nifi.zookeeper.security.keystore=/opt/nifi/nifi-current/config-data/certs/zk/keystore.jks
nifi.zookeeper.security.keystoreType=JKS
nifi.zookeeper.security.keystorePasswd=123456
nifi.zookeeper.security.truststore=/opt/nifi/nifi-current/config-data/certs/zk/truststore.jks
nifi.zookeeper.security.truststoreType=JKS
nifi.zookeeper.security.truststorePasswd=123456
# Zookeeper properties for the authentication scheme used when creating acls on znodes used for cluster management
# Values supported for nifi.zookeeper.auth.type are "default", which will apply world/anyone rights on znodes
# and "sasl" which will give rights to the sasl/kerberos identity used to authenticate the nifi node
# The identity is determined using the value in nifi.kerberos.service.principal and the removeHostFromPrincipal
# and removeRealmFromPrincipal values (which should align with the kerberos.removeHostFromPrincipal and kerberos.removeRealmFromPrincipal
# values configured on the zookeeper server).
nifi.zookeeper.auth.type=
nifi.zookeeper.kerberos.removeHostFromPrincipal=
nifi.zookeeper.kerberos.removeRealmFromPrincipal=
# kerberos #
nifi.kerberos.krb5.file=
# kerberos service principal #
nifi.kerberos.service.principal=
nifi.kerberos.service.keytab.location=
# kerberos spnego principal #
nifi.kerberos.spnego.principal=
nifi.kerberos.spnego.keytab.location=
nifi.kerberos.spnego.authentication.expiration=12 hours
# external properties files for variable registry
# supports a comma delimited list of file locations
nifi.variable.registry.properties=
You usually see this when you have a HTTP vs HTTPS mismatch
ideally you would be calling your service over the HTTP
spring:
cloud:
gateway:
discovery:
locator:
url-expression: "'lb:http://'+serviceId"
For reference using client port 2181
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
selector:
app: zk
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
ports:
- port: 2181
name: client
selector:
app: zk
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 3
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
containers:
- name: kubernetes-zookeeper
imagePullPolicy: Always
image: "k8s.gcr.io/kubernetes-zookeeper:1.0-3.4.10"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
command:
- sh
- -c
- "start-zookeeper \
--servers=3 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=60 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=40000 \
--min_session_timeout=4000 \
--log_level=INFO"
readinessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
securityContext:
runAsUser: 1000
fsGroup: 1000
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
NiFi did not support TLS with ZooKeeper until release 1.13.0. If you're using NiFi 1.12.1 it will not support configuring TLS for ZooKeeper.

Debugging istio rate limiting handler

I'm trying to apply rate limiting on some of our internal services (inside the mesh).
I used the example from the docs and generated redis rate limiting configurations that include a (redis) handler, quota instance, quota spec, quota spec binding and rule to apply the handler.
This redis handler:
apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
name: redishandler
namespace: istio-system
spec:
compiledAdapter: redisquota
params:
redisServerUrl: <REDIS>:6379
connectionPoolSize: 10
quotas:
- name: requestcountquota.instance.istio-system
maxAmount: 10
validDuration: 100s
rateLimitAlgorithm: FIXED_WINDOW
overrides:
- dimensions:
destination: s1
maxAmount: 1
- dimensions:
destination: s3
maxAmount: 1
- dimensions:
destination: s2
maxAmount: 1
The quota instance (I'm only interested in limiting by destination at the moment):
apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
name: requestcountquota
namespace: istio-system
spec:
compiledTemplate: quota
params:
dimensions:
destination: destination.labels["app"] | destination.service.host | "unknown"
A quota spec, charging 1 per request if I understand correctly:
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
name: request-count
namespace: istio-system
spec:
rules:
- quotas:
- charge: 1
quota: requestcountquota
A quota binding spec that all participating services pre-fetch. I also tried with service: "*" which also did nothing.
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
name: request-count
namespace: istio-system
spec:
quotaSpecs:
- name: request-count
namespace: istio-system
services:
- name: s2
namespace: default
- name: s3
namespace: default
- name: s1
namespace: default
# - service: '*' # Uncomment this to bind *all* services to request-count
A rule to apply the handler. Currently on all occasions (tried with matches but didn't change anything as well):
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: quota
namespace: istio-system
spec:
actions:
- handler: redishandler
instances:
- requestcountquota
The VirtualService definitions are pretty similar for all participants:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: s1
spec:
hosts:
- s1
http:
- route:
- destination:
host: s1
The problem is nothing really happens and no rate limiting takes place. I tested with curl from pods inside the mesh. The redis instance is empty (no keys on db 0, which I assume is what the rate limiting would use) so I know it can't practically rate-limit anything.
The handler seems to be configured properly (how can I make sure?) because I had some errors in it which were reported in mixer (policy). There are still some errors but none which I associate to this problem or the configuration. The only line in which redis handler is mentioned is this:
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
But its unclear if its a problem or not. I assume its not.
These are the rest of the lines from the reload once I deploy:
2019-12-17T13:44:22.601644Z info Built new config.Snapshot: id='43'
2019-12-17T13:44:22.601866Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.601881Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.602718Z info adapters Waiting for kubernetes cache sync... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903844Z info adapters Cache sync successful. {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903878Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903882Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.904808Z info Setting up event handlers
2019-12-17T13:44:22.904939Z info Starting Secrets controller
2019-12-17T13:44:22.904991Z info Waiting for informer caches to sync
2019-12-17T13:44:22.957893Z info Cleaning up handler table, with config ID:42
2019-12-17T13:44:22.957924Z info adapters deleted remote controller {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.957999Z info adapters adapter closed all scheduled daemons and workers {"adapter": "prometheus.istio-system"}
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
2019-12-17T13:44:22.958065Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958050Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958096Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958182Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:23.958109Z info adapters adapter closed all scheduled daemons and workers {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:55:21.042131Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2019-12-17T14:14:00.265722Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
I'm using the demo profile with disablePolicyChecks: false to enable rate limiting. This is on istio 1.4.0, deployed on EKS.
I also tried memquota (this is our staging environment) with low limits and nothing seems to work. I never got a 429 no matter how much I went over the rate limit configured.
I don't know how to debug this and see where the configuration is wrong causing it to do nothing.
Any help is appreciated.
I too spent hours trying to decipher the documentation and get a sample working.
According to the documentation, they recommended that we enable policy checks:
https://istio.io/docs/tasks/policy-enforcement/rate-limiting/
However when that did not work, I did an "istioctl profile dump", searched for policy, and tried several settings.
I used Helm install and passed the following and then was able to get the described behaviour:
--set global.disablePolicyChecks=false \
--set values.pilot.policy.enabled=true \ ===> this made it work, but it's not in the docs.

Spinnaker on Titus cloud provider

Are there any steps of configuring Spinnaker/Halyard to work on Titus based cluster? - https://netflix.github.io/titus/
There aren't any steps described in the documentation: https://www.spinnaker.io/setup/install/providers/
Also, check this Github issue: https://github.com/spinnaker/spinnaker.github.io/issues/869
There is a sample config in the github repo:
titus:
enabled: true
awsVpc: vpc0 # this is the default vpc used by titus
accounts:
- name: titusdevint
environment: test
discovery: "http://discovery.compary.com/v2"
discoveryEnabled: true
registry: testregistry # reference to the docker registry being used
awsAccount: test # aws account underpinning
autoscalingEnabled: true
loadBalancingEnabled: false # load balancing will be released at a later date
regions:
- name: us-east-1
url: https://myTitus.us-east-1.company.com/
port: 7104
autoscalingEnabled: true
loadBalancingEnabled: false
- name: eu-west-1
url: https://myTitus.eu-west-1.company.com/
port: 7104
autoscalingEnabled: true
loadBalancingEnabled: false
https://github.com/spinnaker/clouddriver/tree/master/clouddriver-titus
Right now you'll have to edit clouddriver.yml manually and then update via halyard