Spinnaker halyard deployment - overriding readinessprobe configuration - spinnaker

How to override readinessprobe configuration in halyard config for Spinnaker deployment through Halyard?
Front50 is taking time (~60 secs ) to start and hence healthcheckfails as the default timeout is set to 1 sec.
Default readinessprobe config for Front50 deployment through halyard:
readinessProbe:
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1

I think this might help:
hal config deploy edit --liveness-probe-enabled true --liveness-probe-initial-delay-seconds $LONGEST_SERVICE_STARTUP_TIME
hal deploy apply
The hal config options seem to have only liveness probe delay argument.
Ref:
https://www.spinnaker.io/setup/install/environment/#distributed-installation

Related

How can I configure the AdmissionConfiguration > PodSecurity > PodSecurityConfiguration in an EKS cluster?

If I understand right from Apply Pod Security Standards at the Cluster Level, in order to have a PSS (Pod Security Standard) as default for the whole cluster I need to create an AdmissionConfiguration in a file that the API server needs to consume during cluster creation.
I don't see any way to configure / provide the AdmissionConfiguration at CreateCluster , also I'm not sure how to provide this AdmissionConfiguration in a managed EKS node.
From the tutorials that use KinD or minikube it seems that the AdmissionConfiguration must be in a file that is referenced in the cluster-config.yaml, but if I'm not mistaken the EKS API server is managed and does not allow to change or even see this file.
The GitHub issue aws/container-roadmap Allow Access to AdmissionConfiguration seems to suggest that currently there is no possibility of providing AdmissionConfiguration at creation, but on the other hand aws-eks-best-practices says These exemptions are applied statically in the PSA admission controller configuration as part of the API server configuration
so, is there a way to provide PodSecurityConfiguration for the whole cluster in EKS? or I'm forced to just use per-namespace labels?
See also Enforce Pod Security Standards by Configuration the Built-in Admission Controller and EKS Best practices PSS and PSA
I don't think there is any way currently in EKS to provide configuration for the built-in PSA controller (Pod Security Admission controller).
But if you want to implement a cluster-wide default for PSS (Pod Security Standards) you can do that by installing the the official pod-security-webhook as a Dynamic Admission Controller in EKS.
git clone https://github.com/kubernetes/pod-security-admission
cd pod-security-admission/webhook
make certs
kubectl apply -k .
The default podsecurityconfiguration.yaml in pod-security-admission/webhook/manifests/020-configmap.yaml allows EVERYTHING so you should edit it and write something like
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-security-webhook
namespace: pod-security-webhook
data:
podsecurityconfiguration.yaml: |
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
# Array of authenticated usernames to exempt.
usernames: []
# Array of runtime class names to exempt.
runtimeClasses: []
# Array of namespaces to exempt.
namespaces: ["policy-test2"]
then
kubectl apply -k .
kubectl -n pod-security-webhook rollout restart deployment/pod-security-webhook # otherwise the pods won't reread the configuration changes
After those changes you can verify that the default forbids privileged pods with:
kubectl --context aihub-eks-terraform create ns policy-test1
kubectl --context aihub-eks-terraform -n policy-test1 run --image=ecerulm/ubuntu-tools:latest --rm -ti rubelagu-$RANDOM --privileged
Error from server (Forbidden): admission webhook "pod-security-webhook.kubernetes.io" denied the request: pods "rubelagu-32081" is forbidden: violates PodSecurity "restricted:latest": privileged (container "rubelagu-32081" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "rubelagu-32081" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "rubelagu-32081" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "rubelagu-32081" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "rubelagu-32081" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Note: that you get the error forbidding privileged pods even when the namespace policy-test1 has no label pod-security.kubernetes.io/enforce, so you know that this rule comes from the pod-security-webhook that we just installed and configured.
Now if you want to create a pod you will be forced to create in a way that complies with the restricted PSS, by specifying runAsNonRoot, seccompProfile.type and capabilities and For example:
apiVersion: v1
kind: Pod
metadata:
name: test-1
spec:
restartPolicy: Never
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: test
image: ecerulm/ubuntu-tools:latest
imagePullPolicy: Always
command: ["/bin/bash", "-c", "--", "sleep 900"]
securityContext:
privileged: false
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL

Spring Cloud Gateway Request Rate Limiter is not working with Redis Cluster

I am trying to add redis request rate limiter to a gateway project. Redis cluster is already up with 6 nodes in docker. But it seems the redis request rate limiter not working in the gateway project.
Here is the config
spring:
redis:
cluster:
nodes: ${REDIS_CLUSTER_NODES}
maxRedirects: ${REDIS_CLUSTER_MAX_REDIRECTS}
...
filters:
- name: RequestRateLimiter
args:
key-resolver: "#{#userRemoteAddressResolver}"
redis-rate-limiter.replenishRate: 1
redis-rate-limiter.burstCapacity: 2
redis-rate-limiter.requestedTokens: 1
There is no error message and no 429 HttpStatus in responses. Does RequestRateLimiter not work with Redis-Cluster? Am i missing something? Thanks in advance

Debugging istio rate limiting handler

I'm trying to apply rate limiting on some of our internal services (inside the mesh).
I used the example from the docs and generated redis rate limiting configurations that include a (redis) handler, quota instance, quota spec, quota spec binding and rule to apply the handler.
This redis handler:
apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
name: redishandler
namespace: istio-system
spec:
compiledAdapter: redisquota
params:
redisServerUrl: <REDIS>:6379
connectionPoolSize: 10
quotas:
- name: requestcountquota.instance.istio-system
maxAmount: 10
validDuration: 100s
rateLimitAlgorithm: FIXED_WINDOW
overrides:
- dimensions:
destination: s1
maxAmount: 1
- dimensions:
destination: s3
maxAmount: 1
- dimensions:
destination: s2
maxAmount: 1
The quota instance (I'm only interested in limiting by destination at the moment):
apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
name: requestcountquota
namespace: istio-system
spec:
compiledTemplate: quota
params:
dimensions:
destination: destination.labels["app"] | destination.service.host | "unknown"
A quota spec, charging 1 per request if I understand correctly:
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
name: request-count
namespace: istio-system
spec:
rules:
- quotas:
- charge: 1
quota: requestcountquota
A quota binding spec that all participating services pre-fetch. I also tried with service: "*" which also did nothing.
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
name: request-count
namespace: istio-system
spec:
quotaSpecs:
- name: request-count
namespace: istio-system
services:
- name: s2
namespace: default
- name: s3
namespace: default
- name: s1
namespace: default
# - service: '*' # Uncomment this to bind *all* services to request-count
A rule to apply the handler. Currently on all occasions (tried with matches but didn't change anything as well):
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: quota
namespace: istio-system
spec:
actions:
- handler: redishandler
instances:
- requestcountquota
The VirtualService definitions are pretty similar for all participants:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: s1
spec:
hosts:
- s1
http:
- route:
- destination:
host: s1
The problem is nothing really happens and no rate limiting takes place. I tested with curl from pods inside the mesh. The redis instance is empty (no keys on db 0, which I assume is what the rate limiting would use) so I know it can't practically rate-limit anything.
The handler seems to be configured properly (how can I make sure?) because I had some errors in it which were reported in mixer (policy). There are still some errors but none which I associate to this problem or the configuration. The only line in which redis handler is mentioned is this:
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
But its unclear if its a problem or not. I assume its not.
These are the rest of the lines from the reload once I deploy:
2019-12-17T13:44:22.601644Z info Built new config.Snapshot: id='43'
2019-12-17T13:44:22.601866Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.601881Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.602718Z info adapters Waiting for kubernetes cache sync... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903844Z info adapters Cache sync successful. {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903878Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903882Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.904808Z info Setting up event handlers
2019-12-17T13:44:22.904939Z info Starting Secrets controller
2019-12-17T13:44:22.904991Z info Waiting for informer caches to sync
2019-12-17T13:44:22.957893Z info Cleaning up handler table, with config ID:42
2019-12-17T13:44:22.957924Z info adapters deleted remote controller {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.957999Z info adapters adapter closed all scheduled daemons and workers {"adapter": "prometheus.istio-system"}
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
2019-12-17T13:44:22.958065Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958050Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958096Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958182Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:23.958109Z info adapters adapter closed all scheduled daemons and workers {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:55:21.042131Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2019-12-17T14:14:00.265722Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
I'm using the demo profile with disablePolicyChecks: false to enable rate limiting. This is on istio 1.4.0, deployed on EKS.
I also tried memquota (this is our staging environment) with low limits and nothing seems to work. I never got a 429 no matter how much I went over the rate limit configured.
I don't know how to debug this and see where the configuration is wrong causing it to do nothing.
Any help is appreciated.
I too spent hours trying to decipher the documentation and get a sample working.
According to the documentation, they recommended that we enable policy checks:
https://istio.io/docs/tasks/policy-enforcement/rate-limiting/
However when that did not work, I did an "istioctl profile dump", searched for policy, and tried several settings.
I used Helm install and passed the following and then was able to get the described behaviour:
--set global.disablePolicyChecks=false \
--set values.pilot.policy.enabled=true \ ===> this made it work, but it's not in the docs.

serverless-api-gateway-caching plugin is not setting the cache size

I try to set the AWS API Gateway cache using the serverless-api-gateway-caching plugin.
All is working fine, except the cacheSize.
This is my configuration for the caching:
caching:
enabled: true
clusterSize: '13.5'
ttlInSeconds: 3600
cacheKeyParameters:
- name: request.path.param1
- name: request.querystring.param2
The cache is configured correctly, but the cache size is always the default one '0.5'
Any idea about what is wrong?
sls -v
1.42.3
node --version
v9.11.2
serverless-api-gateway-caching: 1.4.0
Regards
Because of "Cache Capacity" setting is global per stage, it is not possible to set it per endpoint.
So the plugin is going to check this parameter only in the servelerless global configuration, ignoring it at the endpoint level.
It means that the right configuration is:
custom:
apiGatewayCaching:
enabled: true
clusterSize: '13.5'

Redis ha helm chart error - NOREPLICAS Not enough good replicas to write

I am trying to setup redis-ha helm chart on my local kubernetes (docker for windows).
helm values file I am using is,
## Configure resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
image:
repository: redis
tag: 5.0.3-alpine
pullPolicy: IfNotPresent
## replicas number for each component
replicas: 3
## Custom labels for the redis pod
labels: {}
## Pods Service Account
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
serviceAccount:
## Specifies whether a ServiceAccount should be created
##
create: false
## The name of the ServiceAccount to use.
## If not set and create is true, a name is generated using the redis-ha.fullname template
# name:
## Role Based Access
## Ref: https://kubernetes.io/docs/admin/authorization/rbac/
##
rbac:
create: false
## Redis specific configuration options
redis:
port: 6379
masterGroupName: mymaster
config:
## Additional redis conf options can be added below
## For all available options see http://download.redis.io/redis-stable/redis.conf
min-slaves-to-write: 1
min-slaves-max-lag: 5 # Value in seconds
maxmemory: "0" # Max memory to use for each redis instance. Default is unlimited.
maxmemory-policy: "volatile-lru" # Max memory policy to use for each redis instance. Default is volatile-lru.
# Determines if scheduled RDB backups are created. Default is false.
# Please note that local (on-disk) RDBs will still be created when re-syncing with a new slave. The only way to prevent this is to enable diskless replication.
save: "900 1"
# When enabled, directly sends the RDB over the wire to slaves, without using the disk as intermediate storage. Default is false.
repl-diskless-sync: "yes"
rdbcompression: "yes"
rdbchecksum: "yes"
## Custom redis.conf files used to override default settings. If this file is
## specified then the redis.config above will be ignored.
# customConfig: |-
# Define configuration here
resources:
requests:
memory: 200Mi
cpu: 100m
limits:
memory: 700Mi
cpu: 250m
## Sentinel specific configuration options
sentinel:
port: 26379
quorum: 2
config:
## Additional sentinel conf options can be added below. Only options that
## are expressed in the format simialar to 'sentinel xxx mymaster xxx' will
## be properly templated.
## For available options see http://download.redis.io/redis-stable/sentinel.conf
down-after-milliseconds: 10000
## Failover timeout value in milliseconds
failover-timeout: 180000
parallel-syncs: 5
## Custom sentinel.conf files used to override default settings. If this file is
## specified then the sentinel.config above will be ignored.
# customConfig: |-
# Define configuration here
resources:
requests:
memory: 200Mi
cpu: 100m
limits:
memory: 200Mi
cpu: 250m
securityContext:
runAsUser: 1000
fsGroup: 1000
runAsNonRoot: true
## Node labels, affinity, and tolerations for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#taints-and-tolerations-beta-feature
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}
# Prometheus exporter specific configuration options
exporter:
enabled: false
image: oliver006/redis_exporter
tag: v0.31.0
pullPolicy: IfNotPresent
# prometheus port & scrape path
port: 9121
scrapePath: /metrics
# cpu/memory resource limits/requests
resources: {}
# Additional args for redis exporter
extraArgs: {}
podDisruptionBudget: {}
# maxUnavailable: 1
# minAvailable: 1
## Configures redis with AUTH (requirepass & masterauth conf params)
auth: false
# redisPassword:
## Use existing secret containing "auth" key (ignores redisPassword)
# existingSecret:
persistentVolume:
enabled: true
## redis-ha data Persistent Volume Storage Class
## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
accessModes:
- ReadWriteOnce
size: 1Gi
annotations: {}
init:
resources: {}
# To use a hostPath for data, set persistentVolume.enabled to false
# and define hostPath.path.
# Warning: this might overwrite existing folders on the host system!
hostPath:
## path is evaluated as template so placeholders are replaced
# path: "/data/{{ .Release.Name }}"
# if chown is true, an init-container with root permissions is launched to
# change the owner of the hostPath folder to the user defined in the
# security context
chown: true
redis-ha is getting deployed correctly and when I do kubectl get all,
NAME READY STATUS RESTARTS AGE
pod/rc-redis-ha-server-0 2/2 Running 0 1h
pod/rc-redis-ha-server-1 2/2 Running 0 1h
pod/rc-redis-ha-server-2 2/2 Running 0 1h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23d
service/rc-redis-ha ClusterIP None <none> 6379/TCP,26379/TCP 1h
service/rc-redis-ha-announce-0 ClusterIP 10.105.187.154 <none> 6379/TCP,26379/TCP 1h
service/rc-redis-ha-announce-1 ClusterIP 10.107.36.58 <none> 6379/TCP,26379/TCP 1h
service/rc-redis-ha-announce-2 ClusterIP 10.98.38.214 <none> 6379/TCP,26379/TCP 1h
NAME DESIRED CURRENT AGE
statefulset.apps/rc-redis-ha-server 3 3 1h
I try to access the redis-ha using Java application, which uses lettuce driver to connect to redis. Sample java code to access redis,
package io.c12.bala.lettuce;
import io.lettuce.core.RedisClient;
import io.lettuce.core.api.StatefulRedisConnection;
import io.lettuce.core.api.sync.RedisCommands;
import java.util.logging.Logger;
public class RedisClusterConnect {
private static final Logger logger = Logger.getLogger(RedisClusterConnect.class.getName());
public static void main(String[] args) {
logger.info("Starting test");
// Syntax: redis-sentinel://[password#]host[:port][,host2[:port2]][/databaseNumber]#sentinelMasterId
RedisClient redisClient = RedisClient.create("redis-sentinel://rc-redis-ha:26379/0#mymaster");
StatefulRedisConnection<String, String> connection = redisClient.connect();
RedisCommands<String, String> command = connection.sync();
command.set("Hello", "World");
logger.info("Ran set command successfully");
logger.info("Value from Redis - " + command.get("Hello"));
connection.close();
redisClient.shutdown();
}
}
I packaged the application as runnable jar, created a container and pushed it to same kubernetes cluster where redis is running. The application now throws an error.
Exception in thread "main" io.lettuce.core.RedisCommandExecutionException: NOREPLICAS Not enough good replicas to write.
at io.lettuce.core.ExceptionFactory.createExecutionException(ExceptionFactory.java:135)
at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:122)
at io.lettuce.core.FutureSyncInvocationHandler.handleInvocation(FutureSyncInvocationHandler.java:69)
at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80)
at com.sun.proxy.$Proxy0.set(Unknown Source)
at io.c12.bala.lettuce.RedisClusterConnect.main(RedisClusterConnect.java:22)
Caused by: io.lettuce.core.RedisCommandExecutionException: NOREPLICAS Not enough good replicas to write.
at io.lettuce.core.ExceptionFactory.createExecutionException(ExceptionFactory.java:135)
at io.lettuce.core.ExceptionFactory.createExecutionException(ExceptionFactory.java:108)
at io.lettuce.core.protocol.AsyncCommand.completeResult(AsyncCommand.java:120)
at io.lettuce.core.protocol.AsyncCommand.complete(AsyncCommand.java:111)
at io.lettuce.core.protocol.CommandHandler.complete(CommandHandler.java:646)
at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:604)
at io.lettuce.core.protocol.CommandHandler.channelRead(CommandHandler.java:556)
I tried with jedis driver too, and with springboot application, getting the same error from the Redis-ha cluster.
** UPDATE **
when I run info command inside redis-cli, I am getting getting
connected_slaves:2
min_slaves_good_slaves:0
Seems the Slaves are not behaving properly. When switched to min-slaves-to-write: 0. Able to read and Write to Redis Cluster.
Any help on this is appreciated.
Seems that you have to edit redis-ha-configmap configmap and set min-slaves-to-write 0.
After all redis pod deletion (to apply it) it works like a charm
so :
helm install stable/redis-ha
kubectl edit cm redis-ha-configmap # change min-slaves-to-write from 1 to 0
kubectl delete pod redis-ha-0
If you deploying this Helm chart locally on your computer, you only have 1 node available. If you install the Helm chart with --set hardAntiAffinity=false then it will put the required replica pods all on the same node and thus will startup correctly and not give you that error. This hardAntiAffinity value has a documented default of true:
Whether the Redis server pods should be forced to run on separate nodes.
When I deployed the helm chart with same values to Kubernetes cluster running on AWS, it works fine.
Seems issue with Kubernetes on Docker for Windows.