Can I change the limits of compute resources of unscheduled pods, with a custom scheduler in kubernetes - kubernetes-go-client

I am new to kubernetes and I am working on compute resource management of a kubernetes cluster. For this reason, I downloaded a toy scheduler (https://github.com/kelseyhightower/scheduler) in go. I know that once you set compute resource requests to pods you cannot change them. However, suppose that I have not set the resource requirements of the pod in the yaml file: e.g. nginx.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
schedulerName: hightower
containers:
- name: nginx
image: nginx
ports:
- containerPort: 8080
protocol: TCP
can I apply resource requests for each pod that the custom scheduler tries to schedule?

According to official Kubernetes Documentation, in order to manage compute resources in the cluster, you have to specify resource types like the maximum amount of CPU and Memory accordingly in the Pod or Deployment (if you consider making ReplicaSets) manifest file. Therefore, when you create Pod, Scheduler selects an appropriate Node for Pod to run on it, assuming that requested resources will not consume most of the Node capabilities.
Custom schedulers can extend the functionality and flexibility for the native Kubernetes scheduler; however, they can't change approach how to provision and manage resource requests or resource limits in Kubernetes cluster.

Thank you for the response, I agree that an implementation of a custom scheduler basically concerns how you bind the pods to nodes, however I thought that I could also enforce resources limits to pods (if there exist none), before the binding and when pods are waiting to be scheduled. It turns out that the only thing I could do is use deployments instead of pods and change the pod template of the deployment to apply also compute resource limits. This way is not as "clean" as I would like because the pending pods are destroyed and kubernetes spawns new ones that include the resource limits. If anybody is interested that is how I did it
imports ...
...
kubeconfig := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
clientcmd.NewDefaultClientConfigLoadingRules(),
&clientcmd.ConfigOverrides{},
)
namespace, _, err := kubeconfig.Namespace()
if err != nil {
panic(err.Error())
}
restconfig, err := kubeconfig.ClientConfig()
if err != nil {
panic(err)
}
clientset, err := kubernetes.NewForConfig(restconfig)
if err != nil {
panic(err)
}
...
deploymentsClient := clientset.AppsV1().Deployments(namespace)
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
d, getErr := deploymentsClient.Get("deployment name",
metav1.GetOptions{})
if getErr != nil {
panic(getErr)
}
d.Spec.Template.Spec.Containers[0].Resources.Requests =
make(map[v1core.ResourceName]resource.Quantity)
d.Spec.Template.Spec.Containers[0].Resources.Requests[v1core.ResourceCPU] =
*resource.NewQuantity("# of cores", resource.BinarySI)
_, updateErr := deploymentsClient.Update(d)
return updateErr
})
...

Related

Istio AuthorizationPolicy not working as expected from 1.8 to 1.14

I was using istio 1.8.6, and now we have migrated to 1.14.5.
After this upgrade the AuthorizationPolicy stops to working as it was previously.
In my case, I have 2 namespaces, and I want to restrict my namespace-1 to only accept requests coming from namespace-2. Services in namespace-1 cannot call other services in that same namespace-1.
This is the AuthorizationPolicy:
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-only-ns-1
namespace: namespace-1
spec:
action: ALLOW
rules:
- from:
- source:
namespaces: ["namespace-2"]
I have a api gateway running in namespace-2 to map/route all services in namespace-1.
So, if an service in namespace-1 needs to call another service in that namspace, it must call it by the api gateway running in namespace-2.
This is a flow example allowed:
service-1.namespace-1 -> api-gateway.namespace-2 -> service-2.namespace-1
This is a flow example NOT allowed:
service-1.namespace-1 -> service-2.namespace-1
After this istio upgrade (1.14.5), the AuthorizationPolicy has stopped to work. This new version starts to block that requests with error: 403 Forbidden (RBAC).The services are not allowed to receive requests from nowhere.
The old version (1.8.6) was working correctly in namespace-1, blocking requests coming from namespace-1 and allowing requests from namespace-2.
Any idea was is going on?

How can I configure the AdmissionConfiguration > PodSecurity > PodSecurityConfiguration in an EKS cluster?

If I understand right from Apply Pod Security Standards at the Cluster Level, in order to have a PSS (Pod Security Standard) as default for the whole cluster I need to create an AdmissionConfiguration in a file that the API server needs to consume during cluster creation.
I don't see any way to configure / provide the AdmissionConfiguration at CreateCluster , also I'm not sure how to provide this AdmissionConfiguration in a managed EKS node.
From the tutorials that use KinD or minikube it seems that the AdmissionConfiguration must be in a file that is referenced in the cluster-config.yaml, but if I'm not mistaken the EKS API server is managed and does not allow to change or even see this file.
The GitHub issue aws/container-roadmap Allow Access to AdmissionConfiguration seems to suggest that currently there is no possibility of providing AdmissionConfiguration at creation, but on the other hand aws-eks-best-practices says These exemptions are applied statically in the PSA admission controller configuration as part of the API server configuration
so, is there a way to provide PodSecurityConfiguration for the whole cluster in EKS? or I'm forced to just use per-namespace labels?
See also Enforce Pod Security Standards by Configuration the Built-in Admission Controller and EKS Best practices PSS and PSA
I don't think there is any way currently in EKS to provide configuration for the built-in PSA controller (Pod Security Admission controller).
But if you want to implement a cluster-wide default for PSS (Pod Security Standards) you can do that by installing the the official pod-security-webhook as a Dynamic Admission Controller in EKS.
git clone https://github.com/kubernetes/pod-security-admission
cd pod-security-admission/webhook
make certs
kubectl apply -k .
The default podsecurityconfiguration.yaml in pod-security-admission/webhook/manifests/020-configmap.yaml allows EVERYTHING so you should edit it and write something like
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-security-webhook
namespace: pod-security-webhook
data:
podsecurityconfiguration.yaml: |
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
# Array of authenticated usernames to exempt.
usernames: []
# Array of runtime class names to exempt.
runtimeClasses: []
# Array of namespaces to exempt.
namespaces: ["policy-test2"]
then
kubectl apply -k .
kubectl -n pod-security-webhook rollout restart deployment/pod-security-webhook # otherwise the pods won't reread the configuration changes
After those changes you can verify that the default forbids privileged pods with:
kubectl --context aihub-eks-terraform create ns policy-test1
kubectl --context aihub-eks-terraform -n policy-test1 run --image=ecerulm/ubuntu-tools:latest --rm -ti rubelagu-$RANDOM --privileged
Error from server (Forbidden): admission webhook "pod-security-webhook.kubernetes.io" denied the request: pods "rubelagu-32081" is forbidden: violates PodSecurity "restricted:latest": privileged (container "rubelagu-32081" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "rubelagu-32081" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "rubelagu-32081" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "rubelagu-32081" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "rubelagu-32081" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Note: that you get the error forbidding privileged pods even when the namespace policy-test1 has no label pod-security.kubernetes.io/enforce, so you know that this rule comes from the pod-security-webhook that we just installed and configured.
Now if you want to create a pod you will be forced to create in a way that complies with the restricted PSS, by specifying runAsNonRoot, seccompProfile.type and capabilities and For example:
apiVersion: v1
kind: Pod
metadata:
name: test-1
spec:
restartPolicy: Never
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: test
image: ecerulm/ubuntu-tools:latest
imagePullPolicy: Always
command: ["/bin/bash", "-c", "--", "sleep 900"]
securityContext:
privileged: false
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL

How to use EKS with suitable volumes and resolve subnet IP insufficient issue on AWS?

I deployed an application in EKS. The deployment always pending, when I checked the events found these issues.
$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
89s Warning FailedScheduling pod/awx-demo-111111111-122222 running PreBind plugin "VolumeBinding": binding volumes: provisioning failed for PVC "awx-demo-projects-claim"
49m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-031f9c702bc474e8f. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555555
32m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-01322i912fas0123na. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555515
15m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-031f9c702bc474e8f. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555525
89s Normal WaitForPodScheduled persistentvolumeclaim/awx-demo-projects-claim waiting for pod awx-demo-111111111-122222 to be scheduled
21m Warning ProvisioningFailed persistentvolumeclaim/awx-demo-projects-claim Failed to provision volume with StorageClass "gp2": invalid AccessModes [ReadWriteMany]: only AccessModes [ReadWriteOnce] are supported
It seems there are device issue and subnet issue. I created the EKS cluster and node group with these configurations:
resource "aws_eks_cluster" "this" {
encryption_config {
resources = ["secrets"]
provider {
key_arn = aws_kms_key.this.arn
}
}
enabled_cluster_log_types = ["api", "authenticator", "audit", "scheduler", "controllerManager"]
name = local.cluster_name
version = "1.20"
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = [
data.aws_ssm_parameter.private_subnet_0_id.value,
data.aws_ssm_parameter.private_subnet_1_id.value,
]
security_group_ids = [aws_security_group.this.id]
endpoint_public_access = true
}
depends_on = [
aws_iam_role_policy_attachment.eks_cluster_policy,
aws_iam_role_policy_attachment.eks_vpc_resource_controller,
aws_iam_role_policy_attachment.eks_service_policy,
]
tags = merge(
local.tags,
)
}
resource "aws_eks_node_group" "this" {
cluster_name = local.cluster_name
node_group_name = local.node_group_name
node_role_arn = aws_iam_role.eks_nodes.arn
instance_types = ["m5.2xlarge"]
subnet_ids = [
data.aws_ssm_parameter.private_subnet_0_id.value,
data.aws_ssm_parameter.private_subnet_1_id.value,
]
scaling_config {
desired_size = 2
max_size = 2
min_size = 2
}
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
}
depends_on = [
aws_iam_role_policy_attachment.eks_worker_node_policy,
aws_iam_role_policy_attachment.eks_cni_policy,
aws_iam_role_policy_attachment.ec2_container_register_readonly,
]
tags = merge(
local.tags,
)
}
I didn't define the volume type for EBS, maybe it's using the default setting. How to fix the issue?
For the VPC has insufficient IP addresses issue, if create a new subnet for EKS to use, is it necessary to delete the EKS cluster or node group?
By the way, the deployment I used was https://raw.githubusercontent.com/ansible/awx-operator/0.13.0/deploy/awx-operator.yaml.
The install was used https://github.com/ansible/awx-operator#basic-install.
#miantian, Continuing our discussion from the comments:
A subnet size cannot just be increased. If you change the subnet size, it will be recreated. But as the EKS is there, the subnet creation will fail. So, I would say - start fresh. Delete everything and then start fresh.
Regd the volume issue, by default EKS only supports ReadWriteOnce access mode. This is because of the technical limitation of AWS where an EBS volume can only be attached to 1 EC2 instance. If you want to use ReadWriteMany access mode, you need to use EFS.
If you want to use EFS, look up NFS/EFS client provisioner for EKS. There are few steps you need to follow in order to create an EFS provisioner in EKS. Then, you can start using ReadWriteMany access mode.

Assign roles to EKS cluster in manifest file?

I'm new to Kubernetes, and am playing with eksctl to create an EKS cluster in AWS. Here's my simple manifest file
kind: ClusterConfig
apiVersion: eksctl.io/v1alpha5
metadata:
name: sandbox
region: us-east-1
version: "1.18"
managedNodeGroups:
- name: ng-sandbox
instanceType: r5a.xlarge
privateNetworking: true
desiredCapacity: 2
minSize: 1
maxSize: 4
ssh:
allow: true
publicKeyName: my-ssh-key
fargateProfiles:
- name: fp-default
selectors:
# All workloads in the "default" Kubernetes namespace will be
# scheduled onto Fargate:
- namespace: default
# All workloads in the "kube-system" Kubernetes namespace will be
# scheduled onto Fargate:
- namespace: kube-system
- name: fp-sandbox
selectors:
# All workloads in the "sandbox" Kubernetes namespace matching the
# following label selectors will be scheduled onto Fargate:
- namespace: sandbox
labels:
env: sandbox
checks: passed
I created 2 roles, EKSClusterRole for cluster management, and EKSWorkerRole for the worker nodes? Where do I use them in the file? I'm looking at eksctl Config file schema page and it's not clear to me where in manifest file to use them.
As you mentioned, it's in the managedNodeGroups docs
managedNodeGroups:
- ...
iam:
instanceRoleARN: my-role-arn
# or
# instanceRoleName: my-role-name
You should also read about
Creating a cluster with Fargate support using a config file
AWS Fargate

Debugging istio rate limiting handler

I'm trying to apply rate limiting on some of our internal services (inside the mesh).
I used the example from the docs and generated redis rate limiting configurations that include a (redis) handler, quota instance, quota spec, quota spec binding and rule to apply the handler.
This redis handler:
apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
name: redishandler
namespace: istio-system
spec:
compiledAdapter: redisquota
params:
redisServerUrl: <REDIS>:6379
connectionPoolSize: 10
quotas:
- name: requestcountquota.instance.istio-system
maxAmount: 10
validDuration: 100s
rateLimitAlgorithm: FIXED_WINDOW
overrides:
- dimensions:
destination: s1
maxAmount: 1
- dimensions:
destination: s3
maxAmount: 1
- dimensions:
destination: s2
maxAmount: 1
The quota instance (I'm only interested in limiting by destination at the moment):
apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
name: requestcountquota
namespace: istio-system
spec:
compiledTemplate: quota
params:
dimensions:
destination: destination.labels["app"] | destination.service.host | "unknown"
A quota spec, charging 1 per request if I understand correctly:
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
name: request-count
namespace: istio-system
spec:
rules:
- quotas:
- charge: 1
quota: requestcountquota
A quota binding spec that all participating services pre-fetch. I also tried with service: "*" which also did nothing.
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
name: request-count
namespace: istio-system
spec:
quotaSpecs:
- name: request-count
namespace: istio-system
services:
- name: s2
namespace: default
- name: s3
namespace: default
- name: s1
namespace: default
# - service: '*' # Uncomment this to bind *all* services to request-count
A rule to apply the handler. Currently on all occasions (tried with matches but didn't change anything as well):
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: quota
namespace: istio-system
spec:
actions:
- handler: redishandler
instances:
- requestcountquota
The VirtualService definitions are pretty similar for all participants:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: s1
spec:
hosts:
- s1
http:
- route:
- destination:
host: s1
The problem is nothing really happens and no rate limiting takes place. I tested with curl from pods inside the mesh. The redis instance is empty (no keys on db 0, which I assume is what the rate limiting would use) so I know it can't practically rate-limit anything.
The handler seems to be configured properly (how can I make sure?) because I had some errors in it which were reported in mixer (policy). There are still some errors but none which I associate to this problem or the configuration. The only line in which redis handler is mentioned is this:
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
But its unclear if its a problem or not. I assume its not.
These are the rest of the lines from the reload once I deploy:
2019-12-17T13:44:22.601644Z info Built new config.Snapshot: id='43'
2019-12-17T13:44:22.601866Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.601881Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.602718Z info adapters Waiting for kubernetes cache sync... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903844Z info adapters Cache sync successful. {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903878Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903882Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.904808Z info Setting up event handlers
2019-12-17T13:44:22.904939Z info Starting Secrets controller
2019-12-17T13:44:22.904991Z info Waiting for informer caches to sync
2019-12-17T13:44:22.957893Z info Cleaning up handler table, with config ID:42
2019-12-17T13:44:22.957924Z info adapters deleted remote controller {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.957999Z info adapters adapter closed all scheduled daemons and workers {"adapter": "prometheus.istio-system"}
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
2019-12-17T13:44:22.958065Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958050Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958096Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958182Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:23.958109Z info adapters adapter closed all scheduled daemons and workers {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:55:21.042131Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2019-12-17T14:14:00.265722Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
I'm using the demo profile with disablePolicyChecks: false to enable rate limiting. This is on istio 1.4.0, deployed on EKS.
I also tried memquota (this is our staging environment) with low limits and nothing seems to work. I never got a 429 no matter how much I went over the rate limit configured.
I don't know how to debug this and see where the configuration is wrong causing it to do nothing.
Any help is appreciated.
I too spent hours trying to decipher the documentation and get a sample working.
According to the documentation, they recommended that we enable policy checks:
https://istio.io/docs/tasks/policy-enforcement/rate-limiting/
However when that did not work, I did an "istioctl profile dump", searched for policy, and tried several settings.
I used Helm install and passed the following and then was able to get the described behaviour:
--set global.disablePolicyChecks=false \
--set values.pilot.policy.enabled=true \ ===> this made it work, but it's not in the docs.