GCE: load balancer health check fails when routing traffic to pods - load-balancing

I am following the steps at https://codelabs.developers.google.com/codelabs/cloud-hello-kubernetes and can successfully expose my pod to the outside world with a command like:
kubectl expose deployment hello --type="LoadBalancer"
I have set up a static IP and when I run
$ kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.111.xxx.x <none> 443/TCP 13d
hello 10.111.xxx.xx 104.155.xxx.xxx 80/TCP 12d
Everything looks OK and works for a couple of days, but after a while the traffic from 104.155.xxx.xxx stops getting routed to my pod and I start getting errors like this when I check the load balancer:
Instance gke-k8-default-pool-xxxx is unhealthy for 104.155.xxx.xxx
This always happens after a few days. I have no clue what I am doing wrong.

The load-balancer functionality is provided by the underlying infrastructure(in your case is GCE), so this has barely no relation with kubernetes it self.
Instance gke-k8-default-pool-xxxx is unhealthy for 104.155.xxx.xxx
From the log you provide, I can only tell the instance(vm) in your GCE can't pass the health check you defined for ip 104.155.xxx.xxx. So there is several things you need to check:
Did anything special happened for instance gke-k8-default-pool-xxxx?
What health check you defined for 104.155.xxx.xxx (why would it down for this instance) ?
So you can choose to find out what the true reason above or just restart the instance gke-k8-default-pool-xxxx to check if it succeed again.

Related

Which is the correct IP to run API tests on kubernetes cluster

i have kubernetes cluster with pods which are type cluster IP. Which is the correct ip to shoot it if want to run integration tests IP:10.102.222.181 or Endpoints: 10.244.0.157:80,10.244.5.243:80
for example:
Type: ClusterIP
IP Families: <none>
IP: 10.102.222.181
IPs: <none>
Port: http 80/TCP
TargetPort: 80/TCP
Endpoints: 10.244.0.157:80,10.244.5.243:80
Session Affinity: None
Events: <none>
If your test runner is running inside the cluster, use the name: of the Service as a host name. Don't use any of these IP addresses directly. Kubernetes provides a DNS service that will translate the Service's name to its address (the IP: from the kubectl describe service output), and the Service itself just forwards network traffic to the Endpoints: (individual pod addresses).
If the test runner is outside the cluster, none of these DNS names or IP addresses are reachable at all. For basic integration tests, it should be enough to kubectl port-forward service/its-name 12345:80, and then you can use http://localhost:12345 to reach the service (actually a fixed single pod from it). This isn't a good match for performance or load tests, and you'll either need to launch these from inside the cluster, or to use a NodePort or LoadBalancer service to make the service accessible from outside.
IPs in the Endpoints are individual Pod IPs which are subject to change when new pods are created and replace the old pods. ClusterIP is stable IP which does not change unless you delete the service and recreate it. So recommendation is to use the clusterIP.

Running an apache container on a port > 1024

I've built a docker image based on httpd:2.4. In my k8s deployment I've defined the following securityContext:
securityContext:
privileged: false
runAsNonRoot: true
runAsUser: 431
allowPrivilegeEscalation: false
In order to get this container to run properly as non-root apache needs to be configured to bind to a port > 1024, as opposed to the default 80. As far as I can tell this means editing Listen 80 in httpd.conf to Listen {Some port > 1024}.
When I want to run the docker image I've build normally (i.e. on default port 80) I have the following port settings:
deployment
spec.template.spec.containers[0].ports[0].containerPort: 80
service
spec.ports[0].targetPort: 80
spec.ports[0].port: 8080
ingress
spec.rules[0].http.paths[0].backend.servicePort: 8080
Given these settings the service becomes accessible at the host url provided in the ingress manifest. Again, this is without the changes to httpd.conf. When I make those changes (using Listen 8000), and add in the securityContext section to the deployment, I change the various manifests accordingly:
deployment
spec.template.spec.containers[0].ports[0].containerPort: 8000
service
spec.ports[0].targetPort: 8000
spec.ports[0].port: 8080
ingress
spec.rules[0].http.paths[0].backend.servicePort: 8080
Yet for some reason, when I try to access a URL that should be working I get a 502 Bad Gateway error. Have I set the ports correctly? Is there something else I need to do?
Check if pod is Running
kubectl get pods
kubectl logs pod_name
Check if the URL is accessible within the pod
kubectl exec -it <pod_name> -- bash
$ curl http://localhost:8000
If the above didn't work, check your httpd.conf.
Check with the service name
kubectl exec -it <ingress pod_name> -- bash
$ curl http://svc:8080
You can check ingress logs too.
In order to get this container to run properly as non-root apache
needs to be configured to bind to a port > 1024, as opposed to the
default 80
You got it, that's the hard requirement in order to make the apache container running as non-root, therefore this change needs to be done at container level, not to Kubernetes' abstracts like Deployment's Pod spec or Service/Ingress resource object definitions. So the only thing left in your case, is to build a custom httpd image, with listening port > 1024. The same approach applies to the NGINX Docker containers.
One key information for the 'containerPort' field in Pod spec, that you are trying to manually adjust, and which is not so apparent. It's there primarily for informational purposes, and does not cause opening port on container level. According Kubernetes API reference:
Not specifying a port here DOES NOT prevent that port from being
exposed. Any port which is listening on the default "0.0.0.0" address
inside a container will be accessible from the network. Cannot be updated.
I hope this will help you to move on

How to connect to redis-ha cluster in Kubernetes cluster?

So I recently installed stable/redis-ha cluster (https://github.com/helm/charts/tree/master/stable/redis-ha) on my G-Cloud based kubernetes cluster. The cluster was installed as a "Headless Service" without a ClusterIP. There are 3 pods that make up this cluster one of which is elected master.
The cluster has installed with no issues and can be accessed via redis-cli from my local pc (after port-forwarding with kubectl).
The output from the cluster install provided me with DNS name for the cluster. Because the service is a headless I am using the following DNS Name
port_name.port_protocol.svc.namespace.svc.cluster.local (As specified by the documentation)
When attempting to connect I get the following error:
"redis.exceptions.ConnectionError: Error -2 connecting to
port_name.port_protocol.svc.namespace.svc.cluster.local :6379. Name does not
resolve."
This is not working.
Not sure what to do here. Any help would be greatly appreciated.
the DNS appears to be incorrect. it should be in the below format
<redis-service-name>.<namespace>.svc.cluster.local:6379
say, redis service name is redis and namespace is default then it should be
redis.default.svc.cluster.local:6379
you can also use pod dns, like below
<redis-pod-name>.<redis-service-name>.<namespace>.svc.cluster.local:6379
say, redis pod name is redis-0 and redis service name is redis and namespace is default then it should be
redis-0.redis.default.svc.cluster.local:6379
assuming the service port is same as container port and that is 6379
Not sure if this is still relevant. Just enhance the chart similar to other charts to support NodePort, e.g. rabbitmq-ha so that you can use any node ip and configured node port if you want to access redis from outside the cluster.

Can not link a HTTP Load Balancer to a backend (502 Bad Gateway)

I have on the backend a Kubernetes node running on port 32656 (Kubernetes Service of type NodePort). If I create a firewall rule for the <node_ip>:32656 to allow traffic, I can open the backend in the browser on this address: http://<node_ip>:32656.
What I try to achieve now is creating an HTTP Load Balancer and link it to the above backend. I use the following script to create the infrastructure required:
#!/bin/bash
GROUP_NAME="gke-service-cluster-61155cae-group"
HEALTH_CHECK_NAME="test-health-check"
BACKEND_SERVICE_NAME="test-backend-service"
URL_MAP_NAME="test-url-map"
TARGET_PROXY_NAME="test-target-proxy"
GLOBAL_FORWARDING_RULE_NAME="test-global-rule"
NODE_PORT="32656"
PORT_NAME="http"
# instance group named ports
gcloud compute instance-groups set-named-ports "$GROUP_NAME" --named-ports "$PORT_NAME:$NODE_PORT"
# health check
gcloud compute http-health-checks create --format none "$HEALTH_CHECK_NAME" --check-interval "5m" --healthy-threshold "1" --timeout "5m" --unhealthy-threshold "10"
# backend service
gcloud compute backend-services create "$BACKEND_SERVICE_NAME" --http-health-check "$HEALTH_CHECK_NAME" --port-name "$PORT_NAME" --timeout "30"
gcloud compute backend-services add-backend "$BACKEND_SERVICE_NAME" --instance-group "$GROUP_NAME" --balancing-mode "UTILIZATION" --capacity-scaler "1" --max-utilization "1"
# URL map
gcloud compute url-maps create "$URL_MAP_NAME" --default-service "$BACKEND_SERVICE_NAME"
# target proxy
gcloud compute target-http-proxies create "$TARGET_PROXY_NAME" --url-map "$URL_MAP_NAME"
# global forwarding rule
gcloud compute forwarding-rules create "$GLOBAL_FORWARDING_RULE_NAME" --global --ip-protocol "TCP" --ports "80" --target-http-proxy "$TARGET_PROXY_NAME"
But I get the following response from the Load Balancer accessed through the public IP in the Frontend configuration:
Error: Server Error
The server encountered a temporary error and could not complete your
request. Please try again in 30 seconds.
The health check is left with default values: (/ and 80) and the backend service responds quickly with a status 200.
I have also created the firewall rule to accept any source and all ports (tcp) and no target specified (i.e. all targets).
Considering that regardless of the port I choose (in the instance group), and that I get the same result (Server Error), the problem should be somewhere in the configuration of the HTTP Load Balancer. (something with the health checks maybe?)
What am I missing from completing the linking between the frontend and the backend?
I assume you actually have instances in the instance group, and the firewall rule is not specific to a source range. Can you check your logs for a google health check? (UA will have google in it).
What version of kubernetes are you running? Fyi there's a resource in 1.2 that hooks this up for you automatically: http://kubernetes.io/docs/user-guide/ingress/, just make sure you do these: https://github.com/kubernetes/contrib/blob/master/ingress/controllers/gce/BETA_LIMITATIONS.md.
More specifically: in 1.2 you need to create a firewall rule, service of type=nodeport (both of which you already seem to have), and a health check on that service at "/" (which you don't have, this requirement is alleviated in 1.3 but 1.3 is not out yet).
Also note that you can't put the same instance into 2 loadbalanced IGs, so to use the Ingress mentioned above you will have to cleanup your existing loadbalancer (or at least, remove the instances from the IG, and free up enough quota so the Ingress controller can do its thing).
There can be a few things wrong that are mentioned:
firewall rules need to be set to all hosts, are they need to have the same network label as the machines in the instance group have
by default, the node should return 200 at / - readiness and liveness probes to configure otherwise did not work for me
It seems you try to do things that are all automated, so I can really recommend:
https://cloud.google.com/kubernetes-engine/docs/how-to/load-balance-ingress
This shows the steps that do the firewall and portforwarding for you, which also may show you what you are missing.
I noticed myself when using an app on 8080, exposed on 80 (like one of the deployments in the example) that the load balancer staid unhealthy untill I had / returning 200 (and /healthz I added to). So basically that container now exposes a webserver on port 8080, returning that and the other config wires that up to port 80.
When it comes to firewall rules, make sure they are set to all machines or make the network label match, or they won't work.
The 502 error is usually from the loadbalancer that will not pass your request if the health check does not pass.
Could you make your service type LoadBalancer (http://kubernetes.io/docs/user-guide/services/#type-loadbalancer) which would setup this all up automatically? This assumes you have the flag set for google cloud.
After you deploy, then describe the service name and should give you the endpoint which is assigned.

Kubernetes: get real client source IP of incoming packet

I want to get the actual IP using which the client sent out the packet in my app sitting in a kubernetes pod.
I did some searches and found that this was not supported earlier but supported later.
I ungraded my setup and here is the current setup version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
$ kubectl api-versions
extensions/v1beta1
v1
I also ran:
$ for node in $(kubectl get nodes -o name); do kubectl annotate $node net.beta.kubernetes.io/proxy-mode=iptables; done
This now gives:
error: --overwrite is false but found the following declared
annotation(s): 'net.beta.kubernetes.io/proxy-mode' already has a value
(iptables)
error: --overwrite is false but found the following declared
annotation(s): 'net.beta.kubernetes.io/proxy-mode' already has a value
(iptables)
I also rebooted all the boxes.
However, I still get IP of docker0 interface of the worker node when the packet is received inside my application.
Here, I read:
But that will not expose external client IPs, just intra cluster IPs.
So, the question is how to get the real, external client IP when I get a packet.
The packets are not http/websocket packets, but plain TCP packets if this is relevant to get an answer.
I also tried following this comment but did not get lucky. App continued to get packets with docker0 interface IP as source IP. May be I could not copy-paste the stuff. I don't know how to get kube-proxy IP and just used worker machine IP there. I am just getting started with Kubernetes and CoreOS.