When using the Bitnami Helm chart for Redis-Cluster, there is a redis-cluster-cluster-create job. However, when enabling istio-injection, this job never ends. If I disable istio-injection, the job quickly ends. Any solutions or reason why this phenomenon is happening?
Answering the main question
there is a redis-cluster-cluster-create job. However, when enabling istio-injection, this job never ends. If I disable istio-injection, the job quickly ends. Any solutions or reason why this phenomenon is happening?
The main issue here is that job is not considered complete until all containers have stopped running, and Istio sidecar run indefinitely, while your task may have completed, the Job as a whole will not appear as completed in Kubernetes.
There is github issue about that.
There is one of the workarounds, and you can find more workarounds here.
I can change the podAnnotations from Redis-Cluster Helm Chart, and when disabling the istio-injection, the Job doesn't spin up istio-proxy. However, the main job 'cluster-create' job never ends, and eventually fails the the deploy
As mentioned here
So as a temporary workaround adding sidecar.istio.io/inject: “false” is possible but this disables Istio for any traffic to/from the annotated Pod. As mentioned, we leveraged Kubernetes Jobs for Integration Testing, which meant some tests may need to access the service mesh. Disabling Istio essentially means breaking routing — a show stopper.
So it might actually not work here. I suggest to try with the quitquitquit as it's the most recommended workaround.
Additionally worth to check these github issues:
List of applications incompatible with Istio
helm stable/redis does not work with istio sidecar
Using Istio with CronJobs
Sidecar Containers
Related
I am working on an application that, as I can see is doing multiple health checks?
DB readiness probe
Another API dependency readiness probe
When I look at cluster logs, I realize that my service, when it fails a DB-check, just throws 500 and goes down. What I am failing to understand here is that if DB was down or another API was down and IF I do not have a readiness probe then my container is going down anyway. Also, I will see that my application did throw some 500 because DB or another service was off.
What is the benefit of the readiness probe of my container was going down anyway? Another question I have is that is Healthcheck something that I should consider only if I am deploying my service to a cluster? If it was not a cluster microservice environment, would it increase/decrease benefits of performing healtheck?
There are three types of probes that Kubernetes uses to check the health of a Pod:
Liveness: Tells Kubernetes that something went wrong inside the container, and it's better to restart it to see if Kubernetes can resolve the error.
Readiness: Tells Kubernetes that the Pod is ready to receive traffic. Sometimes something happens that doesn't wholly incapacitate the Pod but makes it impossible to fulfill the client's request. For example: losing connection to a database or a failure on a third party service. In this case, we don't want Kubernetes to reset the Pod, but we also don't wish for it to send it traffic that it can't fulfill. When a Readiness probe fails, Kubernetes removes the Pod from the service and stops communication with the Pod. Once the error is resolved, Kubernetes can add it back.
Startup: Tells Kubernetes when a Pod has started and is ready to receive traffic. These probes are especially useful on applications that take a while to begin. While the Pod initiates, Kubernetes doesn't send Liveness or Readiness probes. If it did, they might interfere with the app startup.
You can get more information about how probes work on this link:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Readiness probes are used in a few places. A big one is that non-ready pods are removed from all Services that reference them. They also matter for rolling updates on Deployments/StatefulSets as the roll won't continue until the new pods reach a ready state. In general the checks used for readiness probes should only be checking the current service. So it shouldn't be reaching out to a database. Sometimes that's hard to implement and does indeed make them less useful. But check per-pod stuff like the web server is listening on the port and can return HTTP responses.
I am trying to deploy a pod to the cluster. The application I am deploying is not a web server. I have an issue with setting up the liveness and readiness probes. Usually, I would use something like /isActive and /buildInfo endpoint for that.
I've read this https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-command.
Wondering if I need to code a mechanism which will create a file and then somehow prob it from the deployment.yaml file?
Edit: this is what I used to keep the container running, not sure if that is the best way to do it?
- touch /tmp/healthy; while true; do sleep 30; done;
It does not make sense to create files in your application just for the liveness probe. On the K8s documentation this is just an example to show you how the exec command probe works.
The idea behind the liveness probe is bipartite:
Avoid traffic on your Pods, before they have been fully started.
Detect unresponsive applications due to lack of resources or deadlocks where the application main process is still running.
Given that your deployments don't seem to expect external traffic, you don't require a liveness probe for the first case. Regarding the second case, question is how your application could lock up and how you would notice externally, e.g. by monitoring a log file or similar.
Bear in mind, that K8s will still monitor whether your applications main process is running. So, restarts on application failure will still occur, if you application stops running without a liveness probe. So, if you can be fairly sure that your application is not prone to becoming unresponsive while still running, you can also do without a liveness probe.
I am having one namespace and one deployment(replica set), My Apache logs should be written outside the pod, how is it possible in Kubernetes.
This is a Community Wiki answer so feel free to edit it and add any additional details you consider important.
You should specify more precisely what you exactly mean by outside the pod, but as David Maze have already suggested in his comment, take a closer look at Logging Architecture section in the official kubernetes documentation.
Depending on what you mean by "outside the Pod", different solution may be the most optimal in your case.
As you can read there:
Kubernetes provides no native storage solution for log data, but you can integrate many existing logging solutions into your Kubernetes
cluster ... Cluster-level logging architectures are described in assumption that a logging backend is present inside or outside of your cluster.
Here are mentioned 3 most popular cluster-level logging architectures:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Second solution is widely used. Unlike the third one where the logs pushing needs to be handled by your application container, sidecar approach is application independend, which makes it much more flexible solution.
So that the matter was not so simple, it can be implemented in two different ways:
Streaming sidecar container
Sidecar container with a logging agent
We are running one of our services in a newly created kubernetes cluster. Because of that, we have now switched them from the previous "in-memory" cache to a Redis cache.
Preliminary tests on our application which exposes an API shows that we experience timeouts from our applications to the Redis cache. I have no idea why and it issue pops up very irregularly.
So I'm thinking maybe the reason for these timeouts are actually network related. Is it a good idea to put in affinity so we always run the Redis-cache on the same nodes as the application to prevent network issues?
The issues have not arisen during "very high load" situations so it's concerning me a bit.
This is an opinion question so I'll answer in an opinionated way:
Like you mentioned I would try to put the Redis and application pods on the same node, that would rule out wire networking issues. You can accomplish that with Kubernetes pod affinity. But you can also try nodeslector, that way you always pin your Redis and application pods to a specific node.
Another way to do this is to taint your nodes where you want to run your workloads and then add a toleration to the Redis and your application pods.
Hope it helps!
I just started trying out Spinnaker. I have gone through the tutorial, https://www.spinnaker.io/guides/tutorials/codelabs/gcp-kubernetes-source-to-prod/, and got it working without issues.
Now I want to go a bit more advanced and do a rolling release or a canary deployment (https://www.spinnaker.io/concepts/#deployment-strategies), where it is possible, for instance, to only expose a new release to 5% of the customers.
I cannot find any guide on spinnaker.io (or google) on how to set that up. Can anyone guide me in the right direction?
I have currently been experimenting and doing PoC's on Spinnaker and Canary Deployments myself of late, and here is what I have found thus far.
To implement a rolling release, just create a Deploy stage in Spinnaker, and set the Deployment Strategy to RollingUpdate in your Server Group config. You will need to make sure that the Deployment checkbox is checked before you can change the Deployment Strategy.
For the Canary Deployment, it is a little more involved. I don't think that the Canary Stage currently supports Kubernetes Deployments(yet), but apparently you can manually deploy a canary(e.g. 1 replica) into the same Kubernetes LoadBalancer where your app is running. This is done using a separate Spinnaker Server Group.
Then you can add a Manual Judgement to your Spinnaker pipeline that will pause until you test/validate the canary. Once the canary has been validated, you "Continue" the Manual Judgement, and the new Server Group gets deployed, and the old Server Group gets disabled, and the canary destroyed.
If you don't want to use a Manual Judgement, and want this fully automated, you can add an ACA Stage(Automated Canary Analysis). This involves setting up a judge, that Spinnaker can connect to, that will gather various metrics and provide an ACA score. You can then use that score to decide whether to proceed with a deployment, or stop the deployment.