I have a situation where my acceptance test makes a connection with a rabbitMQ instance during the pipeline. But the rabbitMQ instance is private, making not possible to make this connection in the pipeline.
I was wondering if making an api endpoint that run this test and adding to the startup probe would be a good approach to make sure this test is passing.
If the rabbitmq is a container in your pod yes, if it isn't then you shouldn't.
There's no final answer to this, but the startup probe is just there to ensure that your pod is not being falsly considered unhealthy by other probes just because it takes a little longer to start. It's aimed at legacy applications that need to build assets or compile stuff at startup.
If there was a place to put a connectivity test to rabitmq would be the liveness probe, but you should only do that if your application is entirely dependent on a connection to rabbitmq, otherwise your authentication would fail because you couldn't connect to the messaging queue. And if you have a second app that tries to connect to your endpoint as a liveness probe? And a third app that connects the second one to check if that app is alive? You could kill an entire ecosystem just because rabbitmq rebooted or crashed real quick.
Not recommended.
You could have that as part of your liveness probe IF your app is a worker, then, not having a connection to rabbitmq would make the worker unusable.
Your acceptance tests should be placed on your CD or in a post-deploy script step if you don't have a CD.
Related
I am working on an application that, as I can see is doing multiple health checks?
DB readiness probe
Another API dependency readiness probe
When I look at cluster logs, I realize that my service, when it fails a DB-check, just throws 500 and goes down. What I am failing to understand here is that if DB was down or another API was down and IF I do not have a readiness probe then my container is going down anyway. Also, I will see that my application did throw some 500 because DB or another service was off.
What is the benefit of the readiness probe of my container was going down anyway? Another question I have is that is Healthcheck something that I should consider only if I am deploying my service to a cluster? If it was not a cluster microservice environment, would it increase/decrease benefits of performing healtheck?
There are three types of probes that Kubernetes uses to check the health of a Pod:
Liveness: Tells Kubernetes that something went wrong inside the container, and it's better to restart it to see if Kubernetes can resolve the error.
Readiness: Tells Kubernetes that the Pod is ready to receive traffic. Sometimes something happens that doesn't wholly incapacitate the Pod but makes it impossible to fulfill the client's request. For example: losing connection to a database or a failure on a third party service. In this case, we don't want Kubernetes to reset the Pod, but we also don't wish for it to send it traffic that it can't fulfill. When a Readiness probe fails, Kubernetes removes the Pod from the service and stops communication with the Pod. Once the error is resolved, Kubernetes can add it back.
Startup: Tells Kubernetes when a Pod has started and is ready to receive traffic. These probes are especially useful on applications that take a while to begin. While the Pod initiates, Kubernetes doesn't send Liveness or Readiness probes. If it did, they might interfere with the app startup.
You can get more information about how probes work on this link:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Readiness probes are used in a few places. A big one is that non-ready pods are removed from all Services that reference them. They also matter for rolling updates on Deployments/StatefulSets as the roll won't continue until the new pods reach a ready state. In general the checks used for readiness probes should only be checking the current service. So it shouldn't be reaching out to a database. Sometimes that's hard to implement and does indeed make them less useful. But check per-pod stuff like the web server is listening on the port and can return HTTP responses.
I am trying to deploy a pod to the cluster. The application I am deploying is not a web server. I have an issue with setting up the liveness and readiness probes. Usually, I would use something like /isActive and /buildInfo endpoint for that.
I've read this https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-command.
Wondering if I need to code a mechanism which will create a file and then somehow prob it from the deployment.yaml file?
Edit: this is what I used to keep the container running, not sure if that is the best way to do it?
- touch /tmp/healthy; while true; do sleep 30; done;
It does not make sense to create files in your application just for the liveness probe. On the K8s documentation this is just an example to show you how the exec command probe works.
The idea behind the liveness probe is bipartite:
Avoid traffic on your Pods, before they have been fully started.
Detect unresponsive applications due to lack of resources or deadlocks where the application main process is still running.
Given that your deployments don't seem to expect external traffic, you don't require a liveness probe for the first case. Regarding the second case, question is how your application could lock up and how you would notice externally, e.g. by monitoring a log file or similar.
Bear in mind, that K8s will still monitor whether your applications main process is running. So, restarts on application failure will still occur, if you application stops running without a liveness probe. So, if you can be fairly sure that your application is not prone to becoming unresponsive while still running, you can also do without a liveness probe.
I'd like to run a perl6/raku Cro app as a service behind a frontend webserver.
Just running cro run won't handle restarting after segfaults & reboots.
Previously with perl5 I've used FastCGI - however Cro::HTTP::Server's Cro::HTTP::Server.new().start() idiom doesn't look compatible with FastCGI::Native's while $fcgi.accept() {} example.
The service.p6 generated by cro stub does have a SIGINT handler, however I'm unsure whether this is sufficient to point to it in a systemctl service, i.e.
[Service]
ExecStart = /path/to/service.p6
How are people currently hosting Cro apps?
cro run is intended as a development tool, not a deployment one, and so is indeed not a good choice for hosting the services.
All of the Cro services I directly take care of are containerized (some guidance on that here) and then run on a hosted Kubernetes cluster. Kubernetes takes care of the automatic restarts, rolling out new versions, etc. I'm also aware of docker-compose being used in place of Kubernetes, which I guess works, though I believe that's also considered as primarily a development tool.
Setting it up as a systemctl service should also work fine, provided that's configured to always restart. However, it seems that you'd want to handle SIGTERM for the clean shutdown to work instead of SIGINT (nothing wrong with handling both).
I do also place a frontend web server in front of Cro (using Apache, though nginx would be a fine choice too), and also use that to do some caching of static content also (using content-control in my routes to describe the cachability).
I need to run a component using Apache Camel (or Spring Integration) under WAS ND 8.0 cluster. They both run some threads on startup, and stop them on shutdown normally. No problem to supply WAS managed threadpool. But that threads must run on single cluster's node at the same time. Moreover it must be high-available i.e. switch to other node when active node falls.
Solution I found - is WAS Partitioning Facility. It requires additional Extended Deployment licenses. Is it the only way, or there is some way to implement this using Network Deployment license only?
Thanks in advance.
I think that there is not a feature that address this interesting requirement.
I can imagine a "trick":
A Timer EJB send a message on a queue (let's say 1 per minute)
Configure a Service Integration Bus (SIB) with High Availability and No Scalability, so the HA Manager ensure that only one messaging engine (ME) is alive.
Create a non-reliable queue for high performances and low resource consumption.
The Activation Spec should be configured to listen only local ME.
A MDB implement the following logic: when the message arrives, it check if the singleton thread is alive, otherwise it start the thread.
Does it make sense?
I have a WCF Web Service that has no concurrency configuration in the web.config, so I believe it is running as the default as persession. In the service, it uses a COBOL Virtual Machine to execute code that pulls data from COBOL Vision files. Per the developer of the COBOL VM, it is a singleton.
When more than one person accesses the service at a time, I'll get periodic crashes of the web service. What I believe is happening is that as one process is executing another separate process comes in at about the same time. The first process ends and closes the VM down through normal closing procedures. The second process is still executing and attempting to read/write data, but the VM was shutdown and it crashes. In the constructor for the web service, an instance of the VM is created and when a series of methods complete, the service is cleaned up and the VM closed out.
I have been reading up on Singleton concurrency in WCF web services and thinking I might need to switch to this instead. This way I can open the COBOL VM and keep it alive forever and eliminate my code shutting down the VM in my methods. The only data I need to share between requests is the status of the COBOL VM.
My alternative I'm thinking of is creating a server process that manages opening the VM and keeping it alive and allowing the web service to make read/write requests through that process instead.
Does this sound like the right path? I'm basically looking for a way to keep the Virtual Machine alive in a WCF web service situation and just keep executing code against it. The COBOL VM system sends me back locking information on the read/writes which I can use to handle retries or waits against.
Thanks,
Martin
The web service is now marked as:
[ServiceBehavior(ConcurrencyMode = ConcurrencyMode.Single)]
From what I understand, this only allows a single thread to run through the web service at a time. Other requests are queued until the first completes. This was a quick fix that works in my situation because my web service doesn't require high concurrency. There are never more than a handful of requests coming in at a time.