Restrict Log Analytics logging per deployment or container - azure-container-service

We've seen our Log Analytics costs spike and found that the ContainerLog table had grown drastically. This appears to be all stdout/stderr logs from the containers.
Is it possible to restrict logging to this table, at least for some deployments or containers, without disabling Log Analytics on the cluster? We still want performance logging and insights.

AFAIK the stdout and stderr logs under ContainerLog table are basically the logs which we see when we manually run the command "kubectl logs " so it would be possible to restrict logging to ContainerLog table without disabling Log Analytics on the cluster by having the deployment file something like shown below which would write logs to logfile within the container.
apiVersion: apps/v1
kind: Deployment
metadata:
name: xxxxxxx
spec:
selector:
matchLabels:
app: xxxxxxx
template:
metadata:
labels:
app: xxxxxxx
spec:
containers:
- name: xxxxxxx
image: xxxxxxx/xxxxxxx:latest
command: ["sh", "-c", "./xxxxxxx.sh &> /logfile"]
However, the best practice would be to send log messages to stdout for applications running in a container so the above process is not a preferrable way.
So you may create an alert when data collection is higher than expected as explained in this article and / or occasionally delete unwanted data as explained in this article by leveraging purge REST API (but make sure you are purging only unwanted data because the deletes in Log Analytics are non-reversible!).
Hope this helps!!

Recently faced a similar problem in one of our Azure Clusters. Due to some incessant logging in the code the container logs went berserk. It is possible to restrict logging per namespace at the level of STDOUT or STDERR.
You have to configure this by deploying a config map on the kube-system namespace upon which, logging ingestion to the log analytics workspace can be disabled/restricted per namespace.
The omsagent pods in kube-system namespace will absorb these new configs in a few mins.
Download the below file and apply it on your Azure Kubernetes cluster
container-azm-ms-agentconfig.yaml
The file contains the flags to enable/disable logging and namespaces can be excluded in the rule.
# kubectl apply -f <path to container-azm-ms-agentconfig.yaml>
This only prevents the log collection in the Log analytics Workspace but not the log generation in the individual containers.
Details on each config flag in the file is available here

Related

Named processes grafana dashboard not working

I created this dashboard by importing its ID.
Then, in order to have the necessary metrics, used this chart to install this exporter in my EKS cluster:
helm repo add prometheus-process-exporter-charts https://raw.githubusercontent.com/mumoshu/prometheus-process-exporter/master/docs
helm install --generate-name prometheus-process-exporter-charts/prometheus-process-exporter
All the prometheus-process-exporter are up and running, but the only log they have is:
2022/11/23 18:26:55 Reading metrics from /host/proc based on "/var/process-exporter/config.yml"
I was expecting to automatically have all default processes listed in the dashboard as soon as I deployed the exporter, but the dashboard still say "No data":
Do you have any ideas on why this is happening? Did I miss any step in configuring this exporter?

S3 - Kubernetes probe

I have the following situation:
Application uses S3 to store data in Amazon. Application is deployed as a pod in kubernetes. Sometimes some of developers messes with access data for S3 (eg. user/password) and application fails to connect to S3 - but pod starts normally and kills previous pod version that worked OK (since all readiness and aliveness probes are OK). I thought of adding S3 probe to readiness - in order to execute HeadBucketRequest on S3 and if this one succeeds it is able to connect to S3. The problem here is that these requests cost money, and I really need them only on start of the pod.
Are there any best-practices related to this one?
If you (quote) "... really need them [the probes] only on start of the pod" then look into adding a startup probe.
In addition to what startup probes help with - pods that take longer time to start - a startup probe will make it possible to verify a condition only at pod startup time.
Readiness and liveness prove as for checking the health of POD or container while running. You scenario is quite wired but with Readiness & liveness probe it wont work as it fire on internal and which cost money.
in this case you might can use the lifecycle hook :
containers:
- image: MAGE_NAME
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "script.sh"]
which will run the hook at starting of the container you can keep shell file inside the POD or image.
inside shell file you can right logic if 200 response move a head and container get started.
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/

Spinnaker - SQL backend for front50

I am trying to setup SQL backend for front50 using the document below.
https://www.spinnaker.io/setup/productionize/persistence/front50-sql/
I have fron50-local.yaml for the mysql config.
But, not sure how to disable persistent storage in halyard config. Here, I can not completely remove persistentStorage and persistentStoreType should be one of a3,azs,gcs,redis,s3,oracle.
There is no option to disable persistent storage here.
persistentStorage:
persistentStoreType: s3
azs: {}
gcs:
rootFolder: front50
redis: {}
s3:
bucket: spinnaker
rootFolder: front50
maxKeys: 1000
oracle: {}
So within your front50-local.yaml you will want to disable the service you used to utilize e.g.
spinnaker:
gcs:
enabled: false
s3:
enabled: false
You may need/want to remove the section from your halconfig and run your apply with
hal deploy apply --no-validate
There are a number of users dealing with these same issues and some more help might be found on the Slack: https://join.spinnaker.io/
I've noticed the same issue just recently. Maybe this is because, for example Kayenta (which is an optional component to enable) is still missing the non-object storage persistent support, or...
I've created a GitHub issue on this here: https://github.com/spinnaker/spinnaker/issues/5447

Managing the health and well being of multiple pods with dependencies

We have several pods (as service/deployments) in our k8s workflow that are dependent on each other, such that if one goes into a CrashLoopBackOff state, then all these services need to be redeployed.
Instead of having to manually do this, is there a programatic way of handling this?
Of course we are trying to figure out why the pod in question is crashing.
If these are so tightly dependant on each other, I would consider these options
a) Rearchitect your system to be more resilient towards failure and tolerate, if a pod is temporary unavailable
b) Put all parts into one pod as separate containers, making the atomic design more explicit
If these don't fit your needs, you can use the Kubernetes API to create a program that automates the task of restarting all dependent parts. There are client libraries for multiple languages and integration is quite easy. The next step would be a custom resource definition (CRD) so you can manage your own system using an extension to the Kubernetes API.
First thing to do is making sure that pods are started in correct sequence. This can be done using initContainers like that:
spec:
initContainers:
- name: waitfor
image: jwilder/dockerize
args:
- -wait
- "http://config-srv/actuator/health"
- -wait
- "http://registry-srv/actuator/health"
- -wait
- "http://rabbitmq:15672"
- -timeout
- 600s
Here your pod will not start until all the services in a list are responding to HTTP probes.
Next thing you may want to define liveness probe that periodically executes curl to the same services
spec:
livenessProbe:
exec:
command:
- /bin/sh
- -c
- curl http://config-srv/actuator/health &&
curl http://registry-srv/actuator/health &&
curl http://rabbitmq:15672
Now if any of those services fail - you pod will fail liveness probe, be restarted and wait for services to become back online.
That's just an example how it can be done. In your case checks can be different of course.

How can I configure openshift to find the my RabbitMQ definitions.json?

I am experiencing this problem or a similiar problem:
https://access.redhat.com/solutions/2374351 (RabbitMQ users and its permission are deleted after resource restart)
But the proposed fix is not public.
I would like to have a user name & password hash pair which can survive a complete crash.
I am not sure how with openshift templates to programmatically upload or define definitions.json. I can upload the definitions.json to here /var/lib/rabbitmq/etc/rabbitmq/definitions.json with winscp.
If my definitions.json is uploaded from hand, after a crash the user names and hashes get reloaded. However I don't want to upload from hand. I would like to configure openshift and save that configuration.
My only idea is to trying to access one openshift ConfigMap from another.
I have two config maps:
plattform-rabbitmq-configmap
definitions.json
I want the ConfigMap plattform-rabbitmq-configmap to reference the ConfigMap definitions.json.
plattform-rabbitmq-configmap contains my rabbitmq.config. In plattform-rabbitmq-configmap I want to access or load definitions.json.
Using the oc get configmaps command I got a selflink for definitions.json. Using the selflink I try to load the definitions.json as follows (in plattform-rabbitmq-configmap):
load_definitions,"/api/v1/namespaces/my-app/configmaps/definitions.json"}
But that doesn't work:
=INFO REPORT==== 15-Mar-2018::15:08:40 ===
application: rabbit
exited: {bad_return,
{{rabbit,start,[normal,[]]},
{'EXIT',
{error,
{could_not_read_defs,
{"/api/v1/namespaces/my-app/configmaps/definitions.json",
enoent}}}}}}
type: transient
Is there any way to do this? Or another way?