I have an instance of MuleSoft's Flex Gateway (v 1.2.0) installed on a Linux machine in a podman container. I am trying to forward container as well as API logs to Splunk. Below is my log.yaml file in /home/username/app folder. Not sure what I am doing wrong, but the logs are not getting forwarded to Splunk.
apiVersion: gateway.mulesoft.com/v1alpha1
kind: Configuration
metadata:
name: logging-config
spec:
logging:
outputs:
- name: default
type: splunk
parameters:
host: <instance-name>.splunkcloud.com
port: "443"
splunk_token: xxxxx-xxxxx-xxxx-xxxx
tls: "on"
tls.verify: "off"
splunk_send_raw: "on"
runtimeLogs:
logLevel: info
outputs:
- default
accessLogs:
outputs:
- default
Please advise.
The endpoint for Splunk's HTTP Event Collector (HEC) is https://http-input.<instance-name>.splunkcloud.com:443/services/collector/raw. If you're using a free trial of Splunk Cloud then change the port number to 8088. See https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Send_data_to_HTTP_Event_Collector_on_Splunk_Cloud_Platform for details.
I managed to get this work. The issue was that I had to give full permissions to the app folder using "chmod" command. After it was done, the fluent-bit.conf file had an entry for Splunk and logs started flowing.
Related
If I understand right from Apply Pod Security Standards at the Cluster Level, in order to have a PSS (Pod Security Standard) as default for the whole cluster I need to create an AdmissionConfiguration in a file that the API server needs to consume during cluster creation.
I don't see any way to configure / provide the AdmissionConfiguration at CreateCluster , also I'm not sure how to provide this AdmissionConfiguration in a managed EKS node.
From the tutorials that use KinD or minikube it seems that the AdmissionConfiguration must be in a file that is referenced in the cluster-config.yaml, but if I'm not mistaken the EKS API server is managed and does not allow to change or even see this file.
The GitHub issue aws/container-roadmap Allow Access to AdmissionConfiguration seems to suggest that currently there is no possibility of providing AdmissionConfiguration at creation, but on the other hand aws-eks-best-practices says These exemptions are applied statically in the PSA admission controller configuration as part of the API server configuration
so, is there a way to provide PodSecurityConfiguration for the whole cluster in EKS? or I'm forced to just use per-namespace labels?
See also Enforce Pod Security Standards by Configuration the Built-in Admission Controller and EKS Best practices PSS and PSA
I don't think there is any way currently in EKS to provide configuration for the built-in PSA controller (Pod Security Admission controller).
But if you want to implement a cluster-wide default for PSS (Pod Security Standards) you can do that by installing the the official pod-security-webhook as a Dynamic Admission Controller in EKS.
git clone https://github.com/kubernetes/pod-security-admission
cd pod-security-admission/webhook
make certs
kubectl apply -k .
The default podsecurityconfiguration.yaml in pod-security-admission/webhook/manifests/020-configmap.yaml allows EVERYTHING so you should edit it and write something like
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-security-webhook
namespace: pod-security-webhook
data:
podsecurityconfiguration.yaml: |
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
# Array of authenticated usernames to exempt.
usernames: []
# Array of runtime class names to exempt.
runtimeClasses: []
# Array of namespaces to exempt.
namespaces: ["policy-test2"]
then
kubectl apply -k .
kubectl -n pod-security-webhook rollout restart deployment/pod-security-webhook # otherwise the pods won't reread the configuration changes
After those changes you can verify that the default forbids privileged pods with:
kubectl --context aihub-eks-terraform create ns policy-test1
kubectl --context aihub-eks-terraform -n policy-test1 run --image=ecerulm/ubuntu-tools:latest --rm -ti rubelagu-$RANDOM --privileged
Error from server (Forbidden): admission webhook "pod-security-webhook.kubernetes.io" denied the request: pods "rubelagu-32081" is forbidden: violates PodSecurity "restricted:latest": privileged (container "rubelagu-32081" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "rubelagu-32081" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "rubelagu-32081" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "rubelagu-32081" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "rubelagu-32081" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Note: that you get the error forbidding privileged pods even when the namespace policy-test1 has no label pod-security.kubernetes.io/enforce, so you know that this rule comes from the pod-security-webhook that we just installed and configured.
Now if you want to create a pod you will be forced to create in a way that complies with the restricted PSS, by specifying runAsNonRoot, seccompProfile.type and capabilities and For example:
apiVersion: v1
kind: Pod
metadata:
name: test-1
spec:
restartPolicy: Never
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: test
image: ecerulm/ubuntu-tools:latest
imagePullPolicy: Always
command: ["/bin/bash", "-c", "--", "sleep 900"]
securityContext:
privileged: false
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
I have a rancher 2.6.67 server and RKE2 downstream cluster. The cluster was created without authorized cluster endpoint. How to add an authorised cluster endpoint to a RKE2 cluster created by Rancher article describes how to add it in an existing cluster, however although the answer looks promising, I still must miss some detail, because it does not work for me.
Here is what I did:
Created /var/lib/rancher/rke2/kube-api-authn-webhook.yaml file with contents:
apiVersion: v1
kind: Config
clusters:
- name: Default
cluster:
insecure-skip-tls-verify: true
server: http://127.0.0.1:6440/v1/authenticate
users:
- name: Default
user:
insecure-skip-tls-verify: true
current-context: webhook
contexts:
- name: webhook
context:
user: Default
cluster: Default
and added
"kube-apiserver-arg": [
"authentication-token-webhook-config-file=/var/lib/rancher/rke2/kube-api-authn-webhook.yaml"
to the /etc/rancher/rke2/config.yaml.d/50-rancher.yaml file.
After restarting rke2-server I found the network configuration tab in Rancher and was able to enable authorized endpoint. Here is where my success ends.
I tried to create a serviceaccount and got the secret to have token authorization, but it failed when connecting directly to the api endpoint on the master.
kube-api-auth pod logs this:
time="2022-10-06T08:42:27Z" level=error msg="found 1 parts of token"
time="2022-10-06T08:42:27Z" level=info msg="Processing v1Authenticate request..."
Also the log is full of messages like this:
E1006 09:04:07.868108 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterAuthToken: failed to list *v3.ClusterAuthToken: the server could not find the requested resource (get clusterauthtokens.meta.k8s.io)
E1006 09:04:40.778350 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterAuthToken: failed to list *v3.ClusterAuthToken: the server could not find the requested resource (get clusterauthtokens.meta.k8s.io)
E1006 09:04:45.171554 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterUserAttribute: failed to list *v3.ClusterUserAttribute: the server could not find the requested resource (get clusteruserattributes.meta.k8s.io)
I found that SA tokens will not work this way so I tried to use a rancher user token, but that fails as well:
time="2022-10-06T08:37:34Z" level=info msg=" ...looking up token for kubeconfig-user-qq9nrc86vv"
time="2022-10-06T08:37:34Z" level=error msg="clusterauthtokens.cluster.cattle.io \"cattle-system/kubeconfig-user-qq9nrc86vv\" not found"
Checking the cattle-system namespace, there are no SA and secret entries corresponding to the users created in rancher, however I found SA and secret entries related in cattle-impersonation-system.
I tried creating a new user, but that too, only resulted in new entries in cattle-impersonation-system namespace, so I presume kube-api-auth wrongly assumes the location of the secrets to be cattle-system namespace.
Now the questions:
Can I authenticate with downstream RKE2 cluster using normal SA tokens (not ones created through Rancher server)? If so, how?
What did I do wrong about adding the webhook authentication configuration? How to make it work?
I noticed, that since I made the modifications described above, I cannot download the kubeconfig file from the rancher UI for this cluster. What went wrong there?
Thanks in advance for any advice.
I'm running a .dotnet 3.1 RestAPI inside a pod in Openshift, and it process every request smoothly - all transactions to the database (outside the Openshift network) are executed properly, all programatic executions are being finalized without errors. However, 1 in 15 requests will always ECONNRESET and fail to return the HTTP request.
Let's say I make a GET to /users/id/3 - I can see this request hitting my restAPI, being processed all the way down the infra layer, fetch data from the DB, wrap the return, and send it back finishing the request, however at this point, no return is received by the frontend, or postman, but I can see on the API logs that the request was finished and returned.
All of this requests take 2.3min to execute, and often finish in a ECONNRESET. I'm at odds at how to troubleshoot this. I have tried curl'ing the resource in another pod and the same behaviour appears.
I think these requests sometimes are getting lost in the cluster network, so I tried playing with the sessionaffinity of the service config but it's not really tied to this, as far as I understood. Do I have a wrong route config, or service config?
Route config
spec:
host: api.com.cloud
to:
kind: Service
name: api
weight: 100
port:
targetPort: 8080-tcp
tls:
termination: edge
wildcardPolicy: None
status:
ingress:
- host: api.com.cloud
routerName: default
conditions:
- type: Admitted
status: 'True'
lastTransitionTime: XXXX
wildcardPolicy: None
routerCanonicalHostname: router-default.apps.com.cloud
Service config
spec:
clusterIP: XXXX
ipFamilies:
- IPv4
ports:
- name: 8080-tcp
protocol: TCP
port: 8080
targetPort: 8080
internalTrafficPolicy: Cluster
clusterIPs:
- XXX
type: ClusterIP
ipFamilyPolicy: SingleStack
sessionAffinity: None
selector:
deploymentconfig: api
status:
loadBalancer: {}
I'm trying to apply rate limiting on some of our internal services (inside the mesh).
I used the example from the docs and generated redis rate limiting configurations that include a (redis) handler, quota instance, quota spec, quota spec binding and rule to apply the handler.
This redis handler:
apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
name: redishandler
namespace: istio-system
spec:
compiledAdapter: redisquota
params:
redisServerUrl: <REDIS>:6379
connectionPoolSize: 10
quotas:
- name: requestcountquota.instance.istio-system
maxAmount: 10
validDuration: 100s
rateLimitAlgorithm: FIXED_WINDOW
overrides:
- dimensions:
destination: s1
maxAmount: 1
- dimensions:
destination: s3
maxAmount: 1
- dimensions:
destination: s2
maxAmount: 1
The quota instance (I'm only interested in limiting by destination at the moment):
apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
name: requestcountquota
namespace: istio-system
spec:
compiledTemplate: quota
params:
dimensions:
destination: destination.labels["app"] | destination.service.host | "unknown"
A quota spec, charging 1 per request if I understand correctly:
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
name: request-count
namespace: istio-system
spec:
rules:
- quotas:
- charge: 1
quota: requestcountquota
A quota binding spec that all participating services pre-fetch. I also tried with service: "*" which also did nothing.
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
name: request-count
namespace: istio-system
spec:
quotaSpecs:
- name: request-count
namespace: istio-system
services:
- name: s2
namespace: default
- name: s3
namespace: default
- name: s1
namespace: default
# - service: '*' # Uncomment this to bind *all* services to request-count
A rule to apply the handler. Currently on all occasions (tried with matches but didn't change anything as well):
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: quota
namespace: istio-system
spec:
actions:
- handler: redishandler
instances:
- requestcountquota
The VirtualService definitions are pretty similar for all participants:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: s1
spec:
hosts:
- s1
http:
- route:
- destination:
host: s1
The problem is nothing really happens and no rate limiting takes place. I tested with curl from pods inside the mesh. The redis instance is empty (no keys on db 0, which I assume is what the rate limiting would use) so I know it can't practically rate-limit anything.
The handler seems to be configured properly (how can I make sure?) because I had some errors in it which were reported in mixer (policy). There are still some errors but none which I associate to this problem or the configuration. The only line in which redis handler is mentioned is this:
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
But its unclear if its a problem or not. I assume its not.
These are the rest of the lines from the reload once I deploy:
2019-12-17T13:44:22.601644Z info Built new config.Snapshot: id='43'
2019-12-17T13:44:22.601866Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.601881Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.602718Z info adapters Waiting for kubernetes cache sync... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903844Z info adapters Cache sync successful. {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903878Z info adapters getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903882Z warn Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2019-12-17T13:44:22.904808Z info Setting up event handlers
2019-12-17T13:44:22.904939Z info Starting Secrets controller
2019-12-17T13:44:22.904991Z info Waiting for informer caches to sync
2019-12-17T13:44:22.957893Z info Cleaning up handler table, with config ID:42
2019-12-17T13:44:22.957924Z info adapters deleted remote controller {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.957999Z info adapters adapter closed all scheduled daemons and workers {"adapter": "prometheus.istio-system"}
2019-12-17T13:44:22.958041Z info adapters adapter closed all scheduled daemons and workers {"adapter": "redishandler.istio-system"}
2019-12-17T13:44:22.958065Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958050Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958096Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958182Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:23.958109Z info adapters adapter closed all scheduled daemons and workers {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:55:21.042131Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2019-12-17T14:14:00.265722Z info transport: loopyWriter.run returning. connection error: desc = "transport is closing"
I'm using the demo profile with disablePolicyChecks: false to enable rate limiting. This is on istio 1.4.0, deployed on EKS.
I also tried memquota (this is our staging environment) with low limits and nothing seems to work. I never got a 429 no matter how much I went over the rate limit configured.
I don't know how to debug this and see where the configuration is wrong causing it to do nothing.
Any help is appreciated.
I too spent hours trying to decipher the documentation and get a sample working.
According to the documentation, they recommended that we enable policy checks:
https://istio.io/docs/tasks/policy-enforcement/rate-limiting/
However when that did not work, I did an "istioctl profile dump", searched for policy, and tried several settings.
I used Helm install and passed the following and then was able to get the described behaviour:
--set global.disablePolicyChecks=false \
--set values.pilot.policy.enabled=true \ ===> this made it work, but it's not in the docs.
I'm having a very hard time to get two WildFly swarm apps (based on 2017.9.5 version) communicate with each other over a standalone ActiveMQ 5.14.3 broker. All done using YAML config as I can't have a main method in my case.
after reading hundreds of outdated examples and inaccurate pages of documentation, I settled with following settings for both producer and consumer apps:
swarm:
messaging-activemq:
servers:
default:
jms-topics:
domain-events: {}
messaging:
remote:
name: remote-mq
host: localhost
port: 61616
jndi-name: java:/jms/remote-mq
remote: true
Now it seems that at least part of the setting is correct as the apps start except for following warning:
2017-09-16 14:20:04,385 WARN [org.jboss.activemq.artemis.wildfly.integration.recovery] (MSC service thread 1-2) AMQ122018: Could not start recovery discovery on XARecoveryConfig [transportConfiguration=[TransportConfiguration(name=, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?port=61616&localAddress=::&host=localhost], discoveryConfiguration=null, username=null, password=****, JNDI_NAME=java:/jms/remote-mq], we will retry every recovery scan until the server is available
Also when producer tries to send messages it just times out and I get following exception (just the last part):
Caused by: javax.jms.JMSException: Failed to create session factory
at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createConnectionInternal(ActiveMQConnectionFactory.java:727)
at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createXAConnection(ActiveMQConnectionFactory.java:304)
at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createXAConnection(ActiveMQConnectionFactory.java:300)
at org.apache.activemq.artemis.ra.ActiveMQRAManagedConnection.setup(ActiveMQRAManagedConnection.java:785)
... 127 more
Caused by: ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ119013: Timed out waiting to receive cluster topology. Group:null]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:797)
at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createConnectionInternal(ActiveMQConnectionFactory.java:724)
... 130 more
I suspect that the problem is ActiveMQ has security turned on, but I found no place to give username and password to swarm config.
The ActiveMQ instance is running using Docker and following compose file:
version: '2'
services:
activemq:
image: webcenter/activemq
environment:
- ACTIVEMQ_NAME=amqp-srv1
- ACTIVEMQ_REMOVE_DEFAULT_ACCOUNT=true
- ACTIVEMQ_ADMIN_LOGIN=admin
- ACTIVEMQ_ADMIN_PASSWORD=your_password
- ACTIVEMQ_WRITE_LOGIN=producer_login
- ACTIVEMQ_WRITE_PASSWORD=producer_password
- ACTIVEMQ_READ_LOGIN=consumer_login
- ACTIVEMQ_READ_PASSWORD=consumer_password
- ACTIVEMQ_JMX_LOGIN=jmx_login
- ACTIVEMQ_JMX_PASSWORD=jmx_password
- ACTIVEMQ_MIN_MEMORY=1024
- ACTIVEMQ_MAX_MEMORY=4096
- ACTIVEMQ_ENABLED_SCHEDULER=true
ports:
- "1883:1883"
- "5672:5672"
- "8161:8161"
- "61616:61616"
- "61613:61613"
- "61614:61614"
any idea what's going wrong?
I had bad times trying to get it working too. The following YML solved my problem:
swarm:
network:
socket-binding-groups:
standard-sockets:
outbound-socket-bindings:
myapp-socket-binding:
remote-host: localhost
remote-port: 61616
messaging-activemq:
servers:
default:
remote-connectors:
myapp-connector:
socket-binding: myapp-socket-binding
pooled-connection-factories:
myAppRemote:
user: username
password: password
connectors:
- myapp-connector
entries:
- 'java:/jms/remote-mq'