RKE2 Authorized endpoint configuration help required - authentication

I have a rancher 2.6.67 server and RKE2 downstream cluster. The cluster was created without authorized cluster endpoint. How to add an authorised cluster endpoint to a RKE2 cluster created by Rancher article describes how to add it in an existing cluster, however although the answer looks promising, I still must miss some detail, because it does not work for me.
Here is what I did:
Created /var/lib/rancher/rke2/kube-api-authn-webhook.yaml file with contents:
apiVersion: v1
kind: Config
clusters:
- name: Default
cluster:
insecure-skip-tls-verify: true
server: http://127.0.0.1:6440/v1/authenticate
users:
- name: Default
user:
insecure-skip-tls-verify: true
current-context: webhook
contexts:
- name: webhook
context:
user: Default
cluster: Default
and added
"kube-apiserver-arg": [
"authentication-token-webhook-config-file=/var/lib/rancher/rke2/kube-api-authn-webhook.yaml"
to the /etc/rancher/rke2/config.yaml.d/50-rancher.yaml file.
After restarting rke2-server I found the network configuration tab in Rancher and was able to enable authorized endpoint. Here is where my success ends.
I tried to create a serviceaccount and got the secret to have token authorization, but it failed when connecting directly to the api endpoint on the master.
kube-api-auth pod logs this:
time="2022-10-06T08:42:27Z" level=error msg="found 1 parts of token"
time="2022-10-06T08:42:27Z" level=info msg="Processing v1Authenticate request..."
Also the log is full of messages like this:
E1006 09:04:07.868108 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterAuthToken: failed to list *v3.ClusterAuthToken: the server could not find the requested resource (get clusterauthtokens.meta.k8s.io)
E1006 09:04:40.778350 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterAuthToken: failed to list *v3.ClusterAuthToken: the server could not find the requested resource (get clusterauthtokens.meta.k8s.io)
E1006 09:04:45.171554 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterUserAttribute: failed to list *v3.ClusterUserAttribute: the server could not find the requested resource (get clusteruserattributes.meta.k8s.io)
I found that SA tokens will not work this way so I tried to use a rancher user token, but that fails as well:
time="2022-10-06T08:37:34Z" level=info msg=" ...looking up token for kubeconfig-user-qq9nrc86vv"
time="2022-10-06T08:37:34Z" level=error msg="clusterauthtokens.cluster.cattle.io \"cattle-system/kubeconfig-user-qq9nrc86vv\" not found"
Checking the cattle-system namespace, there are no SA and secret entries corresponding to the users created in rancher, however I found SA and secret entries related in cattle-impersonation-system.
I tried creating a new user, but that too, only resulted in new entries in cattle-impersonation-system namespace, so I presume kube-api-auth wrongly assumes the location of the secrets to be cattle-system namespace.
Now the questions:
Can I authenticate with downstream RKE2 cluster using normal SA tokens (not ones created through Rancher server)? If so, how?
What did I do wrong about adding the webhook authentication configuration? How to make it work?
I noticed, that since I made the modifications described above, I cannot download the kubeconfig file from the rancher UI for this cluster. What went wrong there?
Thanks in advance for any advice.

Related

Health Check on Fabric CA

I have a hyperledger fabric network v2.2.0 deployed with 2 peer orgs and an orderer org in a kubernetes cluster. Each org has its own CA server. The CA pod keeps on restarting sometimes. In order to know whether the service of the CA server is reachable or not, I am trying to use the healthz API on port 9443.
I have used the livenessProbe condition in the CA deployment like so:
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 9443
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
After configuring this liveness probe, the pod keeps on restarting with the event Liveness probe failed: HTTP probe failed with status code: 400. Why might this be happening?
HTTP 400 code:
The HTTP 400 Bad Request response status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (for example, malformed request syntax, invalid request message framing, or deceptive request routing).
This indicates that Kubernetes is sending the data in a way hyperledger is rejecting, but without more information it is hard to say where the problem is. Some quick checks to start with:
Send some GET requests directly to the hyperledger /healthz resource yourself. What do you get? You should get back either a 200 "OK" if everything is functioning, or a 503 "Service Unavailable" with details of which nodes are down (docs).
kubectl describe pod liveness-request. You should see a few lines towards the bottom describing the state of the liveness probe in more detail:
Restart Count: 0
.
.
.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned example-dc/liveness-request to dcpoz-d-sou-k8swor3
Normal Pulling 4m45s kubelet, dcpoz-d-sou-k8swor3 Pulling image "nginx"
Normal Pulled 4m42s kubelet, dcpoz-d-sou-k8swor3 Successfully pulled image "nginx"
Normal Created 4m42s kubelet, dcpoz-d-sou-k8swor3 Created container liveness
Normal Started 4m42s kubelet, dcpoz-d-sou-k8swor3 Started container liveness
Some other things to investigate:
httpGet options that might be helpful:
scheme – Protocol type HTTP or HTTPS
httpHeaders– Custom headers to set in the request
Have you configured the operations service?
You may need a valid client certificate (if TLS is enabled, and clientAuthRequired is set to true).

Why can I read ksqldb streams but not topics within ksql client?

I am testing ksqldb on AWS EC2 instances in the latest release (confluent 5.5.1) and have an access problem that I can't solve.
I have a secured Kafka sever (SASL_SSSL, SASL mode PLAIN), an unsecured Schema Registry (another issue with Avro Serializers, but ok for the moment), and a secured KSQL Server and Client.
Topics are filled properly with AVRO data (value only, no key) from a JDBC source connector.
I can access the KSQL Server with ksql without issues
I can access KSQL REST API without issues
When I list topics within ksql, I get the correct list.
When I select a push stream, I get messages when I push something into the topic (with Kafka Connect, in my case).
BUT: When I call "print topic" I get a ~60 sec block in the client, followed by a 'Timeout expired while fetching topic metadata'.
The ksql-kafka.log goes wild with repeated entries like
[2020-09-02 18:52:46,246] WARN [Consumer clientId=consumer-2, groupId=null] Bootstrap broker ip-10-1-2-10.eu-central-1.compute.internal:9093 (id: -3 rack: null) disconnected (org.apache.kafka.clients.NetworkClient:1037)
The corresponding broker log shows
Sep 2 18:52:44 ip-10-1-6-11 kafka-server-start: [2020-09-02 18:52:44,704] INFO [SocketServer brokerId=1002] Failed authentication with ip-10-1-2-231.eu-central-1.compute.internal/10.1.2.231 (Unexpected Kafka request of type METADATA during SASL handshake.) (org.apache.kafka.common.network.Selector)
This is my ksql-server.properties file:
ksql.service.id= hf_kafka_ksql_001
bootstrap.servers=ip-10-1-11-229.eu-central-1.compute.internal:9093,ip-10-1-6-11.eu-central-1.compute.internal:9093,ip-10-1-2-10.eu-central-1.compute.internal:9093
ksql.streams.state.dir=/var/data/ksqldb
ksql.schema.registry.url=http://ip-10-1-1-22.eu-central-1.compute.internal:8081
ksql.output.topic.name.prefix=ksql-interactive-
ksql.internal.topic.replicas=3
confluent.support.metrics.enable=false
# currently the keystore contains only the ksql server and the certificate chain to the CA
ssl.keystore.location=/var/kafka-ssl/ksql.keystore.jks
ssl.keystore.password=kspassword
ssl.key.password=kspassword
ssl.client.auth=true
# Need to set this to empty, otherwise the REST API is not accessible with the client key.
ssl.endpoint.identification.algorithm=
# currently the truststore contains only the CA certificate
ssl.truststore.location=/var/kafka-ssl/client.truststore.jks
ssl.truststore.password=ctpassword
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="ksql" \
password="ksqlsecret";
listeners=https://0.0.0.0:8088
advertised.listener=https://ip-10-1-2-231.eu-central-1.compute.internal:8088
authentication.method=BASIC
authentication.roles=admin,ksql,cli
authentication.realm=KsqlServerProps
# authentication for producers, needed for ksql commands like "Create Stream"
producer.ssl.endpoint.identification.algorithm=HTTPS
producer.security.protocol=SASL_SSL
producer.sasl.mechanism=PLAIN
producer.ssl.truststore.location=/var/kafka-ssl/client.truststore.jks
producer.ssl.truststore.password=ctpassword
producer.sasl.mechanism=PLAIN
producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="ksql" \
password="ksqlsecret";
# authentication for consumers, needed for ksql commands like "Create Stream"
consumer.ssl.endpoint.identification.algorithm=HTTPS
consumer.security.protocol=SASL_SSL
consumer.ssl.truststore.location=/var/kafka-ssl/client.truststore.jks
consumer.ssl.truststore.password=ctpassword
consumer.sasl.mechanism=PLAIN
consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="ksql" \
password="ksqlsecret";
I call ksql with
ksql --user cli --password test --config-file /var/kafka-ssl/ksql_cli.properties https://ip-10-1-2-231.eu-central-1.compute.internal:8088'
This is my ksql client configuration ksql_cli.properties:
security.protocol=SSL
#ssl.client.auth=true
ssl.truststore.location=/var/kafka-ssl/client.truststore.jks
ssl.truststore.password=ctpassword
ssl.keystore.location=/var/kafka-ssl/ksql.keystore.jks
ssl.keystore.password=kspassword
ssl.key.password=kspassword
JAAS config, included as Parameter on service start
KsqlServerProps {
org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required
file="/var/kafka-ssl/cli.password"
debug="false";
};
with cli.password containing the authentication users and passwords for the ksql client.
I call ksql with
ksql --user cli --password test --config-file /var/kafka-ssl/ksql_cli.properties https://ip-10-1-2-231.eu-central-1.compute.internal:8088'
I possibly have tried any permutation of keys, settings etc but to no avail. Obviously there is something wroing in key management. For me, it is surprising that usings streams is ok but the low-level topics is not.
Has someone found a solution for that issue? I am really running ou of ideas here. Thanks.
Found it! It was easy to overlook - the client's configuration needs of course. a SASL setting...
security.protocol=SASL_SSL

kubectl exec "error: unable to upgrade connection: Unauthorized"

I was using our Kubernetes cluster, I don't think so i have changed recently after deployment but am encountering this error
Error kubectl log with verbose :
01:49:42.691510 30028 round_trippers.go:444] Response Headers:
I0514 01:49:42.691526 30028 round_trippers.go:447] Content-Length: 12
10514 01:49:42.691537 30028 round_trippers.go:447] Content-Type: text/plain; charset=utf-8
I0514 01:49:42.691545 30028 round_trippers.go:447] Date: Tue, 14 May 2019 08:49:42 GMT
F0514 01:49:42.691976 30028 helpers.go:119] error: unable to upgrade connection:
Unauthorized
Kubelet running with below options :
/usr/local/bin/kubelet --logtostderr=true --v=2 --address=0.0.0.0 --node-ip=1******
--hostname-override=***** --allow-privileged=true --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --authentication-token-webhook --enforce-node-allocatable= --client-ca-file=/etc/kubernetes/ssl/ca.crt --pod-manifest-path=/etc/kubernetes/manifests --pod-infra-container-image=gcr.io/google_containers/pause-amd64:3.1 --node-status-update-frequency=10s --cgroup-driver=cgroupfs --max-pods=110 --anonymous-auth=false --read-only-port=0 --fail-swap-on=True --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --cluster-dns=10.233.0.3 --cluster-domain=cluster.local --resolv-conf=/etc/resolv.conf --kube-reserved cpu=200m,memory=512M --node-labels=node-role.kubernetes.io/master=,node-role.kubernetes.io/node= --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin
API running with below options :
kube-apiserver --allow-privileged=true --apiserver-count=2 --authorization-mode=Node,RBAC --bind-address=0.0.0.0 --endpoint-reconciler-type=lease --insecure-port=0 --kubelet-preferred-address-types=InternalDNS,InternalIP,Hostname,ExternalDNS,ExternalIP --runtime-config=admissionregistration.k8s.io/v1alpha1 --service-node-port-range=30000-32767 --storage-backend=etcd3 --advertise-address=******* --client-ca-file=/etc/kubernetes/ssl/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/ssl/etcd/ca.pem --etcd-certfile=/etc/kubernetes/ssl/etcd/node-bg-kub-dev-1.pem --etcd-keyfile=/etc/kubernetes/ssl/etcd/node-bg-kub-dev-1-key.pem --etcd-servers=https://*******:2379,https://********:2379,https://*****:2379 --kubelet-client-certificate=/etc/kubernetes/ssl/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/ssl/apiserver-kubelet-client.key --proxy-client-cert-file=/etc/kubernetes/ssl/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/ssl/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/ssl/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-key-file=/etc/kubernetes/ssl/sa.pub --service-cluster-ip-range=10.233.0.0/18 --tls-cert-file=/etc/kubernetes/ssl/apiserver.crt --tls-private-key-file=/etc/kubernetes/ssl/apiserver.key
I think you messed your cert files or you played with RBAC profiles.
You can have a look at great guide by Kelsey Hightower called kubernetes-the-hard-way.
It's showing how to setup a whole cluster from beggining without any automation tools like kubeadm.
In part 04-certificate-authority - Provisioning a CA and Generating TLS Certificates.
You have exampled of certs being used in Kubernetes.
The Kubelet Client Certificates
Kubernetes uses a special-purpose authorization mode called Node Authorizer, that specifically authorizes API requests made by Kubelets. In order to be authorized by the Node Authorizer, Kubelets must use a credential that identifies them as being in the system:nodes group, with a username of system:node:<nodeName>. In this section you will create a certificate for each Kubernetes worker node that meets the Node Authorizer requirements.
Once certs are generated for workers and uploaded you need to generate kubeconfig for each worker.
The kubelet Kubernetes Configuration File
When generating kubeconfig files for Kubelets the client certificate matching the Kubelet's node name must be used. This will ensure Kubelets are properly authorized by the Kubernetes Node Authorizer.
Also this case might be helpful "kubectl exec" results in "error: unable to upgrade connection: Unauthorized"
I got fixed this issue.
Actually "/etc/kubernetes/ssl/ca.crt" in my both masters are same but in worker nodes "/etc/kubernetes/ssl/ca.crt" is totally different. So i just copied "/etc/kubernetes/ssl/ca.crt" from master to my worker nodes and restarted kubelet in workers nodes which fixed my issue.
But am not sure I did right changes for fix
I hope --client-ca-file=/etc/kubernetes/ssl/ca.crt should be same for all kubelet which is running master and workers

ERR_SSL_VERSION_OR_CIPHER_MISMATCH from AWS API Gateway into Lambda

I have set up a lambda and attached an API Gateway deployment to it. The tests in the gateway console all work fine. I created an AWS certificate for *.hazeapp.net. I created a custom domain in the API gateway and attached that certificate. In the Route 53 zone, I created the alias record and used the target that came up under API gateway (the only one available). I named the alias rest.hazeapp.net. My client gets the ERR_SSL_VERSION_OR_CIPHER_MISMATCH error. Curl indicates that the TLS server handshake failed, which agrees with the SSL error. Curl indicates that the certificate CA checks out.
Am I doing something wrong?
I had this problem when my DNS entry pointed directly to the API gateway deployment rather than that backing the custom domain name.
To find the domain name to point to:
aws apigateway get-domain-name --domain-name "<YOUR DOMAIN>"
The response contains the domain name to use. In my case I had a Regional deployment so the result was:
{
"domainName": "<DOMAIN_NAME>",
"certificateUploadDate": 1553011117,
"regionalDomainName": "<API_GATEWAY_ID>.execute-api.eu-west-1.amazonaws.com",
"regionalHostedZoneId": "...",
"regionalCertificateArn": "arn:aws:acm:eu-west-1:<ACCOUNT>:certificate/<CERT_ID>",
"endpointConfiguration": {
"types": [
"REGIONAL"
]
}
}

Kubernetes cluster role admin not able to get deployment status

I have the following role:
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
When I do a kubectl proxy --port 8080 and then try doing
http://127.0.0.1:8080/apis/extensions/v1beta1/namespaces/cdp/deployments/{deploymentname}
I get a 200 and everything works fine. However when I do:
http://127.0.0.1:8080/apis/extensions/v1beta1/namespaces/cdp/deployments/{deploymentname}/status
I get forbidden and a 403 status back .
I also am able to do get, create, list,watch on deployments with my admin role .
Any idea as to why /status would give forbidden when I clearly have all the necessary permission as admin for my namespace.
You mentioned verbs of the role and you didn't mention resources and apiGroup. Make sure the following are set:
- apiGroups:
- apps
- extensions
resources:
- deployments/status
the status subresource doesn't give you any more information than simply fetching the deployment
The admin role permissions do not let you write deployment status. They let you create and delete the deployment objects, controlling the "spec" portion of the object. Status modification permissions are granted to the deployment controller.