Only 1 of 2 pods deploy, other pod is not authorised to pull container - openshift-origin

I set up a production project using the same structure as my staging project minus the BuildConfigurations and then tagged my containers from the staging image stream to the prod image stream.
oc tag my-staging/nginx:latest my-prod/nginx:prod
oc tag my-staging/gunicorn:latest my-prod/gunicorn:prod
oc tag my-staging/celery-worker:latest my-prod/celery-worker:prod
Each of these as a DeploymentConfig for 2 replicas. The first two have come up with both pods, but the celery-worker container is only coming up with a single pod. The other pod generates an error:
Failed to pull image
"172.x.x.x:5000/my-staging/celery-worker#sha256:xxx":
unauthorized: authentication required
I don't get how one kubelet can have access but not another. Especially since all of the other pods are up.
Here's the logs from the registry:
10.1.3.1 - - [22/Feb/2016:02:52:58 +0000] "GET /v2/cwl-staging/cwl-leadershift-20-celery-worker/manifests/sha256:7a2608ce648b767d65209410fd9f0e8d2fe3f559367c77ba45ba9a713940f83a HTTP/1.1" 401 176 "" "docker/1.8.2-el7.centos go/go1.4.2 kernel/3.10.0-327.4.5.el7.x86_64 os/linux arch/amd64"
time="2016-02-22T02:52:58.297372303Z" level=error msg="OpenShift access denied: User \"system:serviceaccount:cwl-production:default\" cannot get imagestreams/layers in project \"cwl-staging\"" go.version=go1.4.2 http.request.host="172.30.140.184:5000" http.request.id=71a32c41-9e91-40be-9774-166bfa7264f8 http.request.method=GET http.request.remoteaddr="10.1.3.1:48777" http.request.uri="/v2/cwl-staging/cwl-leadershift-20-celery-worker/manifests/sha256:7a2608ce648b767d65209410fd9f0e8d2fe3f559367c77ba45ba9a713940f83a" http.request.useragent="docker/1.8.2-el7.centos go/go1.4.2 kernel/3.10.0-327.4.5.el7.x86_64 os/linux arch/amd64" instance.id=180a3a82-b568-40ab-aaa0-538588e8e765 vars.name="cwl-staging/cwl-leadershift-20-celery-worker" vars.reference="sha256:7a2608ce648b767d65209410fd9f0e8d2fe3f559367c77ba45ba9a713940f83a"
time="2016-02-22T02:52:58.297449598Z" level=error msg="error authorizing context: access denied" go.version=go1.4.2 http.request.host="172.30.140.184:5000" http.request.id=71a32c41-9e91-40be-9774-166bfa7264f8 http.request.method=GET http.request.remoteaddr="10.1.3.1:48777" http.request.uri="/v2/cwl-staging/cwl-leadershift-20-celery-worker/manifests/sha256:7a2608ce648b767d65209410fd9f0e8d2fe3f559367c77ba45ba9a713940f83a" http.request.useragent="docker/1.8.2-el7.centos go/go1.4.2 kernel/3.10.0-327.4.5.el7.x86_64 os/linux arch/amd64" instance.id=180a3a82-b568-40ab-aaa0-538588e8e765 vars.name="cwl-staging/cwl-leadershift-20-celery-worker" vars.reference="sha256:7a2608ce648b767d65209410fd9f0e8d2fe3f559367c77ba45ba9a713940f83a"

The problem is that the system:image-puller role wasn't granted to my-prod.
Grant the role on the my-staging project:
oc policy add-role-to-user system:image-puller system:serviceaccount:my-prod:default -n my-staging
Delete the stuck pods so they gain the new credentials to pull the image.
Appropriate section of the Openshift documentation.

Related

RKE2 Authorized endpoint configuration help required

I have a rancher 2.6.67 server and RKE2 downstream cluster. The cluster was created without authorized cluster endpoint. How to add an authorised cluster endpoint to a RKE2 cluster created by Rancher article describes how to add it in an existing cluster, however although the answer looks promising, I still must miss some detail, because it does not work for me.
Here is what I did:
Created /var/lib/rancher/rke2/kube-api-authn-webhook.yaml file with contents:
apiVersion: v1
kind: Config
clusters:
- name: Default
cluster:
insecure-skip-tls-verify: true
server: http://127.0.0.1:6440/v1/authenticate
users:
- name: Default
user:
insecure-skip-tls-verify: true
current-context: webhook
contexts:
- name: webhook
context:
user: Default
cluster: Default
and added
"kube-apiserver-arg": [
"authentication-token-webhook-config-file=/var/lib/rancher/rke2/kube-api-authn-webhook.yaml"
to the /etc/rancher/rke2/config.yaml.d/50-rancher.yaml file.
After restarting rke2-server I found the network configuration tab in Rancher and was able to enable authorized endpoint. Here is where my success ends.
I tried to create a serviceaccount and got the secret to have token authorization, but it failed when connecting directly to the api endpoint on the master.
kube-api-auth pod logs this:
time="2022-10-06T08:42:27Z" level=error msg="found 1 parts of token"
time="2022-10-06T08:42:27Z" level=info msg="Processing v1Authenticate request..."
Also the log is full of messages like this:
E1006 09:04:07.868108 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterAuthToken: failed to list *v3.ClusterAuthToken: the server could not find the requested resource (get clusterauthtokens.meta.k8s.io)
E1006 09:04:40.778350 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterAuthToken: failed to list *v3.ClusterAuthToken: the server could not find the requested resource (get clusterauthtokens.meta.k8s.io)
E1006 09:04:45.171554 1 reflector.go:139] pkg/mod/github.com/rancher/client-go#v1.22.3-rancher.1/tools/cache/reflector.go:168: Failed to watch *v3.ClusterUserAttribute: failed to list *v3.ClusterUserAttribute: the server could not find the requested resource (get clusteruserattributes.meta.k8s.io)
I found that SA tokens will not work this way so I tried to use a rancher user token, but that fails as well:
time="2022-10-06T08:37:34Z" level=info msg=" ...looking up token for kubeconfig-user-qq9nrc86vv"
time="2022-10-06T08:37:34Z" level=error msg="clusterauthtokens.cluster.cattle.io \"cattle-system/kubeconfig-user-qq9nrc86vv\" not found"
Checking the cattle-system namespace, there are no SA and secret entries corresponding to the users created in rancher, however I found SA and secret entries related in cattle-impersonation-system.
I tried creating a new user, but that too, only resulted in new entries in cattle-impersonation-system namespace, so I presume kube-api-auth wrongly assumes the location of the secrets to be cattle-system namespace.
Now the questions:
Can I authenticate with downstream RKE2 cluster using normal SA tokens (not ones created through Rancher server)? If so, how?
What did I do wrong about adding the webhook authentication configuration? How to make it work?
I noticed, that since I made the modifications described above, I cannot download the kubeconfig file from the rancher UI for this cluster. What went wrong there?
Thanks in advance for any advice.

installing Kong API Gateway on EKS

Following the super-simple instructions in this link (the helm method) I'm trying to install Kong on an test EKS cluster (in an empty kong namespace) and it's a disaster. The service never gets an external IP (stuck at "pending"). The ingress-controller container inside the main pod fails with:
time="2022-06-15T19:14:52Z" level=info msg="successfully synced configuration to kong."
time="2022-06-15T19:14:54Z" level=error msg="checking config status failed: %!w(*kong.APIError=&{500 An unexpected error occurred})"
time="2022-06-15T19:14:57Z" level=error msg="checking config status failed: %!w(*kong.APIError=&{500 An unexpected error occurred})"
time="2022-06-15T19:15:00Z" level=error msg="checking config status failed: %!w(*kong.APIError=&{500 An unexpected error occurred})"
(the last line repeating every 3sec)
...while the proxy container inside the same pod fails with repeating
2022/06/15 19:47:27 [error] 1110#0: *11302 [lua] api_helpers.lua:511: handle_error(): /usr/local/share/lua/5.1/lapis/application.lua:424: /usr/local/share/lua/5.1/kong/api/routes/health.lua:45: http2 requests not supported yet
I'm not doing any customizing with the values file, I'm just installing as it comes by default. The instructions (just a "helm repo add" and a "helm install") come from the official Kong site, so what is amiss?? Helm is v3.8, K8s is v1.21.

Health Check on Fabric CA

I have a hyperledger fabric network v2.2.0 deployed with 2 peer orgs and an orderer org in a kubernetes cluster. Each org has its own CA server. The CA pod keeps on restarting sometimes. In order to know whether the service of the CA server is reachable or not, I am trying to use the healthz API on port 9443.
I have used the livenessProbe condition in the CA deployment like so:
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 9443
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
After configuring this liveness probe, the pod keeps on restarting with the event Liveness probe failed: HTTP probe failed with status code: 400. Why might this be happening?
HTTP 400 code:
The HTTP 400 Bad Request response status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (for example, malformed request syntax, invalid request message framing, or deceptive request routing).
This indicates that Kubernetes is sending the data in a way hyperledger is rejecting, but without more information it is hard to say where the problem is. Some quick checks to start with:
Send some GET requests directly to the hyperledger /healthz resource yourself. What do you get? You should get back either a 200 "OK" if everything is functioning, or a 503 "Service Unavailable" with details of which nodes are down (docs).
kubectl describe pod liveness-request. You should see a few lines towards the bottom describing the state of the liveness probe in more detail:
Restart Count: 0
.
.
.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned example-dc/liveness-request to dcpoz-d-sou-k8swor3
Normal Pulling 4m45s kubelet, dcpoz-d-sou-k8swor3 Pulling image "nginx"
Normal Pulled 4m42s kubelet, dcpoz-d-sou-k8swor3 Successfully pulled image "nginx"
Normal Created 4m42s kubelet, dcpoz-d-sou-k8swor3 Created container liveness
Normal Started 4m42s kubelet, dcpoz-d-sou-k8swor3 Started container liveness
Some other things to investigate:
httpGet options that might be helpful:
scheme – Protocol type HTTP or HTTPS
httpHeaders– Custom headers to set in the request
Have you configured the operations service?
You may need a valid client certificate (if TLS is enabled, and clientAuthRequired is set to true).

MinIO Signature Mismatch

I have set up MinIO behind a reverse proxy in EKS. Everything worked well until MinIO was updated to RELEASE.2021-11-03T03-36-36Z. Now I am getting the following error when trying to access my MinIO bucket using the mc command-line utility: mc: <ERROR> Unable to list folder. The request signature we calculated does not match the signature you provided. Check your key and signing method.
mc version is RELEASE.2021-11-16T20-37-36Z. When I port-forward the MinIO container to localhost and access it in a browser at http://localhost:9001 I can get to it, but I can't log in anymore. I get the error:
Invalid Login, 403 Forbidden`. This is seen in my MinIO container
It also logs the following:
API: SYSTEM()
Time: 03:19:57 UTC 11/23/2021
DeploymentID: 60a8ed7a-7448-4a3d-9220-ff823facd54e
Error: The request signature we calculated does not match the signature you provided. Check your key and signing method. (*errors.errorString)
requestHeaders={"method":"POST","reqURI":"/minio/admin/v3/update?updateURL=","header":{"Authorization":["AWS4-HMAC-SHA256 Credential=<credential-scrubbed>/20211123//s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=37850012ca8d27498793c514aa826f1c29b19ceae96057b9d46e24599cc8081b"],"Connection":["keep-alive"],"Content-Length":["0"],"Host":["<host-info-scrubbed>"],"User-Agent":["MinIO (darwin; amd64) madmin-go/0.0.1 mc/RELEASE.2021-11-16T20-37-36Z"],"X-Amz-Content-Sha256":["<scrubbed>"],"X-Amz-Date":["20211123T031957Z"],"X-Forwarded-For":["10.192.57.142"],"X-Forwarded-Host":["<host-info-scurbbed>"],"X-Forwarded-Path":["/minio/admin/v3/update"],"X-Forwarded-Port":["80"],"X-Forwarded-Proto":["http"],"X-Real-Ip":["10.192.57.142"]}}
5: cmd/auth-handler.go:154:cmd.validateAdminSignature()
4: cmd/auth-handler.go:165:cmd.checkAdminRequestAuth()
3: cmd/admin-handler-utils.go:41:cmd.validateAdminReq()
2: cmd/admin-handlers.go:87:cmd.adminAPIHandlers.ServerUpdateHandler()
1: net/http/server.go:2046:http.HandlerFunc.ServeHTTP()
When checking the proxy logs (NGINX), I see:
10.192.57.142 - - [24/Nov/2021:21:18:17 +0000] "GET / HTTP/1.1" 403 334 "-" "MinIO (darwin; amd64) minio-go/v7.0.16 mc/RELEASE.2021-11-16T20-37-36Z"
Any suggestions or advice on what I can do to resolve this would be great! I'm using the mc client on OSX.

Kubernetes cluster role admin not able to get deployment status

I have the following role:
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
When I do a kubectl proxy --port 8080 and then try doing
http://127.0.0.1:8080/apis/extensions/v1beta1/namespaces/cdp/deployments/{deploymentname}
I get a 200 and everything works fine. However when I do:
http://127.0.0.1:8080/apis/extensions/v1beta1/namespaces/cdp/deployments/{deploymentname}/status
I get forbidden and a 403 status back .
I also am able to do get, create, list,watch on deployments with my admin role .
Any idea as to why /status would give forbidden when I clearly have all the necessary permission as admin for my namespace.
You mentioned verbs of the role and you didn't mention resources and apiGroup. Make sure the following are set:
- apiGroups:
- apps
- extensions
resources:
- deployments/status
the status subresource doesn't give you any more information than simply fetching the deployment
The admin role permissions do not let you write deployment status. They let you create and delete the deployment objects, controlling the "spec" portion of the object. Status modification permissions are granted to the deployment controller.