Gitpod cannot resolve workspace image: hostname required on workspace start - amazon-eks

I've gitpod selfhosted in EKS. When I try to start a new workspace I have this error:
Request createWorkspace failed with message: 13 INTERNAL: cannot
resolve workspace image: hostname required
Unknown Error: { "code": -32603 }
I haven't found any solution.
Any idea?
Thank you
Here my gitpod-config.yaml
apiVersion: v1
authProviders: []
blockNewUsers:
enabled: false
passlist: []
certificate:
kind: secret
name: https-certificates
containerRegistry:
inCluster: true
s3storage:
bucket: custom-bucket
certificate:
kind: secret
name: object-storage-gitpod-token
database:
inCluster: false
external:
certificate:
kind: secret
name: mysql-gitpod-token
domain: my-domain.com
imagePullSecrets: null
jaegerOperator:
inCluster: true
kind: Full
metadata:
region: eu-west-1
objectStorage:
inCluster: true
observability:
logLevel: info
repository: eu.gcr.io/gitpod-core-dev/build
workspace:
resources:
requests:
cpu: "1"
memory: 2Gi
runtime:
containerdRuntimeDir: /var/lib/containerd/io.containerd.runtime.v2.task/k8s.io
containerdSocket: /run/containerd/containerd.sock
fsShiftMethod: shiftfs

I'm a Gitpodder and wrote most of the Installer. This is usually down to a misconfiguration.
Can you post your config.yaml (redacting the domain) please and hopefully I'll be able to see the issue?

Check your .env file has an entry for the CERTIFICATE_ARN= and that the cert has 3 entries for the base domain.
e.g. if DOMAIN=mygitpod.domain.com
The cert needs these three:
mygitpod.domain.com
*.mygitpod.domain.com
*.ws.mygitpod.domain.com`

i had this error and i resolved it by adding a DNS record for
$DOMAIN
*.$DOMAIN
*.ws.$DOMAIN`

I run v2022.03.1 and now I have all three DNS records configured.
It works.
Thanks everybody who response

Related

cert-manager challenges are pending

I'm using the cert-manager to manage my ssl certificates in my Kubernetes cluster. The cert-manager creates the pods and the challenges, but the challenges are never getting fulfilled. They're always saying:
Waiting for HTTP-01 challenge propagation: failed to perform self check GET request 'http://somedomain/.well-known/acme-challenge/VqlmMCsb019CCFDggs03RyBLZJ0jo53LO...': Get "http://somedomain/.well-known/acme-challenge/VqlmMCsb019CCFDggs03RyBLZJ0jo53LO...": EOF
But when I open the url (http:///.well-known/acme-challenge/VqlmMCsb019CCFDggs03RyBLZJ0jo53LO...), it returns the expected code:
vzCVdTk1q55MQCNH...zVkKYGvBJkRTvDBHQ.YfUcSfIKvWo_MIULP9jvYcgtsGxwfJMLWUGsB5kFKRc
When I do kubectl get certs, it says that the certs are ready:
NAME
READY
SECRET
AGE
crt1
True
crt1-secret
65m
crt1-secret
True
crt1-secret
65m
crt2
True
crt2-secret
65m
crt2-secret
True
crt2-secret
65m
It looks like Let's Encrypt never calls (or the cert-manager never instructs) these url's to verify.
When I list the challenges kubectl describe challenges, it says:
Name: crt-secret-hcgcf-269956107-974455061
Namespace: default
Labels: <none>
Annotations: <none>
API Version: acme.cert-manager.io/v1
Kind: Challenge
Metadata:
Creation Timestamp: 2021-07-23T10:47:27Z
Finalizers:
finalizer.acme.cert-manager.io
Generation: 1
Managed Fields:
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizer.acme.cert-manager.io":
f:ownerReferences:
.:
k:{"uid":"09e39ad0-cc39-421f-80d2-07c2f82680af"}:
.:
f:apiVersion:
f:blockOwnerDeletion:
f:controller:
f:kind:
f:name:
f:uid:
f:spec:
.:
f:authorizationURL:
f:dnsName:
f:issuerRef:
.:
f:group:
f:kind:
f:name:
f:key:
f:solver:
.:
f:http01:
.:
f:ingress:
.:
f:class:
f:ingressTemplate:
UID: 09e39ad0-cc39-421f-80d2-07c2f82680af
Resource Version: 19014474
UID: b914ad18-2f5c-45cd-aa34-4ad7a2786536
Spec:
Authorization URL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/1547...9301
Dns Name: mydomain.something
Issuer Ref:
Group: cert-manager.io
Kind: Issuer
Name: letsencrypt
Key: VqlmMCsb019CCFDggs03RyBLZ...nc767h_g.YfUcSfIKv...GxwfJMLWUGsB5kFKRc
Solver:
http01:
Ingress:
Class: nginx
Ingress Template:
Metadata:
Annotations:
nginx.org/mergeable-ingress-type: minion
Service Type: ClusterIP
Token: VqlmMCsb019CC...03RyBLZJ0jo53LOiqnc767h_g
Type: HTTP-01
URL: https://acme-v02.api.letsencrypt.org/acme/chall-v3/15...49301/X--4pw
Wildcard: false
Events: <none>
Any idea how I can solve this issue?
Update 1:
When I run curl http://some-domain.tld/.well-known/acme-challenge/VqlmMCsb019CC...gs03RyBLZJ0jo53LOiqnc767h_g in another pod, it returns:
curl: (52) Empty reply from server
When I do it locally (on my PC), it returns me the expected challenge-response.
Make sure your POD is returning something on the home URL or on the Home page of the domain that you are configuring on ingress host
You can also use the DNS-01 method for verification if HTTP-01 is not working
Here example for the DNS-01 :
Wild card certificate with cert-manager example
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: test123#gmail.com
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- selector:
dnsZones:
- "devops.example.in"
dns01:
route53:
region: us-east-1
hostedZoneID: Z0152EXAMPLE
accessKeyID: AKIA5EXAMPLE
secretAccessKeySecretRef:
name: route53-secret
key: secret-access-key
---
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: le-crt
spec:
secretName: tls-secret
issuerRef:
kind: Issuer
name: letsencrypt-prod
commonName: "*.devops.example.in"
dnsNames:
- "*.devops.example.in"
Try using dns01 challenge instead of HTTP-01

FluxCD on EKS cannot read a private repo on GitHub

After installing FluxCD v2 on my EKS cluster, I defined a GitRepository definition pointing to a repo on GitHub.
---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
name: springbootflux-infra
namespace: flux-system
spec:
interval: 30s
ref:
branch: master
url: https://github.com/***/privaterepo
As the name says, yhe privaterepo on GitHub is a private. Problem is that FluxCD cannot read the repo. What can I do to allow FluxCD on EKS to be able to read the repo?
For private repositories you need to define a secret which contains the credentials.
Create a secret:
apiVersion: v1
kind: Secret
metadata:
name: repository-creds
type: Opaque
data:
username: <BASE64>
password: <BASE64>
Refer to the secret in your GitRepository object:
secretRef:
name: repository-creds
Official documentation:
https://fluxcd.io/docs/components/source/gitrepositories/#secret-reference

istio getting "RBAC: access denied" even the servicerolebinding checked to be allowed

I've been struggleing with istio... So here I am seeking help from the experts!
Background
I'm trying to deploy my kubeflow application for multi-tenency with dex.
Refering to the kubeflow offical document with the manifest file from github
Here is a list of component/version information
I'm running kubernetes 1.15 on GKE
Istio 1.1.6 been used in kubeflow for service meth
Trying to deploy kubeflow 1.0 for ML
Deployed dex 1.0 for authn
With the manifest file I successfully deployed the kubeflow on my cluster. Here's what I've done.
Deploy the kubeflow application on the cluster
Deploy Dex with OIDC service to enable authn to google Oauth2.0
Enable the RBAC
create envoy filter to append header "kubeflow-userid" as the login user
Here is a verification of step 3 and 4
Check RBAC enabled and envoyfilter added for kubeflow-userid
[root#gke-client-tf leilichao]# k get clusterrbacconfigs -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
kind: ClusterRbacConfig
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.istio.io/v1alpha1","kind":"ClusterRbacConfig","metadata":{"annotations":{},"name":"default"},"spec":{"mode":"ON"}}
creationTimestamp: "2020-07-04T01:28:52Z"
generation: 2
name: default
resourceVersion: "5986075"
selfLink: /apis/rbac.istio.io/v1alpha1/clusterrbacconfigs/default
uid: db70920e-f364-40ec-a93b-a3364f88650f
spec:
mode: "ON"
kind: List
metadata:
resourceVersion: ""
selfLink: ""
[root#gke-client-tf leilichao]# k get envoyfilter -n istio-system -o yaml
apiVersion: v1
items:
- apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"oidc-authservice","app.kubernetes.io/instance":"oidc-authservice-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"oidc-authservice","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0"},"name":"authn-filter","namespace":"istio-system"},"spec":{"filters":[{"filterConfig":{"httpService":{"authorizationRequest":{"allowedHeaders":{"patterns":[{"exact":"cookie"},{"exact":"X-Auth-Token"}]}},"authorizationResponse":{"allowedUpstreamHeaders":{"patterns":[{"exact":"kubeflow-userid"}]}},"serverUri":{"cluster":"outbound|8080||authservice.istio-system.svc.cluster.local","failureModeAllow":false,"timeout":"10s","uri":"http://authservice.istio-system.svc.cluster.local"}},"statusOnError":{"code":"GatewayTimeout"}},"filterName":"envoy.ext_authz","filterType":"HTTP","insertPosition":{"index":"FIRST"},"listenerMatch":{"listenerType":"GATEWAY"}}],"workloadLabels":{"istio":"ingressgateway"}}}
creationTimestamp: "2020-07-04T01:40:43Z"
generation: 1
labels:
app.kubernetes.io/component: oidc-authservice
app.kubernetes.io/instance: oidc-authservice-v1.0.0
app.kubernetes.io/managed-by: kfctl
app.kubernetes.io/name: oidc-authservice
app.kubernetes.io/part-of: kubeflow
app.kubernetes.io/version: v1.0.0
name: authn-filter
namespace: istio-system
resourceVersion: "4715289"
selfLink: /apis/networking.istio.io/v1alpha3/namespaces/istio-system/envoyfilters/authn-filter
uid: e599ba82-315a-4fc1-9a5d-e8e35d93ca26
spec:
filters:
- filterConfig:
httpService:
authorizationRequest:
allowedHeaders:
patterns:
- exact: cookie
- exact: X-Auth-Token
authorizationResponse:
allowedUpstreamHeaders:
patterns:
- exact: kubeflow-userid
serverUri:
cluster: outbound|8080||authservice.istio-system.svc.cluster.local
failureModeAllow: false
timeout: 10s
uri: http://authservice.istio-system.svc.cluster.local
statusOnError:
code: GatewayTimeout
filterName: envoy.ext_authz
filterType: HTTP
insertPosition:
index: FIRST
listenerMatch:
listenerType: GATEWAY
workloadLabels:
istio: ingressgateway
kind: List
metadata:
resourceVersion: ""
selfLink: ""
RBAC Issue problem analysis
After I finished my deployment. I performed below functional testing:
I can login with my google account with google oauth
I was able to create my own profile/namespace
I was able to create a notebook server
However I can NOT connect to the notebook server
RBAC Issue investigation
I'm getting "RBAC: access denied" error after I successfully created the notebook server on kubeflow and trying to connect the notebook server.
I managed to updated the envoy log level and get the log below.
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:64] checking request: remoteAddress: 10.1.1.2:58012, localAddress: 10.1.2.66:8888, ssl: none, headers: ':authority', 'compliance-kf-system.ml'
':path', '/notebook/roger-l-c-lei/aug06/'
':method', 'GET'
'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
'accept-encoding', 'gzip, deflate'
'accept-language', 'en,zh-CN;q=0.9,zh;q=0.8'
'cookie', 'authservice_session=MTU5NjY5Njk0MXxOd3dBTkZvMldsVllVMUZPU0VaR01sSk5RVlJJV2xkRFVrRTFTVUl5V0RKV1EwdEhTMU5QVjFCVlUwTkpSVFpYUlVoT1RGVlBUa0U9fN3lPBXDDSZMT9MTJRbG8jv7AtblKTE3r84ayeCYuKOk; _xsrf=2|1e6639f2|10d3ea0a904e0ae505fd6425888453f8|1596697030'
'referer', 'http://compliance-kf-system.ml/jupyter/'
'upgrade-insecure-requests', '1'
'x-forwarded-for', '10.10.10.230'
'x-forwarded-proto', 'http'
'x-request-id', 'babbf884-4cec-93fd-aea6-2fc60d3abb83'
'kubeflow-userid', 'roger.l.c.lei#XXXX.com'
'x-istio-attributes', 'CjAKHWRlc3RpbmF0aW9uLnNlcnZpY2UubmFtZXNwYWNlEg8SDXJvZ2VyLWwtYy1sZWkKIwoYZGVzdGluYXRpb24uc2VydmljZS5uYW1lEgcSBWF1ZzA2Ck4KCnNvdXJjZS51aWQSQBI+a3ViZXJuZXRlczovL2lzdGlvLWluZ3Jlc3NnYXRld2F5LTg5Y2Q0YmQ0Yy1kdnF3dC5pc3Rpby1zeXN0ZW0KQQoXZGVzdGluYXRpb24uc2VydmljZS51aWQSJhIkaXN0aW86Ly9yb2dlci1sLWMtbGVpL3NlcnZpY2VzL2F1ZzA2CkMKGGRlc3RpbmF0aW9uLnNlcnZpY2UuaG9zdBInEiVhdWcwNi5yb2dlci1sLWMtbGVpLnN2Yy5jbHVzdGVyLmxvY2Fs'
'x-envoy-expected-rq-timeout-ms', '300000'
'x-b3-traceid', '3bf35cca1f7b75e7a42a046b1c124b1f'
'x-b3-spanid', 'a42a046b1c124b1f'
'x-b3-sampled', '1'
'x-envoy-original-path', '/notebook/roger-l-c-lei/aug06/'
'content-length', '0'
'x-envoy-internal', 'true'
, dynamicMetadata: filter_metadata {
key: "istio_authn"
value {
}
}
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:108] enforced denied
From the source code it looks like the allowed function is returnning false so it's giving the "RBAC: access denied" response.
if (engine.has_value()) {
if (engine->allowed(*callbacks_->connection(), headers,
callbacks_->streamInfo().dynamicMetadata(), nullptr)) {
ENVOY_LOG(debug, "enforced allowed");
config_->stats().allowed_.inc();
return Http::FilterHeadersStatus::Continue;
} else {
ENVOY_LOG(debug, "enforced denied");
callbacks_->sendLocalReply(Http::Code::Forbidden, "RBAC: access denied", nullptr,
absl::nullopt);
config_->stats().denied_.inc();
return Http::FilterHeadersStatus::StopIteration;
}
}
I took a search on the dumped envoy, it looks like the rule should be allowing any request with a header key as my mail address. Now I can confirm I've got that in my header from above log.
{
"name": "envoy.filters.http.rbac",
"config": {
"rules": {
"policies": {
"ns-access-istio": {
"permissions": [
{
"and_rules": {
"rules": [
{
"any": true
}
]
}
}
],
"principals": [
{
"and_ids": {
"ids": [
{
"header": {
"exact_match": "roger.l.c.lei#XXXX.com"
}
}
]
}
}
]
}
}
}
}
}
With the understand that the envoy config that's been used to validate RBAC authz is from this config. And it's distributed to the sidecar by mixer, The log and code leads me to the rbac.istio.io config of servicerolebinding.
[root#gke-client-tf leilichao]# k get servicerolebinding -n roger-l-c-lei -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
annotations:
role: admin
user: roger.l.c.lei#XXXX.com
creationTimestamp: "2020-07-04T01:35:30Z"
generation: 5
name: owner-binding-istio
namespace: roger-l-c-lei
ownerReferences:
- apiVersion: kubeflow.org/v1
blockOwnerDeletion: true
controller: true
kind: Profile
name: roger-l-c-lei
uid: 689c9f04-08a6-4c51-a1dc-944db1a66114
resourceVersion: "23201026"
selfLink: /apis/rbac.istio.io/v1alpha1/namespaces/roger-l-c-lei/servicerolebindings/owner-binding-istio
uid: bbbffc28-689c-4099-837a-87a2feb5948f
spec:
roleRef:
kind: ServiceRole
name: ns-access-istio
subjects:
- properties:
request.headers[]: roger.l.c.lei#XXXX.com
status: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
I wanted to have a try updating this ServiceRoleBinding to validate some assumption since I can't debug the envoy source code and there's not enough log to show why exactly is the "allow" method returnning false.
However I find myself cannot update the servicerolebinding. It resumes to its orriginal version everytime right after I finish editing it.
I find that there's this istio-galley validatingAdmissionConfiguration(Code block below) that monitors these istio rbac resources.
[root#gke-client-tf leilichao]# k get validatingwebhookconfigurations istio-galley -oyaml
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
creationTimestamp: "2020-08-04T15:00:59Z"
generation: 1
labels:
app: galley
chart: galley
heritage: Tiller
istio: galley
release: istio
name: istio-galley
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: Deployment
name: istio-galley
uid: 11fef012-4145-49ac-a43c-2e1d0a460ea4
resourceVersion: "22484680"
selfLink: /apis/admissionregistration.k8s.io/v1beta1/validatingwebhookconfigurations/istio-galley
uid: 6f485e28-3b5a-4a3b-b31f-a5c477c82619
webhooks:
- admissionReviewVersions:
- v1beta1
clientConfig:
caBundle:
.
.
.
service:
name: istio-galley
namespace: istio-system
path: /admitpilot
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: pilot.validation.istio.io
namespaceSelector: {}
objectSelector: {}
rules:
- apiGroups:
- config.istio.io
apiVersions:
- v1alpha2
operations:
- CREATE
- UPDATE
resources:
- httpapispecs
- httpapispecbindings
- quotaspecs
- quotaspecbindings
scope: '*'
- apiGroups:
- rbac.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- '*'
scope: '*'
- apiGroups:
- authentication.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- '*'
scope: '*'
- apiGroups:
- networking.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- destinationrules
- envoyfilters
- gateways
- serviceentries
- sidecars
- virtualservices
scope: '*'
sideEffects: Unknown
timeoutSeconds: 30
- admissionReviewVersions:
- v1beta1
clientConfig:
caBundle:
.
.
.
service:
name: istio-galley
namespace: istio-system
path: /admitmixer
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: mixer.validation.istio.io
namespaceSelector: {}
objectSelector: {}
rules:
- apiGroups:
- config.istio.io
apiVersions:
- v1alpha2
operations:
- CREATE
- UPDATE
resources:
- rules
- attributemanifests
- circonuses
- deniers
- fluentds
- kubernetesenvs
- listcheckers
- memquotas
- noops
- opas
- prometheuses
- rbacs
- solarwindses
- stackdrivers
- cloudwatches
- dogstatsds
- statsds
- stdios
- apikeys
- authorizations
- checknothings
- listentries
- logentries
- metrics
- quotas
- reportnothings
- tracespans
scope: '*'
sideEffects: Unknown
timeoutSeconds: 30
Long stroy short
I've been banging my head over this istio issue for more than 2 weeks. I'm sure there's planty of people felting the same trying to trouble shoot istio on k8s. Any suggestion is welcomed!
Here's how I understand the problem, please correct me if I'm wrong:
The log evidence showed the rbac rules is not allowing my access to the resource
I need to update the rbac rules
rules are distributed by mixer to the envoy container according to ServiceRoleBinding
So I need to update the ServiceRoleBinding instead
I cannot update the ServiceRoleBinding because either the validating admission webhook or the istio mixer is preventing me from doing it
I've run into below problems where
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
I tried to delete this validating webhook to update the servicerolebinding. The resource resumes right after I save the edit.
The validating webhook is actually generated automatically from a configmap so I had to update that to update the webhook.
Is there some kind of cache in galley that mixer uses to distribute the config
I can't find any relevent log that indicates the rbac.istio.io resource is protected/validated by any service in the istio-system namespace.
How can I get the log of the MIXER
I need to understand which component exactly controls the policy. I managed to update the log level but failed to find anything useful
Most importantly How do I debug an envoy container
I need to debug the envoy app to understand why it's returnning false for the allow function.
If we can not debug it easily. Is there a document that lets me update the code to add more log and build a new image to GCR so I can have another run and based on the log to see what's going on behind the scene.
Answering my own question since I've made some progress on them.
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
That's because the ServiceRoleBinding is actually generated/monitored/managed by the profile controller in the kubeflow namespace instead of the validating webhook.
I'm having this rbac issue because based on the params.yaml in the profiles manifest folder the rule is generated as
request.headers[]: roger.l.c.lei#XXXX.com
instead of
request.headers[kubeflow-userid]: roger.l.c.lei#XXXX.com
Due to I mis-configed the value as blank instead of userid-header=kubeflow-userid in the params.yaml
Check your authorizationpolicy resource in your application namespace.
For new clusters please see this comment from issue 4440
https://github.com/kubeflow/pipelines/issues/4440
cat << EOF | kubectl apply -f -
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: bind-ml-pipeline-nb-kubeflow-user-example-com
namespace: kubeflow
spec:
selector:
matchLabels:
app: ml-pipeline
rules:
- from:
- source:
principals: ["cluster.local/ns/kubeflow-user-example-com/sa/default-editor"]
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: kubeflow-user-example-com
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: user#example.com
workloadSelector:
labels:
notebook-name: test2
EOF
In my notebook
import kfp
client = kfp.Client()
print(client.list_experiments())
Output
{'experiments': [{'created_at': datetime.datetime(2021, 8, 12, 9, 14, 20, tzinfo=tzlocal()),
'description': None,
'id': 'b2e552e5-3324-483a-8ec8-b32894f49281',
'name': 'test',
'resource_references': [{'key': {'id': 'kubeflow-user-example-com',
'type': 'NAMESPACE'},
'name': None,
'relationship': 'OWNER'}],
'storage_state': 'STORAGESTATE_AVAILABLE'}],
'next_page_token': None,
'total_size': 1}

debugging cert-manager certificate creation failure on AKS

I'm deploying cert-manager on Azure AKS and trying to have it request a Let's Encrypt certificate. It fails with certificate signed by unknown authority error and I have problem troubleshooting it further.
Not sure whether this is a problem with trusting LE server, a tunnelfront pod, or maybe an internal AKS self-generated CA. So my questions would be:
how to force cert-manager to debug (display more info) regarding the certificate it does not trust?
maybe the problem is occuring regularly and there is a known solution?
what steps should be undertaken to debug the issue further?
I have created an issue on jetstack/cert-manager's Github page, but was not answered, so I came here.
The whole story is as follows:
Certificates are not created. The following errors are reported:
the certificate:
Error from server: conversion webhook for &{map[apiVersion:cert-manager.io/v1alpha2 kind:Certificate metadata:map[creationTimestamp:2020-05-13T17:30:48Z generation:1 name:xxx-tls namespace:test ownerReferences:[map[apiVersion:extensions/v1beta1 blockOwnerDeletion:true controller:true kind:Ingress name:xxx-ingress uid:6d73b182-bbce-4834-aee2-414d2b3aa802]] uid:d40bc037-aef7-4139-868f-bd615a423b38] spec:map[dnsNames:[xxx.test.domain.com] issuerRef:map[group:cert-manager.io kind:ClusterIssuer name:letsencrypt-prod] secretName:xxx-tls] status:map[conditions:[map[lastTransitionTime:2020-05-13T18:55:31Z message:Waiting for CertificateRequest "xxx-tls-1403681706" to complete reason:InProgress status:False type:Ready]]]]} failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: x509: certificate signed by unknown authority
cert-manager-webhook container:
cert-manager 2020/05/15 14:22:58 http: TLS handshake error from 10.20.0.19:35350: remote error: tls: bad certificate
Where 10.20.0.19 is the IP of tunnelfront pod.
Debugging with https://cert-manager.io/docs/faq/acme/ sort of "fails" when trying to kubectl describe order... as kubectl describe certificaterequest... returns CSR contents with error (as above), but not the order ID.
Environment details:
Kubernetes version: 1.15.10
Cloud-provider/provisioner : Azure (AKS)
cert-manager version: 0.14.3
Install method: static manifests (see below) + cluster issuer (see below) + regular CRDs (not legacy)
cluster issuer:
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
namespace: cert-manager
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: x
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- dns01:
azuredns:
clientID: x
clientSecretSecretRef:
name: cert-manager-stage
key: CLIENT_SECRET
subscriptionID: x
tenantID: x
resourceGroupName: dns-stage
hostedZoneName: x
the manifest:
imagePullSecrets: []
isOpenshift: false
priorityClassName: ""
rbac:
create: true
podSecurityPolicy:
enabled: false
logLevel: 2
leaderElection:
namespace: "kube-system"
replicaCount: 1
strategy: {}
image:
repository: quay.io/jetstack/cert-manager-controller
pullPolicy: IfNotPresent
tag: v0.14.3
clusterResourceNamespace: ""
serviceAccount:
create: true
name:
annotations: {}
extraArgs: []
extraEnv: []
resources: {}
securityContext:
enabled: false
fsGroup: 1001
runAsUser: 1001
podAnnotations: {}
podLabels: {}
nodeSelector: {}
ingressShim:
defaultIssuerName: letsencrypt-prod
defaultIssuerKind: ClusterIssuer
prometheus:
enabled: true
servicemonitor:
enabled: false
prometheusInstance: default
targetPort: 9402
path: /metrics
interval: 60s
scrapeTimeout: 30s
labels: {}
affinity: {}
tolerations: []
webhook:
enabled: true
replicaCount: 1
strategy: {}
podAnnotations: {}
extraArgs: []
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
image:
repository: quay.io/jetstack/cert-manager-webhook
pullPolicy: IfNotPresent
tag: v0.14.3
injectAPIServerCA: true
securePort: 10250
cainjector:
replicaCount: 1
strategy: {}
podAnnotations: {}
extraArgs: []
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
image:
repository: quay.io/jetstack/cert-manager-cainjector
pullPolicy: IfNotPresent
tag: v0.14.3
Seems that v0.14.3 had a bug of some sort. The problem does not occur for v0.15.0.

K8S: Unable to Create wildcard SSL using Issuer with acmedns provider

I have tried to create wildcard SSL certificate using k8s certmanager and issuer with acmedns acme provider. I have created the credentials by POST requesting to /register URL and tested the acmedns successfully. However I am unable to create new wildcard SSL certificate using the k8s issuer. I am adding my issuer YAML file below,
apiVersion: certmanager.k8s.io/v1alpha1
kind: Issuer
metadata:
annotations:
name: letsencrypt-wildcard-prod
namespace: default
spec:
acme:
dns01:
providers:
acmedns:
accountSecretRef:
key: acmedns.json
name: acme-dns
host: http://auth.mydomain.com
email: info#mydomain.com
privateKeySecretRef:
name: letsencrypt-prod
server: https://acme-v02.api.letsencrypt.org/directory
I have created the secret acme-dns using the json output got from the /register output.
Also, adding the k8s certificate YAML here
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: wildcard-mydomain.com
namespace: default
spec:
acme:
config:
- dns01:
provider: acmedns
domains:
- '*.mydomain.com'
commonName: '*.mydomain.com'
dnsNames:
- '*.mydomain.com'
issuerRef:
kind: Issuer
name: letsencrypt-wildcard-prod
secretName: wildcard-mydomain.com-tls
I am getting the following error from the cert-manager:
E1129 16:30:31.881025 1 reflector.go:205]
github.com/jetstack/cert-manager/pkg/client/informers/
externalversions/factory.go:71: Failed to list
*v1alpha1.Issuer: v1alpha1.IssuerList: Items:
[]v1alpha1.Issuer: v1alpha1.Issuer: Spec: v1alpha1.
IssuerSpec: IssuerConfig: ACME: v1alpha1.ACMEIssuer:
DNS01: v1alpha1.ACMEIssuerDNS01Config: Providers:
[]v1alpha1.ACMEIssuerDNS01Provider:
ReadArrayCB:
expect [ or n, but found {, error found in #10 byte
of ...|oviders":{"acmedns":|..., bigger context
...|81551da95"},
"spec":{"acme":{"dns01":{"providers":
{"acmedns":{"accountSecretRef":{"key":"acmedns.json|...
E1129 16:30:32.887374 1 reflector.go:205] github.com/
jetstack/cert-manager/pkg/client/informers/externalversions
/factory.go:71: Failed to list *v1alpha1.Issuer: v1alpha1.
IssuerList: Items: []v1alpha1.Issuer: v1alpha1.Issuer:
Spec: v1alpha1.IssuerSpec: IssuerConfig: ACME: v1alpha1.
ACMEIssuer: DNS01: v1alpha1.ACMEIssuerDNS01Config:
Providers: []v1alpha1.ACMEIssuerDNS01Provider:
ReadArrayCB:
expect [ or n, but found {, error found in #10
byte of ...|oviders":{"acmedns":|...,
bigger context
...|81551da95"},"spec":{"acme":{"dns01":
{"providers":{"acmedns":{"accountSecretRef":
{"key":"acmedns.json|...