Cannot use spring cloud config and istio 1.1.1 together-cannot recover when HTTP 404 error to get remote config - spring-cloud-config

when I'm tring to mix the spring cloud config with istio 1.1.1, When my app container(with istio envoy auto-injected) starts, the spring cloud config client will try to get config(applicationContext.yaml) from remote cloud config server(started in advance with good status), unfornately it fails with HTTP 404 error. Even if I've configged my app to have retry for cloud config client, it keeps retring alway with HTTP 404 error(I've confirmed the config server URL is correct from another container) and there's no chance to recover. It happens sometimes. I knew that Istio envoy and my app are in the same kubernetes POD, the app may start before istio envoy, in which case there might be network error but as soon as the envoy is up, everything should be OK. I really don't understand why my app cannot recover automatically. Here're my diagnostic steps:
1. Add retry mechanism in my app(with retry libs included in POM and modified yaml. - retry works but each retry failed with HTTP 404 error
spring-config/
fail-fast: true
retry:
initial-interval: 10000
max-attempts: 100
2. Add 'sleep xx' before my java app starts in my app k8s deployment file - less chance to have HTTP 404 error, but problem is not eliminated
command: ["/bin/sh","-c","sleep 20; java -jar -Xms512m -Xmx1024m app.jar"]
3. get the istio envoy's access log and compare the victim app's and good app's - it sounds like the good log has values for upstream_cluster and upstream_cluster key; the fields for the bad log are empty
the good access log
{
"response_code": "200",
"user_agent": "Java/1.8.0_121",
"response_flags": "-",
"start_time": "2019-06-25T01:17:29.661Z",
"method": "2019-06-25T01:17:29.661Z",
"request_id": "d3d27512-161b-4303-bb48-05a6e19e05b7",
"upstream_host": "172.20.3.104:9083",
"x_forwarded_for": "-",
"requested_server_name": "-",
"bytes_received": "0",
"istio_policy_status": "-",
"bytes_sent": "1144",
"upstream_cluster": "outbound|9083||fota-spring-config.ns-fota.svc.cluster.local",
"downstream_remote_address": "172.20.2.115:45816",
"path": "/fota-spring-config/fota-task/dev/master",
"authority": "fota-spring-config.ns-fota.svc.cluster.local:9083",
"protocol": "HTTP/1.1",
"upstream_service_time": "289",
"upstream_local_address": "-",
"duration": "290",
"downstream_local_address": "172.21.1.152:9083"
}
the bad access log:
{
"upstream_cluster": "-",
"downstream_remote_address": "172.20.2.118:41980",
"path": "/fota-spring-config/fota-dmserver/dev/master",
"authority": "fota-spring-config.ns-fota.svc.cluster.local:9083",
"protocol": "HTTP/1.1",
"upstream_service_time": "-",
"upstream_local_address": "-",
"duration": "0",
"downstream_local_address": "172.21.1.152:9083",
"response_code": "404",
"user_agent": "Java/1.8.0_121",
"response_flags": "NR",
"start_time": "2019-06-25T01:21:24.197Z",
"method": "2019-06-25T01:21:24.197Z",
"request_id": "346716e4-1def-465f-b370-cb1e71e30d25",
"upstream_host": "-",
"x_forwarded_for": "-",
"requested_server_name": "-",
"bytes_received": "0",
"istio_policy_status": "-",
"bytes_sent": "0"
}

the K8S deployment file is attached.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: fota-car
spec:
template:
metadata:
labels:
app: fota-car
version: v1
spec:
serviceAccountName: fota-serviceaccount
imagePullSecrets:
- name: uaes-docker2
containers:
- name: fota-car
image: 192.168.119.22:18080/uaes-fota/fota-car:dev-release-1.0.0
imagePullPolicy: Always
ports:
- containerPort: 8085
env:
- name: SPRING_DATASOURCE_URL
value: jdbc:mysql://mysql-ali-dev.ns-fota-ext-svc/fota-car?useUnicode=true&characterEncoding=utf-8&useSSL=false
- name: SPRING_DATASOURCE_USERNAME
valueFrom:
secretKeyRef:
name: mysql-ali-dev-secret
key: username
- name: SPRING_DATASOURCE_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-ali-dev-secret
key: password
command: ["/bin/sh","-c","java -jar -Xms512m -Xmx1024m app.jar"]
readinessProbe:
httpGet:
path: /actuator/health
port: 18085
initialDelaySeconds: 60
timeoutSeconds: 1
kind: Service
apiVersion: v1
metadata:
labels:
app: fota-car
name: fota-car
spec:
ports:
- name: http
port: 8085
selector:
app: fota-car

Related

traefik listens on port 80 and forwards the request to minio console(5000) 404

I deployed minio and the console in K8S, used ClusterIP to expose ports 9000 & 5000
Listening for port 80 and 5000 forwarding requests to minio.service(ClusterIP)
Request console all right through port 5000
By requesting the console on port 80, you can see the console, but the request is 404 in the browser
enter image description here
enter image description here
apiVersion: v1
kind: Service
metadata:
namespace: {{ .Release.Namespace }}
name: minio-headless
labels:
app: minio-headless
spec:
type: ClusterIP
clusterIP: None
ports:
- name: server
port: 9000
targetPort: 9000
- name: console
port: 5000
targetPort: 5000
selector:
app: minio
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: ingress-route-minio
namespace: {{ .Release.Namespace }}
spec:
entryPoints:
- minio
- web
routes:
- kind: Rule
match: Host(`minio-console.{{ .Release.Namespace }}.k8s.zszc`)
priority: 10
services:
- kind: Service
name: minio-headless
namespace: {{ .Release.Namespace }}
port: 5000
responseForwarding:
flushInterval: 1ms
scheme: http
strategy: RoundRobin
weight: 10
traefik access log
{
"ClientAddr": "192.168.4.250:55485",
"ClientHost": "192.168.4.250",
"ClientPort": "55485",
"ClientUsername": "-",
"DownstreamContentSize": 19,
"DownstreamStatus": 404,
"Duration": 688075,
"OriginContentSize": 19,
"OriginDuration": 169976,
"OriginStatus": 404,
"Overhead": 518099,
"RequestAddr": "minio-console.etb-0-0-1.k8s.zszc",
"RequestContentSize": 0,
"RequestCount": 1018,
"RequestHost": "minio-console.etb-0-0-1.k8s.zszc",
"RequestMethod": "GET",
"RequestPath": "/api/v1/login",
"RequestPort": "-",
"RequestProtocol": "HTTP/1.1",
"RequestScheme": "http",
"RetryAttempts": 0,
"RouterName": "traefik-traefik-dashboard-6e26dcbaf28841493448#kubernetescrd",
"StartLocal": "2023-01-27T13:20:06.337540015Z",
"StartUTC": "2023-01-27T13:20:06.337540015Z",
"entryPointName": "web",
"level": "info",
"msg": "",
"time": "2023-01-27T13:20:06Z"
}
It looks to me like the request for /api is conflicting with rules for the Traefik dashboard. If you look at the access log in your question, we see:
"RouterName": "traefik-traefik-dashboard-6e26dcbaf28841493448#kubernetescrd",
If you have installed Traefik from the Helm chart, it installs an IngressRoute with the following rules:
- kind: Rule
match: PathPrefix(`/dashboard`) || PathPrefix(`/api`)
services:
- kind: TraefikService
name: api#internal
In theory those are bound only to the traefik entrypoint, but it looks like you may have customized your entrypoint configuration.
Take a look at the IngressRoute resource for your Traefik dashboard and ensure that it's not sharing an entrypoint with minio.

How can I create router and load balance service added to traefik via consulCatalog?

I have nextcloud running on bare metal 2 nodes:
node1: 192.168.1.10
node2: 192.168.1.11
In the consul I have defined nextcloud service as such on both the nodes:
{
"service": {
"name": "nextcloud",
"tags": ["nextcloud", "traefik"],
"port": 80,
"check": {
"tcp": "localhost:80",
"args": ["ping", "-c1", "127.0.0.1"],
"interval": "10s",
"status": "passing",
"success_before_passing": 3,
"failures_before_critical": 3
}
}
now this shows up in consul fine:
static config: traefik.yaml
global:
# Send anonymous usage data
sendAnonymousUsage: true
api:
dashboard: true
debug: true
log:
level: DEBUG
entryPoints:
http:
address: ":80"
https:
address: ":443"
serversTransport:
insecureSkipVerify: true
providers:
docker:
endpoint: "unix:///var/run/docker.sock"
exposedByDefault: false
file:
directory: "/config/"
watch: true
consulCatalog:
defaultRule: "Host(`{{ .Name }}.sub.mydomain.com`)"
endpoint:
address: http://127.0.0.1:8500
certificatesResolvers:
linode:
acme:
caServer: https://acme-staging-v02.api.letsencrypt.org/directory
email: myemail#domain.com
storage: acme.json
dnsChallenge:
provider: linode
resolvers:
- "1.1.1.1:53"
- "1.0.0.1:53"
and then dynamic /config/config.yaml:
http:
routers:
nextcloud#consulCatalog:
entryPoints:
- "https"
rule: "Host(`home.sub.mydomain.com`) && Path(`/nextcloud`)"
tls:
certResolver: linode
service: nextcloud
services:
nextcloud:
loadBalancer:
servers:
- url: http://192.168.1.10
- url: http://192.168.1.11
passHostHeader: true
but this shows up as file provider with TLS in instead in addtion to exisiting consulcatalog provider.
and not IP or domain mapped.
actual consulcatalog provider showing up but no tls
I am wondering why my dynamic configuration in http did not updated the nextcloud#consulcatalog and set the https entrypoint.
Any help will be greatly appreciated, I am struggling very hard to get this to work.
I have tried following the docs on traefik but its very confusing specially on the consulcatalog part.
Your configuration is showing up as being defined via the file provider because you are statically defining it in the file at /config/config.yaml.
In order to dynamically retrieve this configuration from Consul, you should not be defining the static config file and instead configure tags on the Consul service registrations that will instruct Traefik to route traffic to your service.
For example:
{
"service": {
"name": "nextcloud",
"tags": [
"nextcloud",
"traefik.enable=true",
"traefik.http.routers.nextcloud.entrypoints=https",
"traefik.http.routers.nextcloud.rule=(Host(`home.sub.mydomain.com`) && Path(`/nextcloud`))",
"traefik.http.routers.nextcloud.tls.certresolver=linode",
"traefik.http.services.nextcloud.loadbalancer.passhostheader=true"
],
"port": 80,
"check": {
"tcp": "localhost:80",
"args": [
"ping",
"-c1",
"127.0.0.1"
],
"interval": "10s",
"status": "passing",
"success_before_passing": 3,
"failures_before_critical": 3
}
}
}
More info can be found on the Routing Configuration docs for Traffic's Consul catalog provider.

istio getting "RBAC: access denied" even the servicerolebinding checked to be allowed

I've been struggleing with istio... So here I am seeking help from the experts!
Background
I'm trying to deploy my kubeflow application for multi-tenency with dex.
Refering to the kubeflow offical document with the manifest file from github
Here is a list of component/version information
I'm running kubernetes 1.15 on GKE
Istio 1.1.6 been used in kubeflow for service meth
Trying to deploy kubeflow 1.0 for ML
Deployed dex 1.0 for authn
With the manifest file I successfully deployed the kubeflow on my cluster. Here's what I've done.
Deploy the kubeflow application on the cluster
Deploy Dex with OIDC service to enable authn to google Oauth2.0
Enable the RBAC
create envoy filter to append header "kubeflow-userid" as the login user
Here is a verification of step 3 and 4
Check RBAC enabled and envoyfilter added for kubeflow-userid
[root#gke-client-tf leilichao]# k get clusterrbacconfigs -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
kind: ClusterRbacConfig
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.istio.io/v1alpha1","kind":"ClusterRbacConfig","metadata":{"annotations":{},"name":"default"},"spec":{"mode":"ON"}}
creationTimestamp: "2020-07-04T01:28:52Z"
generation: 2
name: default
resourceVersion: "5986075"
selfLink: /apis/rbac.istio.io/v1alpha1/clusterrbacconfigs/default
uid: db70920e-f364-40ec-a93b-a3364f88650f
spec:
mode: "ON"
kind: List
metadata:
resourceVersion: ""
selfLink: ""
[root#gke-client-tf leilichao]# k get envoyfilter -n istio-system -o yaml
apiVersion: v1
items:
- apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"oidc-authservice","app.kubernetes.io/instance":"oidc-authservice-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"oidc-authservice","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0"},"name":"authn-filter","namespace":"istio-system"},"spec":{"filters":[{"filterConfig":{"httpService":{"authorizationRequest":{"allowedHeaders":{"patterns":[{"exact":"cookie"},{"exact":"X-Auth-Token"}]}},"authorizationResponse":{"allowedUpstreamHeaders":{"patterns":[{"exact":"kubeflow-userid"}]}},"serverUri":{"cluster":"outbound|8080||authservice.istio-system.svc.cluster.local","failureModeAllow":false,"timeout":"10s","uri":"http://authservice.istio-system.svc.cluster.local"}},"statusOnError":{"code":"GatewayTimeout"}},"filterName":"envoy.ext_authz","filterType":"HTTP","insertPosition":{"index":"FIRST"},"listenerMatch":{"listenerType":"GATEWAY"}}],"workloadLabels":{"istio":"ingressgateway"}}}
creationTimestamp: "2020-07-04T01:40:43Z"
generation: 1
labels:
app.kubernetes.io/component: oidc-authservice
app.kubernetes.io/instance: oidc-authservice-v1.0.0
app.kubernetes.io/managed-by: kfctl
app.kubernetes.io/name: oidc-authservice
app.kubernetes.io/part-of: kubeflow
app.kubernetes.io/version: v1.0.0
name: authn-filter
namespace: istio-system
resourceVersion: "4715289"
selfLink: /apis/networking.istio.io/v1alpha3/namespaces/istio-system/envoyfilters/authn-filter
uid: e599ba82-315a-4fc1-9a5d-e8e35d93ca26
spec:
filters:
- filterConfig:
httpService:
authorizationRequest:
allowedHeaders:
patterns:
- exact: cookie
- exact: X-Auth-Token
authorizationResponse:
allowedUpstreamHeaders:
patterns:
- exact: kubeflow-userid
serverUri:
cluster: outbound|8080||authservice.istio-system.svc.cluster.local
failureModeAllow: false
timeout: 10s
uri: http://authservice.istio-system.svc.cluster.local
statusOnError:
code: GatewayTimeout
filterName: envoy.ext_authz
filterType: HTTP
insertPosition:
index: FIRST
listenerMatch:
listenerType: GATEWAY
workloadLabels:
istio: ingressgateway
kind: List
metadata:
resourceVersion: ""
selfLink: ""
RBAC Issue problem analysis
After I finished my deployment. I performed below functional testing:
I can login with my google account with google oauth
I was able to create my own profile/namespace
I was able to create a notebook server
However I can NOT connect to the notebook server
RBAC Issue investigation
I'm getting "RBAC: access denied" error after I successfully created the notebook server on kubeflow and trying to connect the notebook server.
I managed to updated the envoy log level and get the log below.
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:64] checking request: remoteAddress: 10.1.1.2:58012, localAddress: 10.1.2.66:8888, ssl: none, headers: ':authority', 'compliance-kf-system.ml'
':path', '/notebook/roger-l-c-lei/aug06/'
':method', 'GET'
'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
'accept-encoding', 'gzip, deflate'
'accept-language', 'en,zh-CN;q=0.9,zh;q=0.8'
'cookie', 'authservice_session=MTU5NjY5Njk0MXxOd3dBTkZvMldsVllVMUZPU0VaR01sSk5RVlJJV2xkRFVrRTFTVUl5V0RKV1EwdEhTMU5QVjFCVlUwTkpSVFpYUlVoT1RGVlBUa0U9fN3lPBXDDSZMT9MTJRbG8jv7AtblKTE3r84ayeCYuKOk; _xsrf=2|1e6639f2|10d3ea0a904e0ae505fd6425888453f8|1596697030'
'referer', 'http://compliance-kf-system.ml/jupyter/'
'upgrade-insecure-requests', '1'
'x-forwarded-for', '10.10.10.230'
'x-forwarded-proto', 'http'
'x-request-id', 'babbf884-4cec-93fd-aea6-2fc60d3abb83'
'kubeflow-userid', 'roger.l.c.lei#XXXX.com'
'x-istio-attributes', 'CjAKHWRlc3RpbmF0aW9uLnNlcnZpY2UubmFtZXNwYWNlEg8SDXJvZ2VyLWwtYy1sZWkKIwoYZGVzdGluYXRpb24uc2VydmljZS5uYW1lEgcSBWF1ZzA2Ck4KCnNvdXJjZS51aWQSQBI+a3ViZXJuZXRlczovL2lzdGlvLWluZ3Jlc3NnYXRld2F5LTg5Y2Q0YmQ0Yy1kdnF3dC5pc3Rpby1zeXN0ZW0KQQoXZGVzdGluYXRpb24uc2VydmljZS51aWQSJhIkaXN0aW86Ly9yb2dlci1sLWMtbGVpL3NlcnZpY2VzL2F1ZzA2CkMKGGRlc3RpbmF0aW9uLnNlcnZpY2UuaG9zdBInEiVhdWcwNi5yb2dlci1sLWMtbGVpLnN2Yy5jbHVzdGVyLmxvY2Fs'
'x-envoy-expected-rq-timeout-ms', '300000'
'x-b3-traceid', '3bf35cca1f7b75e7a42a046b1c124b1f'
'x-b3-spanid', 'a42a046b1c124b1f'
'x-b3-sampled', '1'
'x-envoy-original-path', '/notebook/roger-l-c-lei/aug06/'
'content-length', '0'
'x-envoy-internal', 'true'
, dynamicMetadata: filter_metadata {
key: "istio_authn"
value {
}
}
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:108] enforced denied
From the source code it looks like the allowed function is returnning false so it's giving the "RBAC: access denied" response.
if (engine.has_value()) {
if (engine->allowed(*callbacks_->connection(), headers,
callbacks_->streamInfo().dynamicMetadata(), nullptr)) {
ENVOY_LOG(debug, "enforced allowed");
config_->stats().allowed_.inc();
return Http::FilterHeadersStatus::Continue;
} else {
ENVOY_LOG(debug, "enforced denied");
callbacks_->sendLocalReply(Http::Code::Forbidden, "RBAC: access denied", nullptr,
absl::nullopt);
config_->stats().denied_.inc();
return Http::FilterHeadersStatus::StopIteration;
}
}
I took a search on the dumped envoy, it looks like the rule should be allowing any request with a header key as my mail address. Now I can confirm I've got that in my header from above log.
{
"name": "envoy.filters.http.rbac",
"config": {
"rules": {
"policies": {
"ns-access-istio": {
"permissions": [
{
"and_rules": {
"rules": [
{
"any": true
}
]
}
}
],
"principals": [
{
"and_ids": {
"ids": [
{
"header": {
"exact_match": "roger.l.c.lei#XXXX.com"
}
}
]
}
}
]
}
}
}
}
}
With the understand that the envoy config that's been used to validate RBAC authz is from this config. And it's distributed to the sidecar by mixer, The log and code leads me to the rbac.istio.io config of servicerolebinding.
[root#gke-client-tf leilichao]# k get servicerolebinding -n roger-l-c-lei -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
annotations:
role: admin
user: roger.l.c.lei#XXXX.com
creationTimestamp: "2020-07-04T01:35:30Z"
generation: 5
name: owner-binding-istio
namespace: roger-l-c-lei
ownerReferences:
- apiVersion: kubeflow.org/v1
blockOwnerDeletion: true
controller: true
kind: Profile
name: roger-l-c-lei
uid: 689c9f04-08a6-4c51-a1dc-944db1a66114
resourceVersion: "23201026"
selfLink: /apis/rbac.istio.io/v1alpha1/namespaces/roger-l-c-lei/servicerolebindings/owner-binding-istio
uid: bbbffc28-689c-4099-837a-87a2feb5948f
spec:
roleRef:
kind: ServiceRole
name: ns-access-istio
subjects:
- properties:
request.headers[]: roger.l.c.lei#XXXX.com
status: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
I wanted to have a try updating this ServiceRoleBinding to validate some assumption since I can't debug the envoy source code and there's not enough log to show why exactly is the "allow" method returnning false.
However I find myself cannot update the servicerolebinding. It resumes to its orriginal version everytime right after I finish editing it.
I find that there's this istio-galley validatingAdmissionConfiguration(Code block below) that monitors these istio rbac resources.
[root#gke-client-tf leilichao]# k get validatingwebhookconfigurations istio-galley -oyaml
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
creationTimestamp: "2020-08-04T15:00:59Z"
generation: 1
labels:
app: galley
chart: galley
heritage: Tiller
istio: galley
release: istio
name: istio-galley
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: Deployment
name: istio-galley
uid: 11fef012-4145-49ac-a43c-2e1d0a460ea4
resourceVersion: "22484680"
selfLink: /apis/admissionregistration.k8s.io/v1beta1/validatingwebhookconfigurations/istio-galley
uid: 6f485e28-3b5a-4a3b-b31f-a5c477c82619
webhooks:
- admissionReviewVersions:
- v1beta1
clientConfig:
caBundle:
.
.
.
service:
name: istio-galley
namespace: istio-system
path: /admitpilot
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: pilot.validation.istio.io
namespaceSelector: {}
objectSelector: {}
rules:
- apiGroups:
- config.istio.io
apiVersions:
- v1alpha2
operations:
- CREATE
- UPDATE
resources:
- httpapispecs
- httpapispecbindings
- quotaspecs
- quotaspecbindings
scope: '*'
- apiGroups:
- rbac.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- '*'
scope: '*'
- apiGroups:
- authentication.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- '*'
scope: '*'
- apiGroups:
- networking.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- destinationrules
- envoyfilters
- gateways
- serviceentries
- sidecars
- virtualservices
scope: '*'
sideEffects: Unknown
timeoutSeconds: 30
- admissionReviewVersions:
- v1beta1
clientConfig:
caBundle:
.
.
.
service:
name: istio-galley
namespace: istio-system
path: /admitmixer
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: mixer.validation.istio.io
namespaceSelector: {}
objectSelector: {}
rules:
- apiGroups:
- config.istio.io
apiVersions:
- v1alpha2
operations:
- CREATE
- UPDATE
resources:
- rules
- attributemanifests
- circonuses
- deniers
- fluentds
- kubernetesenvs
- listcheckers
- memquotas
- noops
- opas
- prometheuses
- rbacs
- solarwindses
- stackdrivers
- cloudwatches
- dogstatsds
- statsds
- stdios
- apikeys
- authorizations
- checknothings
- listentries
- logentries
- metrics
- quotas
- reportnothings
- tracespans
scope: '*'
sideEffects: Unknown
timeoutSeconds: 30
Long stroy short
I've been banging my head over this istio issue for more than 2 weeks. I'm sure there's planty of people felting the same trying to trouble shoot istio on k8s. Any suggestion is welcomed!
Here's how I understand the problem, please correct me if I'm wrong:
The log evidence showed the rbac rules is not allowing my access to the resource
I need to update the rbac rules
rules are distributed by mixer to the envoy container according to ServiceRoleBinding
So I need to update the ServiceRoleBinding instead
I cannot update the ServiceRoleBinding because either the validating admission webhook or the istio mixer is preventing me from doing it
I've run into below problems where
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
I tried to delete this validating webhook to update the servicerolebinding. The resource resumes right after I save the edit.
The validating webhook is actually generated automatically from a configmap so I had to update that to update the webhook.
Is there some kind of cache in galley that mixer uses to distribute the config
I can't find any relevent log that indicates the rbac.istio.io resource is protected/validated by any service in the istio-system namespace.
How can I get the log of the MIXER
I need to understand which component exactly controls the policy. I managed to update the log level but failed to find anything useful
Most importantly How do I debug an envoy container
I need to debug the envoy app to understand why it's returnning false for the allow function.
If we can not debug it easily. Is there a document that lets me update the code to add more log and build a new image to GCR so I can have another run and based on the log to see what's going on behind the scene.
Answering my own question since I've made some progress on them.
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
That's because the ServiceRoleBinding is actually generated/monitored/managed by the profile controller in the kubeflow namespace instead of the validating webhook.
I'm having this rbac issue because based on the params.yaml in the profiles manifest folder the rule is generated as
request.headers[]: roger.l.c.lei#XXXX.com
instead of
request.headers[kubeflow-userid]: roger.l.c.lei#XXXX.com
Due to I mis-configed the value as blank instead of userid-header=kubeflow-userid in the params.yaml
Check your authorizationpolicy resource in your application namespace.
For new clusters please see this comment from issue 4440
https://github.com/kubeflow/pipelines/issues/4440
cat << EOF | kubectl apply -f -
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: bind-ml-pipeline-nb-kubeflow-user-example-com
namespace: kubeflow
spec:
selector:
matchLabels:
app: ml-pipeline
rules:
- from:
- source:
principals: ["cluster.local/ns/kubeflow-user-example-com/sa/default-editor"]
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: kubeflow-user-example-com
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: user#example.com
workloadSelector:
labels:
notebook-name: test2
EOF
In my notebook
import kfp
client = kfp.Client()
print(client.list_experiments())
Output
{'experiments': [{'created_at': datetime.datetime(2021, 8, 12, 9, 14, 20, tzinfo=tzlocal()),
'description': None,
'id': 'b2e552e5-3324-483a-8ec8-b32894f49281',
'name': 'test',
'resource_references': [{'key': {'id': 'kubeflow-user-example-com',
'type': 'NAMESPACE'},
'name': None,
'relationship': 'OWNER'}],
'storage_state': 'STORAGESTATE_AVAILABLE'}],
'next_page_token': None,
'total_size': 1}

Ocelot ApiGateway unable to reach other services on Kubernetes

I'm working on a simple project to learn more about microservices.
I have a very simple .net core web with a single GET endpoint.
I have added a ApiGateway web app with Ocelot, and everything seems to be working fine except when I deploy to a local Kubernetes cluster.
Those are the yaml files I'm using for the deployment:
ApiGateway.yaml
kind: Pod
apiVersion: v1
metadata:
name: api-gateway
labels:
app: api-gateway
spec:
containers:
- name: api-gateway
image: apigateway:dev
---
kind: Service
apiVersion: v1
metadata:
name: api-gateway-service
spec:
selector:
app: api-gateway
ports:
- port: 80
TestService.yaml
kind: Pod
apiVersion: v1
metadata:
name: testservice
labels:
app: testservice
spec:
containers:
- name: testservice
image: testbuild:latest
---
kind: Service
apiVersion: v1
metadata:
name: testservice-service
spec:
selector:
app: testservice
ports:
- port: 80
ocelot.json
{
"ReRoutes": [
{
"DownstreamPathTemplate": "/endpoint",
"DownstreamScheme": "http",
"DownstreamHostAndPorts": [
{
"Host": "testservice-service",
"Port": 80
}
],
"UpstreamPathTemplate": "/test",
"UpstreamHttpMethod": [ "Get" ]
}
]
}
If I try to make a request with cUrl directly from the ApiGateway pod to the TestService service it works with no issue. But when I try to request it from Ocelot, it returns a 500 error, saying:
warn: Ocelot.Responder.Middleware.ResponderMiddleware[0]
requestId: 0HLUEDTNVVN26:00000002, previousRequestId: no previous request id, message: Error Code: UnableToCompleteRequestError Message: Error making http request, exception: System.Net.Http.HttpRequestException: Name or service not known
---> System.Net.Sockets.SocketException (0xFFFDFFFF): Name or service not known
Also, I tried with this https://ocelot.readthedocs.io/en/latest/features/kubernetes.html but honestly the doc is not really clear and I haven't had any success so far.
Any idea what the problem might be?
(I'm also using an Ingress resource for exposing the services, but that works with no problem so I will keep it out of this thread for now)

I have a problem with Kubernetes depoyment. Can anybody help I always get this error when trying to connect to the cluster IP

I have problems with Kubernetes. I try to deploy my service for two days now bu I'm doing something wrong.
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\": No policy matched.",
"reason": "Forbidden",
"details": {
},
"code": 403
}
Does anybody knows what the problem could be?
Here is also my yaml file:
# Certificate
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: ${APP_NAME}
spec:
secretName: ${APP_NAME}-cert
dnsNames:
- ${URL}
- www.${URL}
acme:
config:
- domains:
- ${URL}
- www.${URL}
http01:
ingressClass: nginx
issuerRef:
name: ${CERT_ISSUER}
kind: ClusterIssuer
---
# Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ${APP_NAME}
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/ssl-redirect: 'true'
nginx.ingress.kubernetes.io/from-to-www-redirect: 'true'
spec:
tls:
- secretName: ${APP_NAME}-cert
hosts:
- ${URL}
- www.${URL}
rules:
- host: ${URL}
http:
paths:
- backend:
serviceName: ${APP_NAME}-service
servicePort: 80
---
# Service
apiVersion: v1
kind: Service
metadata:
name: ${APP_NAME}-service
labels:
app: ${CI_PROJECT_NAME}
spec:
selector:
name: ${APP_NAME}
app: ${CI_PROJECT_NAME}
ports:
- name: http
port: 80
targetPort: http
---
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: ${APP_NAME}
labels:
app: ${CI_PROJECT_NAME}
spec:
replicas: ${REPLICAS}
revisionHistoryLimit: 0
selector:
matchLabels:
app: ${CI_PROJECT_NAME}
template:
metadata:
labels:
name: ${APP_NAME}
app: ${CI_PROJECT_NAME}
spec:
containers:
- name: webapp
image: eu.gcr.io/my-site/my-site.com:latest
imagePullPolicy: Always
ports:
- name: http
containerPort: 80
env:
- name: COMMIT_SHA
value: ${CI_COMMIT_SHA}
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 30
timeoutSeconds: 1
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 5
timeoutSeconds: 1
resources:
requests:
memory: '16Mi'
limits:
memory: '64Mi'
imagePullSecrets:
- name: ${REGISTRY_PULL_SECRET}
Can anybody help me with this? I'm stuck and I've no idea what could be the problem. This is also my first Kubernetes project.
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\": No policy matched.",
.. means just what it says: your request to the kubernetes api was not authenticated (that's the system:anonymous part), and your RBAC configuration does not tolerate the anonymous user making any requests to the API
No one here is going to be able to help you straighten out that problem, because fixing that depends on a horrific number of variables. Perhaps ask your cluster administrator to provide you with the correct credentials.
I have explained it in this post. You will need ServiceAccount, ClusterRole and RoleBinding. You can find explanation in this article. Or as Matthew L Daniel mentioned in the Kubernetes documentation.
If you still have problems, provide the method/tutorial you have used to deploy the cluster (as "Gitlab Kubernetes integration" does not tell much on the method you have used).