Why do I have a Kubernetes api permission problem? - api

I have a python script that tries to scale a statefulset from inside a pod, but get a forbidden error from the API server. The following yml file shows my role and rolebinding:
apiVersion: rbac.authorization.k8s.io/v1
Kind: Role
metadata:
name: server-controller
namespace: code-server
roles:
- apiGroups: ["*"]
resources:
- statefulsets
verbs: ["update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
Kind: RoleBinding
metadata:
name: server-controller
namespace: code-server
subjects:
-kind: ServiceAccount
name: server-controller
namespace: code-server
roleRef:
kind: Role
name server-controller
apiGroup: rbac.authorization.k8s.io
The following python code snippet shows my access to the API:
kubernetes.config.load_incluster_config()
app = kubernetes.client.AppsV1Api()
body = {"spec": {"replicas": 1}}
app.patch_namespaced_stateful_set_scale(
name="jim",
namespace="code-server",
body=body)
I get the following error:
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache", 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Fri, 15 Oct 2021 15:25:24 GMT', 'Content-Length': '469'})
HTTP response Body: {
"kind": "Status",
"apiVersion": "v1"
"metadata": {
}
"status": "Failure",
"message": "statefulsets.apps \"jim\" is forbidden: User \"system:serviceaccount:code-server:server-controller\" cannot patch resource \"statefulsets/scale\" in API group \"apps\" in the namespace \"code-server\"",
"reason": "Forbidden",
"details": {
"name": "jim",
"group": "apps",
"kind": "statefulesets"
}
"code": 403
}

The solution was to change "statefulsets" to "statefulsets/scale" in the "resources" field under "role".

Related

How can I create router and load balance service added to traefik via consulCatalog?

I have nextcloud running on bare metal 2 nodes:
node1: 192.168.1.10
node2: 192.168.1.11
In the consul I have defined nextcloud service as such on both the nodes:
{
"service": {
"name": "nextcloud",
"tags": ["nextcloud", "traefik"],
"port": 80,
"check": {
"tcp": "localhost:80",
"args": ["ping", "-c1", "127.0.0.1"],
"interval": "10s",
"status": "passing",
"success_before_passing": 3,
"failures_before_critical": 3
}
}
now this shows up in consul fine:
static config: traefik.yaml
global:
# Send anonymous usage data
sendAnonymousUsage: true
api:
dashboard: true
debug: true
log:
level: DEBUG
entryPoints:
http:
address: ":80"
https:
address: ":443"
serversTransport:
insecureSkipVerify: true
providers:
docker:
endpoint: "unix:///var/run/docker.sock"
exposedByDefault: false
file:
directory: "/config/"
watch: true
consulCatalog:
defaultRule: "Host(`{{ .Name }}.sub.mydomain.com`)"
endpoint:
address: http://127.0.0.1:8500
certificatesResolvers:
linode:
acme:
caServer: https://acme-staging-v02.api.letsencrypt.org/directory
email: myemail#domain.com
storage: acme.json
dnsChallenge:
provider: linode
resolvers:
- "1.1.1.1:53"
- "1.0.0.1:53"
and then dynamic /config/config.yaml:
http:
routers:
nextcloud#consulCatalog:
entryPoints:
- "https"
rule: "Host(`home.sub.mydomain.com`) && Path(`/nextcloud`)"
tls:
certResolver: linode
service: nextcloud
services:
nextcloud:
loadBalancer:
servers:
- url: http://192.168.1.10
- url: http://192.168.1.11
passHostHeader: true
but this shows up as file provider with TLS in instead in addtion to exisiting consulcatalog provider.
and not IP or domain mapped.
actual consulcatalog provider showing up but no tls
I am wondering why my dynamic configuration in http did not updated the nextcloud#consulcatalog and set the https entrypoint.
Any help will be greatly appreciated, I am struggling very hard to get this to work.
I have tried following the docs on traefik but its very confusing specially on the consulcatalog part.
Your configuration is showing up as being defined via the file provider because you are statically defining it in the file at /config/config.yaml.
In order to dynamically retrieve this configuration from Consul, you should not be defining the static config file and instead configure tags on the Consul service registrations that will instruct Traefik to route traffic to your service.
For example:
{
"service": {
"name": "nextcloud",
"tags": [
"nextcloud",
"traefik.enable=true",
"traefik.http.routers.nextcloud.entrypoints=https",
"traefik.http.routers.nextcloud.rule=(Host(`home.sub.mydomain.com`) && Path(`/nextcloud`))",
"traefik.http.routers.nextcloud.tls.certresolver=linode",
"traefik.http.services.nextcloud.loadbalancer.passhostheader=true"
],
"port": 80,
"check": {
"tcp": "localhost:80",
"args": [
"ping",
"-c1",
"127.0.0.1"
],
"interval": "10s",
"status": "passing",
"success_before_passing": 3,
"failures_before_critical": 3
}
}
}
More info can be found on the Routing Configuration docs for Traffic's Consul catalog provider.

istio getting "RBAC: access denied" even the servicerolebinding checked to be allowed

I've been struggleing with istio... So here I am seeking help from the experts!
Background
I'm trying to deploy my kubeflow application for multi-tenency with dex.
Refering to the kubeflow offical document with the manifest file from github
Here is a list of component/version information
I'm running kubernetes 1.15 on GKE
Istio 1.1.6 been used in kubeflow for service meth
Trying to deploy kubeflow 1.0 for ML
Deployed dex 1.0 for authn
With the manifest file I successfully deployed the kubeflow on my cluster. Here's what I've done.
Deploy the kubeflow application on the cluster
Deploy Dex with OIDC service to enable authn to google Oauth2.0
Enable the RBAC
create envoy filter to append header "kubeflow-userid" as the login user
Here is a verification of step 3 and 4
Check RBAC enabled and envoyfilter added for kubeflow-userid
[root#gke-client-tf leilichao]# k get clusterrbacconfigs -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
kind: ClusterRbacConfig
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.istio.io/v1alpha1","kind":"ClusterRbacConfig","metadata":{"annotations":{},"name":"default"},"spec":{"mode":"ON"}}
creationTimestamp: "2020-07-04T01:28:52Z"
generation: 2
name: default
resourceVersion: "5986075"
selfLink: /apis/rbac.istio.io/v1alpha1/clusterrbacconfigs/default
uid: db70920e-f364-40ec-a93b-a3364f88650f
spec:
mode: "ON"
kind: List
metadata:
resourceVersion: ""
selfLink: ""
[root#gke-client-tf leilichao]# k get envoyfilter -n istio-system -o yaml
apiVersion: v1
items:
- apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"oidc-authservice","app.kubernetes.io/instance":"oidc-authservice-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"oidc-authservice","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0"},"name":"authn-filter","namespace":"istio-system"},"spec":{"filters":[{"filterConfig":{"httpService":{"authorizationRequest":{"allowedHeaders":{"patterns":[{"exact":"cookie"},{"exact":"X-Auth-Token"}]}},"authorizationResponse":{"allowedUpstreamHeaders":{"patterns":[{"exact":"kubeflow-userid"}]}},"serverUri":{"cluster":"outbound|8080||authservice.istio-system.svc.cluster.local","failureModeAllow":false,"timeout":"10s","uri":"http://authservice.istio-system.svc.cluster.local"}},"statusOnError":{"code":"GatewayTimeout"}},"filterName":"envoy.ext_authz","filterType":"HTTP","insertPosition":{"index":"FIRST"},"listenerMatch":{"listenerType":"GATEWAY"}}],"workloadLabels":{"istio":"ingressgateway"}}}
creationTimestamp: "2020-07-04T01:40:43Z"
generation: 1
labels:
app.kubernetes.io/component: oidc-authservice
app.kubernetes.io/instance: oidc-authservice-v1.0.0
app.kubernetes.io/managed-by: kfctl
app.kubernetes.io/name: oidc-authservice
app.kubernetes.io/part-of: kubeflow
app.kubernetes.io/version: v1.0.0
name: authn-filter
namespace: istio-system
resourceVersion: "4715289"
selfLink: /apis/networking.istio.io/v1alpha3/namespaces/istio-system/envoyfilters/authn-filter
uid: e599ba82-315a-4fc1-9a5d-e8e35d93ca26
spec:
filters:
- filterConfig:
httpService:
authorizationRequest:
allowedHeaders:
patterns:
- exact: cookie
- exact: X-Auth-Token
authorizationResponse:
allowedUpstreamHeaders:
patterns:
- exact: kubeflow-userid
serverUri:
cluster: outbound|8080||authservice.istio-system.svc.cluster.local
failureModeAllow: false
timeout: 10s
uri: http://authservice.istio-system.svc.cluster.local
statusOnError:
code: GatewayTimeout
filterName: envoy.ext_authz
filterType: HTTP
insertPosition:
index: FIRST
listenerMatch:
listenerType: GATEWAY
workloadLabels:
istio: ingressgateway
kind: List
metadata:
resourceVersion: ""
selfLink: ""
RBAC Issue problem analysis
After I finished my deployment. I performed below functional testing:
I can login with my google account with google oauth
I was able to create my own profile/namespace
I was able to create a notebook server
However I can NOT connect to the notebook server
RBAC Issue investigation
I'm getting "RBAC: access denied" error after I successfully created the notebook server on kubeflow and trying to connect the notebook server.
I managed to updated the envoy log level and get the log below.
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:64] checking request: remoteAddress: 10.1.1.2:58012, localAddress: 10.1.2.66:8888, ssl: none, headers: ':authority', 'compliance-kf-system.ml'
':path', '/notebook/roger-l-c-lei/aug06/'
':method', 'GET'
'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
'accept-encoding', 'gzip, deflate'
'accept-language', 'en,zh-CN;q=0.9,zh;q=0.8'
'cookie', 'authservice_session=MTU5NjY5Njk0MXxOd3dBTkZvMldsVllVMUZPU0VaR01sSk5RVlJJV2xkRFVrRTFTVUl5V0RKV1EwdEhTMU5QVjFCVlUwTkpSVFpYUlVoT1RGVlBUa0U9fN3lPBXDDSZMT9MTJRbG8jv7AtblKTE3r84ayeCYuKOk; _xsrf=2|1e6639f2|10d3ea0a904e0ae505fd6425888453f8|1596697030'
'referer', 'http://compliance-kf-system.ml/jupyter/'
'upgrade-insecure-requests', '1'
'x-forwarded-for', '10.10.10.230'
'x-forwarded-proto', 'http'
'x-request-id', 'babbf884-4cec-93fd-aea6-2fc60d3abb83'
'kubeflow-userid', 'roger.l.c.lei#XXXX.com'
'x-istio-attributes', 'CjAKHWRlc3RpbmF0aW9uLnNlcnZpY2UubmFtZXNwYWNlEg8SDXJvZ2VyLWwtYy1sZWkKIwoYZGVzdGluYXRpb24uc2VydmljZS5uYW1lEgcSBWF1ZzA2Ck4KCnNvdXJjZS51aWQSQBI+a3ViZXJuZXRlczovL2lzdGlvLWluZ3Jlc3NnYXRld2F5LTg5Y2Q0YmQ0Yy1kdnF3dC5pc3Rpby1zeXN0ZW0KQQoXZGVzdGluYXRpb24uc2VydmljZS51aWQSJhIkaXN0aW86Ly9yb2dlci1sLWMtbGVpL3NlcnZpY2VzL2F1ZzA2CkMKGGRlc3RpbmF0aW9uLnNlcnZpY2UuaG9zdBInEiVhdWcwNi5yb2dlci1sLWMtbGVpLnN2Yy5jbHVzdGVyLmxvY2Fs'
'x-envoy-expected-rq-timeout-ms', '300000'
'x-b3-traceid', '3bf35cca1f7b75e7a42a046b1c124b1f'
'x-b3-spanid', 'a42a046b1c124b1f'
'x-b3-sampled', '1'
'x-envoy-original-path', '/notebook/roger-l-c-lei/aug06/'
'content-length', '0'
'x-envoy-internal', 'true'
, dynamicMetadata: filter_metadata {
key: "istio_authn"
value {
}
}
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:108] enforced denied
From the source code it looks like the allowed function is returnning false so it's giving the "RBAC: access denied" response.
if (engine.has_value()) {
if (engine->allowed(*callbacks_->connection(), headers,
callbacks_->streamInfo().dynamicMetadata(), nullptr)) {
ENVOY_LOG(debug, "enforced allowed");
config_->stats().allowed_.inc();
return Http::FilterHeadersStatus::Continue;
} else {
ENVOY_LOG(debug, "enforced denied");
callbacks_->sendLocalReply(Http::Code::Forbidden, "RBAC: access denied", nullptr,
absl::nullopt);
config_->stats().denied_.inc();
return Http::FilterHeadersStatus::StopIteration;
}
}
I took a search on the dumped envoy, it looks like the rule should be allowing any request with a header key as my mail address. Now I can confirm I've got that in my header from above log.
{
"name": "envoy.filters.http.rbac",
"config": {
"rules": {
"policies": {
"ns-access-istio": {
"permissions": [
{
"and_rules": {
"rules": [
{
"any": true
}
]
}
}
],
"principals": [
{
"and_ids": {
"ids": [
{
"header": {
"exact_match": "roger.l.c.lei#XXXX.com"
}
}
]
}
}
]
}
}
}
}
}
With the understand that the envoy config that's been used to validate RBAC authz is from this config. And it's distributed to the sidecar by mixer, The log and code leads me to the rbac.istio.io config of servicerolebinding.
[root#gke-client-tf leilichao]# k get servicerolebinding -n roger-l-c-lei -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
annotations:
role: admin
user: roger.l.c.lei#XXXX.com
creationTimestamp: "2020-07-04T01:35:30Z"
generation: 5
name: owner-binding-istio
namespace: roger-l-c-lei
ownerReferences:
- apiVersion: kubeflow.org/v1
blockOwnerDeletion: true
controller: true
kind: Profile
name: roger-l-c-lei
uid: 689c9f04-08a6-4c51-a1dc-944db1a66114
resourceVersion: "23201026"
selfLink: /apis/rbac.istio.io/v1alpha1/namespaces/roger-l-c-lei/servicerolebindings/owner-binding-istio
uid: bbbffc28-689c-4099-837a-87a2feb5948f
spec:
roleRef:
kind: ServiceRole
name: ns-access-istio
subjects:
- properties:
request.headers[]: roger.l.c.lei#XXXX.com
status: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
I wanted to have a try updating this ServiceRoleBinding to validate some assumption since I can't debug the envoy source code and there's not enough log to show why exactly is the "allow" method returnning false.
However I find myself cannot update the servicerolebinding. It resumes to its orriginal version everytime right after I finish editing it.
I find that there's this istio-galley validatingAdmissionConfiguration(Code block below) that monitors these istio rbac resources.
[root#gke-client-tf leilichao]# k get validatingwebhookconfigurations istio-galley -oyaml
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
creationTimestamp: "2020-08-04T15:00:59Z"
generation: 1
labels:
app: galley
chart: galley
heritage: Tiller
istio: galley
release: istio
name: istio-galley
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: Deployment
name: istio-galley
uid: 11fef012-4145-49ac-a43c-2e1d0a460ea4
resourceVersion: "22484680"
selfLink: /apis/admissionregistration.k8s.io/v1beta1/validatingwebhookconfigurations/istio-galley
uid: 6f485e28-3b5a-4a3b-b31f-a5c477c82619
webhooks:
- admissionReviewVersions:
- v1beta1
clientConfig:
caBundle:
.
.
.
service:
name: istio-galley
namespace: istio-system
path: /admitpilot
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: pilot.validation.istio.io
namespaceSelector: {}
objectSelector: {}
rules:
- apiGroups:
- config.istio.io
apiVersions:
- v1alpha2
operations:
- CREATE
- UPDATE
resources:
- httpapispecs
- httpapispecbindings
- quotaspecs
- quotaspecbindings
scope: '*'
- apiGroups:
- rbac.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- '*'
scope: '*'
- apiGroups:
- authentication.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- '*'
scope: '*'
- apiGroups:
- networking.istio.io
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- destinationrules
- envoyfilters
- gateways
- serviceentries
- sidecars
- virtualservices
scope: '*'
sideEffects: Unknown
timeoutSeconds: 30
- admissionReviewVersions:
- v1beta1
clientConfig:
caBundle:
.
.
.
service:
name: istio-galley
namespace: istio-system
path: /admitmixer
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: mixer.validation.istio.io
namespaceSelector: {}
objectSelector: {}
rules:
- apiGroups:
- config.istio.io
apiVersions:
- v1alpha2
operations:
- CREATE
- UPDATE
resources:
- rules
- attributemanifests
- circonuses
- deniers
- fluentds
- kubernetesenvs
- listcheckers
- memquotas
- noops
- opas
- prometheuses
- rbacs
- solarwindses
- stackdrivers
- cloudwatches
- dogstatsds
- statsds
- stdios
- apikeys
- authorizations
- checknothings
- listentries
- logentries
- metrics
- quotas
- reportnothings
- tracespans
scope: '*'
sideEffects: Unknown
timeoutSeconds: 30
Long stroy short
I've been banging my head over this istio issue for more than 2 weeks. I'm sure there's planty of people felting the same trying to trouble shoot istio on k8s. Any suggestion is welcomed!
Here's how I understand the problem, please correct me if I'm wrong:
The log evidence showed the rbac rules is not allowing my access to the resource
I need to update the rbac rules
rules are distributed by mixer to the envoy container according to ServiceRoleBinding
So I need to update the ServiceRoleBinding instead
I cannot update the ServiceRoleBinding because either the validating admission webhook or the istio mixer is preventing me from doing it
I've run into below problems where
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
I tried to delete this validating webhook to update the servicerolebinding. The resource resumes right after I save the edit.
The validating webhook is actually generated automatically from a configmap so I had to update that to update the webhook.
Is there some kind of cache in galley that mixer uses to distribute the config
I can't find any relevent log that indicates the rbac.istio.io resource is protected/validated by any service in the istio-system namespace.
How can I get the log of the MIXER
I need to understand which component exactly controls the policy. I managed to update the log level but failed to find anything useful
Most importantly How do I debug an envoy container
I need to debug the envoy app to understand why it's returnning false for the allow function.
If we can not debug it easily. Is there a document that lets me update the code to add more log and build a new image to GCR so I can have another run and based on the log to see what's going on behind the scene.
Answering my own question since I've made some progress on them.
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
That's because the ServiceRoleBinding is actually generated/monitored/managed by the profile controller in the kubeflow namespace instead of the validating webhook.
I'm having this rbac issue because based on the params.yaml in the profiles manifest folder the rule is generated as
request.headers[]: roger.l.c.lei#XXXX.com
instead of
request.headers[kubeflow-userid]: roger.l.c.lei#XXXX.com
Due to I mis-configed the value as blank instead of userid-header=kubeflow-userid in the params.yaml
Check your authorizationpolicy resource in your application namespace.
For new clusters please see this comment from issue 4440
https://github.com/kubeflow/pipelines/issues/4440
cat << EOF | kubectl apply -f -
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: bind-ml-pipeline-nb-kubeflow-user-example-com
namespace: kubeflow
spec:
selector:
matchLabels:
app: ml-pipeline
rules:
- from:
- source:
principals: ["cluster.local/ns/kubeflow-user-example-com/sa/default-editor"]
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: kubeflow-user-example-com
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: user#example.com
workloadSelector:
labels:
notebook-name: test2
EOF
In my notebook
import kfp
client = kfp.Client()
print(client.list_experiments())
Output
{'experiments': [{'created_at': datetime.datetime(2021, 8, 12, 9, 14, 20, tzinfo=tzlocal()),
'description': None,
'id': 'b2e552e5-3324-483a-8ec8-b32894f49281',
'name': 'test',
'resource_references': [{'key': {'id': 'kubeflow-user-example-com',
'type': 'NAMESPACE'},
'name': None,
'relationship': 'OWNER'}],
'storage_state': 'STORAGESTATE_AVAILABLE'}],
'next_page_token': None,
'total_size': 1}

Ocelot ApiGateway unable to reach other services on Kubernetes

I'm working on a simple project to learn more about microservices.
I have a very simple .net core web with a single GET endpoint.
I have added a ApiGateway web app with Ocelot, and everything seems to be working fine except when I deploy to a local Kubernetes cluster.
Those are the yaml files I'm using for the deployment:
ApiGateway.yaml
kind: Pod
apiVersion: v1
metadata:
name: api-gateway
labels:
app: api-gateway
spec:
containers:
- name: api-gateway
image: apigateway:dev
---
kind: Service
apiVersion: v1
metadata:
name: api-gateway-service
spec:
selector:
app: api-gateway
ports:
- port: 80
TestService.yaml
kind: Pod
apiVersion: v1
metadata:
name: testservice
labels:
app: testservice
spec:
containers:
- name: testservice
image: testbuild:latest
---
kind: Service
apiVersion: v1
metadata:
name: testservice-service
spec:
selector:
app: testservice
ports:
- port: 80
ocelot.json
{
"ReRoutes": [
{
"DownstreamPathTemplate": "/endpoint",
"DownstreamScheme": "http",
"DownstreamHostAndPorts": [
{
"Host": "testservice-service",
"Port": 80
}
],
"UpstreamPathTemplate": "/test",
"UpstreamHttpMethod": [ "Get" ]
}
]
}
If I try to make a request with cUrl directly from the ApiGateway pod to the TestService service it works with no issue. But when I try to request it from Ocelot, it returns a 500 error, saying:
warn: Ocelot.Responder.Middleware.ResponderMiddleware[0]
requestId: 0HLUEDTNVVN26:00000002, previousRequestId: no previous request id, message: Error Code: UnableToCompleteRequestError Message: Error making http request, exception: System.Net.Http.HttpRequestException: Name or service not known
---> System.Net.Sockets.SocketException (0xFFFDFFFF): Name or service not known
Also, I tried with this https://ocelot.readthedocs.io/en/latest/features/kubernetes.html but honestly the doc is not really clear and I haven't had any success so far.
Any idea what the problem might be?
(I'm also using an Ingress resource for exposing the services, but that works with no problem so I will keep it out of this thread for now)

Cannot use spring cloud config and istio 1.1.1 together-cannot recover when HTTP 404 error to get remote config

when I'm tring to mix the spring cloud config with istio 1.1.1, When my app container(with istio envoy auto-injected) starts, the spring cloud config client will try to get config(applicationContext.yaml) from remote cloud config server(started in advance with good status), unfornately it fails with HTTP 404 error. Even if I've configged my app to have retry for cloud config client, it keeps retring alway with HTTP 404 error(I've confirmed the config server URL is correct from another container) and there's no chance to recover. It happens sometimes. I knew that Istio envoy and my app are in the same kubernetes POD, the app may start before istio envoy, in which case there might be network error but as soon as the envoy is up, everything should be OK. I really don't understand why my app cannot recover automatically. Here're my diagnostic steps:
1. Add retry mechanism in my app(with retry libs included in POM and modified yaml. - retry works but each retry failed with HTTP 404 error
spring-config/
fail-fast: true
retry:
initial-interval: 10000
max-attempts: 100
2. Add 'sleep xx' before my java app starts in my app k8s deployment file - less chance to have HTTP 404 error, but problem is not eliminated
command: ["/bin/sh","-c","sleep 20; java -jar -Xms512m -Xmx1024m app.jar"]
3. get the istio envoy's access log and compare the victim app's and good app's - it sounds like the good log has values for upstream_cluster and upstream_cluster key; the fields for the bad log are empty
the good access log
{
"response_code": "200",
"user_agent": "Java/1.8.0_121",
"response_flags": "-",
"start_time": "2019-06-25T01:17:29.661Z",
"method": "2019-06-25T01:17:29.661Z",
"request_id": "d3d27512-161b-4303-bb48-05a6e19e05b7",
"upstream_host": "172.20.3.104:9083",
"x_forwarded_for": "-",
"requested_server_name": "-",
"bytes_received": "0",
"istio_policy_status": "-",
"bytes_sent": "1144",
"upstream_cluster": "outbound|9083||fota-spring-config.ns-fota.svc.cluster.local",
"downstream_remote_address": "172.20.2.115:45816",
"path": "/fota-spring-config/fota-task/dev/master",
"authority": "fota-spring-config.ns-fota.svc.cluster.local:9083",
"protocol": "HTTP/1.1",
"upstream_service_time": "289",
"upstream_local_address": "-",
"duration": "290",
"downstream_local_address": "172.21.1.152:9083"
}
the bad access log:
{
"upstream_cluster": "-",
"downstream_remote_address": "172.20.2.118:41980",
"path": "/fota-spring-config/fota-dmserver/dev/master",
"authority": "fota-spring-config.ns-fota.svc.cluster.local:9083",
"protocol": "HTTP/1.1",
"upstream_service_time": "-",
"upstream_local_address": "-",
"duration": "0",
"downstream_local_address": "172.21.1.152:9083",
"response_code": "404",
"user_agent": "Java/1.8.0_121",
"response_flags": "NR",
"start_time": "2019-06-25T01:21:24.197Z",
"method": "2019-06-25T01:21:24.197Z",
"request_id": "346716e4-1def-465f-b370-cb1e71e30d25",
"upstream_host": "-",
"x_forwarded_for": "-",
"requested_server_name": "-",
"bytes_received": "0",
"istio_policy_status": "-",
"bytes_sent": "0"
}
the K8S deployment file is attached.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: fota-car
spec:
template:
metadata:
labels:
app: fota-car
version: v1
spec:
serviceAccountName: fota-serviceaccount
imagePullSecrets:
- name: uaes-docker2
containers:
- name: fota-car
image: 192.168.119.22:18080/uaes-fota/fota-car:dev-release-1.0.0
imagePullPolicy: Always
ports:
- containerPort: 8085
env:
- name: SPRING_DATASOURCE_URL
value: jdbc:mysql://mysql-ali-dev.ns-fota-ext-svc/fota-car?useUnicode=true&characterEncoding=utf-8&useSSL=false
- name: SPRING_DATASOURCE_USERNAME
valueFrom:
secretKeyRef:
name: mysql-ali-dev-secret
key: username
- name: SPRING_DATASOURCE_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-ali-dev-secret
key: password
command: ["/bin/sh","-c","java -jar -Xms512m -Xmx1024m app.jar"]
readinessProbe:
httpGet:
path: /actuator/health
port: 18085
initialDelaySeconds: 60
timeoutSeconds: 1
kind: Service
apiVersion: v1
metadata:
labels:
app: fota-car
name: fota-car
spec:
ports:
- name: http
port: 8085
selector:
app: fota-car

AWS Api Gateway proxy resource using Cloudformation?

I'm trying to proxy an S3 bucket configured as a website from an API Gateway endpoint. I configured an endpoint successfully using the console, but I am unable to recreate the configuration using Cloudformation.
After lots of trial and error and guessing, I've come up with the following CF stack template that gets me pretty close:
Resources:
Api:
Type: 'AWS::ApiGateway::RestApi'
Properties:
Name: ApiDocs
Resource:
Type: 'AWS::ApiGateway::Resource'
Properties:
ParentId: !GetAtt Api.RootResourceId
RestApiId: !Ref Api
PathPart: '{proxy+}'
RootMethod:
Type: 'AWS::ApiGateway::Method'
Properties:
HttpMethod: ANY
ResourceId: !GetAtt Api.RootResourceId
RestApiId: !Ref Api
AuthorizationType: NONE
Integration:
IntegrationHttpMethod: ANY
Type: HTTP_PROXY
Uri: 'http://my-bucket.s3-website-${AWS::Region}.amazonaws.com/'
PassthroughBehavior: WHEN_NO_MATCH
IntegrationResponses:
- StatusCode: 200
ProxyMethod:
Type: 'AWS::ApiGateway::Method'
Properties:
HttpMethod: ANY
ResourceId: !Ref Resource
RestApiId: !Ref Api
AuthorizationType: NONE
RequestParameters:
method.request.path.proxy: true
Integration:
CacheKeyParameters:
- 'method.request.path.proxy'
RequestParameters:
integration.request.path.proxy: 'method.request.path.proxy'
IntegrationHttpMethod: ANY
Type: HTTP_PROXY
Uri: 'http://my-bucket.s3-website-${AWS::Region}.amazonaws.com/{proxy}'
PassthroughBehavior: WHEN_NO_MATCH
IntegrationResponses:
- StatusCode: 200
Deployment:
DependsOn:
- RootMethod
- ProxyMethod
Type: 'AWS::ApiGateway::Deployment'
Properties:
RestApiId: !Ref Api
StageName: dev
Using this template I can successfully get the root of the bucket website, but the proxy resource gives me a 500:
curl -i https://abcdef.execute-api.eu-west-1.amazonaws.com/dev/index.html
HTTP/1.1 500 Internal Server Error
Content-Type: application/json
Content-Length: 36
Connection: keep-alive
Date: Mon, 11 Dec 2017 16:36:02 GMT
x-amzn-RequestId: 6014a809-de91-11e7-95e4-dda6e24d156a
X-Cache: Error from cloudfront
Via: 1.1 8f6f9aba914cc74bcbbf3c57e10df26a.cloudfront.net (CloudFront)
X-Amz-Cf-Id: TlOCX3eemHfY0aiVk9MLCp4qFzUEn5I0QUTIPkh14o6-nh7YAfUn5Q==
{"message": "Internal server error"}
I have no idea how to debug that 500.
To track down what may be wrong, I've compared the output of aws apigateway get-resource on the resource I created manually in the console (which is working) with the one Cloudformation made (which isn't). The resources look exactly alike. The output of get-method however, is subtly different, and I'm not sure it's possible to make them exactly the same using Cloudformation.
Working method configuration:
{
"apiKeyRequired": false,
"httpMethod": "ANY",
"methodIntegration": {
"integrationResponses": {
"200": {
"responseTemplates": {
"application/json": null
},
"statusCode": "200"
}
},
"passthroughBehavior": "WHEN_NO_MATCH",
"cacheKeyParameters": [
"method.request.path.proxy"
],
"requestParameters": {
"integration.request.path.proxy": "method.request.path.proxy"
},
"uri": "http://muybucket.s3-website-eu-west-1.amazonaws.com/{proxy}",
"httpMethod": "ANY",
"cacheNamespace": "abcdefg",
"type": "HTTP_PROXY"
},
"requestParameters": {
"method.request.path.proxy": true
},
"authorizationType": "NONE"
}
Configuration that doesn't work:
{
"apiKeyRequired": false,
"httpMethod": "ANY",
"methodIntegration": {
"integrationResponses": {
"200": {
"responseParameters": {},
"responseTemplates": {},
"statusCode": "200"
}
},
"passthroughBehavior": "WHEN_NO_MATCH",
"cacheKeyParameters": [
"method.request.path.proxy"
],
"requestParameters": {
"integration.request.path.proxy": "method.request.path.proxy"
},
"uri": "http://mybucket.s3-website-eu-west-1.amazonaws.com/{proxy}",
"httpMethod": "ANY",
"requestTemplates": {},
"cacheNamespace": "abcdef",
"type": "HTTP_PROXY"
},
"requestParameters": {
"method.request.path.proxy": true
},
"requestModels": {},
"authorizationType": "NONE"
}
The differences:
The working configuration has responseTemplates set to "application/json": null. As far as I can tell, there's no way to set a mapping explicitly to null using Cloudformation. My CF method instead just has an empty object here.
My CF method has "responseParameters": {},, while the working configuration does not have responseParameters at all
My CF method has "requestModels": {},, while the working configuration does not have requestModels at all
Comparing the two in the console, they are seemingly exactly the same.
I'm at my wits end here: what am I doing wrong? Is this possible to achieve using Cloudformation?
Answer: The above is correct. I had arrived at this solution through a series of steps, and re-applied the template over and over. Deleting the stack and deploying it anew with this configuration had the desired effect.