Argocd Failed to Get Static Asset when Loading UI - amazon-eks

New to ArgoCD. I have deployed ArgoCD on my EKS cluster fronted with an AWS ALB Controller.
...
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/listen-port: '[{"HTTPS":443}]'
name: argo-ingress
namespace: argocd
spec:
rules:
- host: argocd.example.com
http:
paths:
- backend:
serviceName: argocd-server
servicePort: 80
path: /
Given that the SSL is terminated at ALB, I deployed API server with the API server with the following parameters:
spec:
containers:
- command:
- argocd-server
- --insecure
- --staticassets
- /shared/app
When I port forward ArgoCD on the cluster, I am able to retrieve the objects locally.
HTTP request sent, awaiting response... 200 OK
Length: 2080536 (2.0M) [application/javascript]
Saving to: ‘main.12b930b6a3d660c9da5a.js.2’
100%[===================================================================================================================>] 2,080,536 --.-K/s in 0.03s
2020-10-26 02:14:53 (64.2 MB/s) - ‘main.12b930b6a3d660c9da5a.js.2’ saved [2080536/2080536]
However, when I use the browser to access the UI, I get 200 MSG and get a blank UI page and I get 400 error for the main.js and images.
Can anyone help me to troubleshoot this?

I managed to find the issue.
There was a typo in the ingress controller rules. As a result, all requests were being handled by the last ALB rule which resulted in 404. The fix was to include a '*' in the path. See below:
...
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/listen-port: '[{"HTTPS":443}]'
name: argo-ingress
namespace: argocd
spec:
rules:
- host: argocd.example.com
http:
paths:
- backend:
serviceName: argocd-server
servicePort: 80
path: /*

Related

AWS EKS ingress - Entity too large

I am running Laravel 8 api in the cluster and I have this ingress:
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"alb.ingress.kubernetes.io/scheme":"internet-facing","alb.ingress.kubernetes.io/target-type":"ip","kubernetes.io/ingress.class":"alb"},"labels":{"app":"voterapi"},"name":"rapp-ingress","namespace":"voterapp"},"spec":{"rules":[{"http":{"paths":[{"backend":{"serviceName":"app-service","servicePort":80},"path":"/*"}]}}]}}
kubernetes.io/ingress.class: alb
nginx.ingress.kubernetes.io/proxy-body-size: 100m
creationTimestamp: "2022-05-26T08:25:50Z"
finalizers:
- ingress.k8s.aws/resources
generation: 1
labels:
app: appapi
name: app-ingress
namespace: app
resourceVersion: "94262558"
uid: ec29661a-f4be-4ae1-a0e0-29c3d8bff0e5
spec:
rules:
- http:
paths:
- backend:
service:
name: app-service
port:
number: 80
path: /*
pathType: ImplementationSpecific
status:
loadBalancer:
ingress:
- hostname: XXX
I am trying to upload the file using the API and I am getting
413 Request Entity Too Large
I dont see this error in my PHP log so looks like it is not even getting there.
Can anyone help me to solve the issue?
Update: Try to update your ingress adding nginx.org/client-max-body-size
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.org/proxy-read-timeout: "40s"
nginx.org/proxy-connect-timeout: "40s"
nginx.org/client-max-body-size: "100m"
In some cases, you might need to increase the max size for all post body data and file uploads.
Try to update the post_max_size and upload_max_file_size values in the php.ini configuration:
post_max_size = 100M
upload_max_filesize = 100M
Reference:
NGINX Ingress Controller to increase the client request body
NGINX Ingress Controller: Advanced Configuration with Annotations
https://laracasts.com/discuss/channels/laravel/increase-file-upload-size

OpenShift: Pod often does not return expected requests

I'm running a .dotnet 3.1 RestAPI inside a pod in Openshift, and it process every request smoothly - all transactions to the database (outside the Openshift network) are executed properly, all programatic executions are being finalized without errors. However, 1 in 15 requests will always ECONNRESET and fail to return the HTTP request.
Let's say I make a GET to /users/id/3 - I can see this request hitting my restAPI, being processed all the way down the infra layer, fetch data from the DB, wrap the return, and send it back finishing the request, however at this point, no return is received by the frontend, or postman, but I can see on the API logs that the request was finished and returned.
All of this requests take 2.3min to execute, and often finish in a ECONNRESET. I'm at odds at how to troubleshoot this. I have tried curl'ing the resource in another pod and the same behaviour appears.
I think these requests sometimes are getting lost in the cluster network, so I tried playing with the sessionaffinity of the service config but it's not really tied to this, as far as I understood. Do I have a wrong route config, or service config?
Route config
spec:
host: api.com.cloud
to:
kind: Service
name: api
weight: 100
port:
targetPort: 8080-tcp
tls:
termination: edge
wildcardPolicy: None
status:
ingress:
- host: api.com.cloud
routerName: default
conditions:
- type: Admitted
status: 'True'
lastTransitionTime: XXXX
wildcardPolicy: None
routerCanonicalHostname: router-default.apps.com.cloud
Service config
spec:
clusterIP: XXXX
ipFamilies:
- IPv4
ports:
- name: 8080-tcp
protocol: TCP
port: 8080
targetPort: 8080
internalTrafficPolicy: Cluster
clusterIPs:
- XXX
type: ClusterIP
ipFamilyPolicy: SingleStack
sessionAffinity: None
selector:
deploymentconfig: api
status:
loadBalancer: {}

Registered Targets Disappear

I have a working EKS cluster. It is using a ALB for ingress.
When I apply a service and then an ingress most of these work as expected. However some target groups eventually have no registered targets. If I get the service IP address kubectl describe svc my-service-name and manually register the EndPoints in the target group the pods are reachable again but that's not a sustainable process.
Any ideas on what might be happening? Why doesn't EKS find the target groups as pods cycle?
Each service (secrets, deployment, service and ingress consists of a set of .yaml files applied like:
deploy.sh
#!/bin/bash
set -e
kubectl apply -f ./secretsMap.yaml
kubectl apply -f ./configMap.yaml
kubectl apply -f ./deployment.yaml
kubectl apply -f ./service.yaml
kubectl apply -f ./ingress.yaml
service.yaml
apiVersion: v1
kind: Service
metadata:
name: "site-bob"
namespace: "next-sites"
spec:
ports:
- port: 80
targetPort: 3000
protocol: TCP
type: NodePort
selector:
app: "site-bob"
ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: "site-bob"
namespace: "next-sites"
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/tags: Environment=Production,Group=api
alb.ingress.kubernetes.io/backend-protocol: HTTP
alb.ingress.kubernetes.io/ip-address-type: ipv4
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80},{"HTTPS":443}]'
alb.ingress.kubernetes.io/load-balancer-name: eks-ingress-1
alb.ingress.kubernetes.io/group.name: eks-ingress-1
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-2:402995436123:certificate/9db9dce3-055d-4655-842e-xxxxx
alb.ingress.kubernetes.io/healthcheck-port: traffic-port
alb.ingress.kubernetes.io/healthcheck-path: /
alb.ingress.kubernetes.io/healthcheck-interval-seconds: '30'
alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '16'
alb.ingress.kubernetes.io/success-codes: 200,201
alb.ingress.kubernetes.io/healthy-threshold-count: '2'
alb.ingress.kubernetes.io/unhealthy-threshold-count: '2'
alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=60
alb.ingress.kubernetes.io/target-group-attributes: deregistration_delay.timeout_seconds=30
alb.ingress.kubernetes.io/actions.ssl-redirect: >
{
"type": "redirect",
"redirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}
}
alb.ingress.kubernetes.io/actions.svc-host: >
{
"type":"forward",
"forwardConfig":{
"targetGroups":[
{
"serviceName":"site-bob",
"servicePort": 80,"weight":20}
],
"targetGroupStickinessConfig":{"enabled":true,"durationSeconds":200}
}
}
labels:
app: site-bob
spec:
rules:
- host: "staging-bob.imgeinc.net"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ssl-redirect
port:
name: use-annotation
- backend:
service:
name: svc-host
port:
name: use-annotation
pathType: ImplementationSpecific
Something in my configuration added tagged two security groups as being owned by the cluster. When I checked the load balancer controller logs:
kubectl logs -n kube-system aws-load-balancer-controller-677c7998bb-l7mwb
I saw many lines like:
{"level":"error","ts":1641996465.6707578,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-nextsite-sitefest-89a6f0ff0a","namespace":"next-sites","error":"expect exactly one securityGroup tagged with kubernetes.io/cluster/imageinc-next-eks-4KN4v6EX for eni eni-0c5555fb9a87e93ad, got: [sg-04b2754f1c85ac8b9 sg-07b026b037dd4d6a4]"}
sg-07b026b037dd4d6a4 has description: EKS created security group applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloads.
sg-04b2754f1c85ac8b9 has description: Security group for all nodes in the cluster.
I removed the tag:
{
Key: 'kubernetes.io/cluster/_cluster name_',
value:'owned'
}
from sg-04b2754f1c85ac8b9
and the TargetGroups started to fill in and everything is now working. Both groups were created and tagged by Terraform. I suspect my worker group configuration is off.
facing the same issue when creating the cluster with terraform. Solved updating aws load balancer controller from 2.3 to 2.4.4

Istio with SDS and Mutual TLS: upstream connect error or disconnect/reset before headers. reset reason: connection failure

I am trying to set up a cluster with Istio on it, where the SSL traffic gets terminated at the Ingress. I have deployed Istio with SDS and Mutual TLS. With the below yaml, I only get the error message upstream connect error or disconnect/reset before headers. reset reason: connection failure when accessing my cluster in the browser:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: default-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
---
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx1
name: nginx1
spec:
containers:
- image: nginx
name: nginx
resources: {}
ports:
- containerPort: 80
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
---
apiVersion: v1
kind: Service
metadata:
labels:
run: nginx1
name: nginx1
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
run: nginx1
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: nginx1
spec:
hosts:
- "*"
gateways:
- istio-system/default-gateway
http:
- match:
- uri:
prefix: /nginx1
route:
- destination:
port:
number: 80
host: nginx1.default.svc.cluster.local
The ingressgateway logs show the following TLS error:
[2019-07-09 09:07:24.907][29][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:88] creating a new connection
[2019-07-09 09:07:24.907][29][debug][client] [external/envoy/source/common/http/codec_client.cc:26] [C4759] connecting
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/common/network/connection_impl.cc:702] [C4759] connecting to 100.200.1.59:80
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/common/network/connection_impl.cc:711] [C4759] connection in progress
[2019-07-09 09:07:24.907][29][debug][pool] [external/envoy/source/common/http/conn_pool_base.cc:20] queueing request due to no available connections
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/common/network/connection_impl.cc:550] [C4759] connected
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:168] [C4759] handshake error: 2
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:168] [C4759] handshake error: 1
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:201] [C4759] TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2019-07-09 09:07:24.907][29][debug][connection] [external/envoy/source/common/network/connection_impl.cc:188] [C4759] closing socket: 0
[2019-07-09 09:07:24.907][29][debug][client] [external/envoy/source/common/http/codec_client.cc:82] [C4759] disconnect. resetting 0 pending requests
[2019-07-09 09:07:24.907][29][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:129] [C4759] client disconnected, failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2019-07-09 09:07:24.907][29][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:164] [C4759] purge pending, failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2019-07-09 09:07:24.907][29][debug][router] [external/envoy/source/common/router/router.cc:671] [C4753][S3527573287149425977] upstream reset: reset reason connection failure
[2019-07-09 09:07:24.907][29][debug][http] [external/envoy/source/common/http/conn_manager_impl.cc:1137] [C4753][S3527573287149425977] Sending local reply with details upstream_reset_before_response_started{connection failure,TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER}
Reading though this blog I thought I might need to add
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
privateKey: /etc/istio/ingressgateway-certs/tls.key
to the ingressgateway configuration. However, this did not solve the problem. Additionally, since I am using SDS, there won't be any certificates in ingressgateway-certs (see https://istio.io/docs/tasks/security/auth-sds/#verifying-no-secret-volume-mounted-file-is-generated) as it is described in https://istio.io/docs/tasks/traffic-management/ingress/secure-ingress-mount/
Can anyone point me to a correct configuration? Most of what I find online is referring to the "old" filemount approach...
The issue has been resolved by not using istio-cni. See https://github.com/istio/istio/issues/15701
You may have to specify the minimum or maximum TLS version. The options are documented here, under minProtocolVersion and maxProtocolVersion:
https://istio.io/docs/reference/config/networking/v1alpha3/gateway/#Server-TLSOptions
Under the hood, these values map to the following Envoy parameters:
https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/auth/cert.proto#auth-tlsparameters

Cert-manager certificates not found and challenges not created

I followed https://docs.cert-manager.io/en/venafi/tutorials/quick-start/index.html from start to end and everything seems to be working except that I'm not getting an external ip for my ingress.
NAME HOSTS ADDRESS PORTS AGE
staging-site-ingress staging.site.io,staging.admin.site.io, 80, 443 1h
Altough I'm able to use the nginx ingress controller external ip and use dns to access the sites. When I'm going to the urls I'm being redirected to https, so I assume that's working fine.
It redirects to https but still says "not secured", so he don't get a certificate issued.
When I'm debugging I get the following information:
Ingress:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreateCertificate 54m cert-manager Successfully created Certificate "tls-secret-staging"
Normal UPDATE 35m (x3 over 1h) nginx-ingress-controller Ingress staging/staging-site-ingress
Normal CreateCertificate 23m (x2 over 35m) cert-manager Successfully created Certificate "letsencrypt-staging-tls"
Certificate:
Status:
Conditions:
Last Transition Time: 2019-02-27T14:02:29Z
Message: Certificate does not exist
Reason: NotFound
Status: False
Type: Ready
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal OrderCreated 3m (x2 over 14m) cert-manager Created Order resource "letsencrypt-staging-tls-593754378"
Secret:
Name: letsencrypt-staging-tls
Namespace: staging
Labels: certmanager.k8s.io/certificate-name=staging-site-io
Annotations: <none>
Type: kubernetes.io/tls
Data
====
ca.crt: 0 bytes
tls.crt: 0 bytes
tls.key: 1679 bytes
Order:
Status:
Certificate: <nil>
Finalize URL:
Reason:
State:
URL:
Events: <none>
So it seems something goes wrong in order and no challenges are created.
Here are my ingress.yaml and issuer.yaml:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: staging-site-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
certmanager.k8s.io/issuer: "letsencrypt-staging"
certmanager.k8s.io/acme-challenge-type: http01
spec:
tls:
- hosts:
- staging.site.io
- staging.admin.site.io
- staging.api.site.io
secretName: letsencrypt-staging-tls
rules:
- host: staging.site.io
http:
paths:
- backend:
serviceName: frontend-service
servicePort: 80
path: /
- host: staging.admin.site.io
http:
paths:
- backend:
serviceName: frontend-service
servicePort: 80
path: /
- host: staging.api.site.io
http:
paths:
- backend:
serviceName: gateway-service
servicePort: 9000
path: /
apiVersion: certmanager.k8s.io/v1alpha1
kind: Issuer
metadata:
name: letsencrypt-staging
namespace: staging
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: hello#site.io
privateKeySecretRef:
name: letsencrypt-staging-tls
http01: {}
Anyone knows what I can do to fix this or what went wrong? Certmanager is installed correctly 100%, I'm just not sure about the ingress and what went wrong in the order.
Thanks in advance!
EDIT: I found this in the nginx-ingress-controller:
W0227 14:51:02.740081 8 controller.go:1078] Error getting SSL certificate "staging/letsencrypt-staging-tls": local SSL certificate staging/letsencrypt-staging-tls was not found. Using default certificate
It's getting spammed & the CPU load is always at 0.003 and the cpu graph is full (the other services are almost nothing)
I stumbled over the same issue once, following exactly the same official tutorial.
As #mikebridge mentioned, the issue is with Issuer/Secret's namespace mismatch.
For me, the best was to switch from Issuer to ClusterIssuer, which is not scoped to a single namespace.
The reason your certificate order is not completing is because the challenge is failing to successfully complete. Review your solver configuration in either your Issuer or ClusterIssuer.
See my answer here for more details.
https://stackoverflow.com/a/75454772/4820940