Connection refused when using load balancing with Traefik 2 - traefik

I have yml configuration file with a router and a service. Every time I get a 404 error. I know the URL works and I can access the server from Traefik server. What am I missing? Also, for some reason the request reroutes to https. Perhaps a conflicting rule?
Also note, Traefik runs in docker, but the connecting server does not. The goal here is to add multiple nodes to the load balancer.
http:
routers:
demo_1-rtr:
rule: "Host(`http://demo.lab.local`)"
service: demo_1
entryPoints:
- http
services:
demo_1:
loadBalancer:
servers:
- url: "http://172.16.9.90:16000"
Traefik Config:
global:
checkNewVersion: true
sendAnonymousUsage: true
api:
insecure: true
providers:
docker:
endpoint: "unix://var/run/docker.sock"
exposedByDefault: false
file:
directory: /rules
watch: true
log:
level: DEBUG
accessLog: {}
entryPoints:
http:
address: ":80"

I suspect it would be this
--api.insecure=true global argument and it should work.
So in your case add the following in traefik.toml
[api]
insecure = true
Otherwise I would need more information to debug more.

Related

OpenShift: Pod often does not return expected requests

I'm running a .dotnet 3.1 RestAPI inside a pod in Openshift, and it process every request smoothly - all transactions to the database (outside the Openshift network) are executed properly, all programatic executions are being finalized without errors. However, 1 in 15 requests will always ECONNRESET and fail to return the HTTP request.
Let's say I make a GET to /users/id/3 - I can see this request hitting my restAPI, being processed all the way down the infra layer, fetch data from the DB, wrap the return, and send it back finishing the request, however at this point, no return is received by the frontend, or postman, but I can see on the API logs that the request was finished and returned.
All of this requests take 2.3min to execute, and often finish in a ECONNRESET. I'm at odds at how to troubleshoot this. I have tried curl'ing the resource in another pod and the same behaviour appears.
I think these requests sometimes are getting lost in the cluster network, so I tried playing with the sessionaffinity of the service config but it's not really tied to this, as far as I understood. Do I have a wrong route config, or service config?
Route config
spec:
host: api.com.cloud
to:
kind: Service
name: api
weight: 100
port:
targetPort: 8080-tcp
tls:
termination: edge
wildcardPolicy: None
status:
ingress:
- host: api.com.cloud
routerName: default
conditions:
- type: Admitted
status: 'True'
lastTransitionTime: XXXX
wildcardPolicy: None
routerCanonicalHostname: router-default.apps.com.cloud
Service config
spec:
clusterIP: XXXX
ipFamilies:
- IPv4
ports:
- name: 8080-tcp
protocol: TCP
port: 8080
targetPort: 8080
internalTrafficPolicy: Cluster
clusterIPs:
- XXX
type: ClusterIP
ipFamilyPolicy: SingleStack
sessionAffinity: None
selector:
deploymentconfig: api
status:
loadBalancer: {}

When using "tls-alpn-01" challenge for let's encrypt certs in kubernetes using traefik, I'm getting "acme: error: 400 Timeout during connect"

I'm following the tutorial to use traefik as the ingress and ingress controller for Azure Kubernetes Service (AKS) cluster. I'm using terraform to deploy the traefik (version 1.7.24) helm chart.
resource "helm_release" "traefik" {
name = "traefik"
namespace = "traefik"
repository = "https://charts.helm.sh/stable"
chart = "traefik"
version = "1.87.2"
values = [<<EOF
loadBalancerIP: "50.100.200.300"
service:
annotations:
service.beta.kubernetes.io/azure-load-balancer-resource-group: "aks-rg"
kubernetes:
ingressClass: traefik
ingressEndpoint:
useDefaultPublishedService: true
dashboard:
enabled: true
domain: traefik.mydomain.tld
ingress:
annotations:
kubernetes.io/ingress.class: traefik
metrics:
serviceMonitor:
enabled: true
rbac:
enabled: true
ssl:
enabled: true
enforced: true
acme:
enabled: true
email: admin#mydomain.tld
staging: true
tlsChallenge: true
entrypoint: https
ports: "443:443"
challengeType: tls-alpn-01
onHostRule: true
domains:
enabled: true
domainsList:
- main: "mydomain.tld"
- sans:
- "traefik.mydomain.tld"
EOF
]
}
The DNS records are correctly pointed to the AKS Load Balancer IP.
When I check the traefik logs, I could see that "tls-alpn-01" challenge fails with the following error:
{"level":"error","msg":"Unable to obtain ACME certificate for domains \"mydomain.tld,traefik.mydomain.tld\" : unable to generate a certificatefor the domains [mydomain.tld traefik.mydomain.tld]: acme: Error -\u003e One or more domains had a problem:\n[mydomain.tld] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Timeout during connect (likely firewall problem), url: \n[traefik.mydomain.tld] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Timeout during connect (likely firewall problem), url: \n","time":"2021-02-26T02:32:05Z"}
Full log is given below:
{"level":"info","msg":"Using TOML configuration file /config/traefik.toml","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"No tls.defaultCertificate given for https: using the first item in tls.certificates as a fallback.","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Traefik version v1.7.24 built on 2020-03-25_04:34:11PM","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"\nStats collection is disabled.\nHelp us improve Traefik by turning this feature on :)\nMore details on: https://docs.traefik.io/v1.7/basics/#collected-data\n","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Preparing server traefik \u0026{Address::8080 TLS:\u003cnil\u003e Redirect:\u003cnil\u003e Auth:\u003cnil\u003e WhitelistSourceRange:[] WhiteList:\u003cnil\u003e Compress:false ProxyProtocol:\u003cnil\u003e ForwardedHeaders:0xc000851700} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Preparing server http \u0026{Address::80 TLS:\u003cnil\u003e Redirect:0xc0000b1b80 Auth:\u003cnil\u003e WhitelistSourceRange:[] WhiteList:\u003cnil\u003e Compress:true ProxyProtocol:\u003cnil\u003e ForwardedHeaders:0xc0008516a0} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Starting server on :8080","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Preparing server https \u0026{Address::443 TLS:0xc000431e60 Redirect:\u003cnil\u003e Auth:\u003cnil\u003e WhitelistSourceRange:[] WhiteList:\u003cnil\u003e Compress:true ProxyProtocol:\u003cnil\u003e ForwardedHeaders:0xc0008516c0} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Starting provider configuration.ProviderAggregator {}","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Starting server on :80","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Starting server on :443","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Starting provider *kubernetes.Provider {\"Watch\":true,\"Filename\":\"\",\"Constraints\":[],\"Trace\":false,\"TemplateVersion\":0,\"DebugLogGeneratedTemplate\":false,\"Endpoint\":\"\",\"Token\":\"\",\"CertAuthFilePath\":\"\",\"DisablePassHostHeaders\":false,\"EnablePassTLSCert\":false,\"Namespaces\":null,\"LabelSelector\":\"\",\"IngressClass\":\"traefik\",\"IngressEndpoint\":{\"IP\":\"\",\"Hostname\":\"\",\"PublishedService\":\"traefik/traefik\"},\"ThrottleDuration\":0}","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"ingress label selector is: \"\"","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Creating in-cluster Provider client","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Starting provider *acme.Provider {\"Email\":\"admin#mydomain.tld\",\"ACMELogging\":false,\"CAServer\":\"https://acme-staging-v02.api.letsencrypt.org/directory\",\"Storage\":\"/acme/acme.json\",\"EntryPoint\":\"https\",\"KeyType\":\"RSA4096\",\"OnHostRule\":true,\"OnDemand\":false,\"DNSChallenge\":null,\"HTTPChallenge\":null,\"TLSChallenge\":{},\"Domains\":[{\"Main\":\"mydomain.tld\",\"SANs\":[\"traefik.mydomain.tld\"]}],\"Store\":{}}","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Testing certificate renew...","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2021-02-26T02:31:38Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2021-02-26T02:31:39Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2021-02-26T02:31:39Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2021-02-26T02:31:39Z"}
{"level":"info","msg":"Register...","time":"2021-02-26T02:31:42Z"}
{"level":"info","msg":"Updated status on ingress traefik/traefik-dashboard","time":"2021-02-26T02:31:42Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2021-02-26T02:31:55Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2021-02-26T02:31:55Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2021-02-26T02:31:55Z"}
{"level":"error","msg":"Unable to obtain ACME certificate for domains \"mydomain.tld,traefik.mydomain.tld\" : unable to generate a certificatefor the domains [mydomain.tld traefik.mydomain.tld]: acme: Error -\u003e One or more domains had a problem:\n[mydomain.tld] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Timeout during connect (likely firewall problem), url: \n[traefik.mydomain.tld] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Timeout during connect (likely firewall problem), url: \n","time":"2021-02-26T02:32:05Z"}
I have even added a "AllowAll" rule in AKS LoadBalancer NSG (firewall). But still tls-alpn-01 validation is facing the timeout error. The ssl certificate generation doesn't happen and my website is using the default example.com expired ssl certificate.
I can confirm that telnet to port 443 of mydomain.tld also works fine.
PS: I don't want to use "dns-01" challenge for ssl certificate as the dns provider doesn't have the APIs for let's encrypt. I can't use "http-01" as these are backend servers which do not have any web server.
Any help is much appreciated. I would also like to know how the tls-alpn-01 challenge works.

Traefik 2 set passHostHeader globally and not per service?

I'm using traefik internally as a gateway and thus want to always have passtHostHeader: false for default as to not always have it repeatedly set on each service.
I looked through the
ENVs
cli arguments &
dynamic configuration reference
but to no avail.
Is there a way to achieve this?
my dynamic-file.yml
routers:
ext-service-router:
rule: "PathPrefix(`/ext-service`)
service: ext-service-service
services:
ext-service-service:
loadBalancer:
passHostHeader: false # I want this to be the default and not need to set it
servers:
- URL: "https://my-ext-service.example.com"

Let's encrypt SSL with traefick on ECS Fargate

I've been trying to solve this for days, but without any luck:
Situation:
I have a ECS cluster on AWS using Fargate, this cluster contains an instance of Traefick 2.3.4 and other containers. I'm using Traefick as reverse proxy to forward the requests to the other containers.
Using HTTP everything works fine, so I've decided to add also the secure connection to Traefick. I've tried everything that I could find on the Internet but nothing works, when I try to connect to the specified domain with curl it returns:
curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number
Here there are some test that I've done:
traefick.yml:
log:
level: DEBUG
api:
dashboard: true
entryPoints:
web:
address: :80
http:
redirections:
entryPoint:
to: websecure
scheme: https
websecure:
address: ":443"
providers:
ecs:
clusters:
- tools-cluster
region: eu-west-2
exposedByDefault: false
certificatesResolvers:
letsencrypt:
acme:
caServer: https://acme-staging-v02.api.letsencrypt.org/directory
email: #########################
storage: acme.json
httpchallenge:
entrypoint: web
Labels:
"dockerLabels": {
"traefik.enable": "true",
"traefik.http.services.traefik.loadbalancer.server.port": "8080",
"traefik.http.routers.traefik.rule": "Host(`${host}`)",
"traefik.http.routers.traefik.entrypoints": "websecure",
"traefik.http.routers.traefik.tls.certresolver": "letsencrypt",
"traefik.http.routers.traefik.service": "api#internal"
}
this version returns this error:
rror: 400 :: urn:ietf:params:acme:error:connection :: Fetching https://traefik.baaluu.com/.well-known/acme-challenge/td8IdOvJ1_GkigY-jPYaA4YsgeiS5FUiuUS-avbpsuY: Error getting validation data, url
It tries to retrieve that data but it can't because it is redirected to the https and it can't retrieve because https doesn't work, I've tried also without the auto redirect, and it returns a similar error, it can't retrieve that data.
But following this guide it should work correctly.
So I've decided to move to the dnsChallenge with this configuration:
Traefick.yml
log:
level: DEBUG
api:
dashboard: true
entryPoints:
web:
address: :80
websecure:
address: ":443"
providers:
ecs:
clusters:
- tools-cluster
region: eu-west-2
exposedByDefault: false
certificatesResolvers:
letsencrypt:
acme:
caServer: https://acme-staging-v02.api.letsencrypt.org/directory
email: ######################
storage: acme.json
dnsChallenge:
provider: route53
delayBeforeCheck: 3
and same labels as before:
"dockerLabels": {
"traefik.enable": "true",
"traefik.http.services.traefik.loadbalancer.server.port": "8080",
"traefik.http.routers.traefik.rule": "Host(`${host}`)",
"traefik.http.routers.traefik.entrypoints": "websecure",
"traefik.http.routers.traefik.tls.certresolver": "letsencrypt",
"traefik.http.routers.traefik.service": "api#internal"
}
Still nothing, and I've this inside the logs:AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/170242259"
That url contains:
{
"type": "urn:ietf:params:acme:error:malformed",
"detail": "Method not allowed",
"status": 405
}
The latest test that I did is to remove the staging ca server:
log:
level: DEBUG
api:
dashboard: true
entryPoints:
web:
address: :80
websecure:
address: :443
providers:
ecs:
clusters:
- tools-cluster
region: eu-west-2
exposedByDefault: false
certificatesResolvers:
letsencrypt:
acme:
email: ###############
storage: acme.json
dnsChallenge:
provider: route53
delayBeforeCheck: 2
The ssl still doesn't work but I don't see any error message inside the logs: this is the last message that I get about a certificate:
Try to challenge certificate for domain [traefik.baaluu.com] found in HostSNI rule" providerName=letsencrypt.acme routerName=traefik#ecs rule="Host(`traefik.baaluu.com`)"
And there is not much more after that:
(I'm sorry for the picture but I don't find a way to extract that logs from ECS)
The other containers are still reachable on the http protocol.
If I try to connect to it using telnet I can reach the service:
telnet traefik.baaluu.com 443
Trying 3.8.30.164...
Connected to traefik-1547500306.eu-west-2.elb.amazonaws.com.
Escape character is '^]'.
Same goes for the 80
Looking better inside the logs I've also find this
retry due to: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ :: urn:ietf:params:acme:error:badNonce :: JWS has an invalid anti-replay nonce: \"0004cbkFTGjCALFGDYOmhruMl6_F_fRSj33cOMvdpx5Xd2M\", url: "
time="2020-12-10T13:08:21Z" level=debug msg="legolog: [INFO] retry due to: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ :: urn:ietf:params:acme:error:badNonce :: JWS has an invalid anti-replay nonce: \"0004cbkFTGjCALFGDYOmhruMl6_F_fRSj33cOMvdpx5Xd2M\", url: "
that contains this url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ
{
"type": "dns-01",
"status": "valid",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ",
"token": "44R4gD4_ZmemiCn5rtkqJyWOcjoj09sEgobUvZLH6yc",
"validationRecord": [
{
"hostname": "traefik.baaluu.com"
}
]
}
So I suppose that the ssl has been generated correctly but I'm not sure.
Any idea or suggestion?
Thanks in advance.
H2K
Edit:
I've removed the ssl from the dashboard and I've put it on another container, now entering inside the dashboard I can see this:
So I suppose that the ssl is working for that domain, but I still can't connect to it.
Edit 2:
with telnet if I connect to that url on the port 443 and I request the page I can see the content:
telnet xxxxxxxxxxxxxxxxx 443
Trying 3.10.148.201...
Connected to traefik-1547500306.eu-west-2.elb.amazonaws.com.
Escape character is '^]'.
GET /index.html HTTP/1.1
Host: xxxxxxxxxxxxxxxxx
And the content of the page appears, so it is not a load balacer problem or routing problem, it seems that I can reach the container using the 443, simply the ssl is not there. It is like to have 2 http port and both are behaving in the same way. The 443 at the moment is like a port 80.
I've have also spent a number of days trying to work it out so i feel your pain.
The error is misleading, the request doesn't even make it past the ALB let alone traefik.
There are two factors to this issue,
The first being that when you specify a port 443 through docker compose as "443:443" you would assume that this creates a HTTPS listener, it actually creates a listener for 443 on the HTTP protocol. In addition the listener also sent the data to the fargate HTTP port and didn't redirect. I'm not sure if this is a bug, or because because i haven't specified that the protocol should be "x-aws-protocol: https" on the target port.
I also found some AWS documentation that said if you use a HTTPS port on a ALB that you need an SSL certificate in place at a ALB level. This kind of makes sense that you can't terminate the connection at a task level if you consider the swarm nature and security implications (better minds are welcome to explain)
With the above in mind i created a certificate in the ACM that covered all the the domains that i needed, changed the listener to the HTTPS protocol and specified the certificate i created. At this point i was able to configure traefik to accept traefik to the frontend.

Traefik v2 as a reverse proxy without docker

I have read the documentation but I can not figure out how to configure Traefik v2 to replace Nginx as a reverse proxy for web sites (virtual hosts) without involving Docker. Ideally there would be let'sencrypt https as well.
I have a service running at http://127.0.0.1:4000 which I would like to reverse proxy to from http://myhost.com:80
This is the configuration i've come up with so far:
[Global]
checkNewVersion = true
[log]
level = "DEBUG"
filePath = "log-file.log"
[accessLog]
filePath = "log-access.log"
bufferingSize = 100
[entrypoints]
[entrypoints.http]
address = ":80"
[http]
[http.routers]
[http.routers.my-router]
rule = "Host(`www.myhost.com`)"
service = "http"
entrypoint=["http"]
[http.services]
[http.services.http.loadbalancer]
[[http.services.http.loadbalancer.servers]]
url = "http://127.0.0.1:4000"
I figured it out,
the first part to note is that in traefik v2 there are two types of configuration, static and dynamic. So I created two files, traefik.toml and traefik-dynamic.toml.
contents of traefik.toml:
[log]
level = "DEBUG"
filePath = "log-file.log"
[accessLog]
filePath = "log-access.log"
bufferingSize = 100
[providers]
[providers.file]
filename = "traefik-dynamic.toml"
[api]
dashboard = true
debug = true
[entryPoints]
[entryPoints.web]
address = ":80"
[entryPoints.web-secure]
address = ":443"
[entryPoints.dashboard]
address = ":8080"
[certificatesResolvers.sample.acme]
email = "myemail#example.com"
storage = "acme.json"
[certificatesResolvers.sample.acme.httpChallenge]
# used during the challenge
entryPoint = "web"
traefik-dynamic.toml:
[http]
# Redirect to https
[http.middlewares]
[http.middlewares.test-redirectscheme.redirectScheme]
scheme = "https"
[http.routers]
[http.routers.my-router]
rule = "Host(`www.example.com`)"
service = "phx"
entryPoints = ["web-secure"]
[http.routers.my-router.tls]
certResolver = "sample"
[http.services]
[http.services.phx.loadbalancer]
[[http.services.phx.loadbalancer.servers]]
url = "http://127.0.0.1:4000"
You can also use Traefik v2 to reverse proxy to a service running on the localhost without using Nginx as explained here using File (and not Docker provider) for Traefik.
First, route calls to myhost.com through localhost by updating /etc/hosts like:
127.0.0.1 myhost.com
Create a minimal docker-compose.yml like:
version: "3.7"
services:
proxy:
image: traefik:2.0
command:
- "--providers.file.filename=/etc/traefik/proxy-config.toml"
- "--entrypoints.web.address=:80"
ports:
- "80:80"
volumes:
- ./proxy-config.toml:/etc/traefik/proxy-config.toml:ro
This Compose file creates a read-only volume containing the dynamic configuration for the Traefik reverse proxy standing in for Nginx as requested. It uses the File provider for Traefik and not Docker and a blank HTTP address mapped to port 80 for the entrypoint. This is a complete Compose file in itself. Beyond that all that's needed is the reverse proxy configuration for Traefik.
Configure the Traefik reverse proxy proxy-config.toml in the same directory:
[http.routers.test-streamrouter]
rule = "Host(`myhost.com`)"
service = "test-loadbalancer"
entryPoints = ["web"]
[[http.services.test-loadbalancer.loadBalancer.servers]]
url = "http://host.docker.internal:4000"
This is a sample reverse proxy in its entirety. It can be enhanced with middlewares to perform URL rewriting, update domain names or even redirect users if that's your aim. A single load balancer is used as shown in this answer. And host.docker.internal is used to return the host's internal networking address.
Note: At time of writing "host.docker.internal" only works with Docker for Mac and will fail on Linux. However, you may be able to use the Compose service name instead (i.e. "proxy").
Once you get this working you can set up the Let's Encrypt stuff or swap between development and production configurations using the TRAEFIK_PROVIDERS_FILE_FILENAME environment variable.
You can
use container names within the same bridged network instead of localhost
link middlewares and services without #file suffix
Please mind, that in the yaml and toml file, you need to pay attention to lower-uppercase of the properties. Whereas in docker it is loadbalancer, you need to write loadBalencer in the config file.
http:
middlewares:
docs:
stripPrefix:
prefixes:
- "/docs"
restapi:
stripPrefix:
prefixes:
- "/api/v1"
routers:
restapi:
rule: "PathPrefix(`/api/v1`)"
middlewares:
- "restapi"
service: "restapi"
entryPoints:
- http
docs:
rule: "PathPrefix(`/docs`)"
middlewares:
- "docs"
service: "docs"
entryPoints:
- http
client:
rule: "PathPrefix(`/`)"
service: "client"
entryPoints:
- http
help:
rule: "PathPrefix(`/server/sicon/help`)"
services:
restapi:
loadBalancer:
servers:
- url: "http://sicon_backend:1881"
docs:
loadBalancer:
servers:
- url: "http://sicon_backend:1882"
client:
loadBalancer:
servers:
- url: "http://sicon_client"