I'm trying to configure an HTTPS/Layer 7 Load Balancer with GKE. I'm following SSL certificates overview and GKE Ingress for HTTP(S) Load Balancing.
My config. has worked for some time. I wanted to test Google's managed service.
This is how I've set it up so far:
k8s/staging/staging-ssl.yml:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: my-staging-lb-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: "my-staging-global"
ingress.gcp.kubernetes.io/pre-shared-cert: "staging-google-managed-ssl"
kubernetes.io/ingress.allow-http: "false"
spec:
rules:
- host: staging.my-app.no
http:
paths:
- path: /*
backend:
serviceName: my-svc
servicePort: 3001
gcloud compute addresses list
#=>
NAME REGION ADDRESS STATUS
my-staging-global 35.244.160.NNN RESERVED
host staging.my-app.no
#=>
35.244.160.NNN
but it is stuck on FAILED_NOT_VISIBLE:
gcloud beta compute ssl-certificates describe staging-google-managed-ssl
#=>
creationTimestamp: '2018-12-20T04:59:39.450-08:00'
id: 'NNNN'
kind: compute#sslCertificate
managed:
domainStatus:
staging.my-app.no: FAILED_NOT_VISIBLE
domains:
- staging.my-app.no
status: PROVISIONING
name: staging-google-managed-ssl
selfLink: https://www.googleapis.com/compute/beta/projects/my-project/global/sslCertificates/staging-google-managed-ssl
type: MANAGED
Any idea on how I can fix or debug this further?
I found a section in the doc I linked to at the beginning of the post
Associating SSL certificate resources with a target proxy:
Use the following gcloud command to associate SSL certificate resources with a target proxy, whether the SSL certificates are self-managed or Google-managed.
gcloud compute target-https-proxies create [NAME] \
--url-map=[URL_MAP] \
--ssl-certificates=[SSL_CERTIFICATE1][,[SSL_CERTIFICATE2], [SSL_CERTIFICATE3],...]
Is that necessary when I have this line in k8s/staging/staging-ssl.yml?
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
. . .
ingress.gcp.kubernetes.io/pre-shared-cert: "staging-google-managed-ssl"
. . .
I have faced this issue recently. You need to check whether your A Record correctly points to the Ingress static IP.
If you are using a service like Cloudflare, then disable the Cloudflare proxy setting so that ping to the domain will give the actual IP of Ingress. THis will create the Google Managed SSL certificate correctly with 10 to 15 minutes.
Once the certificate is up, you can again enable Cloudflare proxy setting.
I'm leaving this for anyone who might end up in the same situation as me. I needed to migrate from a self-managed certificate to a google-managed one.
I did create the google-managed certificate following the guide and was expecting to see it being activated before applying the certificate to my Kubernetes ingress (to avoid the possibility of a downtime)
Turns out, as stated by the docs,
the target proxy must reference the Google-managed certificate
resource
So applying the configuration with kubectl apply -f ingress-conf.yaml made the load balancer use the newly created certificate, which became active shortly after (15 min or so)
What worked for me after checking the answers here (I worked with a load balancer but IMO this is correct for all cases):
If some time passed this certificate will not work for you (It may be permamnently gone and it will take time to show that) - I created a new one and replaced it in the Load Balancer (just edit it)
Make sure that the certificate is being used a few minutes after creating it
Make sure that the DNS points to your service. And that your configuration is working when using http!! - This is the best and safest way (also if you just moved a domain - make sure that when you check it you reach to the correct IP)
After creating a new cert or if the problem was fixed - your domain will turn green but you still need to wait (can take an hour or more)
As per the following documentation which you provided, this should help you out:
The status FAILED_NOT_VISIBLE indicates that certificate provisioning failed for a domain because of a problem with DNS or the load balancing configuration. Make sure that DNS is configured so that the certificate's domain resolves to the IP address of the load balancer.
What is the TTL (time to live) of the A Resource Record for staging.my-app.no?
Use, e.g.,
dig +nocmd +noall +answer staging.my-app.no
to figure it out.
In my case, increasing the TTL from 60 seconds to 7200 let the domainStatus finally arrive in ACTIVE.
In addition to the other answers, when migrating from self-managed to google-managed certs I had to:
Enable http to my ingress service with kubernetes.io/ingress.allow-http: true
Leave the existing SSL cert running in the original ingress service until the new managed cert was Active
I also had an expired original SSL cert, though I'm not sure this mattered.
In my case, at work. We are leveraging the managed certificate a lot in order to provide dynamic environment for Developers & QA. As a result, we are provisioning & removing managed certificate quite a lot. This mean that we are also updating the Ingress resource as we are generating & removing managed certificate.
What we have founded out is that even if you delete the reference of the managed certificate from this annotation:
networking.gke.io/managed-certificates: <list>
It seems that randomly the Ingress does not remove the associated ssl-certificates from the LoadBalancer.
ingress.gcp.kubernetes.io/pre-shared-cert: <list>
As a result, when the managed certificate is deleted. The ingress will be "stuck" in a way, that no new managed certificate could be provision. Hence, new managed-ceritifcate will after some times transition from PROVISIONING state to FAILED_NOT_VISIBLE state
The only solution that we founded out so far, is that if a new certificate does not get provision after 30min. We will check if the annotation ingress.gcp.kubernetes.io/pre-shared-cert contains ssl-certificate that does not exist anymore.
You can check existing ssl-certificate with the command below
gcloud compute ssl-certificates list
If it happens that one ssl-certificate that does not exist anymore is still hanging around in the annotation. We'll then remove the unnecessary ssl-certificate from the ingress.gcp.kubernetes.io/pre-shared-cert annotation manually.
After applying the updated configuration, in about 5 minutes, the new managed certificate which was in FAILED_NOT_VISIBLE state should be provision and in ACTIVE state.
As already pointed by Mitzi https://stackoverflow.com/a/66578266/7588668
This is what worked for me
Create cert with subdomains/domains
Must Add it load balancer ( I was waiting for it to become active but only when you add it becomes active !! )
Add static IP as A record for domains/subdomain
It worked in 5min
In my case I needed alter the healthcheck and point it to the proper endpoint ( /healthz on nginx-ingress) and after the healtcheck returned true I had to make sure the managed certificate was created in the same namespace as the gce-ingress. After these two things were done it finally went through, otherwise I got the same error. "FAILED_NOT_VISIBLE"
I met the same issue.
I fixed it by re-looking at the documentation.
https://cloud.google.com/load-balancing/docs/ssl-certificates/troubleshooting?_ga=2.107191426.-1891616718.1598062234#domain-status
FAILED_NOT_VISIBLE
Certificate provisioning failed for the domain. Either of the following might be the issue:
The domain's DNS record doesn't resolve to the IP address of the Google Cloud load balancer. To resolve this issue, update the DNS records to point to your load balancer's IP address.
The SSL certificate isn't attached to the load balancer's target proxy. To resolve this issue, update your load balancer configuration.
Google Cloud continues to try to provision the certificate while the managed status is PROVISIONING.
Because my loadbalancer is behind cloudflare. By default cloudflare has cdn proxy enabled, and i need to first disable it after the DNS verified by Google, the cert state changed to active.
I had this problem for days. Even though the FQDN in Google Cloud public DNS zone correctly resolved to the IP of the HTTPS Load Balancer, certificate created failed with FAILED_NOT_VISIBLE. I eventually resolved the problem as my domain was set up in Google Domains with DNSSEC but had an incorrect DNSSEC record when pointing to the Google Cloud Public DNS zone. DNSSEC configuration can be verified using https://dnsviz.net/
I had the same problem. But my problem was in the deployment. I ran
kubectl describe ingress [INGRESS-NAME] -n [NAMESPACE]
The result shows an error in the resources.timeoutsec for the deployment. Allowed values must be less than 300 sec. My original value was above that. I reduced readinessProbe.timeoutSeconds to a lower number. After 30 mins the SSL cert was generated and the subdomain was verified.
It turns out that I had mistakenly done some changes to the production environment and others to staging. Everything worked as expected when I figured that out and followed the guide. :-)
I recently changed my certificate to LetsEncrypt's.
I placed the new certificate in the location of the old one:
cat /etc/haproxy/certs/fullchain.pem /etc/haproxy/certs/privkey.pem > /etc/haproxy/certs/mydomain.com.pem
And in my haproxy.cfg I have:
frontend https
bind :::8443 v4v6 ssl crt /etc/haproxy/certs/mydomain.com.pem no-sslv3
Then I ran systemctl reload haproxy, but it still brings the old one when I access it in my browser or using SSLLabs.
If I use curl -kv mydomain.com it shows the correct certificate.
I had this same issue where even after reloading the config, haproxy would randomly serve old certs. After looking around for many days the issue was that "reload" operation created a new process without killing the old one. The old processes were serving the outdated certs. You can check this by "ps aux | grep haproxy".
Fix
If your environment allows for a few seconds of downtime run "service haproxy stop" until no haproxy processes are left and then start haproxy.
**OR**
Sort by starting time and kill old processes while checking if the service is still running in between.
1 Year later EDIT
Instead of manually doing the fix mentioned above after every reload, we added a "hard-stop-after" for 600 seconds. Made sure to kill all old processes after adding the param and checked the same using ps aux. So now, older processes have to die after 600 seconds, and cannot serve outdated certs.
If you have the old pem file in /etc/haproxy/certs, HAproxy might be using it instead of new one.
I had a similar problem. HAproxy was using expired certificate that was first created for only dev.domain.com with Let's Encrypt. Later I changed certificate creation process to include multiple domains:
domain.dom www.domain.com and dev.domain.com.
The old dev.domain.com.pem was still in /etc/haproxy/certs folder. When I visited https://dev.domain.com, HAproxy used old pem certificate file and Chrome issued a warning for expired certificate.
When I deleted dev.domain.com.pem file and reloaded HAproxy, it started using new certificate and SSL is working correctly again.
My problem was historic and outdated wildcard cert that HAProxy (HA-Proxy version 1.8.19-1+deb10u3 2020/08/01) erroneously picked up and spitted out as outdated subdomain cert, both in the browser and in cURL.
Reloading, restarting, stopping+starting and even upgrading Debian did not help. What did help was to remove the outdated wildcard cert and reload.
I switched with my Domain to Cloudflare and now I'm trying to use CloudFlare's SSL Feature.
I already own a SSL cert from StartSSL so I would be possible to set the settings to 'Full (Strict)' but I don't want to so I turned it to 'Full'.
Now I'm getting 525 Errors, after a 'Retry for a live Version' everything is okay.
But I'm getting this Error everytime.
Has anyone an idea ?
Thank you
Picture of my Error
Change Cloudflare SSL/TLS encryption mode in to Flexible. it worked for me.
A 525 error indicates that CloudFlare was unable to contact your origin server and create a SSL connection with it.
This can be due to:
Your servers not having matching or compatible SSL Ciphers
Your website may not have a certificate installed properly
Your website may not have a dedicated IP OR is not configured to use SNI
Attempt to contact your hosting provider for assistance to ensure that your SSL certificate is setup correctly. If you are using a control panel, a quick google search can help you find a install guide for that said control panel.
Visit SSL/TLS tab in Cloudflare. Then:
Switch Your SSL/TLS encryption mode to Flexible.
Make sure to switch On "Always Use HTTPS" under "Edge Certificate" tab.
This will transfer all your request from Http to Https automatically. And if you'll implement custom SSL certificate on your hosting server then this 525 error will automatically disappear without changing anything on Cloudflare.
Got the same problem a few days ago.
Our DevOps contacted support and found out that Cloudflare changed certificate type or smth in that way. Asked to return everything back.
That helped.
I went through the same problem today and found that (at least in my case) it was the lack of TLS v1.3
I had just made a server using nginx + php-fpm and a self signed ssl to use below CloudFlare proxy.
When I switched from the production server to this new one, it gave error 525.
I gave the command: curl -I https://your_server_public_ip/ and it returned the error:
error: 1408F10B: SSL routines: ssl3_get_record: wrong version number
This error is described in the CloudFlare community at:
https://community.cloudflare.com/t/community-tip-fixing-error-525-ssl-handshake-failed/44256
There they advise turning off TLS v1.3 on the CloudFlare panel, but I decided to try installing it.
Using nginx is so easy that I don’t know why to have it shut down.
Only add TLSv1.3 like this-> ssl_protocols TLSv1.2 TLSv1.3; in your nginx/snippets/ssl-params.conf file (default Ubuntu 20 and 18) that will work and you still use the latest and most secure protocols.
I have a heroku app hosted at www.example.com.
I have a certificate issued for that address (www.example.com). I've installed the certificate successfully according to the heroku docs.
However, I how have a problem:
when I visit www.example.com, I get an invalid CN error (says that it is issued for *.herokuapp.com)
when I visit example.herokuapp.com, I also get an invalid CN error (this time the cert CN is for www.example.com)
So the certificates are pretty much flipped. This is still pretty fresh (<1 hour) - could waiting solve the problem?
Also: This part of the heroku docs shows an endpoint in the form example.herokussl.com
$ heroku certs:info
Fetching SSL Endpoint example-2121.herokussl.com info for example... done
And I'm getting the standard example.herokuapp.com endpoint, so I did not have to change the DNS settings after installing the certificate. Could that be some clue?
If you have already configured your domain DNS for non-SSL, standard http:// access, keep in mind that you need to update it again for SSL. From your Heroku account, go to your app settings, Domain section. You will see something like:
In this example, the example.com's DNS should be updated to point to example-ssl.herokussl.com (and not example-standard.herokuapp.com).
Turns out this was a DNS caching issue on my machine. Since this was Linux, changing browsers did not work (also for some reason restarting the dns deamon).
Anyway, it's fine now.
I cannot get ssl to work properly on my heroku app.
I have successfully add the crt key witch gives me:
Resolving trust chain... done
Updating SSL Endpoint aichi-7001.herokussl.com for mysite... done
Updated certificate details:
Common Name(s): mysite.com
www.mysite.com
Expires At: 2013-11-03 23:59 UTC
Issuer: /OU=Domain Control Validated/OU=Free SSL/CN=www.mysite.com
Starts At: 2013-08-05 00:00 UTC
Subject: /OU=Domain Control Validated/OU=Free SSL/CN=www.mysite.com
SSL certificate is verified by a root authority.
but when i try to load the page on the browser, i get the following message:
This is probably not the site you are looking for!
You attempted to reach www.mysite.com.br, but instead you actually reached a server identifying itself as *.herokuapp.com. This may be caused by a misconfiguration on the server or by something more serious. An attacker on your network could be trying to get you to visit a fake (and potentially harmful) version of www.mysite.com.br.
You should not proceed, especially if you have never seen this warning before for this site.
Any ideias on where did I mistake?
You need to add the domain www.mysite.com to your Heroku app like so:
$ heroku domains:add www.mysite.com
Added www.mysite.com to example... done