Google Cloud Load Balancer - 502 - Unmanaged instance group failing health checks - ssl

I currently have an HTTPS Load Balancer setup operating with a 443 Frontend, Backend and Health Check that serves a single host nginx instance.
When navigating directly to the host via browser the page loads correctly with valid SSL certs.
When trying to access the site through the load balancer IP, I receive a 502 - Server error message. I check the Google logs and I notice "failed_to_pick_backend" errors at the load balancer. I also notice that it failing health checks.
Some digging around leads me to these two links: https://cloudplatform.googleblog.com/2015/07/Debugging-Health-Checks-in-Load-Balancing-on-Google-Compute-Engine.html
https://github.com/coreos/bugs/issues/1195
Issue #1 - Not sure if google-address-manager is running on the server
(RHEL 7). I do not see an entry for the HTTPS load balancer IP in the
routes. The Google SDK is installed. This is a Google-provided image
and if I update the IP address in the console, it also gets updated on
the host. How do I check if google-address-manager is running on
RHEL7?
[root#server]# ip route ls table local type local scope host
10.212.2.40 dev eth0 proto kernel src 10.212.2.40
127.0.0.0/8 dev lo proto kernel src 127.0.0.1
127.0.0.1 dev lo proto kernel src 127.0.0.1
Output of all google services
[root#server]# systemctl list-unit-files
google-accounts-daemon.service enabled
google-clock-skew-daemon.service enabled
google-instance-setup.service enabled
google-ip-forwarding-daemon.service enabled
google-network-setup.service enabled
google-shutdown-scripts.service enabled
google-startup-scripts.service enabled
Issue #2: Not receiving a 200 OK response. The certificate is valid
and the same on both the LB and server. When running curl against the
app server I receive this response.
root#server.com curl -I https://app-server.com
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
Thoughts?

You should add firewall rules for the health check service -
https://cloud.google.com/compute/docs/load-balancing/health-checks#health_check_source_ips_and_firewall_rules and make sure that your backend service listens on the load balancer ip (easiest is bind to 0.0.0.0) - this is definitely true for an internal load balancer, not sure about HTTPS with an external ip.

A couple of updates and lessons learned:
I have found out that "google-address-manager" is now deprecated and replaced by "google-ip-forward-daemon" which is running.
[root#server ~]# sudo service google-ip-forwarding-daemon status
Redirecting to /bin/systemctl status google-ip-forwarding-daemon.service
google-ip-forwarding-daemon.service - Google Compute Engine IP Forwarding Daemon
Loaded: loaded (/usr/lib/systemd/system/google-ip-forwarding-daemon.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2017-12-22 20:45:27 UTC; 17h ago
Main PID: 1150 (google_ip_forwa)
CGroup: /system.slice/google-ip-forwarding-daemon.service
└─1150 /usr/bin/python /usr/bin/google_ip_forwarding_daemon
There is an active firewall rule allowing IP ranges 130.211.0.0/22 and 35.191.0.0/16 for port 443. The target is also properly set.
Finally, the health check is currently using the default "/" path. The developers have put an authentication in front of the site during the development process. If I bypassed the SSL cert error, I received a 401 unauthorized when running curl. This was the root cause of the issue we were experiencing. To remedy, we modified nginx basic authentication configuration to disable authentication to a new route (eg. /health)
Once nginx configuration was updated and the path was updated to the new /health route at the health check, we were receivied valid 200 responses. This allowed the health check to return healthy instances and allowed the LB to pass through traffic

Related

How to make my Google Cloud Load Balancer work?

I follow Document for Creating Content-Based Load Balancing: https://cloud.google.com/load-balancing/docs/https/content-based-example
I want to reach external address with https. I want load balancer to connect to VM with simple http.
Both VMs work as expected and are returning proper answet when reached by IP address. LB's settings seem fine. Both health checks are passing and Google SSL Certificate is ACTIVE.
However, when I try to reach Load Balancer's IP address or domain I get 502.
LB IP is 35.244.161.226 wciel.pl
Load Balancer's logs show statusDetails: "failed_to_connect_to_backend"
I attached screens of my Google Cloud Console.
Please advice.
me#machine:$ gcloud beta compute ssl-certificates list
NAME TYPE CREATION_TIMESTAMP EXPIRE_TIME MANAGED_STATUS
wciel-pl-certificate2 MANAGED 2019-08-11T03:20:15.971-07:00 2019-11-09T01:27:44.000-08:00 ACTIVE
www.wciel.pl: ACTIVE
I think there is a mismatch in back end service configuration. From the details of web-map-backend-service its seems like your service listening on port 80. However, when you have configured backend service you have configured it with port 443.
If you don't require secure communication between LB to VM, I would recommend followings:
Change backend protocol from HTTPS to HTTP
Edit backend Port numbers from 443 to 80
Save and update the configuration.

Can not link a HTTP Load Balancer to a backend (502 Bad Gateway)

I have on the backend a Kubernetes node running on port 32656 (Kubernetes Service of type NodePort). If I create a firewall rule for the <node_ip>:32656 to allow traffic, I can open the backend in the browser on this address: http://<node_ip>:32656.
What I try to achieve now is creating an HTTP Load Balancer and link it to the above backend. I use the following script to create the infrastructure required:
#!/bin/bash
GROUP_NAME="gke-service-cluster-61155cae-group"
HEALTH_CHECK_NAME="test-health-check"
BACKEND_SERVICE_NAME="test-backend-service"
URL_MAP_NAME="test-url-map"
TARGET_PROXY_NAME="test-target-proxy"
GLOBAL_FORWARDING_RULE_NAME="test-global-rule"
NODE_PORT="32656"
PORT_NAME="http"
# instance group named ports
gcloud compute instance-groups set-named-ports "$GROUP_NAME" --named-ports "$PORT_NAME:$NODE_PORT"
# health check
gcloud compute http-health-checks create --format none "$HEALTH_CHECK_NAME" --check-interval "5m" --healthy-threshold "1" --timeout "5m" --unhealthy-threshold "10"
# backend service
gcloud compute backend-services create "$BACKEND_SERVICE_NAME" --http-health-check "$HEALTH_CHECK_NAME" --port-name "$PORT_NAME" --timeout "30"
gcloud compute backend-services add-backend "$BACKEND_SERVICE_NAME" --instance-group "$GROUP_NAME" --balancing-mode "UTILIZATION" --capacity-scaler "1" --max-utilization "1"
# URL map
gcloud compute url-maps create "$URL_MAP_NAME" --default-service "$BACKEND_SERVICE_NAME"
# target proxy
gcloud compute target-http-proxies create "$TARGET_PROXY_NAME" --url-map "$URL_MAP_NAME"
# global forwarding rule
gcloud compute forwarding-rules create "$GLOBAL_FORWARDING_RULE_NAME" --global --ip-protocol "TCP" --ports "80" --target-http-proxy "$TARGET_PROXY_NAME"
But I get the following response from the Load Balancer accessed through the public IP in the Frontend configuration:
Error: Server Error
The server encountered a temporary error and could not complete your
request. Please try again in 30 seconds.
The health check is left with default values: (/ and 80) and the backend service responds quickly with a status 200.
I have also created the firewall rule to accept any source and all ports (tcp) and no target specified (i.e. all targets).
Considering that regardless of the port I choose (in the instance group), and that I get the same result (Server Error), the problem should be somewhere in the configuration of the HTTP Load Balancer. (something with the health checks maybe?)
What am I missing from completing the linking between the frontend and the backend?
I assume you actually have instances in the instance group, and the firewall rule is not specific to a source range. Can you check your logs for a google health check? (UA will have google in it).
What version of kubernetes are you running? Fyi there's a resource in 1.2 that hooks this up for you automatically: http://kubernetes.io/docs/user-guide/ingress/, just make sure you do these: https://github.com/kubernetes/contrib/blob/master/ingress/controllers/gce/BETA_LIMITATIONS.md.
More specifically: in 1.2 you need to create a firewall rule, service of type=nodeport (both of which you already seem to have), and a health check on that service at "/" (which you don't have, this requirement is alleviated in 1.3 but 1.3 is not out yet).
Also note that you can't put the same instance into 2 loadbalanced IGs, so to use the Ingress mentioned above you will have to cleanup your existing loadbalancer (or at least, remove the instances from the IG, and free up enough quota so the Ingress controller can do its thing).
There can be a few things wrong that are mentioned:
firewall rules need to be set to all hosts, are they need to have the same network label as the machines in the instance group have
by default, the node should return 200 at / - readiness and liveness probes to configure otherwise did not work for me
It seems you try to do things that are all automated, so I can really recommend:
https://cloud.google.com/kubernetes-engine/docs/how-to/load-balance-ingress
This shows the steps that do the firewall and portforwarding for you, which also may show you what you are missing.
I noticed myself when using an app on 8080, exposed on 80 (like one of the deployments in the example) that the load balancer staid unhealthy untill I had / returning 200 (and /healthz I added to). So basically that container now exposes a webserver on port 8080, returning that and the other config wires that up to port 80.
When it comes to firewall rules, make sure they are set to all machines or make the network label match, or they won't work.
The 502 error is usually from the loadbalancer that will not pass your request if the health check does not pass.
Could you make your service type LoadBalancer (http://kubernetes.io/docs/user-guide/services/#type-loadbalancer) which would setup this all up automatically? This assumes you have the flag set for google cloud.
After you deploy, then describe the service name and should give you the endpoint which is assigned.

503 Service Unavailable for Elastic Beanstalk HTTPS Configuration

I have a PHP app deployed on Elastic Beanstalk, currently with a single instance behind a load balancer and am attempting to enable SSL. The current configuration is as follows:
-I've uploaded my certs to IAM successfully
-On the EB Console Load Balancer config "Listener Port" is off, "Secure Listener Port" is "443", and "Protocol" is set to "HTTPS"
-In my Loadbalancer, accessed through the EC2 console, Load Balancer Port/Protocol 443/HTTPS and Instance Port/Protocol is 80/HTTP (the default HTTP/80 HTTP/80 listener is still there but i've tried removing it to no joy)
-My security groups for both the load balancer and the instance are configured the same: Inbound is allowing all connections from either security group, plus inbound http on 80 and https on 443 (source= 0.0.0.0/0)
When attempting to access the url https://myurl.com, I get 503 service unavailable (server at capacity). I suspect there is an issue with my security group configuration, but can't figure out what it is (have tried referring to this thread).
Any Ideas?
I just experienced this on my ElasticBeanstalk deployment and the reason was that my elastic load balancer had 0 healthy instances in service. There's different health check settings, one that checks over HTTP:80 and one that checks over TCP:80. I haven't investigated thoroughly but for some reason the HTTP:80 setting will result in my servers being marked as unhealthy, but TCP:80 will test correctly. If this comes up again I would suggest looking in there?

How to setup puppet master as a node

Currently I have a master and agent working on separate Centos 6.5 VMs. I would like to be able to configure my own master as I will be tearing down and making a new master every time.
How can I get puppet agent --test --noop to work on my master machine as well?
Currently I receive an error:
Error: Could not request certificate: 502 "Proxy Error ( The specified Secure Sockets Layer (SSL) port is not allowed. ISA Server is not configured to allow SSL requests from this port. Most Web browsers use port 443 for SSL requests. )"
SSL requests seem to be setup for port 443. Any thoughts?
Thank you very much!
Jason
Credit to Felix Frank, mr_tron
Issue seemed to be solved by removing http_proxy declaration in .bashrc file and anywhere else
Puppet Master now able to act as an agent
Thank you,
Jason

HTTP access on GCE instance after firewall rule added

I'm trying to get Apache working on a GCE instance.
Following GCE's Quickstart guide, I did the following:
Created instance "my-instance" in "my-project" (CentOS image)
Installed httpd, verified it's running
Added the following firewall rule:
gcutil addfirewall http2 --description="Incoming http allowed." --allowed="tcp:http"
and did the same for HTTPS and ICMP
Verified through gce gui that these rules were added to default network
I can ping my instance's IP address but I can't get an HTTP response. I've tried through the browser, from a curl command - no dice. And it works fine when on localhost so I know Apache is returning the index.html page.
When I use curl from a remote host, the error is:
curl: (7) Failed connect to (instance ip addr):80; Connection refused
Thoughts?
I did some experiments to replicate this. In short, I believe HTTP port 80 may be blocked by iptables firewall rules on the local Centos instance. This appears to be the default behavior.
I have a GCE firewall rule setup to allow port 80 traffic to all instances. I created a centos based image via the Cloud Console (which is indeed using the v1 API). Logged in via SSH and started a web server on port 80. I was not able to hit the web server from my laptop. However I was also not able to hit it from another instance in my project. This lead me to suspect a firewall local to the instance rather than Compute Engine's firewall.
I ran this command (which drops the default reject of all ports for testing - this is unsafe to do for machines which are directly exposed to the internet):
$ sudo iptables -D INPUT -j REJECT --reject-with icmp-host-prohibited
After running that, I was able to hit my webserver from both another instance and my laptop. Note that this change is lost after restarting the instance. I don't know the correct procedure for changing the default firewall rules on Centos.
Please try a similar experiment on your instances, especially try to hit the web server from another Compute Engine instance, since service level firewalls do not block traffic between instances on the same network.