I have a set of applications that we're currently transitioning into a more "cloud-native" architecture. We're beginning by moving them into containers (Docker on Windows), and as part of this, we're hoping to use a load-balancing proxy to handle traffic to the containers.
Having looked at several options, we're hoping to use Traefik as a load-balancing proxy in this iteration of our architecture. It may or may not be important to note that all traffic through Traefik in this setup will be internal; it will not be serving any external traffic. I am also working in a self-hosted situation; because of contractual concerns, cloud providers such as AWS and Azure are not currently available to me.
My question, now, is how Traefik might best be deployed. I can see at least two options:
First, Traefik could be deployed on separate "load-balancer" hosts. I could use round-robin DNS for these hosts, and pass traffic through them.
Second, Traefik could be deployed on each of my application hosts. In this setup, Traefik would be more of a "side-car", and applications on each host would use the local Traefik instance as a proxy to the other hosts' services.
An issue I see is that in neither of these setups is true high availability achieved. In either case, a Traefik instance crashing would result in unavailability for at least some services. In the first case, round-robin DNS and short TTL might mitigate this?
Is lack of high-availability avoidable? Is there an alternative way to architect this solution? Does Traefik itself offer guidance on how this solution should be structured?
Related
Maybe this is a dumb question, but I really don't know if I have to secure applications with tokens etc. within a kubernetes cluster.
So for example I make a grpc-call from a client within the cluster to a server within the cluster.
I thought this should be secure without authenticating the client with a token or something like that, because (if I understood it right) kubernetes pods and services work within a VPN which won't be exposed as long as it's not told to.
But is this really secure, should I somehow build an authorization system within my cluster?
Also how can I use a service to load balance the grpc-calls over the server pods without exposing the server outside the cluster?
If you have a service, it already has built-in load balancer when you have more than one replica out of the box.
Also Kubernetes traffic is internal within the cluster out of the box, unless you explicitly expose traffic using LoadBalancer, Ingress or NodePort.
Does it mean traffic is safe? No.
By default, everything is allowed within Kubernetes cluster so every service can reach every service or pod in StatefulSet apps.
You can use NetworkPolicy to allow traffic from one service to another service and nothing else. That would increase security.
Does it mean traffic is safe now? It depends.
Authentication would add an additional security layer in case container is hacked. There could be more scenarios, but I can't think of for now.
So internal authentication is usually used to improve security in production systems.
I hope it answers the question.
Our services use a K8s service with a reverse proxy to receive a request by multiple domains and redirect to our services, additionally, we manage SSL certificates powered by let's encrypt for every user that configures their domain in our service. Resuming I have multiple .conf files in the nginx for every domain that is configured. Works really great.
But now we need to increase our levels of security and availability and now we ready to configure the ingress in K8s to handle this problem for us because they are built for it.
Everything looks fine until we discover that every time that I need to configure a new domain as a host in the ingress I need to alter the config file and re-apply.
So that's the problem, I want to apply the same concept that I already have running, but in the nginx ingress controller. It's that possible? I have more than 10k domains up and running, I can't configure all in my ingress resource file.
Any thoughts?
In terms of scaling Kubernetes 10k domains should be fine to be configured in an Ingress resource. You might want to check how much storage you have in the etcd nodes to make sure you can store enough data there.
The default etcd storage is 2Gb, but if you keep increasing it's something to keep in mind.
You can also refer to the K8s best practices when it comes to building large clusters.
Another practice that you can use is to use apply and not create when changing the ingress resource, that way the changes are incremental. Furthermore, if you are using K8s 1.18 or later you can take advantage of Server Side Apply.
I have a few spring boot microservices running on Docker, and Apache web server (also running on Docker) for all the static stuff. The microservices are consumed by the web browser. Problem is, I don't know how I should reference the microservices from html or javascript:
the microservice runs on a different port
also might run on a different host
the browser complains about links
Googling the problem points me toward Netflix eureka or Apache Camel, but I'm not sure these are the right solutions.
Let's first think about deployment. You mention that the Docker containers might run on different machines. I recommend using container orchestrators like Docker Swarm or Kubernetes to manage a cluster and communication between microservices (typically via DNS).
Generally, you want to hide all your microservices behind one API path. The outside world does not need to know that your server application consists of multiple microservices. You can use a simple reverse proxy for this. I personally like Traefik because you can configure the routing paths in the Docker ecosystem via labels.
You say you consume the microservice APIs with a browser, so is it a web client application? If so, I recommend serving it as Docker container as well and embed it into the routing by using relative paths. E.g. UI is served as / and microservices as /api/{service}/{path}. Then the UI application can use relative paths because they are served by the same reverse proxy and such under the same URL (=> no CORS issues). Additionally, you can deploy to any IP, the routing stays the same and does not have to be adjusted
I am newbie on Docker and Kubernetes. And now I am developing Restful APIs which later be deployed to Docker containers in a Kubernetes cluster.
How the path of the endpoints will be changed? I have heard that Docker-Swarm and Kubernetes add some ords on the endpoints.
The "path" part of the endpoint URLs themselves (for this SO question, the /questions/53008947/... part) won't change. But the rest of the URL might.
Docker publishes services at a TCP-port level (docker run -p option, Docker Compose ports: section) and doesn't look at what traffic is going over a port. If you have something like an Apache or nginx proxy as part of your stack that might change the HTTP-level path mappings, but you'd probably be aware of that in your environment.
Kubernetes works similarly, but there are more layers. A container runs in a Pod, and can publish some port out of the Pod. That's not used directly; instead, a Service refers to the Pod (by its labels) and republishes its ports, possibly on different port numbers. The Service has a DNS name service-name.namespace.svc.cluster.local that can be used within the cluster; you can also configure the Service to be reachable on a fixed TCP port on every node in the service (NodePort) or, if your Kubernetes is running on a public-cloud provider, to create a load balancer there (LoadBalancer). Again, all of this is strictly at the TCP level and doesn't affect HTTP paths.
There is one other Kubernetes piece, an Ingress controller, which acts as a declarative wrapper around the nginx proxy (or something else with similar functionality). That does operate at the HTTP level and could change paths.
The other corollary to this is that the URL to reach a service might be different in different environments: http://localhost:12345/path in a local development setup, http://other_service:8080/path in Docker Compose, http://other-service/path in Kubernetes, https://api.example.com/other/path in production. You need some way to make that configurable (often an environment variable).
In a scenario where there are thousands of webservices are there reasons to use also a signed cert for each microservice or it's just going to add overhead? Services communicate via VPC sitting behind a firewall while Public endpoints are behind a nginx public facing a valid CA cert.
Services are on multiple servers on aws.
From my limited experience, I believe that it is overkill. If an attacker has access to listen in or interact with your internal network then there are most likely other issues which you should be contending with.
This article on auth0.com explains the use of SSL only on connections to the external client. I also share this view and believe implementing SSL at an individual service level would get extremely difficult unless you where running some form of proxy such as HAProxy or Nginx on each individual host which is sub-optimal, especially if you're using a form of managed cluster like Kubernetes or Docker Swarm.
My current thoughts are its fine to run SSL just for your edge services, ensure you lock down your AWS network using something like Scout2 and run unencrypted for inter-service communication on your lan.
unless all intranet in the cloud are fully VLAN-configured and isolated, is it possible for other hosts that you don't own that are on the same LAN to steal your password by running a simple tcpdump? if that's the case, we need ssl or other encryption internally on the cloud too.