Endpoint Paths for APIs inside Docker and Kubernetes - api

I am newbie on Docker and Kubernetes. And now I am developing Restful APIs which later be deployed to Docker containers in a Kubernetes cluster.
How the path of the endpoints will be changed? I have heard that Docker-Swarm and Kubernetes add some ords on the endpoints.

The "path" part of the endpoint URLs themselves (for this SO question, the /questions/53008947/... part) won't change. But the rest of the URL might.
Docker publishes services at a TCP-port level (docker run -p option, Docker Compose ports: section) and doesn't look at what traffic is going over a port. If you have something like an Apache or nginx proxy as part of your stack that might change the HTTP-level path mappings, but you'd probably be aware of that in your environment.
Kubernetes works similarly, but there are more layers. A container runs in a Pod, and can publish some port out of the Pod. That's not used directly; instead, a Service refers to the Pod (by its labels) and republishes its ports, possibly on different port numbers. The Service has a DNS name service-name.namespace.svc.cluster.local that can be used within the cluster; you can also configure the Service to be reachable on a fixed TCP port on every node in the service (NodePort) or, if your Kubernetes is running on a public-cloud provider, to create a load balancer there (LoadBalancer). Again, all of this is strictly at the TCP level and doesn't affect HTTP paths.
There is one other Kubernetes piece, an Ingress controller, which acts as a declarative wrapper around the nginx proxy (or something else with similar functionality). That does operate at the HTTP level and could change paths.
The other corollary to this is that the URL to reach a service might be different in different environments: http://localhost:12345/path in a local development setup, http://other_service:8080/path in Docker Compose, http://other-service/path in Kubernetes, https://api.example.com/other/path in production. You need some way to make that configurable (often an environment variable).

Related

TCP route binding to specific hosts in traefik

We are using traefik for simulating our production environment. We have multiple services running in kubernetes running in docker. Few of them are java applications. In this stack, a developer can come and deploy the code as per the git branches they are working on. So at a given point, we can have 100s of full fledged stack running. We use traefik for certificates resolution so that each stack can be hosted based on branch names and all.
Now I want to give developer the facility to debug their java applications. Its fairly simple to do it in java. You need to attach java agent while starting up the docker image for application. Basically we need to pass -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=37000 as JVM argument and JVM is ready to attach remote debuggers.
Now JVM is using JDWP protocol. And as far as I understand, it is a tcp protocol. Now my problem is: I want to traefik to create routes dynamically based on my docker service labels. That is also I am able to figure out. I used these labels in the docker service.
And this is how you connect to JVM remotely.
Now if in RULE, if is use HostSNI(*) then I cam able to connect to the container. But problem is when I am doing remote connection for debugging, traefik can direct my request to any container. And this whole thing won't work as expected.
I believe we must have some other supported function for TCP rule as well, apart from only HostSNI. What is your opinion on this ? Or Have I missed something here ?

Apache web server and microservices with Docker

I have a few spring boot microservices running on Docker, and Apache web server (also running on Docker) for all the static stuff. The microservices are consumed by the web browser. Problem is, I don't know how I should reference the microservices from html or javascript:
the microservice runs on a different port
also might run on a different host
the browser complains about links
Googling the problem points me toward Netflix eureka or Apache Camel, but I'm not sure these are the right solutions.
Let's first think about deployment. You mention that the Docker containers might run on different machines. I recommend using container orchestrators like Docker Swarm or Kubernetes to manage a cluster and communication between microservices (typically via DNS).
Generally, you want to hide all your microservices behind one API path. The outside world does not need to know that your server application consists of multiple microservices. You can use a simple reverse proxy for this. I personally like Traefik because you can configure the routing paths in the Docker ecosystem via labels.
You say you consume the microservice APIs with a browser, so is it a web client application? If so, I recommend serving it as Docker container as well and embed it into the routing by using relative paths. E.g. UI is served as / and microservices as /api/{service}/{path}. Then the UI application can use relative paths because they are served by the same reverse proxy and such under the same URL (=> no CORS issues). Additionally, you can deploy to any IP, the routing stays the same and does not have to be adjusted

How should I deploy Traefik in my environment?

I have a set of applications that we're currently transitioning into a more "cloud-native" architecture. We're beginning by moving them into containers (Docker on Windows), and as part of this, we're hoping to use a load-balancing proxy to handle traffic to the containers.
Having looked at several options, we're hoping to use Traefik as a load-balancing proxy in this iteration of our architecture. It may or may not be important to note that all traffic through Traefik in this setup will be internal; it will not be serving any external traffic. I am also working in a self-hosted situation; because of contractual concerns, cloud providers such as AWS and Azure are not currently available to me.
My question, now, is how Traefik might best be deployed. I can see at least two options:
First, Traefik could be deployed on separate "load-balancer" hosts. I could use round-robin DNS for these hosts, and pass traffic through them.
Second, Traefik could be deployed on each of my application hosts. In this setup, Traefik would be more of a "side-car", and applications on each host would use the local Traefik instance as a proxy to the other hosts' services.
An issue I see is that in neither of these setups is true high availability achieved. In either case, a Traefik instance crashing would result in unavailability for at least some services. In the first case, round-robin DNS and short TTL might mitigate this?
Is lack of high-availability avoidable? Is there an alternative way to architect this solution? Does Traefik itself offer guidance on how this solution should be structured?

How can you publish a Kubernetes Service without using the type LoadBalancer (on GCP)

I would like to avoid using type: "LoadBalancer" for a certain Kubernetes Service, but still to be able to publish it on the Internet. I am using Google Cloud Platform (GCP) to run a Kubernetes cluster currently running on a single node.
I tried to us the externalIPs Service configuration and to give at turns, the IPs of:
the instance hosting the Kubernetes cluster (External IP; which also conincides with the IP address of the Kubernetes node as reported by kubernetes describe node)
the Kubernetes cluster endpoint (as reported by the Google Cloud Console in the details of the cluster)
the public/external IP of another Kubernetes Service of type LoadBalancer running on the same node.
None of the above helped me reach my application using the Kubernetes Service with an externalIPs configuration.
So, how can I publish a service on the Internet without using a LoadBalancer-type Kubernetes Service.
If you don't want to use a LoadBalancer service, other options for exposing your service publicly are:
Type NodePort
Create your service with type set to NodePort, and Kubernetes will allocate a port on all of your node VMs on which your service will be exposed (docs). E.g. if you have 2 nodes, w/ public IPs 12.34.56.78 and 23.45.67.89, and Kubernetes assigns your service port 31234, then the service will be available publicly on both 12.34.56.78:31234 & 23.45.67.89:31234
Specify externalIPs
If you have the ability to route public IPs to your nodes, you can specify externalIPs in your service to tell Kubernetes "If you see something come in destined for that IP w/ my service port, route it to me." (docs)
The cluster endpoint won't work for this because that is only the IP of your Kubernetes master. The public IP of another LoadBalancer service won't work because the LoadBalancer is only configured to route the port of that original service. I'd expect the node IP to work, but it may conflict if your service port is a privileged port.
Use the /proxy/ endpoint
The Kubernetes API includes a /proxy/ endpoint that allows you to access services on the cluster endpoint IP. E.g. if your cluster endpoint is 1.2.3.4, you could reach my-service in namespace my-ns by accessing https://1.2.3.4/api/v1/proxy/namespaces/my-ns/services/my-service with your cluster credentials. This should really only be used for testing/debugging, as it takes all traffic through your Kubernetes master on the way to the service (extra hops, SPOF, etc.).
There's another option: set the hostNetwork flag on your pod.
For example, you can use helm3 to install nginx this way:
helm install --set controller.hostNetwork=true nginx-ingress nginx-stable/nginx-ingress
The nginx is then available at port 80 & 443 on the IP address of the node that runs the pod. You can use node selectors or affinity or other tools to influence this choice.
There are a few idiomatic ways to expose a service externally in Kubernetes (see note#1):
Service.Type=LoadBalancer, as OP pointed out.
Service.Type=NodePort, this would exposing node's IP.
Service.Type=ExternalName, Maps the Service to the contents of the externalName field by returning a CNAME record (You need CoreDNS version 1.7 or higher to use the ExternalName type.)
Ingress. This is a new concept that expose eternal HTTP and/or HTTPS routes to services within the Kubernetes cluster, you can even map a route to multiple services. However, this only maps HTTP and/or HTTPS routes only. (See note#2)

Kubernetes API for provisioning pods-as-a-service?

Currently I have an app (myapp) that deploys as a Java web app running on top of a "raw" (Ubuntu) VM. In production there are essentially 5 - 10 VMs running at any given time, all load balanced behind an nginx load balancer. Each VM is managed by Chef, which injects the correct env vars and provides the app with runtime arguments that make sense for production. So again: load balancing via nginx and configuration via Chef.
I am now interested in containerizing my future workloads, and porting this app over to Docker/Kubernetes. I'm trying to see what features Kubernetes offers that could replace my app's dependency on nginx and Chef.
So my concerns:
Does Kube-Proxy (or any other Kubernetes tools) provide subdomains or otherwise-loadbalanced URLs that could load balance to any number of pod replicas. In other words, if I "push" my newly-containerized app/image to Kubernetes API, is there a way for Kubernetes to make image available as, say, 10 pod replicas all load balanced behind myapp.example.com? If not what integration between Kubernetes and networking software (DNS/DHCP) is available?
Does Kubernetes (say, perhas via etc?) offer any sort of key-value basec configuration? It would be nice to send a command to Kubernetes API and give it labels like myapp:nonprod or myapp:prod and have Kubernetes "inject" the correct KV pairs into the running containers. For instance perhaps in the "nonprod" environment, the app connects to a MySQL database named mydb-nonprod.example.com, but in prod it connects to an RDS cluster. Or something.
Does Kubernetes offer service registry like features that could replace Consul/ZooKeeper?
Answers:
1) DNS subdomains in Kubernetes:
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns
Additionally, each Service loadbalancer gets a static IP address, so you can also program other DNS names if you want to target that IP address.
2) Key/Value pairs
At creation time you can inject arbitrary key/value environment variables and then use those in your scripts/config. e.g. you could connect to ${DB_HOST}
Though for your concrete example, we suggest using Namespaces (http://kubernetes.io/v1.0/docs/admin/namespaces/README.html) you can have a "prod" namespace and a "dev" namespace, and the DNS names of services resolve within those namespaces (e.g. mysql.prod.cluster.internal and mysql.dev.cluster.internal)
3) Yes, this is what the DNS and Service object provide (http://kubernetes.io/v1.0/docs/user-guide/walkthrough/k8s201.html#services)