Kubernetes API for provisioning pods-as-a-service? - load-balancing

Currently I have an app (myapp) that deploys as a Java web app running on top of a "raw" (Ubuntu) VM. In production there are essentially 5 - 10 VMs running at any given time, all load balanced behind an nginx load balancer. Each VM is managed by Chef, which injects the correct env vars and provides the app with runtime arguments that make sense for production. So again: load balancing via nginx and configuration via Chef.
I am now interested in containerizing my future workloads, and porting this app over to Docker/Kubernetes. I'm trying to see what features Kubernetes offers that could replace my app's dependency on nginx and Chef.
So my concerns:
Does Kube-Proxy (or any other Kubernetes tools) provide subdomains or otherwise-loadbalanced URLs that could load balance to any number of pod replicas. In other words, if I "push" my newly-containerized app/image to Kubernetes API, is there a way for Kubernetes to make image available as, say, 10 pod replicas all load balanced behind myapp.example.com? If not what integration between Kubernetes and networking software (DNS/DHCP) is available?
Does Kubernetes (say, perhas via etc?) offer any sort of key-value basec configuration? It would be nice to send a command to Kubernetes API and give it labels like myapp:nonprod or myapp:prod and have Kubernetes "inject" the correct KV pairs into the running containers. For instance perhaps in the "nonprod" environment, the app connects to a MySQL database named mydb-nonprod.example.com, but in prod it connects to an RDS cluster. Or something.
Does Kubernetes offer service registry like features that could replace Consul/ZooKeeper?

Answers:
1) DNS subdomains in Kubernetes:
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns
Additionally, each Service loadbalancer gets a static IP address, so you can also program other DNS names if you want to target that IP address.
2) Key/Value pairs
At creation time you can inject arbitrary key/value environment variables and then use those in your scripts/config. e.g. you could connect to ${DB_HOST}
Though for your concrete example, we suggest using Namespaces (http://kubernetes.io/v1.0/docs/admin/namespaces/README.html) you can have a "prod" namespace and a "dev" namespace, and the DNS names of services resolve within those namespaces (e.g. mysql.prod.cluster.internal and mysql.dev.cluster.internal)
3) Yes, this is what the DNS and Service object provide (http://kubernetes.io/v1.0/docs/user-guide/walkthrough/k8s201.html#services)

Related

Deploy a dynamic Service Gateway for Lagom in production

I have developed a set of Lagom microservices. The development environment provides with default Service Gateway and Service Locator.
In a production environment I would like my services to:
register to a service registry
be available to a web app through a service locator that uses this registry
What should I use as Service Registry / Service Locator / Service Gateway ?
A simple NGINX would be a reasonable service gateway but it implies a very static configuration based on redirect rules (no actual registration).
I cannot find any code sample on this subject and the documentation is very poor (it describes well development tools but doesn't help when it comes to actual production).
The documentation on that area is vague on purpose because the ecosystem is very vast and changes fast.
You could, for example, use Consul or ZooKeeper to keep track of the instances that are runnning for each service and where they are running (where means IP:PORT). Then you would need to use a Consul-based or a ZooKeeper-based Service Locator instance. The preferred target deployment environment these days is Kubernetes (in any of its flavors) so the service location is based on DNS-SRV lookups on the DNS server provided by k8s. The registration step happens automatically in a k8s setup for each pod so you won't need to care for that.
Then, the reverse proxy on the edge capable of directing each request to the appropriate process is a plain-old HTTP proxy that can check your service location (or cache the service location information). These days the recommendation is configuring an Ingress/Route (for k8s or OpenShift) edge proxy for each of your lagom services.
See the guide on Deploying a Lagom application to Openshift for a thorough explanation.

Apache web server and microservices with Docker

I have a few spring boot microservices running on Docker, and Apache web server (also running on Docker) for all the static stuff. The microservices are consumed by the web browser. Problem is, I don't know how I should reference the microservices from html or javascript:
the microservice runs on a different port
also might run on a different host
the browser complains about links
Googling the problem points me toward Netflix eureka or Apache Camel, but I'm not sure these are the right solutions.
Let's first think about deployment. You mention that the Docker containers might run on different machines. I recommend using container orchestrators like Docker Swarm or Kubernetes to manage a cluster and communication between microservices (typically via DNS).
Generally, you want to hide all your microservices behind one API path. The outside world does not need to know that your server application consists of multiple microservices. You can use a simple reverse proxy for this. I personally like Traefik because you can configure the routing paths in the Docker ecosystem via labels.
You say you consume the microservice APIs with a browser, so is it a web client application? If so, I recommend serving it as Docker container as well and embed it into the routing by using relative paths. E.g. UI is served as / and microservices as /api/{service}/{path}. Then the UI application can use relative paths because they are served by the same reverse proxy and such under the same URL (=> no CORS issues). Additionally, you can deploy to any IP, the routing stays the same and does not have to be adjusted

Endpoint Paths for APIs inside Docker and Kubernetes

I am newbie on Docker and Kubernetes. And now I am developing Restful APIs which later be deployed to Docker containers in a Kubernetes cluster.
How the path of the endpoints will be changed? I have heard that Docker-Swarm and Kubernetes add some ords on the endpoints.
The "path" part of the endpoint URLs themselves (for this SO question, the /questions/53008947/... part) won't change. But the rest of the URL might.
Docker publishes services at a TCP-port level (docker run -p option, Docker Compose ports: section) and doesn't look at what traffic is going over a port. If you have something like an Apache or nginx proxy as part of your stack that might change the HTTP-level path mappings, but you'd probably be aware of that in your environment.
Kubernetes works similarly, but there are more layers. A container runs in a Pod, and can publish some port out of the Pod. That's not used directly; instead, a Service refers to the Pod (by its labels) and republishes its ports, possibly on different port numbers. The Service has a DNS name service-name.namespace.svc.cluster.local that can be used within the cluster; you can also configure the Service to be reachable on a fixed TCP port on every node in the service (NodePort) or, if your Kubernetes is running on a public-cloud provider, to create a load balancer there (LoadBalancer). Again, all of this is strictly at the TCP level and doesn't affect HTTP paths.
There is one other Kubernetes piece, an Ingress controller, which acts as a declarative wrapper around the nginx proxy (or something else with similar functionality). That does operate at the HTTP level and could change paths.
The other corollary to this is that the URL to reach a service might be different in different environments: http://localhost:12345/path in a local development setup, http://other_service:8080/path in Docker Compose, http://other-service/path in Kubernetes, https://api.example.com/other/path in production. You need some way to make that configurable (often an environment variable).

Can you create Kerberos principals where the hostname is flexible? (Docker)

I'm specifically trying to do this with Apache Storm (1.0.2), but it's relevant to any service that is secured with Kerberos. I'm trying to run a secured Storm cluster in Docker. There are a number of out-of-the-box docker images out there for Storm, and they work great unsecured. I'm using https://github.com/Baqend/docker-storm. I also have Storm running securely on RHEL VM's.
However, my understanding is that Kerberos ties hostnames to principals, so if I'm making service foobar available to clients, I need to create a principal of foobar/hostname#REALM. Then a client service might connect to hostname with principal foobar, Kerberos will look up foobar/hostname#REALM in its database, find that it's there (because we created a principal with exactly that name), and everything will work.
In my case, it's described here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/configure_kerberos_for_storm.html. The nimbus authenticates as storm/<nimbus host>#REALM, and the supervisors and outside clients authenticate as storm/REALM. Everything works.
But here in 2017, we have containers and hostnames are no longer static. So how would I Kerberize a service that runs in Docker Data Center (or Kubernetes, etc)? I have to attach an unknown hostname to the server authentication. I imagine I could create a principal for all possible hostnames and dynamically pick the right one at startup based on where the container lives, but that's kludgy.
Am I misunderstanding how Kerberos works? Is there a solution here that I don't see? I see multiple examples online of people running Storm in Docker, but I can't imagine that nobody's clusters are secure.
I don't know Apache Storm or Docker, but based on previous workings with JBOSS in a cluster in which an inbound client could be connecting to any one of a possible number of different hosts, then you would simply assign a virtual name to the entire pool at the load balancer and kerberize the service according to the virtual name instead of individual host name at the host level. So if you're making service foobar available to clients, you need to create a service principal (SPN) of foobar/virtualhostname#REALM in your Directory to kerberize the service with. You assign that SPN to a user account (not a computer account) to give it the flexibility to work with any Kerberized service which uses that SPN. If you are using Active Directory, you must create a keytab with the SPN inside of it, and place the keytab on each host running the kerberized service instance foobar/virtualhostname#REALM.

Azure Container Services Port Load Balancer

While trying to port my application which is running on docker Swarm locally to Azure Container Service I am struck on the load balancer part of the Azure.
Locally I have a container instance of HAproxy running on Swarm Master and multiple web containers running.
Web containers have just exposed the ports and they are not mapped to machines on which they are running.
HAproxy container has mapped port to the master and internally is talking to my web containers for load balancing.
This gives me the leverage to run any number of containers with limited number of workers in Docker Swarm.
In azure container service I see that Azure load balancer will talk to only ports that are mapped, that means that I can only run 1 container per agent or I keep an internal load balancer in my containers, which implies that users will be going through 2 load balancers before hitting my application.
Not an ideal scenario when my application uses sticky sessions.
So Apparently Microsoft's statement "Everything works same in Azure containers" goes for a toss ?
what are the solutions available or am I doing something wrong here?
Regards,
Harneet
The solution in ACS is almost identical. Use HAProxy and have the Azure LB talk to that. The only difference is that you will not be running the proxy on the master, you will have Swarm deploy it to an agent for you.
You shouldn't really be running workloads on your masters. What would you do if you have a DDoS attack and can't reach your masters, for example. Having Swarm deploy the proxy for you means that you can also have swarm monitor the health of the proxy.
You could, if you really wanted to, run the proxy on the master as you do now. The solution would be the same, have the Azure LB provide a public connection to the proxy just as you currently do.