Openshift Origin Node scaling - openshift-origin

We are testing openshift origin (Latest) for a poc. We have researched POD scaling and it has worked for us very well. This talks about adding nodes and can be done with ansible/puppet but how does one completely automate this.
Using openshift how does one achieve -
1. Create new ec2 node when current nodes are at capacity.
( eg a POD which is guaranteed certain resources cannot be created )
2. Add this node to current cluster
3. Scale PODs to this node

Related

Tag based policies in Apache Ranger not working

I am new to Apache Ranger and the BigData field in general. I am working on an on-prem big data pipeline. I have configured resource based policies in Apache Ranger (ver 2.2.0) using ranger hive plugin (Hive ver 2.3.8) and they seem to be working fine. But I am having problems with tag based policies and would like someone to tell me where I am going wrong. I have configured a tag based policy in Ranger by doing the following -
1. Create a tag in Apache Atlas (eg. TAG_C1) on a hive column (column C1) (for this first
install Apache Atlas, Atlas Hook for Hive, then create tag in Atlas).
This seems to be working fine.
2. Install Atlas plugin in Apache Ranger.
3. Install RangerTagSync (but did not install Kafka).
4. Atlas Tag (TAG_C1) is being seen in Apache Ranger when I create Tag based masking policy in ranger.
5. But masking is not visible in hive which I access via beeline.
Is Kafka important for Tag based policies in Apache Ranger? What am I doing wrong in these steps?
Kafka is important for tagsync and for atlas too. Kafka is the one thats gonna notify rangertagsync about the tag assigments/changes in apache atlas.

Copy / update the code in docker container without stopping container

I have a docker-composer setup in which i am uploading source code for server say flask api . Now when i change my python code, I have to follow steps like this
stop the running containers (docker-compose stop)
build and load updated code in container (docker-compose up --build)
This take a bit long time . Is there any better way ? Like update code in the running docker and then restarting Apache server without stopping whole container ?
There are few dirty ways you can modify file system of running container.
First you need to find the path of directory which is used as runtime root for container. Run docker container inspect id/name. Look for the key UpperDir in JSON output. You can edit/copy/delete files in that directory.
Another way is to get the process ID of the process running within container. Go to the /proc/process_id/root directory. This is the root directory for the process running inside docker. You can edit it on the fly and changes will appear in the container.
You can run the docker build while the old container is still running, and your downtime is limited to the changeover period.
It can be helpful for a couple of reasons to put a load balancer in front of your container. Depending on your application this could be a "dumb" load balancer like HAProxy, or a full Web server like nginx, or something your cloud provider makes available. This allows you to have multiple copies of the application running at once, possibly on different hosts (helps for scaling and reliability). In this case the sequence becomes:
docker build the new image
docker run it
Attach it to the load balancer (now traffic goes to both old and new containers)
Test that the new container works correctly
Detach the old container from the load balancer
docker stop && docker rm the old container
If you don't mind heavier-weight infrastructure, this sequence is basically exactly what happens when you change the image tag in a Kubernetes Deployment object, but adopting Kubernetes is something of a substantial commitment.

How to correctly setup RabbitMQ on Openshift

I have created new app on OpenShift using this image: https://hub.docker.com/r/luiscoms/openshift-rabbitmq/
It runs successfully and I can use it. I have added a persistent volume to it.
However, every time a POD is restarted, I loos all my data. This is because RabbitMq uses a hostname to create database directory.
For example:
node : rabbit#openshift-rabbitmq-11-9b6p7
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash : BsUC9W6z5M26164xPxUTkA==
log : tty
sasl log : tty
database dir : /var/lib/rabbitmq/mnesia/rabbit#openshift-rabbitmq-11-9b6p7
How can I set RabbitMq to always use same database dir?
You should be able to set an environment variable RABBITMQ_MNESIA_DIR to override the default configuration. This can be done via the OpenShift console by add an entry to environment in the deployment config or via the oc tool, for example:
oc set env dc/my-rabbit RABBITMQ_MNESIA_DIR=/myDir
You would then need to mount the persistent volume inside the Pod at the required path. Since you have said it is already created, then you just need to update it, example:
oc volume dc/my-rabbit --add --overwrite --name=my-pv-name --mount-path=/myDir
You will need to make sure you have correct r/w access on the provided mount path
EDIT: Some additional workarounds based on issues in comments
The issues caused by the dynamic hostname could be solved in a number of ways:
1.(Preferred IMO) Move the deployment to a StatefulSet. StatefulSet will provide stability in the naming and hence network identifier of the Pod, which must be fronted by a headless service. This feature is out of beta as of Kubernetes 1.9 and tech preview in OpenShift since version 3.5
Set the hostname for the Pod if Statefulsets are not an option. This can be done by adding the environment variable oc set env dc/example HOSTNAME=example to make the hostname static and setting RABBITMQ_NODENAME to do likewise.
I was able to get it to work by setting the HOSTNAME environment variable. OSE normally sets that value to the pod name, so it changes everytime the pod restarts. By setting it the pod's hostname doesn't change when the pod restarts.
Combined with a Persistent Volume the the queues, messages users and i assume whatever other configuration is persisted through pod restarts.
This was done on an OSE 3.2 server. I just added an environment variable to the deployment config. You can do it through the UI or with the OC CLI:
oc set env dc/my-rabbit HOSTNAME=some-static-name
This will probably be an issue if you run multiple pods for the service, but in that case you would need to setup proper RabbitMq clustering, which is a whole different beast.
The easiest and production-safest way to run RabbitMQ on K8s including OpenShift is the RabbitMQ Cluster Operator.
See this video on how to deploy RabbitMQ on OpenShift.

openshift origin - configure max pods

I'm recently started working a bit with openshift and it looks promising so far, but I keep running into issues and mostly finding outdated documentation or look at the completely wrong place.
For example, I have currently an openshift installation of ~150 cores, based on a couple of servers and some of these nodes have only 4 cores and others have 48.
I would like to modify all my nodes to have pods = 1.5 * cores or so.
Is this possible?
I tried to use:
oc edit node node0
and change pods from the default 40 to say 6, but sadly oc never saves my values and always resets itself back to the default of 40.
kind regards
my openshift information:
oc v1.0.7-2-gd775557-dirty
kubernetes v1.2.0-alpha.1-1107-g4c8e6f4
installation done using ansible, single master, external dns.
Max pods per node is set on the node - you can add in the stanza to the node config YAML file to set it:
kubeletArguments:
max-pods:
- "100"
The string is important - this stanza passes arguments directly to the Kubelet invocation (so any arg you can pass to a Kubelet you can pass via this config)

Is there a way to tell kubernetes to update your containers?

I have a kubernetes cluster, and I am wondering how (best practice) to update containers. I know the idea is to tear down the old containers and put up new ones, but is there a one-liner I can use, do I have to remove the replication controller or pod(s) and then spin up new ones (pods or replicaiton controllers)? With this I am using a self hosted private library that I know I have to build from the Dockerfile and the push to anyway, this I can automate with gulp (or any other build tool), can I automate kubernetes update/tear down and up?
Kubectl can automate the process of rolling updates for you. Check out the docs here:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/kubectl_rolling-update.md
A rolling update of an existing replication controller foo running Docker image bar:1.0 to image bar:2.0 can be as simple as running
kubectl rolling-update foo --image=bar:2.0.
Found where in the Kubernetes docs they mention updates: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/replication-controller.md#rolling-updates. Wish it was more automated, but it works.
EDIT 1
For automating this, I've found https://www.npmjs.com/package/node-kubernetes-client which since I already automate the rest of my build and deployment with a node process, this will work really well.
The OpenShift Origin project (https://github.com/openshift/origin) runs an embedded Kubernetes cluster, but provides automated build and deployment workflows on top of your cluster if you do not want to roll your own solution.
I recommend looking at the example here:
https://github.com/openshift/origin/tree/master/examples/sample-app
It's possible some of the build and deployment hooks may move upstream into the Kubernetes project in the future, but this would serve as a good example of how deployment solutions can be built on top of Kubernetes.