Debugging CrashLoopBackOff for an image running as root in openshift origin - openshift-origin

I wanted to start an ubuntu container on a open shift origin. I have my local registry and pulling from it is successful. The container starts but immediately throws CrashLoopBackOff and stops. The ubuntu image that I have runs as root
Started container with docker id 28250a528e69
Created container with docker id 28250a528e69
Successfully pulled image "ns1.myregistry.com:5000/ubuntu#sha256:6d9a2a1bacdcb2bd65e36b8f1f557e89abf0f5f987ba68104bcfc76103a08b86"
pulling image "ns1.myregistry.com:5000/ubuntu#sha256:6d9a2a1bacdcb2bd65e36b8f1f557e89abf0f5f987ba68104bcfc76103a08b86"
Error syncing pod, skipping: failed to "StartContainer" for "ubuntu" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=ubuntu pod=ubuntu-2-suy6p_testproject(69af5cd9-5dff-11e6-940e-0800277bbed5)"
The container runs with restricted privilege. I dont know how to start the pod with a privileged mode, so edited my restricted mode as follows so that my image with root access will run
> NAME PRIV CAPS SELINUX RUNASUSER FSGROUP
> SUPGROUP PRIORITY READONLYROOTFS VOLUMES restricted true
> [] RunAsAny RunAsAny RunAsAny RunAsAny <none>
> false [configMap downwardAPI emptyDir persistentVolumeClaim
> secret]
But still I couldnt successfully start my container ?

There are two commands that helpful for crashloopbackoff debugging.
oc debug pod/your-pod-name will create a very similar pod and exec into it. You can look at the different options for launching it, some deal with SCC options. You can also use dc, rc, is, most things that can stamp out pods.
oc logs -p pod/your-pod-name will retrieve the logs from the last run of the pod, which may have useful information too.

Related

How to Use Docker Build Secrets with Kaniko

Context
Our current build system builds docker images inside of a docker container (Docker in Docker). Many of our docker builds need credentials to be able to pull from private artifact repositories.
We've handled this with docker secrets.. passing in the secret to the docker build command, and in the Dockerfile, referencing the secret in the RUN command where its needed. This means we're using docker buildkit. This article explains it.
We are moving to a different build system (GitLab) and the admins have disabled Docker in Docker (security reasons) so we are moving to Kaniko for docker builds.
Problem
Kaniko doesn't appear to support secrets the way docker does. (there are no command line options to pass a secret through the Kaniko executor).
The credentials the docker build needs are stored in GitLab variables. For DinD, you simply add those variables to the docker build as a secret:
DOCKER_BUILDKIT=1 docker build . \
--secret=type=env,id=USERNAME \
--secret=type=env,id=PASSWORD \
And then in docker, use the secret:
RUN --mount=type=secret,id=USERNAME --mount=type=secret,id=PASSWORD \
USER=$(cat /run/secrets/USERNAME) \
PASS=$(cat /run/secrets/PASSWORD) \
./scriptThatUsesTheseEnvVarCredentialsToPullArtifacts
...rest of build..
Without the --secret flag to the kaniko executor, I'm not sure how to take advantage of docker secrets... nor do I understand the alternatives. I also want to continue to support developer builds. We have a 'build.sh' script that takes care of gathering credentials and adding them to the docker build command.
Current Solution
I found this article and was able to sort out a working solution. I want to ask the experts if this is valid or what the alternatives might be.
I discovered that when the kaniko executor runs, it appears to mount a volume into the image that's being built at: /kaniko. That directory does not exist when the build is complete and does not appear to be cached in the docker layers.
I also found out that if if the Dockerfile secret is not passed in via the docker build command, the build still executes.
So my gitlab-ci.yml file has this excerpt (the REPO_USER/REPO_PWD variables are GitLab CI variables):
- echo "${REPO_USER}" > /kaniko/repo-credentials.txt
- echo "${REPO_PWD}" >> /kaniko/repo-credentials.txt
- /kaniko/executor
--context "${CI_PROJECT_DIR}/docker/target"
--dockerfile "${CI_PROJECT_DIR}/docker/target/Dockerfile"
--destination "${IMAGE_NAME}:${BUILD_TAG}"
Key piece here is echo'ing the credentials to a file in the /kaniko directory before calling the executor. That directory is (temporarily) mounted into the image which the executor is building. And since all this happens inside of the kaniko image, that file will disappear when kaniko (gitlab) job completes.
The developer build script (snip):
//to keep it simple, this assumes that the developer has their credentials//cached in a file (ignored by git) called dev-credentials.txt
DOCKER_BUILDKIT=1 docker build . \
--secret id=repo-creds,src=dev-credentials.txt
Basically same as before. Had to put it in a file instead of environment variables.
The dockerfile (snip):
RUN --mount=type=secret,id=repo-creds,target=/kaniko/repo-credentials.txt USER=$(sed '1q;d' /kaniko/repo-credentials.txt) PASS=$(sed '2q;d' /kaniko/repo-credentials.txt) ./scriptThatUsesTheseEnvVarCredentialsToPullArtifacts...rest of build..
This Works!
In the Dockerfile, by mounting the secret in the /kaniko subfolder, it will work with both the DinD developer build as well as with the CI Kaniko executor.
For Dev builds, DinD secret works as always. (had to change it to a file rather than env variables which I didn't love.)
When the build is run by Kaniko, I suppose since the secret in the RUN command is not found, it doesn't even try to write the temporary credentials file (which I expected would fail the build). Instead, because I directly wrote the varibles to the temporarily mounted /kaniko directory, the rest of the run command was happy.
Advice
To me this does seem more kludgy than expected. I'm wanting to find out other/alternative solutions. Finding out the /kaniko folder is mounted into the image at build time seems to open a lot of possibilities.

Minikube on Windows and HyperV: Stuck on prompt "minikube login"

I'm "extremely" new to Kubernetes, and I wanted to try it out on my local machine, which is running Windows 10 along with HyperV. I saw that minikube is used for local development, and I was able to find in on Chocolatey, so I installed it using that:
choco install minikube -y
(I think this also installs kubectl)
The problem I have is that I'm not able to start it; I'm running the following command:
minikube start --vm-driver=hyperv
I have an external switch configured in HyperV (I found it as a suggestion somewhere), but when I run the command, it's stuck in Creating VM ...
I thought maybe it would give me a clue if I look at the VM created in HyperV, and when I open that, I see the following:
So, it seems that it's waiting for input, and that's why it's stuck! I tried searching for the problem, but to no avail.
I would appreciate any help
PS: It seems to me that if I wait long enough, the following message appears on the console:
Temporary Error: provisioning: error getting ssh client: Error dialing
tcp via ssh client: ssh: handshake failed: ssh: unable to authenticate,
attempted methods [none publickey], no supported methods remain
So, somehow by chance, I think I found how to resolve the issue.
First thing is that: the fact the VM is displaying that prompt (minikube login) seems to be normal, and it does NOT prevent the minikube start from succeeding.
To resolve the issue, this is what I did:
Delete ~/.kube directory
Delete ~/.minikube directory (in case it exists)
The MOST IMPORTANT step: stop/start the Hyper-V Virtual Machine Management Windows service
These steps seem to have solved the issue for me
PS: I used this command to start minikube and enable verbose logging:
minikube start --vm-driver hyperv -v 7 --alsologtostderr
Farzad, what resources have you used for setting up the minikube? Can you please clarify what do you mean by "unable to start". Are the regular kubectl commands working?
For example kubectl get nodes? That is of course if below steps won't help you.
The screenshot you shared shows a running VM:
Minikube runs a single-node Kubernetes cluster inside a VM on your
laptop for users looking to try out Kubernetes or develop with it
day-to-day.
You mentioned that you've created the vSwitch, you should be using a flag that is pointing minikube to use that external vSwitch:
minikube start --vm-driver hyperv --hyperv-virtual-switch "vSwitch name"
You also mentioned choco, did you install kubernetes-cli (as you did not mention it in the question)? It might be the reason why your commands do not work (seems like the new version downloads kubectl with choco install minikube):
kubectl is a command line interface for running commands against
Kubernetes clusters
At this moment I recommend stoping the minikube VM:
minikub stop
Delete the cluster
minikube delete
Sometimes regular minikube stop, minikube delete does not work so you might have to manually turn off the minikubeVM in Hyper-V, then I recommend to go to c:\users\%username%\ and delete .kube and .minikube.
Use cuninst minikube
Restart and install again as specified in the minikube documentation:
choco install minikube
choco install kubernetes-cli
As for the error you mentioned, let's try to run the cluster properly, and if this persists, we will take care of it.
Try this:
kubectl config use-context minikube
I encountered the same issue. The reason was I chose the wrong disk file to start my VM after creating it in Virtual Box.
This solved my issue.
minikube delete
minikube start --vm-driver hyperv -v 7 --alsologtostderr

Changing permissions of added file to a Docker volume

In the Docker best practices guide it states:
You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image.
And by looking at the source code for e.g. the cpuguy83/nagios image this can clearly be seen done, as everything from nagios to apache config directories are made available as volumes.
However, looking at the same image the apache service (and cgi-scripts for nagios) are run as the nagios user by default. So now I'm in a pickle, as I can't seem to figure how to add my own config files in order to e.g. define more hosts for nagios monitoring. I've tried:
FROM cpuguy83/nagios
ADD my_custom_config.cfg /opt/nagios/etc/conf.d/
RUN chown nagios: /opt/nagios/etc/conf.d/my_custom_config.cfg
CMD ["/opt/local/bin/start_nagios"]
I build as normal, and try to run it with docker run -d -p 8000:80 <image_hash>, however I get the following error:
Error: Cannot open config file '/opt/nagios/etc/conf.d/my_custom_config.cfg' for reading: Permission denied
And sure enough, the permissions in the folder looks like (whist the apache process runs as nagios):
# ls -l /opt/nagios/etc/conf.d/
-rw-rw---- 1 root root 861 Jan 5 13:43 my_custom_config.cfg
Now, this has been answered before (why doesn't chown work in Dockerfile), but no proper solution other than "change the original Dockerfile" has been proposed.
To be honest, I think there's some core concept here I haven't grasped (as I can't see the point of declaring config directories as VOLUME nor running services as anything other than root) - so provided a Dockerfile as above (which follows Docker best practices by adding multiple volumes) is the solution/problem:
To change NAGIOS_USER/APACHE_RUN_USER to 'root' and run everything as root?
To remove the VOLUME declarations in the Dockerfile for nagios?
Other approaches?
How would you extend the nagios dockerfile above with your own config file?
Since you are adding your own my_custom_config.cfg file directly into the container at build time just change the permissions of the my_custom_config.cfg file on your host machine and then build your image using docker build. The host machine permissions are copied into the container image.

Vagrant stuck in "Waiting for VM to Boot"

I want to preface this question by mentioning that I have indeed looked over most if not all vagrant "Waiting for VM to Boot" troubleshooting threads:
Things I've tried include:
vagrant failed to connect VM
https://superuser.com/questions/342473/vagrant-ssh-fails-with-virtualbox
https://github.com/mitchellh/vagrant/issues/410
http://vagrant.wikia.com/wiki/Usage
http://scotch.io/tutorials/get-vagrant-up-and-running-in-no-time
And more.
Here's how I setup my Vagrant:
Note: We are using Vagrant 1.2.2 since we do not at the moment have time to change configs to newer versions. I am also using VirtualBox 4.2.26.
My office has an /official/ folder which includes things such as Vagrantfile inside. Inside my Vagrantfile are these custom settings:
config.vm.box = "my_box"
config.ssh.private_key_path = "~/.ssh/github_rsa"
config.ssh.forward_agent = true
config.ssh.forward_x11 = true
config.ssh.max_tries = 300
config.vm.provision :shell, :inline => "/etc/init.d/networking restart"
I installed our custom box (called package.box) via vagrant box add my_box absolute_path/package.box which went without a hitch.
Running vagrant up, I would look at the "preview" of the VirtualBox, and it would simply be stuck at the login page. My Terminal would also only say: Waiting for VM to boot. This can take a few minutes. As far as I know, this is an SSH issue. Or my private key issues, though in my Vagrantfile I explicitly pointed to my private key location.
Interesting Notes:
Running dhclient within the VirtualBox GUI, it says command no found. Running sudo dhclient eth0 was one of the suggested fixes.
This fix: https://superuser.com/a/343775/298915 of "modify the /etc/rc.local file to include the line sh /etc/init.d/networking restart just before exit 0." did nothing to fix the issue.
Conclusion:
Having tried to re-install everything thinking I messed up a file, it did not seem to ameliorate the issue. I am unable to work with this issue. Could someone give me some insight?
So after around twelve hours of dejected troubleshooting, I was able to (finally) get the VM to boot.
Setup your private/public keys using the link provided. My box is a Debian Linux 3.2.0-4-amd64, so instead of /root/.ssh/id_rsa.pub, you have to use /home/vagrant/.ssh/id_rsa.pub (and the respective id_rsa path for the private key).
Note: make sure your files have the right permissions. Check using ls -l path, and change using chmod. Your machine may not have /home/vagrant/.ssh/authorized_keys, so generate that file with touch /home/vagrant/.ssh/authorized_keys.
Boot your VM using the VirtualBox GUI using (through either Vagrantfile boot-GUI command, or starting your VM using VirtualBox). Login using vagrant and vagrant when prompted.
Within the GUI, manually start dhclient using sudo dhclient eth0 -v. Why is it off by default? I have no idea. I found out that it was off when I tried to wget the private/public keys in the tutorial above, but was unable to.
Go to your local machine's command line and reload vagrant using vagrant reload. It should boot, and no longer hang at "Waiting for VM to Boot."
This worked for me. Though it may be different for other machines, for whatever reason Vagrant likes to break.
Suggestion: can this be saved as a script so we don't need to manually do this everytime?
EDIT: Update to the latest version of Vagrant, and you will never see this issue again. About time, huh?

Docker: How to live sync host folder with container folder?

I am working on a website powered by Node. So I have made a simple Dockerfile that adds my site's files to the container's FS, installs Node and runs the app when I run the container, exposing the private port 80.
But if I want to change a file for that app, I have rebuild the container image and re-run it. That takes some seconds.
Is there an easy way to have some sort of "live sync", NFS like, to have my host system's app files be in sync with the ones from the running container?
This way I only have to relaunch it to have changes apply, or even better, if I use something like supervisor, it will be done automatically.
You can use volumes in order to do this. You have two options:
Docker managed volumes:
docker run -v /src/path nodejsapp
docker run -i -t -volumes-from <container id> bash
The file you edit in the second container will update the first one.
Host directory volume:
docker run -v `pwd`/host/src/path:/container/src/path nodejsapp
The changes you make on the host will update the container.
If you are under OSX, those kind of volume shares can become very slow, especially with node-based apps ( a lot of files ). For this issue, http://docker-sync.io can help, by providing a volume-share like synchronisation, without using volume shares, this usually speeds up your container read/write speed of the code-directory from 50-80 times, depending on what docker-machine you use.
For performance see https://github.com/EugenMayer/docker-sync/wiki/4.-Performance and for easy examples how to use it, see the boilerplates https://github.com/EugenMayer/docker-sync-boilerplate for your case the unison example https://github.com/EugenMayer/docker-sync-boilerplate/tree/master/unison is the one you would need for NFS like sync
docker run -dit -v ~/my/local/path:/container/path/ myimageId
For /container/path/ you could use for instance /usr/src/app.
The flags:
-d = detached mode,
-it = interactive,
-v + paths = specifies the volume.
(If you just care about the volume, you can drop the -dit flag.)
Docker run reference
I use Scaffold's File Sync functionality for this. It gets the job done, and without needing overly complex configuration.
Setting up Scaffold in my project was as simple as installing Skaffold (through chocolatey, since I'm on Windows), running skaffold init --generate-manifests in my project folder, and answering a couple questions it asked.