I am following the steps in the getting started guide for kubeflow and i got stuck at verify the setup works.
I managed to get this:-
$ kubectl get ns
NAME STATUS AGE
default Active 2m
kube-public Active 2m
kube-system Active 2m
kubeflow-admin Active 14s
but when i do
$ kubectl -n kubeflow get svc
No resources found.
I also got
$ kubectl -n kubeflow get pods
No resources found.
I repeated these both on my mac and my ubuntu VM, and both returned the same problem. Am i missing something here?
Thanks.
Yes you're missing something here and that is to use the correct namespace. Use:
$ kubectl -n kubeflow-admin get all
The process of installing the Kubeflow services involves downloading the container images and therefore it takes time for the services to get up and running.
You can run the command periodically until they are up.
Related
I wanted to share a solution I did with kubernetes and have your opinion on best practice to do in such case. I'm still new to kubernetes.
I had a problem I wanted to be able to update my application by restarting my deployment pod that execute all the necessary action to do that already in command start.
I'm using microk8s and I wanted to just go to the good folder and execute microk8s kubectl apply -f myfilename and let kubernetes handle the rest with rolling update.
My issue was how to set dynamic value inside my .yaml file so the command would detect the change and start the process.
I've planned to do a bash script that do the job like the following:
file="my-file-deployment.yaml"
oldstr=`grep 'my' $file | xargs`
timestamp="$(date +"%Y-%m-%d-%H:%M:%S")"
newstr="value: my-version-$timestamp"
sed -i "s/$oldstr/$newstr/g" $file
echo "old version : $oldstr"
echo "Replaced String : $newstr"
sudo microk8s kubectl apply -f $file
on my deployment.yaml file I'm giving the following env:
env:
- name: version
value: my-version-2022-09-27-00:57:15
I'm switching with timestamp to a new value then I launch the command:
microk8s kubectl apply -f myfilename
it is working great for the moment. I still have to configure startupProbe to have a better rolling update execution because I'm having few second downtime which isn't cool.
Is there a better solution to work with rolling update using microk8s?
If you are trying to trigger a rolling update on your deployment (assuming it is a deployment), you can patch the deployment and let the cluster handle the rollout. Here's a trick I use and it's literally a one-liner:
kubectl -n {namespace} patch deployment {name-of-your-deployment} \
-p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"
This will patch your deployment, adding an annotation to the template block. In this way, the cluster thinks there is a change requiring an update to the deployment's pods, and will cycle them while following the rollingUpdate clause.
The date +'%s' will resolve to a different number each time so every time you run this, it will cause the cluster to cycle the deployment's pods.
We use this trick to force a rolling update when we have done an update that requires our pods to be restarted.
You can accompany this with the rollout status command to wait for the update to complete:
kubectl rollout status deployment/{name-of-your-deployment} -n {namespace}
So a complete line would be something like this if I wanted to rolling update my nginx deployment and wait for it to complete:
kubectl -n nginx patch deployment nginx \
-p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}" \
&& kubectl rollout status deployment/nginx -n nginx
One caveat, though. Using kubectl patch does not make changes to the yamls on disk, so if you wanted a copy of the change recorded locally, such as for auditing purposes, similar to what you are doing at the moment, then you could adapt this to do it as a dry-run and redirect output to file:
kubectl -n nginx patch deployment nginx \
-p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}" \
--dry-run=client \
-o yaml >patched-nginx.yaml
I was setting up my new Mac for my eks environment.
After the installation of kubectl, aws-iam-authenticator and the kubeconfig file placement in default location. I ran the command kubectl command and got this error mentioned below in command block.
My cluster uses v1alpha1 client auth api version so basically i wanted to use the same one in my Mac as well.
I tried with latest version (1.23.0) of kubectl as well, still the same error. Whereas When i tried to do with aws-iam-authenticator (version 0.5.5) I was not able to download lower version.
Can someone help me to resolve it?
% kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0", GitCommit:"af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38", GitTreeState:"clean", BuildDate:"2020-12-08T17:59:43Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"darwin/amd64"}
Unable to connect to the server: getting credentials: exec plugin is configured to use API version client.authentication.k8s.io/v1alpha1, plugin returned version client.authentication.k8s.io/v1beta1
Thanks and Regards,
Saravana
I have the same problem
You're using aws-iam-authenticator 0.5.5, AWS changed the way it behaves in 0.5.4 to require v1beta1.
It depends on your configuration, but you can try to change the K8s context you're using to v1beta1
by checking your kubeconfig file (usually in ~/.kube/config) from client.authentication.k8s.io/v1alpha1 to client.authentication.k8s.io/v1beta1
Otherwise switch back to aws-iam-authenticator 0.5.3 - you might need to build it from source if you're using the M1 architecture as there's no darwin-arm64 binary built for it
This worked for me using M1 chip
sed -i .bak -e 's/v1alpha1/v1beta1/' ~/.kube/config
I fixed the issue with command below
aws eks update-kubeconfig --name mycluster
I also solved this by updating the apiVersion value in my kube config file (~/.kube/config).
client.authentication.k8s.io/v1alpha1 to client.authentication.k8s.io/v1beta1
Also make sure the AWS CLI version is up-to-date. Otherwise, AWS IAM Authenticator might not work with v1beta1:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install --update
This might be helpful to fix this issue for those who were using GitHub actions.
For my situation I was using kodermax/kubectl-aws-eks with GitHub actions.
I added the KUBECTL_VERSION and IAM_VERSION environment variables for each steps using kodermax/kubectl-aws-eks to keep them in fixed versions.
- name: deploy to cluster
uses: kodermax/kubectl-aws-eks#master
env:
KUBE_CONFIG_DATA: ${{ secrets.KUBE_CONFIG_DATA_STAGING }}
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: my-app
IMAGE_TAG: ${{ github.sha }
KUBECTL_VERSION: "v1.23.6"
IAM_VERSION: "0.5.3"
Using kubectl 1.21.9 fixed it for me, with asdf:
asdf plugin-add kubectl https://github.com/asdf-community/asdf-kubectl.git
asdf install kubectl 1.21.9
And I would recommend having a .tools-versions file with:
kubectl 1.21.9
This question is a duplicate of error: exec plugin: invalid apiVersion "client.authentication.k8s.io/v1alpha1" CircleCI
Please change the authentication apiVersion from v1alpha1 to v1beta1.
Old
apiVersion: client.authentication.k8s.io/v1alpha1
New
apiVersion: client.authentication.k8s.io/v1beta1
Sometimes this can happen if the Kube cache is corrupted (which happened in my case).
Deleting and recreating the below folder worked for me.
sudo rm -rf $HOME/.kube && mkdir -p $HOME/.kube
I am trying to start busybox container as non root on CentOS 8 server, but its giving the below message.
What is the correct way to start the container as non-root user?
podman run -it --name busy docker.io/library/busybox sh
Trying to pull docker.io/library/busybox...Getting image source signatures
Copying blob bdbbaa22dec6 done
Copying config 6d5fcfe5ff done
Writing manifest to image destination
Storing signatures
ERRO[0003] Error pulling image ref //busybox:latest: Error committing the finished image: error adding layer with blob "sha256:bdbbaa22dec6b7fe23106d2c1b1f43d9598cd8fc33706cc27c1d938ecd5bffc7": Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 65534:65534 for /home): lchown /home: invalid argument
Failed
Error: unable to pull docker.io/library/busybox: unable to pull image: Error committing the finished image: error adding layer with blob "sha256:bdbbaa22dec6b7fe23106d2c1b1f43d9598cd8fc33706cc27c1d938ecd5bffc7": Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 65534:65534 for /home): lchown /home: invalid argument
Yes, the command you run is correct. On my Fedora 31 system it works just fine.
[testuser#fedora31 ~]$ podman run -it --name busy docker.io/library/busybox sh
Trying to pull docker.io/library/busybox...
Getting image source signatures
Copying blob bdbbaa22dec6 done
Copying config 6d5fcfe5ff done
Writing manifest to image destination
Storing signatures
/ # exit
[testuser#fedora31 ~]$ podman --version
podman version 1.8.0
[testuser#fedora31 ~]$
The flag --rm is also often useful.
It seems the error you get is related to the UID mapping.
Here is some information regarding running "rootless" podman:
https://github.com/containers/libpod/blob/master/docs/tutorials/rootless_tutorial.md
What also might be interesting:
"Does not work on NFS or parallel filesystem homedirs"
Quote from
https://github.com/containers/libpod/blob/master/rootless.md
When creating Docker containers I keep running into the issue of the UID/GID not being reflected in the container (I realize this is by design). What I am looking for is a way to keep host permissions reasonable and / or to replicate the UID/GID from the host user / group accounts in my Docker container. For instance:
host -
woot4moo:x:504:504:woot4moo:/home/woot4moo:/bin/bash
I would like this same behavior in the Docker container. That being said, is this even the right way to do this type of thing? My belief is I could simply run:
useradd -u 504 -g 504 woot4moo
as part of my Dockerfile, but I am not sure if that is valid.
You wouldn't want to run that as part of the image build process (in your Dockerfile), because the host on which someone is running a container is often not the host on which you are building the image.
One way of solving this is passing in UID/GID information via environment variables:
docker run -e APP_UID=100 -e APP_GID=100 ...
And then have an ENTRYPOINT script that includes something like the following before running the CMD:
useradd -c 'container user' -u $APP_UID -g $APP_GID appuser
chown -R $APP_UID:$APP_GID /app/data
I had similar issues and typically included entrypoint scripts in every image as it has already been mentioned (using https://github.com/ncopa/su-exec for interactive terminal programs). However, I kept repeating the same steps in multiple Dockerfiles. But after I used "docker.inside" from Jenkins Pipeline which does the user id handling auto-magically, I decided to build a Python 3 package based on docker-py to do this in a (hopefully) similar way (with some extended features I found helpful):
https://github.com/boon-code/docker-inside
I realize that the post is rather old; Maybe it's still helpful to someone with the same problem...
Here i am creating a test machine(dev) using the docker machine.
$ docker-machine create -d virtualbox dev
Creating CA: C:\Users\xxx\.docker\machine\certs\ca.pem
Creating client certificate: C:\Users\xxx\.docker\machine\certs\cert.pem
Creating VirtualBox VM...
Creating SSH key...
Starting VirtualBox VM...
Starting VM...
The vm gets created and runs with out flaws.
And here is the error when i run the following command:
$ docker-machine env dev
open C:\Users\xxx\.docker\machine\machines\dev\ca.pem: The system cannot fin
d the file specified.
I have no idea how to deal with this problem. Tried restarting boot2docker.
You should try using docker-machine regenerate-certs dev. The problem i think is that somehow your .pem file got deleted or was not created. I had the same issue and regenerating the certs fixed the problem (reboot did not help btw).
I guess you are getting Docker-machine : ca.pem not found error even when you use docker info or any command with docker
Try this command: docker-machine env -u
output will be similar to:
unset DOCKER_TLS_VERIFY
unset DOCKER_HOST
unset DOCKER_CERT_PATH
unset DOCKER_MACHINE_NAME
# Run this command to configure your shell:
# eval $(docker-machine env -u)
now enter eval $(docker-machine env -u)
this should do the work. Try docker info to be sure finally.
I was getting the exact same error. It turned out to be the Cisco AnyConnect client affecting my networking settings. It's not enough to quit AnyConnect, you have to reboot your machine to restore your settings.
If someone knows more about how AnyConnect is affecting things and if there are solutions better than rebooting, I'd love to hear about it!
Copy certificates from "C:\Users\xxx\.docker\machine\certs"
Paste certificates to "C:\Users\xxx\.docker\machine\machines\dev"
NOTE: This error was on Windows 10 Docker
Here was my error:
#user ➜ git-repo git(users/user/dev) ✗ docker
unable to resolve docker endpoint: open C:\Users\user\.docker\ca.pem: The system cannot find the file specified.
Here is the link to the shell file I used to recreate the certificates I named it generate_docker_cert.sh:
https://gist.github.com/bradrydzewski/a6090115b3fecfc25280
So I went to that directory that the error output:
cd C:\Users\user\.docker\
Created that file:
notepad generate_docker_cert.sh
Copied the values from the link into there and saved.
Then ran that .sh file:
.\generate_docker_cert.sh
Then the docker command worked:
#user ➜ git-repo git(users/user/dev) ✗ docker
Usage: docker [OPTIONS] COMMAND
A self-sufficient runtime for containers
...