"MountVolume.SetUp failed for volume" with EKS and EFS - amazon-eks

I'm trying to set up EFS with EKS, but when I deploy my pod, I get errors like MountVolume.SetUp failed for volume "efs-pv3" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
in my events.
What is the cause of this?

This turned out to be an issue with the security groups assigned to the EFS mount points. I had created the mount points, but the security groups did not allow traffic from the VPC holding my EKS nodes.
Once I added a properly configured security group to the EFS mount points, that error disappeared.

Here is an automated way to use Phyxx's answer.
At first, I was using the private subnets of my cluster, this led to the error mentioned by this thread. After seeing the answered mentioned, I noticed all my nodes were in the public subnet. Therefore I only had to switch "Private" to "Public" to make it work.
Here is script that worked for me :
CLUSTER_REGION=<YOUR_CLUSTER_REGION>
CLUSTER_NAME=<YOUR_CLUSTER_NAME>
EFS_SECURITY_GROUP_NAME=<YOUR_EFS_SECURITY_GROUP_NAME>
EFS_FILE_SYSTEM_NAME<YOUR_EFS_FILE_SYSTEM_NAME>
create_efs_mount_targets() {
file_system_id=$(aws efs describe-file-systems \
--region $CLUSTER_REGION \
--query "FileSystems[?Name=='$EFS_FILE_SYSTEM_NAME'].FileSystemId" \
--output text) \
&& security_group_id=$(aws ec2 describe-security-groups \
--region $CLUSTER_REGION \
--query 'SecurityGroups[*]' \
--output json \
| jq -r 'map(select(.GroupName=="'$EFS_SECURITY_GROUP_NAME'")) | .[].GroupId') \
&& public_cluster_subnets=$(aws ec2 describe-subnets \
--region $CLUSTER_REGION \
--output json \
--filters Name=tag:alpha.eksctl.io/cluster-name,Values=$CLUSTER_NAME Name=tag:aws:cloudformation:logical-id,Values=SubnetPublic* \
| jq -r '.Subnets[].SubnetId')
if [[ $? != 0 ]]; then
exit 1
fi
for subnet in ${public_cluster_subnets[#]}
do
echo "Attempting to create mount target in "$subnet"..."
aws efs create-mount-target \
--file-system-id $file_system_id \
--subnet-id $subnet \
--security-groups $security_group_id \
&> /dev/null \
&& echo "Mount target created!"
done
}
create_efs_mount_targets

Related

AWS Codebuild build stage shows nginx was installed but container doesn't have nginx

I'm new with AWS Codebuild and trying to deploy my docker application in EKS. On my Dockerfile I'm running RUN apk add nginx and build stage shows that it's being installed.
Step 6/24 : RUN apk add nginx
---> Running in cbd7a4e37bb7
(1/2) Installing pcre (8.44-r0)
(2/2) Installing nginx (1.18.0-r15)
Executing nginx-1.18.0-r15.pre-install
Executing nginx-1.18.0-r15.post-install
Executing busybox-1.32.1-r6.trigger
OK: 15 MiB in 34 packages
Removing intermediate container cbd7a4e37bb7
---> a890efd77e97
After the BUILD stage has been completed, I enter my EKS pod but didn't see any nginx under /etc
/etc # ls
ImageMagick-7 fstab issue my.cnf.d periodic securetty sysctl.d
alpine-release group logrotate.d mysql php7 services terminfo
apk group- modprobe.d network pkcs11 shadow udhcpd.conf
ca-certificates hostname modules openldap profile shadow- wgetrc
ca-certificates.conf hosts modules-load.d opt profile.d shells
conf.d init.d motd os-release protocols ssh
crontabs inittab mtab passwd resolv.conf ssl
fonts inputrc my.cnf passwd- rsyncd.conf sysctl.conf
I manually executed apk add nginx inside my pod and that's the only time nginx appeared under /etc
My AWS Codebuild is using the following configurations:
Managed image
OS: Amazon Linux 2
Runtime: Standard
Image: aws/codebuild/amazonlinux2-x86_64-standard:3.0
Image version: Always use the latest image for this runtime version
Privilaged: Enabled
Also,
I noticed that AWS Build doesn't execute the following COPY command in my Dockerfile:
COPY docker/php/php.ini /usr/local/etc/php/ (this works)
COPY docker/nginx/conf.d/default.conf /etc/nginx/conf.d/default.conf (this doesn't, probably because nginx didn't get installed on runtime)
COPY docker/nginx/docker-entrypoint.sh /usr/local/bin/docker-entrypoint-d9 (this doen't work.. still a mystery)
I hope I was able to provide the full details. Please let me know if you need more information. Thank you.
buildspec.yml
version: 0.2
env:
variables:
PHP_VERSION: 7.4
APP_HOME: /var/www/html
phases:
pre_build:
commands:
- echo Logging in to Amazon ECR...test4
- aws --version
- docker --version
- echo $AWS_DEFAULT_REGION
- $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
- export KUBECONFIG=$HOME/.kube/config
- export PHP_VERSION=$PHP_VERSION
- REPOSITORY_URI=xxxxxxx.xxx.xxx.xx-xxxxxxx-x.amazonaws.com/development-ecr
- COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
- IMAGE_TAG=$CODEBUILD_BUILD_NUMBER
# - IMAGE_TAG=build-$(echo $CODEBUILD_BUILD_ID | awk -F":" '{print $2}')
build:
commands:
- echo "PHP Version is $PHP_VERSION"
- echo "$APP_HOME"
- echo Running Docker compose..
- echo Build started on `date`
- echo Building the Docker image...
- docker build --build-arg PHP_VERSION=${PHP_VERSION} -t $REPOSITORY_URI:latest test/
- docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
post_build:
commands:
- echo Build completed on `date`
- echo Pushing the Docker images...
- docker push $REPOSITORY_URI:latest
- docker push $REPOSITORY_URI:$IMAGE_TAG
- echo "Setting Environment Variables related to AWS CLI for Kube Config Setup"
- CREDENTIALS=$(aws sts assume-role --role-arn $EKS_KUBECTL_ROLE_ARN --role-session-name codebuild-kubectl --duration-seconds 900)
- export AWS_ACCESS_KEY_ID="$(echo ${CREDENTIALS} | jq -r '.Credentials.AccessKeyId')"
- export AWS_SECRET_ACCESS_KEY="$(echo ${CREDENTIALS} | jq -r '.Credentials.SecretAccessKey')"
- export AWS_SESSION_TOKEN="$(echo ${CREDENTIALS} | jq -r '.Credentials.SessionToken')"
- export AWS_EXPIRATION=$(echo ${CREDENTIALS} | jq -r '.Credentials.Expiration')
# Setup kubectl with our EKS Cluster
- echo "Update Kube Config"
- aws eks update-kubeconfig --name $EKS_CLUSTER_NAME
# Apply changes to our Application using kubectl
- echo "Delete current deployment"
- kubectl delete deployment dev-spectrum-app
- sleep 10
- echo "Apply changes to kube manifests"
- kubectl apply -f kube-yaml/
- echo "Completed applying changes to Kubernetes Objects"
pipeline for other stages
- echo Writing image definitions file...
- printf '[{"name":"dev-spectrum","imageUri":"%s"}]' $REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json
- cat imagedefinitions.json
artifacts:
files: imagedefinitions.json
Dockerfile
ARG PHP_VERSION=7.4
FROM php:${PHP_VERSION:+${PHP_VERSION}-}fpm-alpine
ARG APP_HOME
RUN echo "CHECKING ARGS: $APP_HOME ${PHP_VERSION}"
RUN apk update; \
apk upgrade;
RUN apk add nginx
# Install library extension
RUN apk add libpng libpng-dev libjpeg-turbo libjpeg-turbo-dev freetype-dev libwebp-dev zlib-dev libxpm-dev
# Install MySql
RUN docker-php-ext-install mysqli pdo pdo_mysql
RUN docker-php-ext-install opcache
RUN docker-php-ext-configure gd --with-jpeg --with-freetype \
&& docker-php-ext-install gd
# PHP packages
RUN apk add libressl \
ca-certificates \
openssh-client \
rsync \
git \
curl \
wget \
gzip \
tar \
patch \
perl \
pcre \
imap \
imagemagick \
mariadb-client \
build-base \
autoconf \
libtool \
php7-dev \
pcre-dev \
imagemagick-dev \
php7 \
php7-fpm \
php7-opcache \
php7-session \
php7-dom \
php7-xml \
php7-xmlreader \
php7-ctype \
php7-ftp \
php7-json \
php7-posix \
php7-curl \
php7-pdo \
php7-pdo_mysql \
php7-sockets \
php7-zlib \
php7-mcrypt \
php7-mysqli \
php7-sqlite3 \
php7-bz2 \
php7-phar \
php7-openssl \
php7-posix \
php7-zip \
php7-calendar \
php7-iconv \
php7-imap \
php7-soap \
php7-dev \
php7-pear \
php7-redis \
php7-mbstring \
php7-xdebug \
php7-exif \
php7-xsl \
php7-ldap \
php7-bcmath \
php7-memcached \
php7-oauth \
php7-apcu
# Install Composer
RUN curl -sS https://getcomposer.org/installer | php && \
mv composer.phar /usr/local/bin/composer && \
ln -s /root/.composer/vendor/bin/drush /usr/local/bin/drush
# Install Drush
RUN composer global require drush/drush && \
composer global update
# Copy php.ini memory limit increase
COPY docker/php/php.ini /usr/local/etc/php/
COPY docker/nginx/conf.d/default.conf /etc/nginx/conf.d/default.conf
COPY docker/nginx/docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
#COPY docker/nginx/docker-entrypoint.sh /usr/local/bin/docker-entrypoint-d9
RUN echo ${APP_HOME} ${PHP_VERSION}
WORKDIR /var/www/html
RUN mkdir /run/nginx;
#Copy composer files
COPY composer.json composer.lock ./
RUN composer install --prefer-dist --no-interaction
RUN composer dump-autoload
RUN set -eux; \
chmod +x /usr/local/bin/docker-entrypoint.sh
# chmod +x /usr/local/bin/docker-entrypoint-d9
EXPOSE 80 433
CMD ["nginx"]
#RUN nginx
#RUN php-fpm
ENTRYPOINT ["docker-entrypoint.sh"]

Which API can I use to add my own logs to QEMU for debugging purpose

I tried to add my own logs to qemu by using fprintf(stdout, "my own log") and qemu_log("my own log"), and then compiled the qemu from source code and started a VM by the following command:
/usr/bin/qemu-system-x86_64 \
-D /home/VM1-qemu-log.txt \
-d cpu_reset \
-enable-kvm \
-m 4096 \
-nic user,model=virtio \
-drive file=/var/lib/libvirt/images/VM1.qcow2,media=disk,if=virtio \
-nographic
There are CPU-related logs in VM1-qemu-log.txt, however, I cannot find where "my own log" is. Can anyone advise? Thanks!
qemu_log("my own log") works, I added it to the wrong place(i.e., beginning of the 'main()' in 'qemu/vl.c', where the logging has not been setup yet). By adding it to another place(e.g. in virtio_blk_get_request() under qemu/hw/block/virtio-blk.c), I will be able to see "my own log" in /home/VM1-qemu-log.txt. The VM is created by:
/usr/bin/qemu-system-x86_64 \
-D /home/VM1-qemu-log.txt \
-enable-kvm \
-m 4096 \
-nic user,model=virtio \
-drive file=/var/lib/libvirt/images/VM1.qcow2,media=disk,if=virtio \
-nographic

Tensorflow Serving loading multiple models from S3

We have successfully integrated our container to load models from an s3 bucket. However, we are unable to figure out how to load multiple models. Ideally, we would like to start our container pointing to a bucket that has no models. Over time we would like copy models to s3 which would be served by tensorflow-serving.
docker run \
-p 8501:8501 \
-e S3_USE_HTTPS=0 \
-e S3_VERIFY_SSL=0 \
-e AWS_ACCESS_KEY_ID=<key> \
-e AWS_SECRET_ACCESS_KEY=<access_key> \
-e MODEL_BASE_PATH=s3://test-bucket/tf_model_directory/ \
-e MODEL_NAME=SAR \
-e AWS_LOG_LEVEL=3 \
-t tensorflow/serving
The above command expects a model name to be specified. I was wondering if there is a way to start it without the model name or force tensorflow-serving to pick the config from the s3 path.

Error while running docker container

I am running a docker image using the following command.
docker run -it -p 8080:8080 -p 29418:29418 --rm \
-e AUTH_TYPE='DEVELOPMENT_BECOME_ANY_ACCOUNT' \
-v /home/gerrit-site:/home/gerrit/site \
-v /home/nidhi/.ssh/id_rsa.pub:/root/.ssh/id_admin_rsa.pub \
-v /home/nidhi/.ssh/id_rsa:/root/.ssh/id_admin_rsa \
-e GERRIT_ADMIN_USER='admin' \
-e GERRIT_ADMIN_EMAIL='admin#fabric8.io' \
-e GERRIT_ADMIN_FULLNAME='Administrator' \
-e GERRIT_ADMIN_PWD='mysecret' \
-e GERRIT_ADMIN_PRIVATE_KEY='/home/gerrit/ssh-keys/id_admin_rsa' \
-e GERRIT_PUBLIC_KEYS_PATH='/home/gerrit/ssh-keys' \
-v /home/nidhi/.ssh:/home/gerrit/ssh-keys \
--name gerrit admin_gerrit
I know the command is right cause I had used this command before and it worked perfectly fine. But now, when I run this command I get the following error,
Error response from daemon: Cannot start container 2c9514c3b0d953344e66525d083c7ec3921cb9cde2185f43ec3bec2579597485: stat /home/nidhi/.ssh/id_rsa: permission denied
I checked the permission for the ssh public and private keys. The permission is 700 and is owned by nidhi. Please can someone point out what my error is.
When docker runs, the uid in your container will likely not match the uid on the host. So with a host volume containing files with 700 permissions, that will not be readable by the uid inside the container. Three options come to mind:
To keep the 700 permissions and same image, you'd need to chown the file on the host to match the uid inside the container.
You can use a named volume instead of a host volume, add your credentials to that named volume, and then set permissions inside there to match the containers where you'll use the volume.
Or you can use a different image that's been rebuilt to change the uid to match your own on the host.

multiple KVM guests script using virt-install

I would like install 3 KVM guests automatically using kickstart.
I have no problem installing it manually using virt-install command.
virt-install \
-n dal \
-r 2048 \
--vcpus=1 \
--os-variant=rhel6 \
--accelerate \
--network bridge:br1,model=virtio \
--disk path=/home/dal_internal,size=128 --force \
--location="/home/kvm.iso" \
--nographics \
--extra-args="ks=file:/dal_kick.cfg console=tty0 console=ttyS0,115200n8 serial" \
--initrd-inject=/opt/dal_kick.cfg \
--virt-type kvm
I have 3 scripts like the one above - i would like to install all 3 at the same time, how can i disable the console? or running it in the background?
Based on virt-install man page:
http://www.tin.org/bin/man.cgi?section=1&topic=virt-install
--noautoconsole
Don't automatically try to connect to the guest console. The
default behaviour is to launch virt-viewer(1) to display the
graphical console, or to run the "virsh" "console" command to
display the text console. Use of this parameter will disable this
behaviour.
virt-install will connect console automatically. If you don't want,
just simply add --noautoconsole in your cmd like
virt-install \
-n dal \
-r 2048 \
--vcpus=1 \
--quiet \
--noautoconsole \
...... other options
We faced the same problem and at the end the only way we found was to create new threads with the &.
We also include the quiet option, not mandatory.
---quiet option (Only print fatal error messages).
virt-install \
-n dal \
-r 2048 \
--vcpus=1 \
--quiet \
--os-variant=rhel6 \
--accelerate \
--network bridge:br1,model=virtio \
--disk path=/home/dal_internal,size=128 --force \
--location="/home/kvm.iso" \
--nographics \
--extra-args="ks=file:/dal_kick.cfg console=tty0 console=ttyS0,115200n8 serial" \
--initrd-inject=/opt/dal_kick.cfg \
--virt-type kvm &
I know this is kind of old, but I wanted to share my thoughts.
I ran into the same problem, but due to the environment we work in, we need to use sudo with a password (compliance reasons). The solution I came up with was to use timeout instead of &. When we fork it right away, it would hang due to the sudo prompt never appearing. So using timeout with your example above: (we obviously did timeout 10 sudo virt-instal...)
timeout 15 virt-install \
-n dal \
-r 2048 \
--vcpus=1 \
--quiet \
--os-variant=rhel6 \
--accelerate \
--network bridge:br1,model=virtio \
--disk path=/home/dal_internal,size=128 --force \
--location="/home/kvm.iso" \
--nographics \
--extra-args="ks=file:/dal_kick.cfg console=tty0 console=ttyS0,115200n8 serial" \
--initrd-inject=/opt/dal_kick.cfg \
--virt-type kvm
This allowed us to interact with our sudo prompt and send the password over, and then start the build. The timeout doesnt kill the process, it will continue on and so can your script.