Scaling AKS from 1 to 2 nodes: cannot create pods on Node0 and exec (or view logs) on Node1 - azure-container-service

After scaling AKS from 1 to 2 nodes, I got 2 issues.
Node0 cannot create new pods. They got stucked at ContainerCreating status.
Failed create pod sandbox. Error syncing pod.
Node1 cannot EXEC and view LOGS:
the server has asked for the client to provide credentials.
Kubernetes version: 1.8.7
Please advise, thanks!!
UPDATE Jun 14, 2018
Issue 1: RESOLVED by restarting the node from Azure Portal and kubectl drain.
Add some screenshots for Issue 2 (cannot EXEC and LOGS)
Logs screenshot
Exec screenshot


Digital Ocean droplet & gitlab runner problem

I am recently working on GitLab CI/CD and I want to set up a runner on digital ocean droplet however I get the following error:
$ docker network create web
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Cleaning up project directory and file based variables
ERROR: Job failed: exit code 1
how should I avoid this problem, consider that the docker is up and running on the droplet ubuntu with 8 GB ram
it can be only one of this reason:
a) gitlab-runner user is not in docker group
id gitlab-runner
should show something like
uid=998(gitlab-runner) gid=998(gitlab-runner) groups=998(gitlab-runner),1001(docker)
b) docker service is not running in droplet

How connect to MSK cluster from EKS cluster

I am having difficulties connecting to my MSK cluster from my EKS cluster even though both clusters share the same VPC and the same subnets.
The security group used by the MSK cluster has the following inbound rules
port range
all traffic
all traffic
anywhere ipv4
Where SG_ID is the EKS' Cluster security group.
The one labeled: EKS created security group applied...
In the EKS cluster, I am using the following commands to test connectivity:
kubectl run kafka-consumer \
-ti \ \
--rm=true \
--restart=Never \
-- bin/ --create --topic test --bootstrap-server --replication-factor 2 --partitions 1 --if-not-exists
With the following result
Error while executing topic command : Call(callName=createTopics, deadlineMs=1635906680860, tries=1, nextAllowedTryMs=1635906680961) timed out at 1635906680861 after 1 attempt(s)
[2021-11-03 02:31:20,865] ERROR org.apache.kafka.common.errors.TimeoutException: Call(callName=createTopics, deadlineMs=1635906680860, tries=1, nextAllowedTryMs=1635906680961) timed out at 1635906680861 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: createTopics
pod "kafka-consumer" deleted
pod default/kafka-consumer terminated (Error)
Sadly, the second bootstrap server displayed on the MSK Page gives the same result.
nc eventually times out
kubectl run busybox -ti --image=busybox --rm=true --restart=Never -- nc
nslookup fails as well
kubectl run busybox -ti --image=busybox --rm=true --restart=Never -- nslookup
If you don't see a command prompt, try pressing enter.
*** Can't find No answer
Could anyone please give me a hint?
I need to connect MSK from my EKS pod. So I searched this doc, I want to share my solution, hope can help others:
This my config file:
root#kain:~/work# cat kafkaconfig
sasl.mechanism=AWS_MSK_IAM required;
This is my command:
./ --list --bootstrap-server <My MSK bootstrap server>:9098 --command-config ./kafkaconfig
For this command, there are 2 preconditions we need to make sure,
one is you have access to aws msk, (I access MSK from my eks pod, and my eks pod has OIDC to access the AWS).
Second is we need to has AWS auth jar file: aws-msk-iam-auth.jar
put it to kafkaclient libs directory or export CLASSPATH=/aws-msk-iam-auth-1.1.4-all.jar
reference doc:

Unable to create AKS cluster in westeurope location

Trying to setup an AKS cluster using this guide in the westeurope location but it keeps failing at this step.
When executing this command az aks create --location westeurope --resource-group <myResourceGroup> --name <myAKSCluster> --node-count 1 --generate-ssh-keys
I continuously get the following error message:
Operation failed with status: 'Bad Request'. Details: The VM size of Agent is not allowed in your subscription in location 'westeurope'. Agent VM size 'Standard_DS1_v2' is available in locations: australiaeast,australiasoutheast,brazilsouth,canadacentral,canadaeast,centralindia,centralus,centraluseuap,eastasia,eastus,eastus2euap,japaneast,japanwest,koreacentral,koreasouth,northcentralus,northeurope,southcentralus,southindia,uksouth,ukwest,westcentralus,westindia,westus,westus2.
Even when I explicitly set the VM size to a different type of VM I still get a similar error. For example:
az aks create --location westeurope --resource-group <myResourceGroup> --name <myAKSCluster> --node-vm-size Standard_B1s --node-count 1 --generate-ssh-keys
results in:
Operation failed with status: 'Bad Request'. Details: The VM size of Agent is not allowed in your subscription in location 'westeurope'. Agent VM size 'Standard_B1s' is available in locations: australiaeast,australiasoutheast,brazilsouth,canadacentral,canadaeast,centralindia,centralus,centraluseuap,eastasia,eastus,eastus2euap,japaneast,japanwest,koreacentral,koreasouth,northcentralus,northeurope,southcentralus,southindia,uksouth,ukwest,westcentralus,westindia,westus,westus2.
It looks likes creating an AKS cluster in westeurope is forbidden / not possible at all. Anybody created a cluster in this location succesfully?
This is a common problem atm for westeurope, looks like a Bug in Azure AKS. The VM's can be created through "Virtual machines" but not AKS.
Here is a different thread on this topic:
you just need to add --node-vm-size Standard_D2s_v3 in you command. It resolved my issue.
Note: It is to be noted that you need to pass Standard_D2s_v3 according to your region like my region WestUS supports Standard_d16ads_v5. The above command in the question will return the available vm sizes in the exception.

Ambari cluster : Host registration failed

I am setting up an ambari cluster with 3 virtualbox VMs running Ubuntu 16.04LTS.
I followed this hortonworks tutorial.
However when I am going to create a cluster using Ambari Cluster Install Wizard I get the below error during the step 3 - "Confirm Hosts".
26 Jun 2017 16:41:11,553 WARN [Thread-34] BSRunner:292 - Bootstrap process timed out. It will be destroyed.
26 Jun 2017 16:41:11,554 INFO [Thread-34] BSRunner:309 - Script log Mesg
INFO:root:BootStrapping hosts ['', ''] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: ubuntu16 with user 'thanuja'with ssh Port '22' sshKey File /var/run/ambari-server/bootstrap/5/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/5 ambari:; server_port: 8080; ambari version:; user_run_as: root
INFO:root:Executing parallel bootstrap
Bootstrap process timed out. It was destroyed.
I have read number of posts saying that this is related to not enabling Password-less SSH to the hosts. But I can ssh to the hosts without password from the server.
I am running ambari as non-root user with root privileges.
This post helped me.
I modified the users in host machines so that they can execute sudo commands without password using visudo command.
Please post if you have any alternative answers.

SSH into Kubernetes cluster running on Amazon

Created a 2 node Kubernetes cluster as:
This shows the output as:
Found 2 node(s).
NAME STATUS AGE Ready 57s Ready 55s
Validate output:
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
Cluster validation succeeded
Done, listing cluster services:
Kubernetes master is running at
Elasticsearch is running at
Heapster is running at
Kibana is running at
KubeDNS is running at
kubernetes-dashboard is running at
Grafana is running at
InfluxDB is running at
I can see the instances in EC2 console. How do I ssh into the master node?
Here is the exact command that worked for me:
ssh -i ~/.ssh/kube_aws_rsa admin#<masterip>
kube_aws_rsa is the default key generated, otherwise controlled with AWS_SSH_KEY environment variable. For AWS, it is specified in the file cluster/aws/
More details about the cluster can be found using config view.
"Creates an AWS SSH key named kubernetes-. Fingerprint here is the OpenSSH key fingerprint, so that multiple users can run the script with different keys and their keys will not collide (with near-certainty). It will use an existing key if one is found at AWS_SSH_KEY, otherwise it will create one there. (With the default Ubuntu images, if you have to SSH in: the user is ubuntu and that user can sudo"
You should see the ssh key-fingerprint locally in ssh config or set the ENV and recreate.
If you are throwing up your cluster on AWS with kops, and use CoreOS as your image, then the login name would be "core".