Redis cluster HA not working in kubernetes - redis

I deployed https://github.com/bitnami/charts/tree/master/bitnami/redis-cluster
with following command
helm install redis-cluster bitnami/redis-cluster --create-namespace -n redis -f redis-values.yaml
redis-values.yaml
cluster:
init: true
nodes: 6
replicas: 1
podDisruptionBudget:
minAvailable: 2
persistence:
size: 1Gi
password: "redis#pass"
redis:
configmap: |-
maxmemory 600mb
maxmemory-policy allkeys-lru
maxclients 40000
cluster-require-full-coverage no
cluster-allow-reads-when-down yes
sysctlImage:
enabled: true
mountHostSys: true
command:
- /bin/sh
- -c
- |-
insta
sysctl -w net.core.somaxconn=10000
echo never > /host-sys/kernel/mm/transparent_hugepage/enabled
# echo never > /host-sys/kernel/mm/transparent_hugepage/defrag
metrics:
enabled: true
Now cluster working fine but issue only that if i will delete any pod then redis go down i start getting error for redis.
Here is my config for quarkus to connect:
quarkus.redis.hosts=redis://redis-cluster.redis.svc.local:6379
quarkus.redis.master-name=redis-cluster
quarkus.redis.password=redis#pass
quarkus.redis.client-type=cluster

Don't connect with service but use nodes change from
quarkus.redis.hosts=redis://redis-cluster:6379
To
quarkus.redis.hosts=redis://redis-cluster-0.redis-cluster-headless.redis.svc.cluster.local:6379,redis://redis-cluster-1.redis-cluster-headless.redis.svc.cluster.local:6379,redis://redis-cluster-2.redis-cluster-headless.redis.svc.cluster.local:6379,redis://redis-cluster-3.redis-cluster-headless.redis.svc.cluster.local:6379,redis://redis-cluster-4.redis-cluster-headless.redis.svc.cluster.local:6379,redis://redis-cluster-5.redis-cluster-headless.redis.svc.cluster.local:6379
Format for host is following:
POD-NAME.HEADLESS-SVC-NAME.NAMESPACE.svc.cluster.local:PORT

Related

Recv failure when I use docker-compose for set up redisDB

sorry but I'm new to redis and dockers and I'm getting stuck.
I want to connect redis to my localhost with docker-compose. When I use docker-compose my web and my redis shows that they are ON but when i try to make curl -L http://localhost:8081/ping for test it I get this message "curl: (56) Recv failure:"
I tryed to change my docker-compose.yaml but is not working
docker-compose:
version: '3'
services:
redis:
image: "redis:latest"
ports:
- "6379:6379"
web:
build: .
ports:
- "8081:6379"
environment:
REDIS_HOST: 0.0.0.0
REDIS_PORT: 6379
REDIS_PASSWORD: ""
depends_on:
- redis
Dockerfile
FROM python:3-onbuild
COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt
CMD ["python", "main.py"]
My expected results are this:
curl -L http://localhost:8081/ping
pong
curl -L http://localhost:8081/redis-status
{"redis_connectivity": "OK"}

Running dockerized Behat BDD Tests using Zalenium ( scalable selenium grid )

I am trying to Run Behat BDD Tests using docksal/behat docker-compose ( ref: https://github.com/docksal/behat
Looking at the Zalenium documentation
Pull docker-selenium
docker pull elgalu/selenium
# Pull Zalenium
docker pull dosel/zalenium
# Run it!
docker run --rm -ti --name zalenium -p 4444:4444 \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /tmp/videos:/home/seluser/videos \
--privileged dosel/zalenium start
# Point your tests to http://localhost:4444/wd/hub and run them
integrating docksal/behat and zelenium is not clear
using. the. following docker-compose.yml
# The purpose of this -test example is for the e2e tests of this project
#
# Usage:
# docker-compose -p grid up --force-recreate
# docker-compose -p grid scale mock=1 hub=1 chrome=3 firefox=3
version: '2.1'
services:
hub:
image: elgalu/selenium
ports:
- ${VNC_FROM_PORT-40650}-${VNC_TO_PORT-40700}:${VNC_FROM_PORT-40650}-${VNC_TO_PORT-40700}
zalenium:
image: "dosel/zalenium"
container_name: zalenium
hostname: zalenium
tty: true
volumes:
- /tmp/videos:/home/seluser/videos
- /var/run/docker.sock:/var/run/docker.sock
privileged: true
ports:
- 4444:4444
command: >
start --desiredContainers 2
--maxDockerSeleniumContainers 8
--screenWidth 800 --screenHeight 600
--timeZone "America/New_York"
--videoRecordingEnabled false
--sauceLabsEnabled false
--browserStackEnabled false
--testingBotEnabled false
--cbtEnabled false
--startTunnel false
behat:
hostname: behat
image: docksal/behat
volumes:
- .:/src
# Run a built-in web server for access to HTML reports
ports:
- 8000:8000
entrypoint: "php -S 0.0.0.0:8000"
# browser:
# hostname: browser
# Pick/uncomment one
# image: selenium/standalone-chrome
#image: selenium/standalone-firefox
I can bring up the following containers:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9a1620da4c71 elgalu/selenium:latest "entry.sh" 4 seconds ago Up 4 seconds 40001/tcp, 50001/tcp zalenium_52g9tn
82975a246be8 elgalu/selenium:latest "entry.sh" 5 seconds ago Up 4 seconds 40000/tcp, 50000/tcp zalenium_fdCCbk
862017957ba1 docksal/behat "php -S 0.0.0.0:8000" 2 days ago Up 8 seconds 0.0.0.0:8000->8000/tcp behat-selenium_behat_1
2da2c165a211 elgalu/selenium "entry.sh" 4 days ago Up 6 seconds 0.0.0.0:40650-40700->40650-40700/tcp behat-selenium_hub_1
10df443d8378 dosel/zalenium "entry.sh start --de…" 4 days ago Up 8 seconds 0.0.0.0:4444->4444/tcp, 4445/tcp zalenium
now looking at run-behat in examples directory:
basically executing
$docker exec $(docker-compose ps -q behat) behat --colors --format=pretty --out=std --format=html --out=html_report "$#"
I get an error:
Error response from daemon: Container 411c6b89d8f382a64ed567dbe4d02a2840f06d4778f1ce7bcf955e720d96ab02 is not running
whereas:
$ docker-compose ps -q behat
411c6b89d8f382a64ed567dbe4d02a2840f06d4778f1ce7bcf955e720d96ab02
Perhaps in behat.yml wd_host: http://localhost:4444/wd/hub
it should not point to localhost, but rather the hub? since wd_host should point to the selenium grid running in the container?
This
hub:
image: elgalu/selenium
ports:
- ${VNC_FROM_PORT-40650}-${VNC_TO_PORT-40700}:${VNC_FROM_PORT-40650}-${VNC_TO_PORT-40700}
is not needed at all.
Just point your tests to the zalenium service, since you are using docker-compose.
http://zalenium:4444/wd/hub

Setting up SSL between Helm and Tiller

I am following these instructions to setup SSL between helm and tiller
When I helm-init like this, I get an error
helm init --tiller-tls --tiller-tls-cert ./tiller.cert.pem --tiller-tls-key ./tiller.key.pem --tiller-tls-verify --tls-ca-cert ca.cert.pem
$HELM_HOME has been configured at /Users/Koustubh/.helm.
Warning: Tiller is already installed in the cluster.
(Use --client-only to suppress this message, or --upgrade to upgrade Tiller to the current version.)
Happy Helming!
When I check my pods, I get
tiller-deploy-6444c7d5bb-chfxw 0/1 ContainerCreating 0 2h
and after describing the pod, I get
Warning FailedMount 7m (x73 over 2h) kubelet, gke-myservice-default-pool-0198f291-nrl2 Unable to mount volumes for pod "tiller-deploy-6444c7d5bb-chfxw_kube-system(3ebae1df-e790-11e8-98ae-42010a9800f9)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"tiller-deploy-6444c7d5bb-chfxw". list of unmounted volumes=[tiller-certs]. list of unattached volumes=[tiller-certs default-token-9x886]
Warning FailedMount 1m (x92 over 2h) kubelet, gke-myservice-default-pool-0198f291-nrl2 MountVolume.SetUp failed for volume "tiller-certs" : secrets "tiller-secret" not found
If I try to delete the running tiller pod like this, it just gets stuck
helm reset --debug --force
How can I solve this issue? --upgrade flag with helm init, but that doesn't work either.
I had this issue but resolved it by deleting both the tiller deployment and the service and re-initalising.
I'm also using RBAC so have added those commands too:
# Remove existing tiller:
kubectl delete deployment tiller-deploy -n kube-system
kubectl delete service tiller-deploy -n kube-system
# Re-init with your certs
helm init --tiller-tls --tiller-tls-cert ./tiller.cert.pem --tiller-tls-key ./tiller.key.pem --tiller-tls-verify --tls-ca-cert ca.cert.pem
# Add RBAC service account and role
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
# Re-initialize
helm init --service-account tiller --upgrade
# Test the pod is up
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
tiller-deploy-69775bbbc7-c42wp 1/1 Running 0 5m
# Copy the certs to `~/.helm`
cp tiller.cert.pem ~/.helm/cert.pem
cp tiller.key.pem ~/.helm/key.pem
Validate that helm is only responding via tls
$ helm version
Client: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
Error: cannot connect to Tiller
$ helm version --tls
Client: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
Thanks to
https://github.com/helm/helm/issues/4691#issuecomment-430617255
https://medium.com/#pczarkowski/easily-install-uninstall-helm-on-rbac-kubernetes-8c3c0e22d0d7

create a Docker Swarm v1.12.3 service and mount a NFS volume

I'm unable to get a NFS volume mounted for a Docker Swarm, and the lack of proper official documentation regarding the --mount syntax (https://docs.docker.com/engine/reference/commandline/service_create/) doesnt help.
I have tried basically this command line to create a simple nginx service with a /kkk directory mounted to an NFS volume:
docker service create --mount type=volume,src=vol_name,volume-driver=local,dst=/kkk,volume-opt=type=nfs,volume-opt=device=192.168.1.1:/your/nfs/path --name test nginx
The command line is accepted and the service is scheduled by Swarm, but the container never reaches "running" state and swarm tries to start a new instance every few seconds. I set the daemon to debug but no error regarding the volume shows...
Which is the right syntax to create a service with a NFS volume?
Thanks a lot
I found an article here that shows how to mount nfs share (and that works for me): http://collabnix.com/docker-1-12-swarm-mode-persistent-storage-using-nfs/
sudo docker service create \
--mount type=volume,volume-opt=o=addr=192.168.x.x,volume-opt=device=:/data/nfs,volume-opt=type=nfs,source=vol_collab,target=/mount \
--replicas 3 --name testnfs \
alpine /bin/sh -c "while true; do echo 'OK'; sleep 2; done"
Update:
In case you want to use it with docker-compose you can do it the following:
version: '3'
services:
alpine:
image: alpine
volumes:
- vol_collab:/mount
deploy:
mode: replicated
replicas: 2
command: /bin/sh -c "while true; do echo 'OK'; sleep 2; done"
volumes:
vol_collab:
driver: local
driver_opts:
type: nfs
o: addr=192.168.xx.xx
device: ":/data/nfs"
and then run it with
docker stack deploy -c docker-compose.yml test
you could also this in docker compose to create nfs volume
data:
driver: local
driver_opts:
type: "nfs"
o: addr=<nfs-Host-domain-name>,rw,sync,nfsvers=4.1
device: ":<path to directory in nfs server>"

Docker swarm TLS Failed to validate pending node

I am having this log on my swarm manage container:
time="2016-04-15T02:47:59Z" level=debug msg="Failed to validate pending node: lookup node1 on 10.0.2.3:53: server misbehaving" Addr="node1:2376"
I have set up a github repo to reproduce my problem: https://github.com/casertap/playing-with-swarm-tls
I am running a cluster ok 2 machine (built with vagrant)
$script2 = <<STOP
service docker stop
sed -i 's/DOCKER_OPTS=/DOCKER_OPTS="-H tcp:\\/\\/0.0.0.0:2376 -H unix:\\/\\/\\/var\\/run\\/docker.sock --tlsverify --tlscacert=\\/home\\/vagrant\\/.certs\\/ca.pem --tlscert=\\/home\\/vagrant\\/.certs\\/cert.pem --tlskey=\\/home\\/vagrant\\/.certs\\/key.pem"/' /etc/init/docker.conf
service docker start
STOP
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "ubuntu/trusty64"
config.vm.define "node1" do |app|
app.vm.network "private_network", ip: "192.168.33.10"
app.vm.provision "file", source: "ca.pem", destination: "~/.certs/ca.pem"
app.vm.provision "file", source: "node1-cert.pem", destination: "~/.certs/cert.pem"
app.vm.provision "file", source: "node1-priv-key.pem", destination: "~/.certs/key.pem"
app.vm.provision "file", source: "node1.csr", destination: "~/.certs/node1.csr"
app.vm.provision "docker"
app.vm.provision :shell, :inline => $script2
end
config.vm.define "swarm" do |app|
app.vm.network "private_network", ip: "192.168.33.12"
app.vm.provision "shell", inline: "echo '192.168.33.10 node1' >> /etc/hosts"
app.vm.provision "shell", inline: "echo '192.168.33.12 swarm' >> /etc/hosts"
app.vm.provision "docker"
app.vm.provision "file", source: "ca.pem", destination: "~/.certs/ca.pem"
app.vm.provision "file", source: "swarm-cert.pem", destination: "~/.certs/cert.pem"
app.vm.provision "file", source: "swarm-priv-key.pem", destination: "~/.certs/key.pem"
app.vm.provision "file", source: "swarm.csr", destination: "~/.certs/swarm.csr"
end
end
As you can see my node1 /etc/init/docker.conf has the options:
DOCKER_OPTS="-H tcp:\\/\\/0.0.0.0:2376 -H unix:\\/\\/\\/var\\/run\\/docker.sock --tlsverify --tlscacert=\\/home\\/vagrant\\/.certs\\/ca.pem --tlscert=\\/home\\/vagrant\\/.certs\\/cert.pem --tlskey=\\/home\\/vagrant\\/.certs\\/key.pem"
I do
vagrant up
then I connect to swarm
vagrant ssh swarm
export TOKEN=$(docker run swarm create)
#dd182b8d2bc8c03f417376296558ba29
docker run -d swarm join --advertise node1:2376 token://dd182b8d2bc8c03f417376296558ba29
node1 is defined in the /etc/hosts file as you can see on the vagrant provision file.
Start the swarm manager with log debug level (wihthout -d)
docker run -p 3376:3376 -v /home/vagrant/.certs:/certs:ro swarm -l debug manage --tlsverify --tlscacert=/certs/ca.pem --tlscert=/certs/cert.pem --tlskey=/certs/key.pem --host=0.0.0.0:3376 token://dd182b8d2bc8c03f417376296558ba29
The log is showing me:
time="2016-04-15T02:47:59Z" level=debug msg="Failed to validate pending node: lookup node1 on 10.0.2.3:53: server misbehaving" Addr="node1:2376"
my node1 ip address in /etc/hosts is actually:
192.168.33.10 node1
It seems that docker is trying to lookup the node1 alias on the wrong bridge network?
========== more info:
You can check this url to see if the discovery service found your node1 and it does:
https://discovery.hub.docker.com/v1/clusters/dd182b8d2bc8c03f417376296558ba29
Now if you run the swarm manager with -d and do:
vagrant#vagrant-ubuntu-trusty-64:~$ docker --tlsverify --tlscacert=/home/vagrant/.certs/ca.pem --tlscert=/home/vagrant/.certs/cert.pem --tlskey=/home/vagrant/.certs/key.pem -H swarm:3376 info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: swarm/1.2.0
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 1
(unknown): node1:2376
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: (none)
└ UpdatedAt: 2016-04-15T03:03:28Z
└ ServerVersion:
Plugins:
Volume:
Network:
Kernel Version: 3.13.0-85-generic
Operating System: linux
Architecture: amd64
CPUs: 0
Total Memory: 0 B
Name: ee85273cbb64
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support
You see the node has being: Pending
Although you define node1 in your machine's /etc/hosts, the container that swarm manager is running doesn't have node1 in its /etc/hosts file. By default a container doesn't share the host's file system. See https://docs.docker.com/engine/userguide/containers/dockervolumes/. Swarm manager tries to look up node1 thru DNS resolver and fails.
There are several options to resolve this.
Use a resolvable FQDN so Swarm manager in the container can resolve the node
Or provide node1's IP in swarm join command
Or pass /etc/hosts file from host to the Swarm manager container using -v option. See the link above.