I installed okd 3.11 in single master multiple nodes via the openshift ansible playbooks. When I try the nginx quick start. I get the the follow error when I asks the events of the pod.
Warning FailedCreatePodSandBox 23m kubelet, Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "3d57f7bf012f8737e202ac2db7291b58a3d5fde376ff431395584c165928d475" network for pod "nginx-example-1-build": NetworkPlugin cni failed to set up pod "nginx-example-1-build_geoffrey-samper" network: error getting ClusterInformation: Get https://10.43.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.43.0.1:443: i/o timeout, failed to clean up sandbox container "3d57f7bf012f8737e202ac2db7291b58a3d5fde376ff431395584c165928d475" network for pod "nginx-example-1-build": NetworkPlugin cni failed to teardown pod "nginx-example-1-build_geoffrey-samper" network: error getting ClusterInformation: Get https://10.43.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.43.0.1:443: i/o timeout
When I run oc status but I can't find any pod with 10.* ip range
In project default on server https://p04:8443
https://docker-registry-default (passthrough) (svc/docker-registry)
dc/docker-registry deploys docker.io/openshift/origin-docker-registry:v3.11
deployment #1 deployed about an hour ago - 1 pod
svc/kubernetes - xxx ports 443->8443, 53->8053, 53->8053
https://registry-console-default (passthrough) (svc/registry-console)
dc/registry-console deploys docker.io/cockpit/kubernetes:latest
deployment #1 deployed about an hour ago - 1 pod
svc/router - xxx ports 80, 443, 1936
dc/router deploys docker.io/openshift/origin-haproxy-router:v3.11
deployment #1 deployed about an hour ago - 1 pod
Does anyone know how to resolve this or where to begin.
Related
Configuring a ELK stack version 8.1, based on two virtual machine which both run Oracle linux 8. I need to send logs from a VM to the other using rsyslog. On the recieving machine logs will be acquired using FileBeat. The file rsyslog.conf has been configured on the sending machine, adding target machine parameters. The file filebeat.yml has been configured to recieve logs from rsyslog like this:
- type: syslog
enabled: true
format: auto
protocol.tcp:
host: "X.X.X.X:10514"
The firewalld on the receiving machine has been configured opening the port 10514.
Since the reboot after configuration, the only thing I can get is the error:
cannot connect to X.X.X.X:10514: Connection refused
How can I solve this problem?
My windows machine recently auto-updated, and since then rabbitmq is not working. I know the errors I'm getting appear in many stackoverflow questions, but none of them have helped me resolve my issue. Any rabbitmqctl command I run returns the same result, that the node is unreachable (see below).
What I want to know:
How can I diagnose my issue?
Where can I find rabbitmq or erlang logs for what is happening? I can't find any and the only environment variable defined on my machine is the RABBIT_MQ_HOME var.
Any suggestions on fixing the issue (I have listed what I have tried below).
The error I am getting on any rabbitmqctl command:
rabbitmqctl.bat start_app
Starting node rabbit#DESKTOP-BG3LMOM ...
Error: unable to perform an operation on node 'rabbit#DESKTOP-BG3LMOM'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#DESKTOP-BG3LMOM
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: ['rabbit#DESKTOP-BG3LMOM']
rabbit#DESKTOP-BG3LMOM:
* connected to epmd (port 4369) on DESKTOP-BG3LMOM
* epmd reports: node 'rabbit' not running at all
no other nodes on DESKTOP-BG3LMOM
* suggestion: start the node
Current node details:
* node name: 'rabbitmqcli-352-rabbit#DESKTOP-BG3LMOM'
System information:
RabbitMQ version: 3.9.13
Erlang version: 12.0
Windows 10 build: 19044.1526
What I have tried:
I've checked the erlang cookie is synced in all locations
I've uninstalled RabbitMQ and Erlang and reinstalled them both to the latest version (via choco), with machine restarts between uninstalling and reinstalling.
Followed every suggestion listed in this thread
I've installed spinnaker in kubernetes cluster, halyard is running in ubuntu machine.
To access spinnaker UI from my laptop as localhost:9000, I ran hal deploy connect on ubuntu and created ssh tunnel in putty for ports 9000, 8084, 8087 which is in my laptop to ubuntu system where halyard is running.
hal deploy connect
+ Get current deployment
Success
+ Connect to Spinnaker deployment.
Success
Forwarding from 127.0.0.1:8084 -> 8084
Forwarding from [::1]:8084 -> 8084
Forwarding from 127.0.0.1:9000 -> 9000
Forwarding from [::1]:9000 -> 9000
But spinnaker is not connecting, and ssh even logs says - connection is refused...
however 've tried to running other application directly on ubuntu, created ssh tunnel for that and through localhost:portNumber in my laptop is working fine..
please advise.. thanks..
Please take a look on: helm-chart, spinnaker-deployment.
Temporary workaround (until this is fixed) you can add:
spec:
...
- --runtime-config=apps/v1beta1=true,apps/v1beta2=true,extensions/v1beta1/daemonsets=true,extensions/v1beta1/deployments=true,extensions/v1beta1/replicasets=true,extensions/v1beta1/networkpolicies=true,extensions/v1beta1/podsecuritypolicies=true
... to /etc/kubernetes/manifests/kube-apiserver.yaml
At the end restart kubelet.
Make sure you followed every steps from this tutorial.
I want to get the actual IP using which the client sent out the packet in my app sitting in a kubernetes pod.
I did some searches and found that this was not supported earlier but supported later.
I ungraded my setup and here is the current setup version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
$ kubectl api-versions
extensions/v1beta1
v1
I also ran:
$ for node in $(kubectl get nodes -o name); do kubectl annotate $node net.beta.kubernetes.io/proxy-mode=iptables; done
This now gives:
error: --overwrite is false but found the following declared
annotation(s): 'net.beta.kubernetes.io/proxy-mode' already has a value
(iptables)
error: --overwrite is false but found the following declared
annotation(s): 'net.beta.kubernetes.io/proxy-mode' already has a value
(iptables)
I also rebooted all the boxes.
However, I still get IP of docker0 interface of the worker node when the packet is received inside my application.
Here, I read:
But that will not expose external client IPs, just intra cluster IPs.
So, the question is how to get the real, external client IP when I get a packet.
The packets are not http/websocket packets, but plain TCP packets if this is relevant to get an answer.
I also tried following this comment but did not get lucky. App continued to get packets with docker0 interface IP as source IP. May be I could not copy-paste the stuff. I don't know how to get kube-proxy IP and just used worker machine IP there. I am just getting started with Kubernetes and CoreOS.
I set up a 2-node Hadoop cluster, and running start-df.sh and start-yarn.sh works nicely (i.e. all expected services are running, no errors in the logs).
However, when I actually try to run an application, several tasks fail:
15/04/01 15:27:53 INFO mapreduce.Job: Task Id :
attempt_1427894767376_0001_m_000008_2, Status : FAILED
I checked the yarn and datanode logs, but nothing is reported there.
In the userlogs, the syslogs files on the slave node all contain the following error message:
2015-04-01 15:27:21,077 INFO [main] org.apache.hadoop.ipc.Client:
Retrying connect to server:
slave.domain.be./127.0.1.1:53834. Already tried 9 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-04-01 15:27:21,078 WARN [main]
org.apache.hadoop.mapred.YarnChild:
Exception running child :
java.net.ConnectException: Call From
slave.domain.be./127.0.1.1 to
slave.domain.be.:53834 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
So the problem is that the slave cannot connect to itself..
I checked whether there is a process running on the slave node listening at port 53834, but there is none.
However, all 'expected' ports are being listened on (50020,50075,..). Nowhere in my configuration I have used port 53834. It's always a different port on different runs.
Any ideas on fixing this issue?
Your error might be due to loopback address in your hosts file. Go to /etc/hosts file and comment the line with 127.0.1.1 in your slave nodes and master node(if necessary). Now start the hadoop processes.
EDITED:
Do this in terminal to edit hosts file without root permission:
sudo bash
Enter your current user password to enter into root login. You can now edit your hosts file using:
nano /etc/hosts