geth --rinkeby can not find peers - geth

I want to run a Rinkeby full node. That is my startup script for geth:
geth --rinkeby \
--cache=2048 \
--http --http.port=8545 --http.addr=127.0.0.1 --http.api=eth,web3,net,personal --http.corsdomain "*" \
--syncmode=fast
geth starts but obviously it cannot connect to any peers. I get the following debug output:
INFO [08-05|07:03:02.696] Looking for peers peercount=0 tried=137 static=0
INFO [08-05|07:03:12.832] Looking for peers peercount=0 tried=180 static=0
INFO [08-05|07:03:23.506] Looking for peers peercount=0 tried=81 static=0
INFO [08-05|07:03:35.003] Looking for peers peercount=0 tried=198 static=0
INFO [08-05|07:03:45.042] Looking for peers peercount=0 tried=105 static=0
INFO [08-05|07:03:56.216] Looking for peers peercount=0 tried=96 static=0
INFO [08-05|07:04:06.978] Looking for peers peercount=0 tried=118 static=0
INFO [08-05|07:04:17.244] Looking for peers peercount=0 tried=110 static=0
INFO [08-05|07:04:27.273] Looking for peers peercount=0 tried=189 static=0
INFO [08-05|07:04:39.046] Looking for peers peercount=0 tried=81 static=0
It is running already for more than 12 hours like this. I also tried to stop the firewall, but this doesn't change anything.
Here is output of geth version:
Geth
Version: 1.9.19-unstable
Git Commit: 8e7bee9b56763e94c06e597bf968838e7ea2d03b
Git Commit Date: 20200727
Architecture: amd64
Protocol Versions: [65 64 63]
Go Version: go1.13.14
Operating System: linux
GOPATH=
GOROOT=/usr/lib64/go/1.13
What can I do, that geth finds other peers?
Thanks for your help.

If you update to the latest stable version 1.10.x and still having the discovery issue. Try running the node using --v5disc option which enables the experimental RLPx V5 (Topic Discovery) mechanism.
Edit:
Also, not having the discovery port setup correctly 30303 can result in nodes not being able to find peers.
Make sure, no more than one more process is trying to listen on that port.

Well, your geth version is unstable. Try to install the latest stable version. This should solve the problem.

Related

Kafka-s3-connect killed instantly after start

I want to connect aws-Kafka with s3 using confluence connector on my ec2 server. I try to configure everything like in tutorials. When I run connect-standalone or connect-distributed, at first everything goes well, I don't get any errors in the logs but after information about connection starting, my connector died instantly without any information. Has anybody got same problem?
config/connect-standalone.properties
bootstrap.servers=msk-connection-string
plugin.path=/home/ubuntu/connectors/confluentinc-kafka-connect-s3
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
offset.storage.file.filename=/tmp/connect.offsets
connector.properties
connector.class=io.confluent.connect.s3.S3SinkConnector
format.class=io.confluent.connect.s3.format.bytearray.ByteArrayFormat
flush.size=1
topics=SomeTopic
s3.bucket.name=bucket-name-here
s3.region=us-west-2
s3.part.size=5242880
aws.access.key.id=****
aws.secret.access.key=****
behavior.on.null.values=ignore
storage.class=io.confluent.connect.s3.storage.S3Storage
topics.dir=../topics
store.url=http://bucket-name.s3-website-Region.amazonaws.com
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
logs:
[2021-08-20 06:32:35,954] INFO Kafka version: 2.7.0 (org.apache.kafka.common.utils.AppInfoParser:119)
[2021-08-20 06:32:35,954] INFO Kafka commitId: 448719dc99a19793 (org.apache.kafka.common.utils.AppInfoParser:120)
[2021-08-20 06:32:35,954] INFO Kafka startTimeMs: 1629441155953 (org.apache.kafka.common.utils.AppInfoParser:121)
Killed
Please help!
MSK requires TLS connection
When adding few lines with ssl configuration to config/connect-standalone.properties
producer.security.protocol=SSL
consumer.security.protocol=SSL
security.protocol=SSL
ssl.protocol=TLS
ssl.truststore.location=/your/path/to/truststore/kafka.client.truststore.jks
It starts working properly!

Communicate issue between High-level and RTapp

i am trying to create high-level app based on IntercoreComms samples from azsure-sphere-samples, the high level sample code itself it's working w/o issue on my mt3620 dev board, but when i try to add my code on top of it and run it, there is error message as below from high-level app.
ERROR: Unable to create socket: 13 (Permission denied)
and, the error message from RTApp is as below.
TargetName Type Endian TapName State
0* io0 cortex_m little mt3620.cpu unknown
Info : Listening on port 6666 for tcl connections
Info : Listening on port 23 for telnet connections
Info : clock speed 4800 kHz
Info : SWD DPIDR 0x3ba02477
Info : io0: hardware has 6 breakpoints, 4 watchpoints
Info : io0: external reset detected
Info : Listening on port 4444 for gdb connections
Info : accepting 'gdb' connection on tcp/4444
target halted due to debug-request, current mode: Thread
xPSR: 0x61000000 pc: 0x001008ea msp: 0x0012fb90
Warn : target io0 is not halted (gdb fileio)
Polling target io0 failed, trying to reexamine
Info : SWD DPIDR 0x3ba02477
Info : SWD DPIDR 0x3ba02477
Info : SWD DPIDR 0x3ba02477
Info : SWD DPIDR 0x3ba02477
Info : SWD DPIDR 0x3ba02477
To give permission for your high-level app to talk to your real-time app, and vice versa, the "AllowedApplicationConnections" field of the app_manifest.json for each app must contain the component ID of the other app. See here for details. The "ComponentId" is itself a field in the app manfiest: your new app(s) likely have different IDs to the sample apps.
Also, if you are deploying through Visual Studio (Code), you need to declare each app as a 'partner' of the other so one is not deleted when the other is deployed. See here for details of that. The RT app error that you see may come from it being deleted when the high-level app is deployed.

Node not starting after creating a new node in rabbitmq

I want to create a cluster of 3 nodes. I have created two nodes with command:
RABBITMQ_NODE_PORT=5680 RABBITMQ_NODENAME=rabbit1#localhost rabbitmq-server -detached
Now when i try to stop the node in order to join it to cluster, it gives me error stating the node is not started at all.
What i have done till now is installed rabbitmq and started it using rabbitmq-server.
rabbit1#localhost.log
Error description:
init:do_boot/3
init:start_em/1
rabbit:start_it/1 line 480
rabbit:broker_start/0 line 356
rabbit:start_apps/2 line 575
app_utils:manage_applications/6 line 126
lists:foldl/3 line 1263
rabbit:'-handle_app_error/1-fun-0-'/3 line 696
throw:{could_not_start,rabbitmq_mqtt,
{rabbitmq_mqtt,
{{shutdown,
{failed_to_start_child,'rabbit_mqtt_listener_sup_:::1883',
{shutdown,
{failed_to_start_child,
{ranch_listener_sup,{acceptor,{0,0,0,0,0,0,0,0},1883}},
{shutdown,
{failed_to_start_child,ranch_acceptors_sup,
{listen_error,
{acceptor,{0,0,0,0,0,0,0,0},1883},
eaddrinuse}}}}}}},
{rabbit_mqtt,start,[normal,[]]}}}}
Log file(s) (may contain more information):
/usr/local/var/log/rabbitmq/rabbit1#localhost.log
/usr/local/var/log/rabbitmq/rabbit1#localhost_upgrade.log
Terminal:
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit1#localhost
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: [rabbit1#localhost]
rabbit1#localhost:
* connected to epmd (port 4369) on localhost
* epmd reports: node 'rabbit1' not running at all
other nodes on localhost: [rabbit]
* suggestion: start the node
Current node details:
* node name: 'rabbitmqcli-9206-rabbit#localhost'
* effective user's home directory: /Users/yashparekh
* Erlang cookie hash: +/3SPQl4T2w3zA11j1+o4Q==
I expect stop_app command to work in order to be able to join it to cluster.
Please let me know where i'm going wrong.
Thanks in advance.
{failed_to_start_child,
{ranch_listener_sup,{acceptor,{0,0,0,0,0,0,0,0},1883}},
{shutdown,
{failed_to_start_child,ranch_acceptors_sup,
{listen_error,
{acceptor,{0,0,0,0,0,0,0,0},1883},
eaddrinuse}}}}}}},
it means that the port 1883 (the MQTT port) is already used. you have to set also this port dynamically.

locksmithd doesn't work properly with etcd tls

I have CoreOS beta (1153.4.0) installed.
I have etcd2 configured with tls and works properly.
I'm trying to configure locksmithd to work with the tls certificates by updating /var/lib/coreos-install/user_data and adding:
coreos:
locksmith:
endpoint: "https://coreos-2.tux-in.com:2379,https://coreos-3.tux-in.com:2379"
etcd_cafile: /etc/ssl/etcd/ca.pem
etcd_certfile: /etc/ssl/etcd/etcd1.pem
etcd_keyfile: /etc/ssl/etcd/etcd1-key.pem
which created the file /run/systemd/system/locksmithd.service.d/20-cloudinit.conf with the content:
[Service]
Environment="LOCKSMITHD_ENDPOINT=https://coreos-2.tux-in.com:2379"
Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/ca.pem"
Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd1.pem"
Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd1-key.pem"
locksmithctl status returns Error initializing etcd client: client: etcd cluster is unavailable or misconfigured.
how can I debug this issue further? or even better.. solve it? :)
any information regarding the issue would be greatly appreciated.
needed to run systemctl unmask update-engine.service on both servers and now locksmithd runs properly, if I lock one of the servers, the other one notices it with 'locksmithctl status'.
everything works.

Error running topology in production cluster with Apache Storm 1.0.0, topology does not start

I have a topology that runs well on a Local cluster.
But when I try to run it on a production cluster the following things happens:
The nimbus is up
The storm UI is up
The two workers I use are up
Zookeper is up
I run storm with
storm jar myjar.jar MyClass
Nimbus submits the topology
The topologies and the workers appears in the storm UI
BUT:
The topology does not start despite the fact that its status is ACTIVE
The log file of the topology does not appear in the workers.
I have the following log in the worker on the supervisor.log:
2016-04-15 13:18:19.831 o.a.s.d.supervisor [WARN] There was a connection problem with nimbus. #error {
:cause jobs-rec-storm-nimbus
:via
[{:type java.lang.RuntimeException
:message org.apache.storm.thrift.transport.TTransportException: java.net.UnknownHostException: jobs-rec-storm-nimbus
:at [org.apache.storm.security.auth.TBackoffConnect retryNext TBackoffConnect.java 64]}
{:type org.apache.storm.thrift.transport.TTransportException
:message java.net.UnknownHostException: jobs-rec-storm-nimbus
:at [org.apache.storm.thrift.transport.TSocket open TSocket.java 226]}
{:type java.net.UnknownHostException
:message jobs-rec-storm-nimbus
:at [java.net.AbstractPlainSocketImpl connect AbstractPlainSocketImpl.java 184]}]
:trace
[[java.net.AbstractPlainSocketImpl connect AbstractPlainSocketImpl.java 184]
[java.net.SocksSocketImpl connect SocksSocketImpl.java 392]
[java.net.Socket connect Socket.java 589]
[org.apache.storm.thrift.transport.TSocket open TSocket.java 221]
[org.apache.storm.thrift.transport.TFramedTransport open TFramedTransport.java 81]
[org.apache.storm.security.auth.SimpleTransportPlugin connect SimpleTransportPlugin.java 103]
[org.apache.storm.security.auth.TBackoffConnect doConnectWithRetry TBackoffConnect.java 53]
[org.apache.storm.security.auth.ThriftClient reconnect ThriftClient.java 99]
[org.apache.storm.security.auth.ThriftClient <init> ThriftClient.java 69]
[org.apache.storm.utils.NimbusClient <init> NimbusClient.java 106]
[org.apache.storm.utils.NimbusClient getConfiguredClientAs NimbusClient.java 78]
[org.apache.storm.utils.NimbusClient getConfiguredClient NimbusClient.java 41]
[org.apache.storm.blobstore.NimbusBlobStore prepare NimbusBlobStore.java 268]
[org.apache.storm.utils.Utils getClientBlobStoreForSupervisor Utils.java 462]
[org.apache.storm.daemon.supervisor$fn__9590 invoke supervisor.clj 942]
[clojure.lang.MultiFn invoke MultiFn.java 243]
[org.apache.storm.daemon.supervisor$mk_synchronize_supervisor$this__9351$fn__9369 invoke supervisor.clj 582]
[org.apache.storm.daemon.supervisor$mk_synchronize_supervisor$this__9351 invoke supervisor.clj 581]
[org.apache.storm.event$event_manager$fn__8903 invoke event.clj 40]
[clojure.lang.AFn run AFn.java 22]
[java.lang.Thread run Thread.java 745]]}
2016-04-15 13:18:19.831 o.a.s.d.supervisor [INFO] Finished downloading code for storm id jobs-KafkaMigration-topology-3-1460740616
2016-04-15 13:18:19.850 o.a.s.d.supervisor [INFO] Missing topology storm code, so can't launch worker with assignment ...(some more numbers)
So I asume that I have a connection problem with nimbus, but the properties file in the worker is:
storm.zookeeper.servers:
- "192.168.22.209"
- "192.168.22.216"
- "192.168.22.217"
storm.local.dir: "/app/home/storm"
storm.zookeeper.root: "/storm-prod"
#
nimbus.seeds: ["192.168.120.96"]
And if I make a ping to the nimbus ip from the workers, it returns OK
Where is the error, How can I fix it?
Thanks!
Whats appears to happen in this context is that Storm supervisor resolves nimbus from whatever is configured in storm.yaml seeds/host the first time and from then on uses nimbus host name to download the topology artifacts.
If that is correct, DNS is mandatory for a cluster setup. This is far from ideal, specially when using containers in an orchestrated environment like kubernetes.
Current workaround i'm using is adding
storm.local.hostname: "<local.ip.value>"
to the storm.yaml
Thanks to #bastien who provided the tip on storm user mailing list
I ran into the similar issue. Turns out my firewall rules were blocking the supervisor ports. Make sure the supervisor and nimbus are able to talk to each other.
I found that I need to have the hostnames of the boxes match what I was calling them in the /etc/hosts file
in host file i had
xxx.xxx.xxx.xxx nimbus
but the host name on the box was different and it was pulling the hostname from the os
changing the host name on the os of the nimbus server resolved my issue.