I have setup a 3-node ambari cluster (3 VMs running CentOS 7), with hive just one of the services. All the other services have started, along with the hive clients on all the hosts and the hive metastore. However, starting the hiveserver2 fails.
The starting logs show the following exception:
caught exception: ZooKeeper node /hiveserver2 is not ready yet. Sleeping for 10 sec(s)
Could anybody help me out please?
In Ambari home page go to /Services/HDFS/Configs search for hadoop.proxyuser.hive.hosts and append the hostname that Hiveserver2 installed on it.
Related
I have configured the ZK Server to use SSL (signed cert, trust store,keystore, modified zookeeper.properties all setup done and good). Zookeeper starts and listens on the port 2182 for SSL requests and no errors in the zookeeper and kafka server logs.
#new properties added in kafka/config/zookeeper.properties
secureClientPort=2182
authProvider.x509=org.apache.zookeeper.server.auth.X509AuthenticationProvider
serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory
ssl.trustStore.location=/path/to/ssl/kafka.zookeeper.truststore.jks
ssl.trustStore.password=serversecret
ssl.keyStore.location=/path/to/ssl/kafka.zookeeper.keystore.jks
ssl.keyStore.password=serversecret
ssl.clientAuth=need
Now to connect to secure zookeeper using ZK-CLI I am following similar approach. Create zk-client cert, get it signed, create truststore and keystore for the same. Create the properties file and trying to connect to ZK server but I get an error
Command not found: Command not found /path/to/ssl/zookeeper-client.properties
$ kafka/bin/zookeeper-shell.sh localhost:2182 -zk-tls-config-file /Users/path/to/ssl/zookeeper-client.properties
Connecting to localhost:2182
ZooKeeper -server host:port cmd args
addauth scheme auth
close
.....
Command not found: Command not found /Users/path/to/ssl/zookeeper-client.properties
My zookeeper-client.properties looks like this
$cat /Users/path/to/ssl/zookeeper-client.properties
#zookeeper.connect=localhost:2182
zookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
zookeeper.ssl.client.enable=true
zookeeper.ssl.protocol=TLSv1.2
zookeeper.ssl.truststore.location=/Users/path/to/ssl/kafka.zookeeper-client.truststore.jks
zookeeper.ssl.truststore.password=serversecret
zookeeper.ssl.keystore.location=/Users/path/to/ssl/kafka.zookeeper-client.keystore.jks
zookeeper.ssl.keystore.password=serversecret
Kafka Server logs at the start of the ZK.
[2021-07-16 11:27:38,676] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NettyServerCnxnFactory)
[2021-07-16 11:27:43,760] INFO bound to port 2181 (org.apache.zookeeper.server.NettyServerCnxnFactory)
.....
[2021-07-16 11:27:43,819] INFO Using org.apache.zookeeper.server.NettyServerCnxnFactory as server connection factory (org.apache.zookeeper.server.ServerCnxnFactory)
[2021-07-16 11:27:43,819] INFO binding to port 0.0.0.0/0.0.0.0:2182 (org.apache.zookeeper.server.NettyServerCnxnFactory)
[2021-07-16 11:27:43,821] INFO bound to port 2182 (org.apache.zookeeper.server.NettyServerCnxnFactory)
...
When I try to connect to port 2182 with the zk-client the server logs doesn't show an entry (probably because it is not able to connect as the command to initiate connection fails)
I am using kafka_2.12 version and it has zookeeper-3.5.7
What am I missing here? To me configurations look as expected and the zk-cli shouldn't throw
Reference :
https://atsc.com.sg/docs/edp/7-security/zookeeper-mutual-tls/
https://docs.confluent.io/platform/current/security/zk-security.html
Thanks,
JE
I think the problem is that your cli is running from older version that does not yet support this parameter, check your execution path , are you truly executing from the "current" version?
I have installed rabbitmq in two machines in linux OS.And they all worked well. Then I run the command:rabbitmqctl join_cluster rabbit#gz2, it's not work.And the error info :
Error: unable to connect to nodes [rabbit#gz2]: nodedown
attempted to contact: [rabbit#gz2]
rabbit#gz2:
connected to epmd (port 4369) on gz2
epmd reports node 'rabbit' running on port 25672
TCP connection succeeded but Erlang distribution failed
suggestion: hostname mismatch?
suggestion: is the cookie set correctly?
suggestion: is the Erlang distribution using TLS?
suggestion: is the cookie set correctly?
You need to ensure both RabbitMQ nodes are using the same cookie file. Copy the file /var/lib/rabbitmq/.erlang.cookie from one node to the other, then restart RabbitMQ on the node to which you copied the file. You will be able to create a cluster after that.
Clustering and the Erlang cookie is documented here.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
I have Nifi cluster with one zookeeper node and five Nifi node. I want to have SSL encryption from the zookeeper server to the Nifi client.
Reading from the Nifi documentation, it says:
Support for SSL in ZooKeeper is being actively developed and is expected to be available in the 3.5.x release version.
The new zookeeper 3.5.3-beta have SSL capabilities.
I installed zookeeper 3.5.3 but I am unable to secure the connection it with SSL: I am getting NotSslRecordException
How can I run Nifi with a secure zookeeper using SSL?
Thank you
It requires more than just running ZooKeeper 3.5.x. There is code in NiFi that uses the ZooKeeper client and that code is not based on the 3.5.x client, so there is no way for NiFi to make a SSL connection.
Note that you also need to setup Zookeeper to use the SSL security for example
zookeeper.ssl.keyStore.location="/path/to/your/keystore"
zookeeper.ssl.keyStore.password="keystore_password"
zookeeper.ssl.trustStore.location="/path/to/your/truststore"
zookeeper.ssl.trustStore.password="truststore_password"
Full docummentation here: https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+SSL+User+Guide
I set up a 2-node Hadoop cluster, and running start-df.sh and start-yarn.sh works nicely (i.e. all expected services are running, no errors in the logs).
However, when I actually try to run an application, several tasks fail:
15/04/01 15:27:53 INFO mapreduce.Job: Task Id :
attempt_1427894767376_0001_m_000008_2, Status : FAILED
I checked the yarn and datanode logs, but nothing is reported there.
In the userlogs, the syslogs files on the slave node all contain the following error message:
2015-04-01 15:27:21,077 INFO [main] org.apache.hadoop.ipc.Client:
Retrying connect to server:
slave.domain.be./127.0.1.1:53834. Already tried 9 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-04-01 15:27:21,078 WARN [main]
org.apache.hadoop.mapred.YarnChild:
Exception running child :
java.net.ConnectException: Call From
slave.domain.be./127.0.1.1 to
slave.domain.be.:53834 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
So the problem is that the slave cannot connect to itself..
I checked whether there is a process running on the slave node listening at port 53834, but there is none.
However, all 'expected' ports are being listened on (50020,50075,..). Nowhere in my configuration I have used port 53834. It's always a different port on different runs.
Any ideas on fixing this issue?
Your error might be due to loopback address in your hosts file. Go to /etc/hosts file and comment the line with 127.0.1.1 in your slave nodes and master node(if necessary). Now start the hadoop processes.
EDITED:
Do this in terminal to edit hosts file without root permission:
sudo bash
Enter your current user password to enter into root login. You can now edit your hosts file using:
nano /etc/hosts
Apache mesos fails to find slave usage when you select slaves on the mesos gui. Also the web console is showing "failing when trying to load resource."
This is a common issue when running on EC2 or other cloud providers where machines have both an external and an internal IP. Mesos reports the internal IP in the UI, so if you're using the web UI from outside of EC2, the URLs won't work.
Current Mesos master and the latest 0.15 release candidate fix this issues by adding a --hostname command line option to set the hostname that gets reported in the UI.
If you're running <0.15, you can fix the issue by adding all the hosts in your Mesos cluster to /etc/hosts like so:
<private ip> <public fqdn> <machine hostname>
for example:
10.98.58.170 ec2-54-224-191-136.compute-1.amazonaws.com ec2-54-224-191-136