How to fix the Zookeeper error for Hbase - virtual-machine

Main OS is windows 7 64bit. Using VM player to create two vm CentOS 5.6 system. The net connection is bridge. I installed Hbase on both of the CentOS system, one is master, the other is slave. When I enter the shell, and run status 'details'.
The error from master is
zookeeper.ZKConfig: no valid quorum servers found in zoo.cfg ERROR:
org.apache.hadoop.hbase.ZooKeeperConnectionException: An error is
preventing HBase from connecting to ZooKeeper
And the error from slave is
ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is
able to connect to ZooKeeper but the connection closes immediately.
This could be a sign that the server has too many connections (30 is
the default). Consider inspecting your ZK server logs for that error
and then make sure you are reusing HBaseConfiguration as often as you
can. See HTable's javadoc for more information.
Please give me some suggestion.
Thanks a lot

Check if this is within your .bashrc, if not, add them and restart all hbase services (do not forget to manually run them as well), that did it for me with a pseudo-distributed installation. My problem (and maybe yours as well) was that Hbase wasn't detecting it's configuration.
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HBASE_CONF_DIR=/etc/hbase/conf

I see this very often on my machine. I don't have a failsafe cure, but end up running stop-all.sh, and deleting every place that hadoop and dfs (its a dfs failure) store their temp files. It seems to happen after my computer goes to sleep while dfs is running.
I am going to experiment with single-user mode to avoid this. I dont need distribution while developing.

Related

Migrate a (storm+nimbus) cluster to a new Zookeeper, without loosing the information or having downtime

I have a nimbus+storm cluster using Zookeeper, and I wish to move my cluster and point it to a new Zookeeper. Do you know if this is possible? Can I keep all the information of the old zookeeper and save it in the new one? Is it possible to do it without downtime?
I have looked in the internet for this procedure but I have not found much.
Would it be as simples as change the storm.yml file in both the master . and worker nodes? Do I need a restart afterwards?
# storm.zookeeper.servers:
# - "server1"
# - "server2"
If you just change storm.yml, you'd be pointing Storm at a new empty Zookeeper cluster, and it will be like you just installed Storm from scratch. More likely, you want to grow your Zookeeper cluster to include your new machines, then update storm.yml to point at the new machines, then shrink the cluster to exclude the machines you want to move away from. That way, your Zookeeper quorum is preserved even though you've moved to other physical machines.
This is easier to do on Zookeeper 3.5 with dynamic reconfiguration http://zookeeper.apache.org/doc/r3.5.5/zookeeperReconfig.html. I'm unsure whether Storm will run on Zookeeper 3.5, but you may consider investigating whether you can upgrade to 3.5 before growing/shrinking the cluster.
Otherwise you will have to do a rolling restart to add the new Zookeeper nodes, then do another one to remove the old machines once the cluster has stabilized.
Let me suggest a hack here. This was a script provided by microsoft for migration on HD Insight cluster , but you can change it and use it for your need.
The script can be downloaded from : https://github.com/hdinsight/hdinsight-storm-examples/tree/master/tools/zkdatatool-1.0 and you can read more about it here :
https://blogs.msdn.microsoft.com/azuredatalake/2017/02/24/restarting-storm-eventhub/
I have used it in the past when i had to migrate some stuff between PaaS clusters and i can confirm it works ok!

Meld error when setting up a new cluster

I am evaluating the DataStax OpsCenter on a virtual machine to start managing/monitoring cassandra. I am following the online docs to create cluster topology models via OpsCenter LCM, but the error message doesn't provide much information for me to continue. The jobs status are,
error- MeldError, 400 Client Error: Bad Request for url: http://[ip_address]:8888/api/v1/lcm/internal/nodes/6185c776-9034-45b4-a54f-6eb9511274a2/package_information
Meld failed on name="testnode1" ssh-management-address=[ip_address]" node-id="6185c776-9034-45b4-a54f-6eb9511274a2" node-name="testnode1" job-id="1b792c69-bcca-489f-ad12-a6285ba84d59" stdout=" Meld has started... " stderr=""
My question is what might be wrong and any hint how to resolve that?
I am new to the cassandra and DataStax communities, please forgive me if any silly question asked!
Q: I used to be a buildbot user and DataStax agent looks like a Buildbot's slave. Why we don't need agent setup on the remote machine to work with opscenter? The working directory of agent is configured in opscenter?
The opscenterd.log, https://pastebin.com/TJsvmr6t
According to the compatibility of the tools set mentioned in https://docs.datastax.com/en/landing_page/doc/landing_page/compatibility.html#compatibilityDocument__opsc-compatibility , I actually use the OpsCenter v5.2 for monitoring and basic db operations. After trial-and-error of .yaml of Agent and .conf of Cassandra 2.2, the Dashboard works!
Knowledge gained,
The OpsCenter 5.2 actually works with Cassandra 2.2 which is not listed in the compatibility table
For beginner, if not sure where to start, try to install all the components on one machine to get idea of the least viable working setup. And from there to configure the actual dev/test/production environment.

Unable to connect to local RabbitMQ on Windows 10

I've installed RabbitMQ (latest version downloadable from RabbitMQ website) on my Windows 10 machine. It installed with ERlang 19.1.
I'm trying to install RabbitMQ Web UI Management Tools using the following command (using RabbitMQ Command Prompt):
rabbitmq-plugins enable rabbitmq_management
I'm getting the following error:
The directory name is invalid.
The filename, directory name, or volume label syntax is incorrect.
The filename, directory name, or volume label syntax is incorrect.
Plugin configuration unchanged.
Applying plugin configuration to rabbit#[0x7FF9A8527044]... failed.
* Could not contact node rabbit#[0x7FF9A8527044].
Changes will take effect at broker restart.
* Options: --online - fail if broker cannot be contacted.
--offline - do not try to contact broker.
I've looked up on SO and tried stopping and restarting, overriding erlang cookie, but nothing helps.
I think there's a problem with RabbitMQ itself. The service itself is marked as started, but if I try to telnet the default port (5672) then it fails (it's not a firewall issue - I've disabled it).
Also I don't see an log files created for RabbitMQ or any related Event Logs messages. So it's hard to diagnose exactly the problem.
I also tried uninstalling and re-install both erlang and RabbitMQ. Still didn't help.
How do I further diagnose the problem?
Found a solution to the problem (downgrading Erlang did not work in my case, but just in case I left it on Erlang 18 in case there were other issues with ver 19).
What puzzled my eye was this line: Applying plugin configuration to rabbit#[0x7FF9A8527044]... failed.. Seems like it's trying to connect to rabbit instance at a wrong machine name.
I then ran rabbitmqctl.bat status which failed but again showed that it's trying to connect to [0x7FF9A8527044] while the node name was rabbit#my-mchine-name. So I started reading the configuration section at RabbitMQ website and the solution was simple - setting the node name manually.
All I had to do is add an environment variable named RABBITMQ_NODENAME with the node name being rabbit#localhost. And that's it. Problem solved!
you may be running into issues with Erlang 19 incompatibility. there has been some history of Erlang 19 support problems with RMQ. Try installing Erlang 18 instead.
If that fails, I would recommend using Docker for Windows and installing / running RabbitMQ in that. I've moved all my services like RabbitMQ, MongoDB, etc. into Docker containers and it's made my life as a dev so much simpler.
In my case I had to trash the local account config located at : %APPDATA%\RabbitMQ\.
Deleting the entire folder and reinstalling the service did the trick.
Rabbitmq 3.6.14
Erlang 20.1 OTP

Why Apache Accumulo is not running after restart?

I just installed Apache Accumulo. It is successfully initialized and run but after a restart when i insert start-all.sh command it stuck on waiting for Accumulo to be initialized. what's wrong here?
If you've restarted your computer, be sure that you have also restarted Hadoop (HDFS) and Zookeeper as well as verify they are running correctly. They are required to be running for Accumulo.
It sounds like you might be running this locally on a single machine. If that's the case, also verify your hadoop hdfs settings and make sure it's not writing its data to /tmp which will get wiped out occasionally between restarts.

Zookeeper: It is probably not running

I am trying to start zookeeper on a remote virtual machine. I use this for my project regularly and I do not have any problems while starting the zookeeper. But lately when I am trying to start the server I am getting an error.
When I give ./zkServer.sh start it shows zookeeper server started.
When I check for status using ./zkServer.sh status it shows "Error contacting service. It is probably not running."
I am totally working with 5 Virtual Machines. All these machines were fine initially. I started getting problems with machine 1. But recently I have the same problem with all my virtual machines. Can someone tell me what the issue is and suggest me a way to clear this issue?
Most probably Zookeeper server exited.
If we are running it on a Linux box, use the linux commands. Some of them:
ps -ef | grep -i zookeeper
jps
etc.
Also, try running it in foreground
zkServer.sh start-foreground
In My case the issue was $PATH issue...
You will get what was the issue by running zookeeper in foreground
zkServer.sh start-foreground
I encountered same problem,too. In my case problem is about zookeeper locations configuration is not same for each node so zookeeper can not provide Quorum and mentioned nodes can not be part of cluster.
Please be sure server definition for each node is same.
For example for all nodes, server definition must be same as below
server.0=ip0:2888:3888
server.1=ip1:2888:3888
server.2=ip2:2888:3888
server.3=ip3:2888:3888
server.4=ip4:2888:3888
In my case the issue was some how ClientPort attribute's value was missed in one of the box so in console it was showing as invalid config path.With the help of command 'zkServer.sh start-foreground' investigated and found root cause.