Hive query is giving error - hive

I have a HIVE table called testdata and the columns are as follows
name
age
gender
From hive prompt when i am issuing the command "select * from testdata", it is showing me the whole dataset. But when i am issuing the command select name from testdata, it is showing me the error
java.net.NoRouteToHostException: No Route to Host from [NAMENODE_IP] to [CLUSTER_IP]:35946 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost.
Can anybody please help me to find out what exactly i am doing wrong.
My Hadoop Version is 2.2.0 and Hive version is 0.11.0

Check for your 'hosts' file, system's current IP should have to be first entry in your hosts file.

I ran into the same trouble so I thought I would tell you guys what I did to fix this very issue.
1) Disable your firewall on the name node and see if it works
2) If it does then your firewall is blocking the network discussions between nodes. You will have to manually add rules to allow connections between your name node IPs
You can find how to do so with this very detailed answer on iptables here
https://serverfault.com/questions/30026/whitelist-allowed-ips-in-out-using-iptables/30031#30031

You need correct related hadoop configuration. It seems that hive cannot connect jobtracker.
When you query select * from testdata , hive have not use mapreduce to get result.
While you query select name from testdata, hive will call hadoop to start a mapreduce job.
So, make your hadoop configuration correct.

Related

Installing RCU for Oracle Data integrator runs on Error

I am trying to installing RCU for Oracle Repository Creation utility however everytime I try to install the development repository it runs on error.
Steps to reproduce the issue.
I run rcu bat file
Create repository/system load and product load
3.Choose oracle as database type/hostname localhost, port 1521 servicename xe, username sys and the password (I am able to login into oracle database with this login information) (I am using oracle 18c express
I am using the prefix dev
I am adding password for schemas, supervisor, and for the repository user.
I am not touching the tablespaces
I start the install and several error messages are dropping. like
Ora-65096 invalid common user or role name
Ora-01917 DEV_STB user does not exists
Ora-00955 The name has already been in use by another object.
My question what could be the problem with the installation of RCU and How can I resolve the issue ?
Funny thing is I am trying to install odi and RCU based on step by step video still something went wrong...
The issue here is you are logging to CDB with the service name XE. You need to log into the PDB (Plugable database).
Just change the service name from XE to XEPDB1 while connecting to the DB through RCU and your issue should be resolved.

Zeek cluster fails with pcap_error: socket: Operation not permitted (pcap_activate)

I'm trying to setting up a Zeek IDS cluster (v.3.2.0-dev.271) on 3 Ubuntu 18.04 LTS hosts to no avail - running zeek deploy command fails with the following output:
fatal error: problem with interface ens3 (pcap_error: socket: Operation not permitted (pcap_activate))
I have followed the official documentation (which is pretty generic at best) and set up passwordless SSH authentication between the zeek nodes.
I also preemptively created the /usr/local/zeek path on all hosts and gave the zeek user full permissions on that directory. The documentation says The Zeek user must be able to either create this directory or, where it already exists, must have write permission inside this directory on all hosts.
The documentation also says that on the worker nodes this user must have access to the target network interface in promiscuous mode.
My zeek user is a sudoer AND a member of netdev group on all 3 nodes. Yet, the cluster deployment fails. Apparently, when zeekctl establishes the SSH connection to the workers it cannot get a hold of the network interfaces and set caps.
Eventually I was able to successfully run the cluster by following this article - however it requires you to set up the entire cluster as root, which I would like to avoid if at all possible.
So my question is, is there anything blatantly obvious that I am missing? To the best of my knowledge this setup should work, otherwise I don't know how to force zeekctl to run 'sudo' in front of every SSH command it is supposed to run on the workers, or how to satisfy this requirement.
Any guidance will be greatly appreciated, thanks!
I was experiencing the same error for my standalone setup. Found this question from googling it. More googling the error brought me to a few blogs including one in which the comments mentioned the same error. The author mentioned giving the binaries permissions using setcap:
$sudo setcap cap_net_raw,cap_net_admin=eip /usr/local/zeek/bin/zeek
$sudo setcap cap_net_raw,cap_net_admin=eip /usr/local/zeek/bin/zeekctl
After running them both, my instance of zeek is now running successfully.
Source: https://www.ericooi.com/zeekurity-zen-part-i-how-to-install-zeek-on-centos-8/#comment-1586
So, just in case someone else stumbles upon the same issue - I figured out what was happening.
I streamlined the cluster deployment with Ansible (using 'become' directive at task level) and did not elevate when running the handlers responsible for issuing the zeekctl deploy command.
Once I did, the Zeek Cluster deployment succeeded.

Aginity Workbench Redshift Server Connection Error

I've downloaded Aginity Workbench for Redshift, version 4.3.
I'm receiving the error message
The Connection is not open
I selected my server endpoint by using this document: http://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-connect-to-cluster.html
example from link: examplecluster.userid.us-west-2.redshift.amazonaws.com
Port is 5439
I noticed right away that I could select a database from the dropdown. If I supply the database name I still get the error message "The connection is not open", does anybody know what I'm missing? Thanks.
lowercase loginid should fix your issue.
This could be several things.
Does the instance require SSL connections? If so, select 'require' for the SSL mode in the connection setup.
Also, the link doesn't typically contain a 'userid' (examplecluster.userid.us-west-2.redshift.amazonaws.com). You can get the endpoint from the configuration tab in AWS management console for the redshift instance. The page looks just like the image referenced in the doc here.
http://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-connect-to-cluster.html
The Connection is not open is the error at beginning stage, so could you please check if there are any firewall issue. run the command from your computer:
telnet examplecluster.userid.us-west-2.redshift.amazonaws.com 5439
If can't, you need check the security group setting on that redshift cluster instance and open inbound port 5439 to public or your own IP address.

Hive multiple users on same tables

is that possible to have tables that are shared in hive.
I mean a user creates a hive table. Later multiple users can work on that same table simultaneously.
I heard about derby and individual metastore for each users. But individual metastore option does not allow users to work simultaneously on same set of tables right?
Is there any other way to work on this?
Because when we try to access hive at the same time, we get the following error-
Caused by: ERROR XSDB6: Another instance of Derby may have already booted the da tabase /root/metastore_db.
ERROR XSDB6: Another instance of Derby may have already booted the da
tabase /root/metastore_db.
This error can occur when you are trying to start more than one instance of hive shell. The lock may sustain in background (due to improper disconnection) even after closing tab/terminal.
Solution is to find the process using grep
ps aux | grep hive
Now, kill the process using,
kill -9 hive_process_id (ex: kill -9 21765)
Restart the hive shell. It works fine.
I use Ubuntu and this error occurred when i opened hive from same location in two separate terminal windows. This would be interpreted as multiple users by the system. Close one of the terminal windows/tabs and that should do the trick.
This occurs when running two instances of a spark application (eg: spark-shell, spark-sql, or start-thriftserver) started in the same directory using the embedded Derby metastore.
When not configured by the hive-site.xml, the Spark context automatically creates metastore_db in the current directory (see Spark docs). To avoid this, start the second spark application in a different directory or use a persistent metastore (eg: Hive Derby in Server Mode) and configure it via hive-site.xml.

how to connect to a file based HSQLDB database with sqltool?

I have tried to follow the instructions in chapter 1 of the HSQLDB doc and started my server like:
java -cp hsqldb-2.2.5/hsqldb/lib/hsqldb.jar org.hsqldb.Server -database.0 file:#pathtodb# -dbname.0 xdb
and I have reason to believe that worked cause it said (among other things):
Database [index=0, id=0, db=file:#pathtodb#, alias=xdb] opened sucessfully in 2463 ms.
However at the next step I try to connect using SqlTool and based on chapter 8 of the documentation I came up with this command to connect:
java -jar hsqldb-2.2.5/hsqldb/lib/sqltool.jar localhost-sa
Which gives the following error:
Failed to get a connection to 'jdbc:hsqldb:hsql://localhost' as user "SA".
Cause: General error: database alias does not exist
while the server says:
[Server#60072ffb]: [Thread[HSQLDB Connection #4ceafb71,5,HSQLDB Connections #60072ffb]]: database alias= does not exist
I am at a loss. Should I specify alias when connecting somehow? What alias would my database have then? The server did not say anything about that...
(also, yes I have copied the sqltool.rc file to my home folder.
Your server has -dbname.0 xdb as the database alias. Therefore the connection URL should include xdb. For example jdbc:hsqldb:hsql://localhost/xdb
The server can serve several databases with different aliases. The URL without alias corresponds to a server command line that does not include the alias setting.
java -jar /hsqldb-2.3.2/hsqldb/lib/sqltool.jar --inlineRc=url=jdbc:hsqldb:localhost:3333/runtime,user=sa
Enter password for sa: as2dbadmin
SqlTool v. 5337.
JDBC Connection established to a HSQL Database Engine v. 2.3.2 database
This error has been hunting me for the last 5 hours.
Together with this stupid error: HSQL Driver not working?
If you want to run your hsqldb on your servlet with Apache Tomcat it is necessary that you CLOSE the runManagerSwing.bat. I know it sounds trivial but even if you create the desired database and you run Eclipse J22 Servlet with Tomcat afterwards, you will get a bunch of errors. So runManagerSwing.bat must be closed.
See my sqltool answer over on the question "How to see all the tables in an HSQLDB database". The critical piece is setting up your sqltool.rc correctly and putting it in the right location.
You can also use the following statement for getting a connection from a files based store. this can be used if you are running the application from Windows.
connection = DriverManager.getConnection("jdbc:hsqldb:file:///c:/hsqldb/mydb", "SA", "");