RJDBC hive, connect failed - hive

I followed multiples tutorials to try to connect to Hive with RJDBC, without sucess.
Here is what I have:
library(DBI)
library(rJava)
library(RJDBC)
driver <- JDBC('org.apache.hive.jdbc.HiveDriver',
classPath = list.files("/home/cdsw/R",pattern="jar$",full.names=T),
identifier.quote="`")
USERNAME <- "MyUser"
PASSWORD <- "MySecretPassWord"
HOSTNAME <- "my.host.net"
PORT <- 10000
server <- sprintf('jdbc:hive2://%s:%s', HOSTNAME, PORT)
conn <- dbConnect(driver, server,
USERNAME, PASSWORD)
I have downloaded and place at "/home/cdsw/R/" the jar files.
list.files("/home/cdsw/R",pattern="jar$",full.names=T)
[1] "/home/cdsw/R/hadoop-common-2.6.0-cdh5.16.99.jar"
[2] "/home/cdsw/R/hive-jdbc-1.1.0-cdh5.16.99.jar"
I've also tried most recent versions, but always sync with the same Cloudera Version. Even if my version is 5.XX.
I'm quite sure the HOSTNAME is correct since I've made it work with impyla in Python with the same Hostname/port.
The Error:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.lang.NoClassDefFoundError: org/apache/thrift/TException
From what I understand, I don't have the correct .jars?
Remark:
I can not install hive-jdbc on the machine since I'm not root. Can I do without it since I have place the
hive-jdbc-1.1.0-cdh5.16.99.jar in a folder?
Also, could Kerberos trigger this error?

I needed to download the standalone version of the hive driver.
hive-jdbc-3.1.2-standalone.jar, the standalone version does not require the full install of hive client.

Related

Unable to connect to SQL Server with Kerberos when transformers library is installed

I'm trying to connect to an mssql database using Kerberos authentication in Python. When my anaconda environment just has pyodbc installed, I can connect and send queries to the database. But when I add huggingface's transformer's library to the environment, I get the following error:
Error: ('HY000', '[HY000] [Microsoft][ODBC Driver 17 for SQL Server]SSPI Provider: No credentials were supplied, or the credentials were unavailable or inaccessible. No Kerberos credentials available: No KCM server found (458752) (SQLDriverConnect)')
An example function that works without the transformer's library installed is
import pyodbc
def pyodbc_query(query):
cnxn = pyodbc.connect(
Trusted_Connection='Yes',
Driver='{/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.2.so.0.1}',
Server='servername',
Database='database'
)
cursor = cnxn.cursor()
cursor.execute(query)
result = cursor.fetchall()
return result
I've also tried using sqlalchemy instead of pyodbc, with the same results. My pyodbc version is 4.0.35 and my transfromers version is 4.26.0. Has anyone had the same problem?
In case anyone else stumbles on this:
The issue is that transformers downloads the library krb5 as a dependency, and this package can mess up the system's Kerberos system configuration. You can uninstall just krb5 with conda remove krb5 --force -y if you don't need it for anything else. More information is available in a github issue:
https://github.com/ContinuumIO/anaconda-issues/issues/10772

Why the installing process of R package "RODBC" in "R CMD INSTALL" can't find ODBC driver manager?

I am trying to connect to an Vertica DB from R using "RODBC" package. Also, the machine I am using is an remote server which doesn't have direct internet access so I basically "transfer" all source files from my local to the remote server to build the system. So, in order to give you an clear context, I am listing all my steps in attending of installing "RODBC" package below:
Step1 - I downloaded the RODBC_1.3-13.tar.gz source file for RODBC and then tried to directly install it with "R CMD INSTALL". However, I encountered error as "ODBC headers sql.h and sqlext.h not found".
Step2 - After a few researches, I found that the installation of "unixodbc-dev" would potentially solve this issue. Therefore, I downloaded all needed dependencies for "unixodbc-dev" and transferred them to the server. As you can see the list:
Therefore, I also successfully installed "unixodbc-dev":
However, another error message appears when I tried to re-install the "RODBC" using "sudo R CMD INSTALL /home/mli/RODBC_1.3-13.tar.gz" in which it returns error "no ODBC driver manager found":
As the message indicates, the installation program can't locate my ODBC driver manager. So, I downloaded "vertica-client-7.2.3-0.x86_64.tar.gz" and unzipped it on the server:
So, now my question is how can I customize the "R CMD INSTALL" command say, using some parameter handles to direct the installation program to locate the driver manager? Or am I trying this in a right direction? Please let me know. Any help would be really appreciated!!! :)
ADDITION:
I have also tried it with JDBC in which the I successfully loaded the "RJDBC" package in R and used the JDBC driver from vertica-client-7.2.3-0.x86_64.tar.gz. Also, I have already had "rJava" installed. However, I have still got an error when I tried to make the connection. I am listing my result below:
I successfully installed the "RJDBC" with "$R CMD INSTALL RJDBC_0.2-5.tar.gz --library=/usr/local/lib/R/site-library/" and then I tried the following scripts in R. All the lines are successfully executed except on the line 16:
Based on the error message, I assumed the version of the JDBC driver that I was using is too new for the Vertica server. So, I was trying to use an older version JDBC driver instead, like the "vertica-jdk5-6.1.0-0.jar" which I have downloaded from this link:http://www.java2s.com/Code/Jar/v/Downloadverticajdk56100jar.htm
So, I moved the file "vertica-jdk5-6.1.0-0.jar" to my home directory on the server and then changed the JDBC driver path in the R script:
As you can see, it still returns error "FATAL: Unsupported frontend protocol 3.6: server supports 3.0 to 3.5". Am I doing it right? Or is there an issue with the new driver that I downloaded? How can make it works? Please, any help will be really appreciated! Thanks!!!
A few things:
First, just do sudo apt-get install r-cran-rodbc. The package was created (by yours truly) in no small part because dealing with unixODBC or iODBC is not fun. But even once you have that, you still need the ODBC driver for Linux from Vertica. And that part is filly.
Second, I just did something similar the other day but just used JDBC, which worked. You do of course need sudo apt-get install r-cran-rjava which has its own can of worms (but I already mentioned Java...) Still, maybe try that instead?
Third, you can cheat and just use psql pointed to the Vertica port (usually one above the PostgreSQL port).

DBVisualizer and HIVE

I am using DBVisualizer 9.2 and Cloudera 5.4.1
I want to setup my db visualizer such that I can query hive database from the dbvisualizer tool.
I downloaded the jdbc driver for HIVE from here
http://www.cloudera.com/downloads/connectors/hive/jdbc/2-5-16.html
I extracted all the jar files in /Users/User1/.dbvis/jdbc
But now, when I start dbvisualizer, I get an error
Ignored as there is no matching Default Driver for "com.cloudera.hive.jdbc41.HS1Driver", "com.cloudera.hive.jdbc41.HS2Driver"
/Users/User1/.dbvis/jdbc
HiveJDBC41.jar
TCLIServiceClient.jar
hive_metastore.jar
hive_service.jar
libfb303-0.9.0.jar
libthrift-0.9.0.jar
log4j-1.2.14.jar
ql.jar
slf4j-api-1.5.11.jar
slf4j-log4j12-1.5.11.jar
zookeeper-3.4.6.jar
So my question is, has anyone successfully configured the DBVisualizer tool to connect to cloudera hive server?
After several hours of troubleshooting. I was able to resolve the error and successfully connect to HIVE from DB Visualizer using the HIVE JDBC Driver from cloudera.
These are the steps I took
First go to Tools -> Tool Properties -> Driver finder paths.
Here register a new empty directory. this will be the place where you will download all your jars.
First in this directory extract all the JAR files which come along with the cloudera JDBC Hive Driver.
http://www.cloudera.com/downloads/connectors/hive/jdbc/2-5-4.html
Now go to Tools -> driver manager and select Hive. In the "user specified" tab. click on the "folder icon" on the right hand side and select all the jar files which you just unzipped. (not just the folder... select all jars).
Make sure you select com.cloudera.hive.jdbc41.HS2Driver
Now define connection to Hive using these parameters
url: jdbc:hive2://foo:10000/default
user: admin
password: admin
Now when I tried to connect, I still got errors.
"Type: java.lang.reflect.UndeclaredThrowableException"
In order to resolve the above, I you need to see the error log. (this was the most important step).
Tools -> Debug Window -> Error log
Here I saw that the mysterious "UndeclaredThrowableException" is occuring because a bunch of class files like http utils, http core, hadoop core, hive core and hive cli jar files were missing. I downloaded these jars from maven central
hadoop-core-0.20.2.jar
hive-exec-2.0.0.jar
hive-service-1.1.1.jar
httpclient-4.5.2.jar
httpcore-4.4.4.jar
and again I went inside Tools->DriverManager -> Hive -> user defined and clicked on folder on right hand side and selected each of these jars as well.
Now when I restarted DBVisualizer, I connected to hive just fine and I can query it using DBVisualizer.

Play! Framework 2.2.3 SQL server connection

I have a Play! Web-app that I am developing and I'm currently using an eBean YAML database. I have a 2008 SQL server that was set-up for me by a coworker that I'd like to connect to. I tried following another tutorial on this site
PlayFramework MSSQL Database error
and downloaded the jtds jar file and placed into the proper directories but I'd get an error that the driver is not found. This is my current configuration file:
db.default.url="jdbc:jtds:sqlserver://LSA5A:1433/DatabaseName=hr_site;instance=SQL2008"
db.default.driver=net.sourceforge.jtds.jdbc.Driver
db.default.user=HUser
db.default.password="RaeSusdaRasdh!123"
I have never set-up a database like this before so I'm having difficulty understanding how to set it up and how it will all works together. I didn't understand the solution from the Play! Docs for this and I'm not using MySQL so I couldn't find the help I needed online. I'm not sure what other information I need to provide but I'm running SQL server 2008, db name is SVFSSQL5A with user HRTUser and password testPass12. Thanks for your help!!
Also I have the dependency in my build file:
val appDependencies = Seq(
"net.sourceforge.jtds" % "jtds" % "1.2"
)
I did this:
Download the SQL-Driver for JDBC from Microsoft http://www.microsoft.com/de-DE/download/details.aspx?id=11774, install it and put the sqljdbc4.jar file in the lib folder ( \YourProject\lib).
application.conf file:
db.default.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
db.default.url="jdbc:sqlserver://localhost\\instancename:1433;databaseName=MyDBName"
db.default.user="sa"
db.default.password="MyPassword"
db.default.logStatements=true
Enable the TCP\IP protocoll and the port 1433 in the SQL Server Configuration Tool.
That should do the job!
Manfred

How to define MySQL data source in TomEE?

Platform: TomEE Web profile 1.5.0.
I am trying to do a very basic thing, setup a data source for MySQL. I have read the official guide (http://openejb.apache.org/configuring-datasources.html). It asks us to enter a Resource element in openejb.xml. I can not find that file anywhere in tomee-webprofile-1.5.0. I read in other places that I could use tomee.xml for the same purpose. So, I added this to my conf/tomee.xml.
<Resource id="TestDS" type="DataSource">
JdbcDriver com.mysql.jdbc.Driver
JdbcUrl jdbc:mysql://localhost/test
UserName root
Password some_pass
</Resource>
I copied MySQL driver JAR to tomee/lib folder.
I wrote this code. Showing snippets here:
#Resource(name="TestDS")
DataSource ds;
Connection con = ds.getConnection();
PreparedStatement ps = con.prepareStatement("select * from UserProfile");
The prepareStatement() call is throwing this exception:
java.sql.SQLSyntaxErrorException: user lacks privilege or object not found: USERPROFILE
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
Why is the system using hsqldb driver? In fact, no matter what is use as name for #Resource, I get the same exception.
What am I doing wrong? I am starting TomEE from Eclipse, if that makes any difference.
I have tracked down the root cause. The problem happens only when I start TomEE from Eclipse. If I start it from command line, my data source definition works just fine.
It appears that when I run TomEE from command line, it uses configuration files from /.metadata/.plugins/org.eclipse.wst.server.core/tmp0/conf. To change this, I had to take these steps in Eclipse:
Remove all deployed projects from the server.
Open server settings and from "Server Locations" choose "Use Tomcat installation". This section is greyed out if you have at least one project still deployed to the server. So, make sure you have done step #1.
Restart the server and redeploy the application. Now, my application is finding the data source.
normally the installation is explained here http://tomee.apache.org/tomee-and-eclipse.html
[I would make this a comment to the answer of RajV, but do not have enough reputation to do so.]
Platform: Tomee 1.6.0 Webprofile, eclipse-jee-kepler-SR2-linux-gtk-x86_64 and OpenJDK 1.7.0_51
After doing the steps in http://tomee.apache.org/tomee-and-eclipse.html (including "Workspace Metadata Installation") I got the same error "user lacks privilege or object not found".
My reaction was to:
$ ln -s [workspace_path]/Servers/tomee.xml \
[workspace_path]/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/conf/
As an advantage of this solution TomEE in eclipse is always using the current version of Workspace/Servers/tomee.xml without any further manual operation.
For me, better solution is to put tomee.xml file in your wpt server directory (/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/conf) and define your datasource there.