Determine Hive version via query - hive

With Apache Drill I can get the version through a JDBC connection by dispatching the query: SELECT version FROM sys.version. Is there an analogous way to determine the Hive version?
I know I can use hive --version from a machine where Hive is running and available via the command line. However a query-based approach would fit my use case a little better as JDBC connections may be made from anywhere inside my network.

It's easy if you have a JDBC Connection.
Connection conn = // get it from somewhere
DatabaseMetaData md = conn.getMetadata();
System.out.println(md.getDatabaseMajorVersion() + "." + md.getDatabaseMinorVersion());
I don't know if you can get the information from a SQL/HiveQL query.

you can try this query
select version();

Related

Zeppelin alternative for K-V store

Any alternative for checking key-value entries while debugging the Ignite application? Zeppelin can be able to do only SQL querying. Visor command -> modify -get -c=CName is very tedious to work on and also can't get entries by wildcard searching of keys. Or is there any way we can query the K-V store via SQL Queries?
You can use:
1)REST
https://apacheignite.readme.io/docs/rest-api#get-and-remove
2)Create the thick JAVA, .NET, C++ clients that will use native cache API
3)Node JS client:
https://github.com/apache/ignite/blob/master/modules/platforms/nodejs/examples/CachePutGetExample.js
4)Python client:
https://apacheignite.readme.io/docs/python-thin-client-key-value
5)PHP client:
https://apacheignite.readme.io/docs/php-thin-client-key-value
Probably I missed some integrations.
Also as I know Zeppelin supports cacheAPI using Scala syntax:
https://zeppelin.apache.org/docs/0.8.0/interpreter/ignite.html
val cache: IgniteCache[AffinityUuid, String] = ignite.cache("words")
And the final way. You can add query entity to your cache and run the queries like next:
select _key, _val from table;

Hive on Spark execution engine failed

I am trying Hive on Spark execution engine.I am using Hadoop2.6.0 ,hive 1.2.1,spark 1.6.0.Hive is successfully running in mapreduce engine.Now I am trying Hive on Spark engine.Individually all are working properly.In Hive I set property as
set hive.execution.engine=spark;
set spark.master=spark://INBBRDSSVM294:7077;
set spark.executor.memory=2g;
set spark.serializer=org.apache.spark.serializer.KryoSerializer;
Added spark -asembly jar in hive lib.
and I am trying this command,
select count(*) from sample;
I am getting like this,
Starting Spark Job = b1410161-a414-41a9-a45a-cb7109028fff
Status: SENT
Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
Am I missing any other settings required,please guide me.
I think the problem may be because you use incompatible versions. If you see the version compatibility on Hive on Spark: Getting Started, you'll see that these two specific versions don't ensure the correct work.
I advise you to change the version and use the compatibility version that they advise. I had same problem and I solved when change the versions for compatibility versions.

Prepared Statement support in Apache Ignite Cache API

Is any facility like Prepared statement supported in IgniteCache API to avoid query parsing each time? I saw that a Jira issue has been raised for this , and it says its resolved in version 1.5.0.final,
https://issues.apache.org/jira/browse/IGNITE-1856 , but i could not find any documentation for this in Apache Ignite site. I know that we can use prepared statement by connecting via JDBC Connection but that does not suit my use case.
My Code looks like below ,this query will be called again and again with different parameters,
IgniteCache<Integer,Subscriber> subscriberCache= rocCachemanager.getCache("subscriberCache");
SqlQuery<Integer, Subscriber> sql = new SqlQuery(Subscriber.class,
"from Subscriber where Subscriber.MSISDNNo=? and Subscriber.status='Active'");
sql.setArgs("SomeNumber");
QueryCursor<Entry<Integer,Subscriber>> cursor =ss.query(sql);
Statements are cached automatically, no action required. If your query text does not change, only parameters do, Ignite will not parse the query again.

Specifying database other than default with Impala JDBC driver

I'm using the Impala JDBC driver (or I guess it's actually the Hive Server 2 JDBC driver). I have a view created in another database -- let's call it "store55".
Let's say my view is defined as follows:
CREATE VIEW good_customers AS
SELECT * from customers WHERE good = true;
When I try to query this view using JDBC as follow:
SELECT * FROM store55.good_customers LIMIT 10
I get an error such as:
java.sql.SQLException: AnalysisException: Table does not exist: default.customers
Ideally, I'd like to specify the database name somewhere in the JDBC URL or as a parameter but when I try to use this JDBC url, I still get the same error:
jdbc:hive2://<host>:<port>/store55;auth=noSasl
Doe the Hive2 JDBC driver just ignore the database part of the URL and assume all queries are executed against the default database?
The only way I was able to have the queries return is to change the view definition itself to include the database name:
CREATE VIEW good_customers AS
SELECT * from store55.customers WHERE good = true;
However, I'd like to keep the view definition free of database names.
Thanks!
You might want to specify in JDBC the "use database xxxxx;" statement.
Also, if you are already using the database try "invalidate metadata" statement.
The URL is jdbc:hive2://:/store55;auth=noSasl correct
Can you run few diagnostics such as:
SHOW TABLES - to ensure that the view is created in store55
Are you using the USE DATABASE command in the DDL's

Validate Hive HQL syntax?

Is there a programmatic way to validate HiveQL statements for errors like basic syntax mistakes? I'd like to check statements before sending them off to Elastic Map Reduce in order to save debugging time.
Yes there is!
It's pretty easy actually.
Steps:
1. Get a hive thrift client in your language.
I'm in ruby so I use this wrapper - https://github.com/forward/rbhive (gem install rbhive)
If you're not in ruby, you can download the hive source and run thrift on the included thrift configuration files to generate client code in most languages.
2. Connect to hive on port 10001 and run a describe query
In ruby this looks like this:
RBHive.connect(host, port) do |connection|
connection.fetch("describe select * from categories limit 10")
end
If the query is invalid the client will throw an exception with details of why the syntax is invalid. Describe will return you a query tree if the syntax IS valid (which you can ignore in this case)
Hope that helps.
"describe select * from categories limit 10" didn't work for me.
Maybe this is related to the Hive version one is using.
I'm using Hive 0.8.1.4
After doing some research I found a similar solution to the one Matthew Rathbone provided:
Hive provides an EXPLAIN command that shows the execution plan for a query. The syntax for this statement is as follows:
EXPLAIN [EXTENDED] query
So for everyone who's also using rbhive:
RBHive.connect(host, port) do |c|
c.execute("explain select * from categories limit 10")
end
Note that you have to substitute c.fetch with c.execute, since explain won't return any results if it succeeds => rbhive will throw an exception no matter if your syntax is correct or not.
execute will throw an exception if you've got an syntax error or if the table / column you are querying doesn't exist. If everything is fine, no exception is thrown but also you'll receive no results, which is not an evil thing
In the latest version hive 2.0 comes with hplsql tool which allows us to validate hive commands without actually running them.
Configuration:
add the below XML in hive/conf folder and restart hive
https://github.com/apache/hive/blob/master/hplsql/src/main/resources/hplsql-site.xml
To Run the hplsql and validate the query , please use the below command:
To validate Singe Query
hplsql -offline -trace -e 'select * from sample'
(or)
To Validate Entire File
hplsql -offline -trace -f samplehql.sql
If the query syntax is correct , the response from hplsql would be something like this:
Ln:1 SELECT // type
Ln:1 select * from sample // command
Ln:1 Not executed - offline mode set // execution status
if the query Syntax is wrong , the syntax issue in the query will be reported
If the hive version is older, we need to manually place the hplsql jars inside the hive/lib and proceed.