Atlas with impala 3.2 - impala

How do I configure hook when integrating atals with impala 3.2?Could not find the parameters of the configured hook!
Impala 3.3 and higher can be configured query_event_hook_classes=org.apache.atlas.impala.hook.ImpalaLineageHook

Related

NiFi connection to Hive fails

I am trying to insert into Hive version 2.3.2 using NiFi 1.9.2 on Docker. It works well with PutHiveQL processor but always fails with PutHiveStreaming.
The Hive components included with Apache NiFi are not compatible with
Hive 2.x, they are built with Hive 1.2.x. There is a Jira to add Hive
2 support (NIFI-6456) but it is not yet in NiFi.

Which Phoenix version should I use with HBase in Cloudera 5.5 and Hortonworks 2.4?

Is there a single version of phoenix that is compatible with HBase provided in both Cloudera 5.5 and Hortonworks 2.4?
Hortonworks provides custom fixes and "backports" to their version of Phoenix in their HDP distribution. Cloudera may do the same as well.
I am assuming that you are asking about a client version that is compatible with both server versions.
Are you using the "thin" client jars? Do you find that your application does not work for one distribution or the other (dependent on which version jars you have)? Your application may work for both distributions if you use the non-thin jars.
If you would like to continue using the thin client, you may have to set phoenix.queryserver.serialization to JSON. HDP 2.3.4+ use PROTOBUF by default whereas CDH does not currently support PROTOBUF.
If you are asking about manually installing a version of the Phoenix server that can be installed on both distributions, both use HBase 1.1.x. Any Phoenix version 4.4+ can be used on either distribution. But I recommend using the version that is distributed with the platform.
A Phoenix 4.5.2 package for CDH 5.5.x is available via Cloudera Labs:
http://blog.cloudera.com/blog/2015/11/new-apache-phoenix-4-5-2-package-from-cloudera-labs/
Note however that Cloudera Labs packages are for dev/test only (not supported by Cloudera).

Upgrading Hive to allow Update/Delete Transactions within Ambari

I have created a Hadoop Cluster with Ambari 2.1 including Hive. I would like to be able to do Update and Delete queries within Hive, but it looks like I currently have version 0.12.0.2.0 of Hive. I would like to upgrade to 0.13 or 0.14 to enable these transactions, but I am not sure how to do that with an existing installation of Ambari. Any help would be appreciated.
I think you could follow the HDP docs from hortonworks website
Manual Upgrade of HDP
Upgrading Stack - Ambari
Performing upgrade - Hortonworks
Hope this is helpful.
P.S: Upgrades/ Inserts are not supported in 0.13. You will have to have 0.14 or later for the same.

Error while executing select query in Hive - how to update Hadoop version

I am currently using hadoop 1.0.3 version. I recently installed Apache Hive to run with it. I was running the select * query which gave me an NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset
I further found out its a compatibility issue with my current version of hadoop and requires me to upgrade to 1.2 or later.
I am fairly new to hadoop and would like to upgrade my current version to 1.2 or later. How do I go about doing the same.
I could not find any resources online to do so.
Thanks.
Just download hadoop 1.2.x from here and do necessary configuration changes in your new hadoop. Change HADOOP_HOME to point to your new hadoop folder.
NOTE: Change all the environmental variables (including .bashrc) to point to your new hadoop.

Can Presto connect to other Hadoop distributions and run queries

I see Presto has plugin only to CDH4. Can I connect to other distributions such as HortonWorks from this and what does it take to do it.
Without a specific plugin, I am running into "path host null" errors when executing queries from Presto. Appreciate your help.
The Presto Hive connector supports multiple versions of Hadoop:
hive-hadoop1: Apache Hadoop 1.x
hive-hadoop2: Apache Hadoop 2.x
hive-cdh4: Cloudera CDH 4
hive-cdh5: Cloudera CDH 5
See the Hive Connector documentation for more details.
Where is the code for the CDH connector in GitHub?
briefly looking at the code in GitHub, i dont see anything specific to CDH , other than the name, in presto / presto-hive-cdh4 /src / main /java - am i looking at the wrong thing?