I am trying to connect Grafana and create a dashboard from data present in Hive tables.
Not sure if this is possible, hence seeking SO community's help.
If there are any other open source suggestions for dashboard creation from Hive, would also help.
Related
Is there any way to get table statistics information from Hive metastore so that I can use it for processing via an api?
I found two ways to get statistics,
Meta tool client
API - IMetastoreClient interface
Can anyone tell me the difference between these two? or is there any other way to access it
Would like to know if we can access the Spark External tables with MS SQL as metastore and external files on Azure Data lake using Hive Metastore service (Presto) from a Linux Machine.
We are trying to access the spark delta tables having parquet files on ADLS through Presto. Below is the scenario. I would like to know if there is a possible way to achieve this. We are doing this as a POC only and we believe knowing the answer will take us to next step.
Our central data repository is all spark Delta tables created by many pipelines. The data is stored in Parquet format. MS SQL is the external metastore. Data in these spark tables are used by other teams/applications and they would like to access these data through Presto.
We learnt that Presto uses the metastore service of Hive to access the hive table details. We tried accessing the tables from Hive (thinking if this works Presto also works). But we find problems with different filesystems. We have setup Hadoop and Hive in one single Linux machine. The versions are 3.1.2 & 3.1.1. The hive service is connecting to the SQL metastore and showing the results of few basic commands. However when it comes to accessing the actual data stored in parquet in a ADLS path, it fails saying File system exception. I understand this problem that it is an interaction of many file systems like (ADFS, HDFS, linux) but not finding any blogs that guides us. Kindly help.
Hive Show Database command:
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> SHOW DATABASES;
OK
7nowrtpocqa
c360
default
digital
Hive Listing tables:
hive> SHOW TABLES;
OK
amzn_order_details
amzn_order_items
amzn_product_details
Query data from Orders table:
hive> select * from dlvry_orders limit 3;
OK
Failed with exception java.io.IOException:org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "dbfs"
Time taken: 3.37 seconds
How can I make my setup access the Datalake files and bring in the data?
I believe my metastore should have the exact full path of the ADLS where files are stored. If it is, how will my Hive/Hadoop in Linux will understand the path.
If it can recognize the path also, in which configuraion file should I give the credentials for accessing the data lake (in any .XML)
How can the different file systems interact
Kindly help. Thanks for all the inputs.
I have a hive table lets say it as table A. My requirement is to capture all the DML and DDL operations on table A in table B. Is there any way to capture the same?
Thanks in advance..
I have not come across any such tool however Cloudera Navigator helps to manage it. Refer the detailed documentation.
Cloudera Navigator
Cloudera Navigator auditing supports tracking access to:
HDFS entities accessed by HDFS, Hive, HBase, Impala, and Solr
services
HBase and Impala
Hive metadata
Sentry
Solr
Cloudera Navigator Metadata Server
Alternatively, if you are not using cloudera distribution, you can still access hive-metastore log file under /var/log/hive/hadoop-cmf-hive-HIVEMETASTORE.log.out and check the changes applied to the different table.
I haven't used Apache atlas yet, but from the documentation, it looks like they have Audit store and hive bridge. That works for operational events as well.
https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/atlas-overview/content/apache_atlas_features.html
Is there any way to connect Tableau Desktop to plain Apache Hbase or plain Hive ?
I could only find Tableau drivers for Hortonworks/ MapR/ Cloudhera etc.
Install drivers in desktop installed machine.
You can't directly connect to hbase table via tableau you need to connect to hive table and hive internally mapped to hbase table.
follow links http://thinkonhadoop.blogspot.in/2014/01/access-hbase-table-with-tableau-desktop.html
http://grokbase.com/t/cloudera/cdh-user/141px9aqg5/hbase-connectivity-with-tableau
Our ODBC Driver for HBase will allow you to connect to your HBase data from Tableau. The driver is currently in Beta, so you can download it for free from here.
You can read about setting up the connection in our Knowledge Base, but in short, you'll need to:
Create/configure a DSN from the ODBC Driver (set the server address and port)
Click through the Connect to Data options to find Other Database (ODBC) and select the DSN you configured
Select CData as the database
Enter a Table name (or leave the Table field blank and click search to see a list of Tables).
Once you have access to the tables, you can work with them exactly as you would any other table in Tableau (drag the table to the join area, manipulate Measures and Dimensions to view your data, etc.). If you have any questions, I or our Support Team will be happy to help.
Tableau internally use SQL to fetch raw data, so theoretically it can support any data source comes with a SQL interface, such as Hive.
Plain Hbase does not provide a SQL interface, so you must add an intermediate layer to translate SQL query into Hbase query. The layer could be an ODBC Driver, or other open source projects such as Apache Drill.
I have Oracle Big Data. I have created a table in HIVE. I am able to view the data through HUE in HIVE. But when I am trying to browse to that file I am getting error related HDFS Super User.
Please assist.
Make sure that WebHdfs is configured properly.