Invalid operation: relation "fs_fsentry" already exists; - jcr

I have an icCube setup, icCubeRepository.xml, to use Postgres as JCR repository. When starting icCube I get the following error
java.sql.SQLException: Amazon Invalid operation: relation "fs_fsentry" already exists;
Looks as the driver used in the JCR is redshift instead of the expected Postgres.

This was hapenning because the postgres driver was not added to the classpath in bin/iccube.sh file . Adding postgres jdbc driver to the classpath already available in lib folder of iccube works.

Related

facing hive query error on show databases query - Unable to instantiate

I initialized hive and its worked, later I gave SHOW DATABASES command, but I got below error.
I am using mysql for metadata.
adminn#master:~$ hive
Hive Session ID = e9e9145a-0c38-4007-a9af-ded86a4226ea
Logging initialized using configuration in jar:file:/home/adminn/apache-hive-3.1.1-bin/lib/hive-common-3.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show databases;
FAILED: HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
I added the below property to the hive-site.xml file, and this resolved the issue.
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>

Hive - derby - java.lang.SecurityException: sealing violation: package org.apache.derby.impl.services.locks is sealed

After installing Hive and Derby, and before running hive, I wanted to create metadata schema as:
schematool -initSchema -dbType derby
It was giving me the following error:
"Underlying cause: java.sql.SQLException : No suitable driver found
for jdbc:derby://home/hadoop/metastore_db;create=true"
I checked the CLASSPATH which is as follows:
.:/usr/lib/jvm/jre-1.8.0-openjdk/jre/lib:/usr/lib/jvm/jre-1.8.0-openjdk/lib:/usr/lib/jvm/jre-1.8.0-openjdk/lib/tools.jar:/usr/local/derby/db-derby-10.4.2.0-bin/lib/derbyclient.jar:/usr/local/derby/db-derby-10.4.2.0-bin/lib/derby.jar:/usr/local/derby/db-derby-10.4.2.0-bin/lib/derbytools.jar
The required jar files are there.
However, I copied derbytools.jar, derby.jar, and derbyclient.jar from
/usr/local/derby/db-derby-10.4.2.0-bin/lib/ to
/usr/local/hive/apache-hive-3.1.2-bin/lib/
. This solved the above error.
But now I am getting the following error.
"java.lang.SecurityException: sealing violation: package
org.apache.derby.impl.services.locks is sealed."
Some suggested in this mailing list checking if I am pointing the derby jar files twice in the classpath. Apparently, it is not repeated in the CLASSPATH.
Please let me know where I am going wrong.
Entries in the conf/hive-site.xml are as follows:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://home/hadoop/metastore_db;create=true </value>
<!--<value>jdbc:derby://localhost:1527/metastore_db;create=true </value>-->
<description>JDBC connect string for a JDBC metastore </description>
</property>
Thanks,
Soumyadeep

java.lang.ClassNotFoundException: com.microsoft.azure.storage.blob.BlobListingDetails Exception

I am trying to read a table that is on a azure blob storage via pyspark and the below exception is raised even though I have added the below jars in the pyspark --jars.
azure-storage-2.0.0.jar
hadoop-azure-2.7.0.jar
Exception:
py4j.protocol.Py4JJavaError: An error occurred while calling o38.showString.
: java.lang.NoClassDefFoundError: com/microsoft/azure/storage/blob/BlobListingDetails
Caused by: java.lang.ClassNotFoundException: com.microsoft.azure.storage.blob.BlobListingDetails
Any idea as which specific jar needs to be added to resolve the issue and read azure tables in spark?
My suggestion is that as below.
Please download the jar files of the newest version of Azure Storage Java Client & Hadoop Azure Support instead of their old version.
Check whether the path of these jars were added into the SPARK_CLASSPATH environment variable in the conf/spark-env file, or you can programmatically add the jar path via code SparkContext.addJar("Path to jar created from maven [hint: mvn package]").
Hope it helps.

Use saiku with apache hive

have you ever used Saiku to make data analysis on BigData Platform (Hadoop)? My recent work need to integrate some legacy BI tools with Hadoop to support common OLAP queries on HDFS/HBase.
I found a solution implemented with Phoenix & Hbase here, which bridges saiku and Hbase with SQL Dialect in Phoenix and it worked. However, this method can only handle data within HBase through HBase-API. It cannot boost any Map-Reduce style job when building the data cube. I prefer some more BigData compatible alternatives, like through Apache Hive.
Saiku is based on Mondrian. My version of Saiku use Mondrian-4.0.0.0-SNAPSHOT.jar, which I found can already work well with Hive. And I found that there are many Hive-0.13 jars within Saiku's lib directory. So I thought a simple config of hive2 datasource can work. I started an hiveserver2 in the namenode of my HDFS cluster and add following datasource into saiku.
Name: hive2
Connection Type: Mondrian
URL: jdbc:hive2://localhost:10000/default
Schema: /datasources/movie.xml
Jdbc Driver: org.apache.hive.jdbc.HiveDriver
Username: ubuntu
Password: XXXX
The saiku indeed successfully connected to the hiveserver2 but failed to load the datasource. I found following error in the saiku log:
name:hive2
driver:mondrian.olap4j.MondrianOlap4jDriver
url:jdbc:mondrian:Jdbc=jdbc:hive2://localhost:10000/default;Catalog=mondrian:///datasources/movie.xml;JdbcDrivers=org.apache.hive.jdbc.HiveDriver
12:41:48,110 WARN [RolapSchema] Model is in legacy format
12:41:50,464 ERROR [SecurityAwareConnectionManager] Error connecting: hive2
mondrian.olap.MondrianException: Mondrian Error:Internal error: while quoting identifier
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:992)
at mondrian.olap.Util.newInternal(Util.java:2543)
at mondrian.spi.impl.JdbcDialectImpl.deduceIdentifierQuoteString(JdbcDialectImpl.java:245)
at mondrian.spi.impl.JdbcDialectImpl.<init>(JdbcDialectImpl.java:146)
at mondrian.spi.DialectManager$DialectManagerImpl$1.createDialect(DialectManager.java:210)
...
Caused by: java.sql.SQLException: Method not supported
at org.apache.hive.jdbc.HiveDatabaseMetaData.getIdentifierQuoteString(HiveDatabaseMetaData.java:342)
at org.apache.commons.dbcp.DelegatingDatabaseMetaData.getIdentifierQuoteString(DelegatingDatabaseMetaData.java:306)
at mondrian.spi.impl.JdbcDialectImpl.deduceIdentifierQuoteString(JdbcDialectImpl.java:238)
... 99 more
I looked into the hive 0.13 source. I found the getIdentifierQuoteString isn't implemented yet and simply throw an exception.
public String getIdentifierQuoteString() throws SQLException {
throw new SQLException("Method not supported");
}
Till now I'm puzzled. Is it practical to use the saiku with a hive? It has Hive 0.13 jars in its lib dir but cannot load a simple hive datasource? Should I simply modify the source of hive. I found in the newly released Hive 1.0. This function is implemented by simple return an empty string.
Does anyone has good idea? Thanks!

Apache hive error Merging of credentials not supported in this version of hadoop

I am using hadoop 1.2.1, hbase 0.94.14 and hive 1.0.0. There are three datanodes in my clsuter and three regionservers also. I have to import some data from hbase to hive. I have configured hive successfully but when I ran a command to count no. of rows in hive table, its gives following
ERROR [main]: exec.Task (SessionState.java:printError(833)) - Job Submission failed with exception 'java.lang.RuntimeException(java.io.IOException: Merging of credentials not supported in this version of hadoop)'
java.lang.RuntimeException: java.io.IOException: Merging of credentials not supported in this version of hadoop
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:485)
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:856)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:540)
I have changed version of hive to 0.14 but same error.
What is the solution of it?
Note: I cannot upgrade hadoop.
Although your version of Hive is current, this is not the source of your error. You need to upgrade your Hadoop version, to 2.4.0 or above.
The error originates from here https://github.com/apache/hive/blob/3b6825b5b61e943e8e41743f5cbf6d640e0ebdf5/shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java#L579