hive-hbase integration throws classnotfoundexception NULL::character varying - hive

Following with this link https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-HiveMAPtoHBaseColumnFamily
I'm trying to integrate hive and hbase, I have this configuration in hive-site.xml:
<property>
<name>hive.aux.jars.path</name>
<value>
file:///$HIVE_HOME/lib/hive-hbase-handler-2.0.0.jar,
file:///$HIVE_HOME/lib/hive-ant-2.0.0.jar,
file:///$HIVE_HOME/lib/protobuf-java-2.5.0.jar,
file:///$HIVE_HOME/lib/hbase-client-1.1.1.jar,
file:///$HIVE_HOME/lib/hbase-common-1.1.1.jar,
file:///$HIVE_HOME/lib/zookeeper-3.4.6.jar,
file:///$HIVE_HOME/lib/guava-14.0.1.jar
</value>
</property>
Then create a table named 'ts:testTable' in hbase:
hbase> create 'ts:testTable','pokes'
hbase> put 'ts:testTable', '10000', 'pokes:value','val_10000'
hbase> put 'ts:testTable', '10001', 'pokes:value','val_10001'
...
hbase> scan 'ts:testTable'
ROW COLUMN+CELL
10000 column=pokes:value, timestamp=1462782972084, value=val_10000
10001 column=pokes:value, timestamp=1462783514212, value=val_10001
....
And then create external table in hive:
Hive> CREATE EXTERNAL TABLE hbase_test_table(key int, value string )
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, pokes:value")
TBLPROPERTIES ("hbase.table.name" = "ts:testTable",
"hbase.mapred.output.outputtable" = "ts:testTable");
So far so good. But when I tried to select data from the test table, exception was thrown:
Hive> select * from hbase_test_table;
FAILED: RuntimeException java.lang.ClassNotFoundException: NULL::character varying
Error: Error while compiling statement: FAILED: RuntimeException java.lang.ClassNotFoundException: NULL::character varying (state=42000,code=40000)
Am I missing anything?
I'm trying Hive 2.0.0 around with HBase 1.2.1

Ok, I figured it out, the "NULL::character varying" is not a part of hive, it is coming from Postgresql, as I'm using it as the back end of Metastore. But the problem is Hive doesn't recognizes this exception from Postgresql. We have the following code for Hive 2.0.0:
300: if (inputFormatClass == null) {
301: try {
302: String className = tTable.getSd().getInputFormat();
303: if (className == null) {
304: if (getStorageHandler() == null) {
305: return null;
306: }
307: inputFormatClass = getStorageHandler().getInputFormatClass();
308: } else {
309: inputFormatClass = (Class<? extends InputFormat>)
310: Class.forName(className, true, Utilities.getSessionSpecifiedClassLoader());
}
Line 302 will not return null which supposed to. so that the line 310 will try to load a non-existing class in. That's the reason why program failed.
I believe it is a compatible bug, the way to fix it is change the database which I hate to. So I just simply replaced 302 with
if (className == null || className.toLowerCase().startsWith("null::")) {
And do same thing to the getOutputFormat() method, then re-compile the jar, That's it.

Related

Apache Ignite .Net 2.8.1: Affinity collocation with QueryEntity

I'm using Apache Ignite v.2.8.1 .Net on a Windows 10 machine.
I am trying to use affinity collocation on a query-enabled cache. The entities I store in cache have a primary key named "Id" and an affinity key named "PartnerId". Both keys are of the type Int32. I'm defining the cache as follows:
new CacheConfiguration("BespokeCharge")
{
KeyConfiguration = new[]
{
new CacheKeyConfiguration()
{
AffinityKeyFieldName = "PartnerId",
TypeName = typeof(BespokeCharge).Name
}
}
};
Next I use the following code to add the data:
var cache = Ignite.GetCache<AffinityKey, BespokeCharge>("BespokeCharge")
cache.Put(new AffinityKey(entity.Id, entity.PartnerId), entity)
So far so good. Since I want to be able to use SQL to search for bespoke charges, I also add a QueryEntity configuration:
new CacheConfiguration("BespokeCharge",
new QueryEntity(typeof(AffinityKey), typeof(BespokeCharge))
{
KeyFieldName = "Id",
TableName = "BespokeCharge"
})
{
KeyConfiguration = new[]
{
new CacheKeyConfiguration()
{
AffinityKeyFieldName = "PartnerId",
TypeName = typeof(BespokeCharge).Name
}
}
};
When I run the code, both Ignite and my app crash and the following error is logged:
JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=1565129718, val2=844420635164729]], cacheId=-1278247946, cacheName=BESPOKECHARGES, indexName=BESPOKECHARGES_ID_ASC_IDX, msg=Runtime failure on row: Row#29db2fbe[ key: AffinityKey [idHash=8137191, hash=783909474, key=3, affKey=2], val: UtilityClick.BillValidation.Shared.InMemory.Model.BespokeCharge [idHash=889383506, hash=-399638125, BespokeChargeTypeId=4, ChargeValue=100.0000, ChargeValueIncCommission=100.0000, Id=3, PartnerId=2, QuoteRecordId=5] ][ 4, 100.0000, 100.0000, , 2, 5 ]]]]
When I tried to define QueryEntity with the key type of int instead of AffinityKey, I got a different error with the same outcome -- a crash.
JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=java.lang.ClassCastException: class o.a.i.cache.affinity.AffinityKey cannot be cast to class java.lang.Integer (o.a.i.cache.affinity.AffinityKey is in unnamed module of loader 'app'; java.lang.Integer is in module java.base of loader 'bootstrap')]]
What am I doing wrong? Thank you for your help!
KeyFieldName = "Id" setting is the problem. It sets Id field to be used as a cache key, which is int, but then we use AffinityKey as a cache key, which causes a type mismatch.
In this case I don't think we need KeyFieldName at all, removing it fixes the problem, and SELECT queries should not be affected by this change.

Unloading data from Amazon redshift to Amazon s3

I am trying to use the following code to unload data into S3 bucket. Which works but after unloading it throws some error.
Properties props = new Properties();
props.setProperty("user", MasterUsername);
props.setProperty("password", MasterUserPassword);
conn = DriverManager.getConnection(dbURL, props);
stmt = conn.createStatement();
String sql;
sql = "unload('select * from part where p_partkey in (select p_partkey from
part limit 10)') to"
+ " 's3://redshiftdump.****' "
+ " DELIMITER AS ','"
+ "ADDQUOTES "
+ "NULL AS ''"
+ "credentials 'aws_access_key_id=****;aws_secret_access_key=***' "
+ "parallel off" +
";";
boolean i = stmt.execute(sql);
stmt.close();
conn.close();
The unloading works. It is creating a file in the bucket. But it is giving me some error
java.sql.SQLException:
dataengine.impl.DSISimpleRowCountResult cannot be cast to
com.amazon.dsi.dataengine.interfaces.IResultSet
at
com.amazon.redshift.core.jdbc42.PGJDBC42Statement.createResultSet(Unknown
Source)
at com.amazon.jdbc.common.SStatement.executeQuery(Unknown Source)
what is this error and how to avoid it? Is there any way to dump the table in CSV format. Right now it is dumping the file in FILE format.
You say the UNLOAD works but you receive this error, that suggests to me that you are connecting successfully but there is an problem in the way your code interacts with the JDBC driver when the query completes.
We provide an example that may be helpful in our documentation on the page "Connect to Your Cluster Programmatically"
Regarding the output file format, you will get whatever is specified in your UNLOAD SQL but the filename will have a suffix (for example "000" or "6411_part_00") to indicate which part of the UNLOAD it is.
use executeUpdate .
def runQuery(sql: String) = {
Class.forName("com.amazon.redshift.jdbc.Driver")
val connection = DriverManager.getConnection(url, username, password)
var statement: Statement = null
try {
statement = connection.createStatement()
statement.setQueryTimeout(redshiftTimeoutInSeconds)
val result = statement.executeUpdate(sql)
logger.info(s"statement response code : ${result}")
} catch {
case e: Exception => {
logger.error(s"statement.isCloseOnCompletion :${e.getMessage} ::: ${e.printStackTrace()}")
throw new IngestionException(e.getMessage)
}
}
finally {
if(statement != null ) statement.close()
connection.close()
}
}

Unable to update unidata from .NET

I've been attempting for the last couple of days to update unidata using sample code as a basis using .NET without success. I can read the database successfully and view the raw data within visual studio. The error reported back is a out of range error. The program is attempting to update the unit price of a purchase order.
Error:
{" Error on Socket Receive. Index was outside the bounds of the array.POD"}
[IBMU2.UODOTNET.UniFileException]: {" Error on Socket Receive. Index was outside the bounds of the array.POD"}
Data: {System.Collections.ListDictionaryInternal}
HelpLink: null
HResult: -2146232832
InnerException: null
Message: " Error on Socket Receive. Index was outside the bounds of the array.POD"
Source: "UniFile Class"
StackTrace: " at IBMU2.UODOTNET.UniFile.Write()\r\n at IBMU2.UODOTNET.UniFile.Write(String aRecordID, UniDynArray aRecordData)\r\n at ReadXlsToUnix.Form1.TestUpdate(String PO_LINE_SHIP, String price) in c:\Users\xxx\Documents\Visual Studio 2013\Projects\ReadXlsToUnix\ReadXlsToUnix\Form1.cs:line 330"
TargetSite: {Void Write()}
failing Test Code is:
private void TestUpdate(string PO_LINE_SHIP,string price)
{
UniFile pod =null;
UniSession uniSession =null;
//connection string
uniSession = UniObjects.OpenSession("unixMachine", "userid", Properties.Settings.Default.PWD, "TRAIN", "udcs");
//open file
pod = uniSession.CreateUniFile("POD");
//read data
pod.Read(PO_LINE_SHIP);
//locking strategy
pod.UniFileLockStrategy = 1;
pod.UniFileReleaseStrategy = 1;
if (pod.RecordID == ""){
pod.UnlockRecord();
}
//replace existing value with one entered by user
pod.Record.Replace(4, (string)uniSession.Iconv(price, "MD4"));
try
{
pod.Write(pod.RecordID,pod.Record); //RecordId and Record both show correctly hover/immediate window
//pod.Write() fails with same message
}
catch (Exception err)
{
MessageBox.Show("Error" + err);
}
pod.Close();
UniObjects.CloseSession(uniSession);
}
}
Running on HP UX 11.31 unidata 7.2 and using UODOTNET.dll 2.2.3.7377
Any help greatly appreciated.
This is the write record version and have also tried writefield functionality with same error.
Rajan - thanks for the update and link. I have tried unsuccessfully to read/update my unidata tables using the U2 Toolkit. I can however read/update a file I have created within the same account. Does this mean there is a missing entry somewhere VOC, DICT for example.

spring liquibase / hsqldb integration

I am trying to load an embedded hsqldb with data using SpringLiquibase integration.
java config=
#Configuration
#Profile("testing")
...
#Bean
public DataSource dataSource() {
DataSource ds = new EmbeddedDatabaseBuilder()
.setType(EmbeddedDatabaseType.HSQL)
.setName("junittest")
.build();
return ds;
}
#Bean
SpringLiquibase liquibase(){
SpringLiquibase lb = new SpringLiquibase();
lb.setDataSource(dataSource());
lb.setChangeLog("classpath:liquibase/example.xml");
lb.setContexts("testing");
return lb;
}
...
liquibase schema=example.sql
create table xyz(
id integer,
name varchar(10)
);
But I get the following exception:
Caused by: liquibase.exception.LockException: liquibase.exception.UnexpectedLiquibaseException: liquibase.snapshot.InvalidExampleException: Found multiple catalog/schemas matching null.PUBLIC
at liquibase.lockservice.StandardLockService.acquireLock(StandardLockService.java:211)
at liquibase.lockservice.StandardLockService.waitForLock(StandardLockService.java:151)
at liquibase.Liquibase.update(Liquibase.java:182)
at liquibase.Liquibase.update(Liquibase.java:174)
at liquibase.integration.spring.SpringLiquibase.performUpdate(SpringLiquibase.java:345)
at liquibase.integration.spring.SpringLiquibase.afterPropertiesSet(SpringLiquibase.java:302)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1571)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1509)
... 40 more
Caused by: liquibase.exception.UnexpectedLiquibaseException: liquibase.snapshot.InvalidExampleException: Found multiple catalog/schemas matching null.PUBLIC
at liquibase.snapshot.SnapshotGeneratorFactory.hasDatabaseChangeLogLockTable(SnapshotGeneratorFactory.java:168)
at liquibase.lockservice.StandardLockService.hasDatabaseChangeLogLockTable(StandardLockService.java:138)
at liquibase.lockservice.StandardLockService.init(StandardLockService.java:85)
at liquibase.lockservice.StandardLockService.acquireLock(StandardLockService.java:185)
... 47 more
Caused by: liquibase.snapshot.InvalidExampleException: Found multiple catalog/schemas matching null.PUBLIC
at liquibase.snapshot.jvm.SchemaSnapshotGenerator.snapshotObject(SchemaSnapshotGenerator.java:74)
at liquibase.snapshot.jvm.JdbcSnapshotGenerator.snapshot(JdbcSnapshotGenerator.java:59)
at liquibase.snapshot.SnapshotGeneratorChain.snapshot(SnapshotGeneratorChain.java:50)
at liquibase.snapshot.jvm.JdbcSnapshotGenerator.snapshot(JdbcSnapshotGenerator.java:62)
at liquibase.snapshot.SnapshotGeneratorChain.snapshot(SnapshotGeneratorChain.java:50)
at liquibase.snapshot.jvm.JdbcSnapshotGenerator.snapshot(JdbcSnapshotGenerator.java:62)
at liquibase.snapshot.SnapshotGeneratorChain.snapshot(SnapshotGeneratorChain.java:50)
at liquibase.snapshot.jvm.JdbcSnapshotGenerator.snapshot(JdbcSnapshotGenerator.java:62)
at liquibase.snapshot.SnapshotGeneratorChain.snapshot(SnapshotGeneratorChain.java:50)
at liquibase.snapshot.DatabaseSnapshot.include(DatabaseSnapshot.java:153)
at liquibase.snapshot.DatabaseSnapshot.init(DatabaseSnapshot.java:56)
at liquibase.snapshot.DatabaseSnapshot.<init>(DatabaseSnapshot.java:33)
at liquibase.snapshot.JdbcDatabaseSnapshot.<init>(JdbcDatabaseSnapshot.java:22)
at liquibase.snapshot.SnapshotGeneratorFactory.createSnapshot(SnapshotGeneratorFactory.java:126)
at liquibase.snapshot.SnapshotGeneratorFactory.createSnapshot(SnapshotGeneratorFactory.java:119)
at liquibase.snapshot.SnapshotGeneratorFactory.createSnapshot(SnapshotGeneratorFactory.java:107)
at liquibase.snapshot.SnapshotGeneratorFactory.has(SnapshotGeneratorFactory.java:97)
at liquibase.snapshot.SnapshotGeneratorFactory.hasDatabaseChangeLogLockTable(SnapshotGeneratorFactory.java:166)
... 50 more
version of hsqldb / liquibase : 1.8.0.10 / 3.2.3
Am I missing some config here?

Unable to open transformation: null - Running a Job from Java

I have simple Job (named A) which starts a simple transformation (named A). The transformation
contains only a dummy component.
They are both stored into a db repository.
If I start the job from kitchen, everything runs fine:
./kitchen.sh -rep=spoon -user=<user> -pass=<pwd> -job A
Then I have written a simple java code:
JobMeta jobMeta = repository.loadJob(jobName, directory, null, null);
org.pentaho.di.job.Job job = new org.pentaho.di.job.Job(null, jobMeta);
job.getJobMeta().setInternalKettleVariables(job);
job.setLogLevel(LogLevel.ERROR);
job.setName(Thread.currentThread().getName());
job.start();
job.waitUntilFinished();
if (job.getResult() != null && job.getResult().getNrErrors() != 0) {
...
}
else {
...
}
Problem is that running the java program I always get the following error:
A - Unable to open transformation: null
A - java.lang.NullPointerException
at org.pentaho.di.job.entries.trans.JobEntryTrans.execute(JobEntryTrans.java:698)
at org.pentaho.di.job.Job.execute(Job.java:589)
at org.pentaho.di.job.Job.execute(Job.java:728)
at org.pentaho.di.job.Job.execute(Job.java:443)
at org.pentaho.di.job.Job.run(Job.java:363)
I have googled for this error without success and I am stucking there.
Any suggestion ?
The solution seems to be replacing the line inside kitchen
org.pentaho.di.job.Job job = new org.pentaho.di.job.Job(null, jobMeta);
with
org.pentaho.di.job.Job job = new org.pentaho.di.job.Job(repository, jobMeta);
Hoping that this helps someone else.