Querying Hive from Apache Drill causes Stackoverflow error

Querying Hive from Apache Drill causes Stackoverflow error - sql

I am trying to see a table named customers, in hive from drill. I am using Drill in Embedded mode. I am using the default derby database for hive metastore.
When I do a describe, it shows all the columns and types.
But, when I do a select command like this,
select * from customers limit 10;
In the Web UI, this is what I got
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: StackOverflowError
Hive plugin:
{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "thrift://ip_address:9083",
"javax.jdo.option.ConnectionURL": "jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true",
"hive.metastore.warehouse.dir": "/user/hive/warehouse",
"fs.default.name": "file///",
"hive.metastore.sasl.enabled": "false"
}
}
Errors showed in the Log file:
org.apache.drill.exec.work.foreman.ForemanException: Unexpected
exception during fragment initialization: java.lang.AssertionError:
Internal error: Error while applying rule DrillPushProjIntoScan,
java.lang.StackOverflowError: null at
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:355)
~[hadoop-common-2.7.1.jar:na]
And, finally this
Query failed: org.apache.drill.common.exceptions.UserRemoteException:
SYSTEM ERROR: StackOverflowError
And, the versions i am using are:
Apache Drill : 1.3.0
Hive : 0.13.1-cdh5.3.0
Hadoop : 2.5.0-cdh5.3.0

This is a version conflict I guess.
According to Drill's Documentation:
Drill 1.0 supports Hive 0.13. Drill 1.1 supports Hive 1.0.
So, for 1.1+ you may get issues with hive 0.13.
Read more here.
So upgrade hive to 1.0 or downgrade drill to 1.0 to test this.

Related

How to configure applications in ami 4

The documentation says
In Amazon EMR releases 4.0 and greater, the only accepted parameter is
the application name. To pass arguments to applications, you supply a
configuration for each application.
But I cannot find an example that shows how to pass arguments in ami 4. All I can find are examples configuring exports such as below. I am trying to figure out how to set the version of Spark to use.
[
{
"Classification":"hadoop-env",
"Properties":{
},
"Configurations":[
{
"Classification":"export",
"Properties":{
"HADOOP_USER_CLASSPATH_FIRST":"true",
"HADOOP_CLASSPATH":"/path/to/my.jar"
}
}
]
}
]

You cannot set an arbitrary version of Spark to use like you could with 3.x AMI versions. Rather, the version of Spark (and other apps, of course) is determined by the release label. For example, the latest release is currently emr-5.2.1, which includes Spark 2.0.2. If you want a 1.x version of Spark, the latest version available is Spark 1.6.3 on release emr-4.8.3.

Mapr pig throwing error

if I use following load command
A = LOAD '/home/mapr/resoucr' using PigStorage(',');
It throws following error:
org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

These are the deprecation warnings. Possibly you are lacking in newer hadoop configuration.
To get rid of this, you need to update your hadoop configuration files.

ERROR: Talend S3 - AWS authentication requires a valid Date or x-amz-date header

I'm using talend open studio to push salesforce data to my redshift database. By pushing data using the following:
1. tSalesforceInput
2. tMap
3. tFileOutputDelimited
4. tRedshiftOutput
I am only getting about 2-5 rows/s which does not work at all for me.
By pushing the delimited file to tS3Put and then pushing data to redshift the transfer would go MUCH faster, about 500 rows/s. The issue I continue to face is that I get the error:
AWS authentication requires a valid Date or x-amz-date header (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: CC9C86CCC65625C0
And I have no idea how to solve. I have tried using tLibraryLoad to load joda time 2.8.2 before running and then running after but it still fails. Any advice greatly appreciated.

I also had this problem using Talend 6.1 The issue is an incompatibility of Java8, the AWS SDK and the joda-time 2.3 library that Talend bundles.
The solution I found was adapted from: TalendForge
Download joda-time 2.8.2 jar from Joda Time
Add a tLibraryLoad and point it to the new joda-time jar file you downloaded.
Go to your project's Run tab/Advanced Settings and add an additional JVM argument of:
-Xbootclasspath/p:$ROOT_PATH/../lib/joda-time-2.8.2.jar

liquibase 3.4.1 and doesn't work with Java 7

I'm trying to switch from liquibase depedencies from 3.0.2 to 3.4.1 in a Java based tool but when running with Java 7 (tried different updates including latest 80) i'm getting strange exceptions like:
2015-11-13T11:55:43,351+02:00 ERROR java.lang.IllegalStateException: Cannot find generators for database class liquibase.database.core.MSSQLDatabase, statement: liquibase.statement.core.UpdateStatement#5232d51 - [pool-3-thread-1]
liquibase.exception.LockException: java.lang.IllegalStateException: Cannot find generators for database class liquibase.database.core.MSSQLDatabase, statement: liquibase.statement.core.UpdateStatement#5232d51
or
iquibase.exception.UnexpectedLiquibaseException: liquibase.exception.ServiceNotFoundException: liquibase.exception.ServiceNotFoundException: Could not find unique implementation of liquibase.executor.Executor. Found 0 implementations
at liquibase.executor.ExecutorService.getExecutor(ExecutorService.java:31) ~[liquibase-core-3.4.1.jar:na]
With Java 8 everything works fine.
Is this a known issue? Is there any documentation stating 3.4.1 works only with Java 8 (couldn't find anything).
Thanks,
Dan

It looks like you're experiencing the issue fixed in this bug report. If you upgrade to 3.5.x you should be okay.

RavenDB export / import data between RavenDB servers with Raven.Smuggler

I'm trying to export data with Raven.smuggler from one RavenDB server of version 1.0
and import it to other RavenDB server of version 2.0
I'm getting file load exception, due to lucene.net version differences.
Is it possible to migrate data from one version of RavenDB to another?
what is the best way to do it?
I've already read the following url:
http://ravendb.net/docs/server/administration/export-import
thanks for you help :-)
EDIT:
"unhandled Exceptions: system.net.webException: Error: System.IO.FileLoadException: could not load file or assembly 'Lucene.Net, Version=2.9.4.1.....' or one of its dependencies.
The located assembly's manifest definition does not match the assembly reference..."
In the older version of Raven i was using Lucene Analyzer 2.9, which not exists in the new version. I'm guessing that's the problem.

You can absolutely move data between versions using smuggler.
Please post the full error you are seeing. Including stack trace.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Querying Hive from Apache Drill causes Stackoverflow error - sql

This is a version conflict I guess. According to Drill's Documentation: Drill 1.0 supports Hive 0.13. Drill 1.1 supports Hive 1.0. So, for 1.1+ you may get issues with hive 0.13. Read more here. So upgrade hive to 1.0 or downgrade drill to 1.0 to test this.

Related

How to configure applications in ami 4

Mapr pig throwing error

ERROR: Talend S3 - AWS authentication requires a valid Date or x-amz-date header

liquibase 3.4.1 and doesn't work with Java 7

RavenDB export / import data between RavenDB servers with Raven.Smuggler

Categories

Resources