Flink 1.9 SQL Client throws ClassNotFoundException: org.apache.kafka.clients.consumer.ConsumerRecord - sql

Trying to use Flink 1.9 SQL-Client with Kafka without success.
After figuring out about the required jar files and copying them into the lib directory, I am getting the following run-time exception when doing SELECT * FROM table-name:
Flink SQL> select * from default_catalog.default_database.member_customer_newsletters ;
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: org.apache.kafka.clients.consumer.ConsumerRecord
Doing jar -tf of the jar-file, I can see the class ConsumerRecord being there:
jar -tf ./lib/flink-sql-connector-kafka-0.11_2.12-1.9.0.jar|grep 'ConsumerRecord'
org/apache/flink/kafka011/shaded/org/apache/kafka/clients/consumer/ConsumerRecord.class
So, I am not sure why it is trhowing a ClassNotFoundException as the class is already in the jar-file?
I just need to add the run-time is looking for "org.apache.kafka.clients.consumer.ConsumerRecord" but because this jar-file is shaded the full qualified name for the class is "org.apache.flink.kafka011.shaded.org.apache.kafka.clients.consumer.ConsumerRecord.class"
But, this should be true with any other class in this shaded jar as well!

There are two different classes:
org.apache.kafka.clients.consumer.ConsumerRecord
org.apache.flink.kafka011.shaded.org.apache.kafka.clients.consumer.ConsumerRecord
In flink-sql-connector-kafka-0.11_2.12-1.9.0.jar, you found the class
org.apache.flink.kafka011.shaded.org.apache.kafka.clients.consumer.ConsumerRecord
while Flink is complaining about:
org.apache.kafka.clients.consumer.ConsumerRecord
The first is a class used internally by Flink, after a kind of copy-paste from Kafka.
The second one is a class in kafka-clients-0.11.0.2.jar.
So Flink is right to complain about a missing library.

Related

Blazegraph INSERT DATA crashes with NoSuchMethodError

In Blazegraph I attempted the following query:
INSERT DATA {
<http://my.site/User/instances/1>
:comment
<http://my.site/Comment/instances/16>.
}
and it crashes with the following exception trace:
java.lang.NoSuchMethodError: com.bigdata.rdf.sail.sparql.SPARQLStarUpdateDataBlockParser.read()I
at com.bigdata.rdf.sail.sparql.SPARQLStarUpdateDataBlockParser.checkSparqlStarSyntax(SPARQLStarUpdateDataBlockParser.java:107)
at com.bigdata.rdf.sail.sparql.SPARQLStarUpdateDataBlockParser.parseValue(SPARQLStarUpdateDataBlockParser.java:100)
at org.openrdf.rio.trig.TriGParser.parseGraph(TriGParser.java:158)
at org.openrdf.repository.sail.helpers.SPARQLUpdateDataBlockParser.parseGraph(SPARQLUpdateDataBlockParser.java:87)
at org.openrdf.rio.trig.TriGParser.parseStatement(TriGParser.java:128)
at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:214)
at com.bigdata.rdf.sail.sparql.UpdateExprBuilder.doUnparsedQuadsDataBlock(UpdateExprBuilder.java:746)
at com.bigdata.rdf.sail.sparql.UpdateExprBuilder.visit(UpdateExprBuilder.java:161)
at com.bigdata.rdf.sail.sparql.UpdateExprBuilder.visit(UpdateExprBuilder.java:119)
at com.bigdata.rdf.sail.sparql.ast.ASTInsertData.jjtAccept(ASTInsertData.java:23)
at com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQLParser.parseUpdate2(Bigdata2ASTSPARQLParser.java:289)
at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.prepareNativeSPARQLUpdate(BigdataSailRepositoryConnection.java:278)
at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.prepareUpdate(BigdataSailRepositoryConnection.java:182)
at org.openrdf.repository.base.RepositoryConnectionBase.prepareUpdate(RepositoryConnectionBase.java:180)
However normal DELETE INSERT WHERE queries work fine.
Any ideas how to solve?
Simply removed the following line in my build.gradle:
compile('org.openrdf.sesame:sesame-runtime:2.8.6')
and refreshed dependencies. You simply don't need to declare a Sesame dependency if you're using bigdata-core, since Sesame is already a sub-dependency of that.
The reason for the crash was that bigdata-core relies on Sesame version 2.7.12 which was being superseded by my 2.8.6 declaration, which doesn't have an implementation of the method called, and hence the crash.
Thanks to #AKSW for the guidance behind this solution.

XSSFWorkbook Creation failure

I am creating .xlsx file using poi jars which are poi-3.15.jar, poi-ooxml->3.15.jar, poi-ooxml-schemas-3.15.jar, ooxml-schemas-1.3.jar, xml-apis->2.0.2.jar, xbean-2.2.0, xmlschema-1.4.7.jar and commons-collections4-.4.0.jar.> But still im getting follwoing error.Exception in thread "main" java.lang.NoClassDefFoundError: org.apache.commons.collections4.ListValuedMap
can you please help me to create simple .xlsx file using >XSSFWorkbook class.
ListValuedMap is contained in this jar: commons-collections4. You should also use Xml Beans 2.3.0 or later.

org.apache.ignite.IgniteCheckedException: Failed to read class name from file

I have a 3 node Apache Ignite Cluster, I have created a cache with Integer as Key and a 'Subscriber' POJO as value, when I connect to the cluster from inside a JAVA program and access the cache , I get the above mentioned exception, I have 'peerclassloading' property set to false, and I have deployed 'Subscriber' POJO Binaries in all the nodes, Please find the complete stack trace below. What am I missing here? Why is it looking for some file inside my IGNITE_HOME when I am starting client inside my JAVA program with Ignition.start()?
class org.apache.ignite.IgniteCheckedException: Failed to read class name from file [id=-1219769240, file=/home/benakaraj/Downloads/apache-ignite-fabric-1.5.0.final-bin/work/marshaller/-1219769240.classname]
at org.apache.ignite.internal.MarshallerContextImpl.className(MarshallerContextImpl.java:158)
at org.apache.ignite.internal.MarshallerContextAdapter.getClass(MarshallerContextAdapter.java:174)
at org.apache.ignite.internal.binary.BinaryContext.descriptorForTypeId(BinaryContext.java:483)
at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1443)
at org.apache.ignite.internal.binary.BinaryObjectImpl.deserializeValue(BinaryObjectImpl.java:537)
at org.apache.ignite.internal.binary.BinaryObjectImpl.value(BinaryObjectImpl.java:117)
at org.apache.ignite.internal.processors.cache.CacheObjectContext.unwrapBinary(CacheObjectContext.java:280)
at org.apache.ignite.internal.processors.cache.CacheObjectContext.unwrapBinaryIfNeeded(CacheObjectContext.java:145)
at org.apache.ignite.internal.processors.cache.CacheObjectContext.unwrapBinaryIfNeeded(CacheObjectContext.java:132)
at org.apache.ignite.internal.processors.cache.GridCacheContext.unwrapBinaryIfNeeded(GridCacheContext.java:1748)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.setResult(GridPartitionedSingleGetFuture.java:598)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.onResult(GridPartitionedSingleGetFuture.java:454)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetResponse(GridDhtCacheAdapter.java:153)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$1200(GridDhtAtomicCache.java:128)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$11.apply(GridDhtAtomicCache.java:295)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$11.apply(GridDhtAtomicCache.java:293)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:582)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:280)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:204)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:80)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:163)
at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:821)
at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103)
at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:784)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: /home/benakaraj/Downloads/apache-ignite-fabric-1.5.0.final-bin/work/marshaller/-1219769240.classname (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileReader.<init>(FileReader.java:72)
at org.apache.ignite.internal.MarshallerContextImpl.className(MarshallerContextImpl.java:154)
... 26 more
Looks like the cache tries to deserialize the value after retrieving it from cache, but you don't have a class for it on the node where IgniteCache.get() was called. You can either deploy the class, or use IgniteCache.withKeepBinary() to avoid deserialization: https://apacheignite.readme.io/docs/binary-marshaller#binaryobject-cache-api
The issue turned out to be pretty simple, Ignite looks for the user defined POJOs from the list of classes loaded by default class loader, if it does not find it there , it looks inside marshalled classes, In my case, my value POJO was inside the test resources , hence default class loader was not loading the class, causing ignite to look inside marshalled classes folder(IGNITE_HOME/work/marshaller/) .

Spark Redshift saving into s3 as Parquet

Issues saving a redshift table into s3 as a parquet file... This is coming from the date field. I'm going to try to convert the column to a long and store it as a unix timestamp for now.
Caused by: java.lang.NumberFormatException: multiple points
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1110)
at java.lang.Double.parseDouble(Double.java:540)
at java.text.DigitList.getDouble(DigitList.java:168)
at java.text.DecimalFormat.parse(DecimalFormat.java:1321)
at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:1793)
at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1455)
at com.databricks.spark.redshift.Conversions$$anon$1.parse(Conversions.scala:54)
at java.text.DateFormat.parse(DateFormat.java:355)
at com.databricks.spark.redshift.Conversions$.com$databricks$spark$redshift$Conversions$$parseTimestamp(Conversions.scala:67)
at com.databricks.spark.redshift.Conversions$$anonfun$1.apply(Conversions.scala:122)
at com.databricks.spark.redshift.Conversions$$anonfun$1.apply(Conversions.scala:108)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at com.databricks.spark.redshift.Conversions$.com$databricks$spark$redshift$Conversions$$convertRow(Conversions.scala:108)
at com.databricks.spark.redshift.Conversions$$anonfun$createRowConverter$1.apply(Conversions.scala:135)
at com.databricks.spark.redshift.Conversions$$anonfun$createRowConverter$1.apply(Conversions.scala:135)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:241)
... 8 more
These are my gradle dependencies:
dependencies {
compile 'com.amazonaws:aws-java-sdk:1.10.31'
compile 'com.amazonaws:aws-java-sdk-redshift:1.10.31'
compile 'org.apache.spark:spark-core_2.10:1.5.1'
compile 'org.apache.spark:spark-sql_2.10:1.5.1'
compile 'com.databricks:spark-redshift_2.10:0.5.1'
compile 'com.fasterxml.jackson.module:jackson-module-scala_2.10:2.6.3'
}
EDIT 1: df.write.parquet("s3n://bucket/path/log.parquet") is how I'm saving the dataframe after I load in the redshift data using spark-redshift.
EDIT 2: I'm running all of this on my macbook air, maybe too much data corrupts the Dataframe? Not sure... It works when I 'limit 1000', just not for the entire table... So "query" works, but "table" doesn't in the spark-redshift options params.
spark-redshift maintainer here. I believe that the error that you're seeing is caused by a thread-safety bug in spark-redshift (Java DecimalFormat instances are not thread-safe and we were sharing a single instance across multiple threads).
This has been fixed in the 0.5.2 release, which is available on Maven Central and Spark Packages. Upgrade to 0.5.2 and this should work!

Dbeaver Connecting to Hive - SQLException: Method not supported

I'm getting this error when trying to run a select after connecting to Hive.
Is this a bad jar file?
org.jkiss.dbeaver.model.impl.jdbc.JDBCException: SQL Error: Method not supported
at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCConnectionImpl.prepareStatement(JDBCConnectionImpl.java:170)
at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCConnectionImpl.prepareStatement(JDBCConnectionImpl.java:1)
at org.jkiss.dbeaver.model.DBUtils.createStatement(DBUtils.java:985)
at org.jkiss.dbeaver.model.DBUtils.prepareStatement(DBUtils.java:963)
at org.jkiss.dbeaver.runtime.sql.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:313)
at org.jkiss.dbeaver.runtime.sql.SQLQueryJob.extractData(SQLQueryJob.java:633)
at org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsProvider.readData(SQLEditor.java:1169)
at org.jkiss.dbeaver.ui.controls.resultset.ResultSetDataPumpJob.run(ResultSetDataPumpJob.java:132)
at org.jkiss.dbeaver.runtime.AbstractJob.run(AbstractJob.java:91)
at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)
Caused by: java.sql.SQLException: Method not supported
at org.apache.hadoop.hive.jdbc.HiveConnection.createStatement(HiveConnection.java:229)
at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCConnectionImpl.createStatement(JDBCConnectionImpl.java:350)
at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCConnectionImpl.prepareStatement(JDBCConnectionImpl.java:138)
... 9 more
There is a calls in hive jdbc jar called org.apache.hive.jdbc.HiveResultSetMetaData . This class contains a method isWritable which is not supported by hive yet. This is the reason why you get the error "Method not supported".
Take the source code of this class and update the above method. Then generate the class and replaced it in the jar. This worked for me.