I am trying to read streaming data into Azure Databricks . This is the code i've been using:
And its giving me an error saying:
my Databrick Runtime is : 6.4 Extended Support (includes Apache Spark 2.4.5, Scala 2.11)
and i install the package : com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.17
but i still get the same error
Related
I am facing errors when I deploy the hello samza tutorial on yarn following the documentation. Particularly, I was getting errors when I run the run-app.sh script as mentioned.
I am currently using Samza 1.1.0 on AWS EMR (emr - 5.13.0, amazon 2.8.3, zookeeper 3.4.10) and I am trying to deploy hello samza using the documentation provided on samza (https://samza.apache.org/learn/documentation/latest/deployment/yarn.html). Firstly, I could not find bin/build-package.sh but I used bin/deploy.sh to build the maven package. Then, I tried running the following script run-app.sh
./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path ./deploy/samza/config/filter-example.properties
The documentation says that I should be able to deploy the job now but I get the following error which says
Error: Could not find or load main class org.apache.samza.runtime.ApplicationRunnerMain
I deployed an Aerospike container using the official docker hub image. When I try to execute test_list = client.llist(key, 'test_list'), my Python client script returns the following error:
exception.UDFError: (100L, 'UDF: Execution Error 1', 'src/main/llist/llist_operations.c', 93)
I looked at the Aerospike logs and found that each time this code is executed, the error below gets printed:
: WARNING (udf): (src/main/mod_lua.c:599) Lua Create Error: module 'llist' not found:
no field package.preload['llist']
no file './llist.lua'
no file '/usr/local/share/luajit-2.0.3/llist.lua'
no file '/usr/local/share/lua/5.1/llist.lua'
no file '/usr/local/share/lua/5.1/llist/init.lua'
no file '/opt/aerospike/sys/udf/lua/llist.lua'
no file '/opt/aerospike/sys/udf/lua/external/llist.lua'
no file '/opt/aerospike/usr/udf/lua/llist.lua'
no file './llist.so'
no file '/usr/local/lib/lua/5.1/llist.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
no file '/opt/aerospike/sys/udf/lua/llist.so'
no file '/opt/aerospike/sys/udf/lua/external/llist.so'
no file '/opt/aerospike/usr/udf/lua/llist.so'
: INFO (udf): (udf.c:954) lua error, ret:1
I could not find the relevant lua files or a lua installation in the container. I have my code working fine when I run it directly on the host. Is there some extra configuration that needs to be done to the container?
LDTs were dropped in 3.15.
https://www.aerospike.com/docs/guide/ldt_guide.html
Excerpt:
Aerospike has removed the Large Data Type feature as of server version 3.15 after deprecating this functionality 12 months earlier. Please see the removal notice and deprecation notice. The features listed below are no longer in Aerospike servers.
I am trying to configure Hive on Spark but even after trying for 5 days i am not getting any solution..
Steps followed:
1.After spark installation,going in hive console and setting below proeprties
set hive.execution.engine=spark;
set spark.master=spark://INBBRDSSVM294:7077;
set spark.executor.memory=2g;
set spark.serializer=org.apache.spark.serializer.KryoSerializer;
2.Added spark -asembly jar in hive lib.
3.When running select count(*) from table_name I am getting below error:
2016-08-08 15:17:30,207 ERROR [main]: spark.SparkTask (SparkTask.java:execute(131))
- Failed to execute spark task, with exception
'org.apache.hadoop.hive.ql.metadata.HiveException (Failed to create spark client.)'
Hive version: 1.2.1
Spark version: tried with 1.6.1,1.3.1 and 2.0.0
Would appreciate if any one can suggest something.
You can download spark-1.3.1 src from spark download website and try to build spark-1.3.1 without hive version using:
./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4" -Dhadoop.version=2.7.1 -Dyarn.version=2.7.1 –DskipTests
Then copy spark-assembly-1.3.1-hadoop2.7.1.jar to hive/lib folder.
And follow https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-SparkInstallation to set necessary properties.
First of all, you need to pay attention to which versions are compatible. If you choose Hive 1.2.1, I advise you to use Spark 1.3.1. You can see the version compatibility list here.
The mistake you have is a general mistake. You need to start Spark and see what errors the Spark Workers says. However, have you already copied the hive-site.xml to spark/conf?
I am currently configuring a Cloudera HDP dev image using this tutorial on CentOS 6.5, installing the base and then adding the different components as I need them. Currently, I am installing / testing HCatalog using this section of the tutorial linked above.
I have successfully installed the package and am now testing HCatalog integration with Pig with the following script:
A = LOAD 'groups' USING org.apache.hcatalog.pig.HCatLoader();
DESCRIBE A;
I have previously created and populated a 'groups' table in Hive before running the command. When I run the script with the command pig -useHCatalog test.pig I get an exception rather than the expected output. Below is the initial part of the stacktrace:
Pig Stack Trace
---------------
ERROR 2245: Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1608)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1547)
at org.apache.pig.PigServer.registerQuery(PigServer.java:518)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
...
Has anyone encountered this error before? Any help would be much appreciated. I would be happy to provide more information if you need it.
The error was caused by HBase's Thrift server not being proper configured. I installed/configured Thrift and added the following to my hive-xml.site with the proper server information added:
<property>
<name>hive.metastore.uris</name>
<value>thrift://<!--URL of Your Server-->:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
I thought the snippet above was not required since I am running Cloudera HDP in pseudo-distributed mode.Turns out, it and HBase Thrift are required to use HCatalog with Pig.
I am currently using HADOOP 2.2.0 , HIVE 0.12.0 and Impala 1.2.3. When i am trying to start imapala -server its not getting started. When i checked the log directory , i am getting the following error.
Any help is highly appreciated.
Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status;
Host Details : local host is: "XXXX/[IP-ADDESS]"; destination host is: "hadoop-master":9000;
E0219 13:15:16.223870 22635 impala-server.cc:403] Aborting Impala Server startup due to improper configuration
Hadoop 2.2 is using protobuf 2.5 and Impala is using protobuf 2.4.0a .
Unfortunately code generated with protobuf 2.5 is incompatible with older protobuf libraries.
You can check JIRA ISSUE(HADOOP-9845) for the background or design decision to upgrade protobuf in Hadoop.
SOLUTION
Remove older protobuf .
Install protbuf 2.5
Build Impala