SPARK Exception thrown in awaitResult

SPARK Exception thrown in awaitResult - sql

I am running SPARK locally (I am not using Mesos), and when running a join such as d3=join(d1,d2) and d5=(d3, d4) am getting the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”. 
Googling for it, I found the following two related links:
1) https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433
which both explain why it happens but nothing about what to do to solve it. 
A bit more about my running configuration:
1) I am using spark-core_2.11, spark-sql_2.11
SparkSession spark = SparkSession
.builder()
.master("local[6]").appName("DatasetForCaseNew").config("spark.executor.memory", "4g").config("spark.shuffle.blockTransferService", "nio").getOrCreate();
 
3) public Dataset buildDataset(){
...
// STEP A
// Join prdDS with cmpDS          
Dataset<Row> prdDS_Join_cmpDS
                = res1                        
                  .join(res2, (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");
        
        prdDS_Join_cmpDS.take(1);        
// STEP B
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS_Join
                = prdDS_Join_cmpDS                        
                  .join(res3, prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")), "inner");
        prdDS_Join_cmpDS_Join.take(1);
        prdDS_Join_cmpDS_Join.show();
}
The exception (see below for the stack trace) is thrown when the computation reaches the STEP B, until STEP A is fine. 
Is there anything wrong or missing?
Thanks for your help in advance. 
Best Regards,
Carlo
=== STACK TRACE
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102 sec <<< FAILURE!
testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 421.994 sec  <<< ERROR!
org.apache.spark.SparkException: Exception thrown in awaitResult: 
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
at org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
... 85 more

Just to add to what Carlo said, I used the following Line of code:
sqlContext.setConf("spark.sql.autoBroadcastJoinThreshold", "-1")

I was running into this problem... none of the online searches lead to the right solution...
Well, adding sparkConf.set("spark.sql.autoBroadcastJoinThreshold", "-1") solves the problem...

Related

Export all jobs from jenkins including run history

Ideally I need a script that outputs the following information in a CSV format that's easy to import into Excel:
job name,number of times run in last year,number of times run overall,last run status
For that job, output no individual run details.
Tried this on my Jenkins:
List Jenkins job build detials for last one year along with the user who triggered the build.
but got an error:
java.lang.NullPointerException: Cannot invoke method getShortDescription() on null object
at org.codehaus.groovy.runtime.NullObject.invokeMethod(NullObject.java:91)
at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
Any idea what in the Groovy needs changing? or is there a better solution?
Thanks all!

Thanks to #daggett and #ian . Both worked.
I went with IANS :
def jobNamePattern ='.*'   // adjust to folder/job regex as needed
def daysBack = 365   // adjust to how many days back to report on
def timeToDays = 24*60*60*1000  // converts msec to daysprintln "Job Name: ( # builds: last ${daysBack} days / overall )  Last Status\n   Number | Trigger | Status | Date | Duration\n"Jenkins.instance.allItems.findAll() {
  it instanceof Job && it.fullName.matches(jobNamePattern)
}.each { job ->
  builds = job.getBuilds().byTimestamp(System.currentTimeMillis() - daysBack*timeToDays, System.currentTimeMillis())
  println job.fullName + ' ( ' + builds.size() + ' / ' + job.builds.size() + ' )  ' + job.getLastBuild()?.result
  
  // individual build details
  builds.each { build ->
    println '   ' + build.number + ' | ' + build.getCauses()[0]?.getShortDescription() + ' | ' + build.result + ' | ' + build.getTimestampString2() + ' | ' + build.getDurationString()
  }
}
return

spark sql request two phases NPE

I am a new in Spark (use 2.4.0). I faced with the strange (for me) NPE exception. Following code return NPE.
val ds = "2020-04-01"
spark.sql("select ds, db_name, table_name, type FROM datainfra.hive_tables " +
"where ds = '%s' and db_name = 'db_exports' limit 1".format(ds)).map(table =>
spark.sql("select col_name FROM datainfra.hive_columns " +
"where ds = '%s' and db_name = '%s' and table_name = '%s' and table_type = '%s' and col_type = 'string'"
.format(table.getAs[String]("ds"),
table.getAs[String]("db_name"),
table.getAs[String]("table_name"),
table.getAs[String]("type")))
.map(columnNameRow => columnNameRow.getAs[String](0)).collect().mkString("||")
)
But separately each of DFs works fine:
spark.sql("select ds, db_name, table_name, type FROM datainfra.hive_tables " +
"where ds = '%s' and db_name = 'db_exports' limit 1".format(ds)).show // returns results
spark.sql("select col_name FROM datainfra.hive_columns " +
("where ds = '%s' and db_name = '%s' and table_name = '%s' and table_type = '%s' and col_type = 'string' " +
"and col_name != 'ds'")
.format(ds,
"hardcode_db_name",
"hardcode_table_name",
"hardcode_type")).map(columnNameRow => columnNameRow.getAs[String](0)).collect().mkString("||")
How it could be?

Q: I am a new in Spark (use 2.4.0). I faced with the strange (for me)
NPE exception. Following code return NPE. How it could be?
spark.sql(" sql").map.(spark.sql("some sql ")) pattern has the problem.
In your case is cause of null pointer exception
val ds = "2020-04-01"
val test1: Dataset[String] = spark.sql("select ds, db_name, table_name, type FROM datainfra.hive_tables " +
"where ds = '%s' and db_name = 'db_exports' limit 1".format(ds))
.map(table =>
spark.sql("select col_name FROM datainfra.hive_columns " +
"where ds = '%s' and db_name = '%s' and table_name = '%s' and table_type = '%s' and col_type = 'string'"
.format(table.getAs[String]("ds"),
table.getAs[String]("db_name"),
table.getAs[String]("table_name"),
table.getAs[String]("type")))
.map(columnNameRow => columnNameRow.getAs[String](0)).collect().mkString("||")
)
To prove this I prepared similar example pls see below.. I replicated the same null pointer exception looks like its not supported.
package com.examples
import org.apache.log4j.Level
import org.apache.spark.sql.{DataFrame, Dataset, SparkSession}
/**
* Created by Ram Ghadiyaram
*/
object RDDOfTupleExample {
org.apache.log4j.Logger.getLogger("org").setLevel(Level.ERROR)
def main(args: Array[String]) {
val spark = SparkSession.builder.
master("local")
.appName(this.getClass.getName)
.getOrCreate()
import spark.implicits._
val donuts: DataFrame = Seq(("plain donut", 1.50), ("plain donut", 1.50)
, ("vanilla donut", 2.0), ("vanilla donut", 2.0)
, ("glazed donut", 2.50))
.toDF("Donut_Name", "Price")
//lets suppose this is your hive table since i dont have hive i simulated with temp table
donuts.createOrReplaceTempView("mydonuts")
// }
val test: Dataset[String] = spark.sql("select \"NCA-15\" as mylabel, count(Donut_Name) as mydonutcount from mydonuts")
.map(x => spark.sql(s"select ${x.get(0)}, ${x.get(1)} ").collect().mkString(",")) // this is problem
test.show
}
}
Result :
[2020-04-11 16:27:45,687] ERROR Exception in task 0.0 in stage 1.0 (TID 1) (org.apache.spark.executor.Executor:91)
java.lang.NullPointerException
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:143)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:141)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at com.examples.RDDOfTupleExample$$anonfun$1.apply(RDDOfTupleExample.scala:29)
at com.examples.RDDOfTupleExample$$anonfun$1.apply(RDDOfTupleExample.scala:29)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.mapelements_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.deserializetoobject_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2020-04-11 16:27:45,710] ERROR Task 0 in stage 1.0 failed 1 times; aborting job (org.apache.spark.scheduler.TaskSetManager:70)
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, localhost, executor driver): java.lang.NullPointerException
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:143)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:141)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at com.examples.RDDOfTupleExample$$anonfun$1.apply(RDDOfTupleExample.scala:29)
at com.examples.RDDOfTupleExample$$anonfun$1.apply(RDDOfTupleExample.scala:29)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.mapelements_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.deserializetoobject_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1891)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1879)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1878)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1878)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:927)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:927)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:927)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2112)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2061)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2050)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:738)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:365)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2550)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2550)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2550)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2764)
at org.apache.spark.sql.Dataset.getRows(Dataset.scala:254)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:291)
at org.apache.spark.sql.Dataset.show(Dataset.scala:751)
at org.apache.spark.sql.Dataset.show(Dataset.scala:710)
at org.apache.spark.sql.Dataset.show(Dataset.scala:719)
at com.examples.RDDOfTupleExample$.main(RDDOfTupleExample.scala:30)
at com.examples.RDDOfTupleExample.main(RDDOfTupleExample.scala)
Caused by: java.lang.NullPointerException
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:143)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:141)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at com.examples.RDDOfTupleExample$$anonfun$1.apply(RDDOfTupleExample.scala:29)
at com.examples.RDDOfTupleExample$$anonfun$1.apply(RDDOfTupleExample.scala:29)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.mapelements_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.deserializetoobject_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Conclusion : Afore mentioned nested spark.sql pattern not working (NPE). you have to execute seperately or use some other way

How to send Ionic push notifications using Parse-Server?

I want to send push notifications from my ionic app to app now I wrote parse cloud code and  normal typescript but both are not working, actually m requirement is sending push notification to all devices and also specific device,please review my code below and help me 
my cloud code
Parse.Cloud.define("send", (request) => {
enter code here
    return Parse.Push.send({
        channels: ["News"],
        data: {
            title: "Hello from the Cloud Code",
            alert: "Back4App rocks!",
        }
    }, { useMasterKey: true });
});
typescript code
calling cloud code
Parse.Cloud.run('send').then(function (ratings) { debugger console.log("updated"); // result should be 'Update object successfully' }).catch((error) => { console.log(error) console.log("fail") });

You should add a query parameter. That way the push knows which user to send the push too.
You don't need the query parameter for channels. Do your installation have the channel?
Feel free to open an issue on the JS SDK.

FileUpload: Get the full path of the CSV file

I'm working on a project or I want to upload an Excel CSV file that contains data from a table that I have in the database using the query LOAD DATA LOCAL, I test the query using the full path of the file with its path (example C: //../file.csv) directly in the query it works without problem, I wanted to work with the library primefaces p: fileUpload and when I choose a file CSV from my desktop or from another directory on my pc, it only returns the name of the file I selected and not the full path of the hit I have an error:
org.hibernate.exception.GenericJDBCException: could not execute statement
java.io.FileNotFoundException: extraction1.csv (The specified file can not be found)
Forced because there is not the full path of the file with the name of the folder, what I want is to return the name of the file with its root files from where I have selected so that my request can to execute correctly, as to show on the code below I wish that the path where the file is located when I select also be returned with the name of the file in question, and thanks.
prelevServ.importToDB("C:\\Users\\helyoubi\\Desktop\\Japon 2\\"+fichierUpload.getFileName());
My JSF form:
<h:form enctype="multipart/form-data">
    <p:growl id="messages" showDetail="true" />
<p:fileUpload label="Choisir" value="#{importFichier.fichierUpload}" mode="simple" skinSimple="true"/>
<p:separator/>
<p:commandButton value="Envoyer" ajax="false" action="#importFichier.importation}" />
</h:form>
My managedBean formula:
#ManagedBean
public class ImportFichier implements Serializable{
     
    /**
     *
     */
    private static final long serialVersionUID = 1L;
 
    private UploadedFile fichierUpload;
     
    private PrelevementServices prelevServ = new PrelevementServicesImpl();
     
 
     
    public UploadedFile getFichierUpload() {
        return fichierUpload;
    }
 
 
 
 
    public void setFichierUpload(UploadedFile fichierUpload) {
        this.fichierUpload = fichierUpload;
    }
 
 
 
 
    public void importation() {
 
        if(fichierUpload.getFileName()!= null) {
         
            //prelevServ.importToDB("C:\\Users\\helyoubi\\Desktop\\Japon 2\\"+fichierUpload.getFileName());
             
            prelevServ.importToDB(fichierUpload.getFileName());
             
            FacesMessage message = new FacesMessage("Succesful", fichierUpload.getFileName()+ " is uploaded.");
             
            FacesContext.getCurrentInstance().addMessage(null, message);
             
        }else {
             
            FacesMessage message = new FacesMessage("Le chemin du fichier : "+fichierUpload.getFileName()+" est introuvable");
             
            FacesContext.getCurrentInstance().addMessage(null, message);
             
        }
         
     
        System.out.println("CSV added to your the DB Table");
         
    }
     
     
 
}
My request :
#Override
    public void importToDB(String cheminFichier) {
         
        session.beginTransaction();
         
        session.createSQLQuery("LOAD DATA LOCAL INFILE :filename INTO TABLE Prelevement_saisons FIELDS TERMINATED BY ',' ENCLOSED BY '\"'(espece,saison,departement,commune,code,attributions,realisations)").setString("filename", cheminFichier).executeUpdate();
         
        session.getTransaction().commit();
         
    }

Try sending .getPath():
prelevServ.importToDB(fichierUpload.getPath());

Version Portable for Hazelcast : Converting Date to Long throws Exception in Predicates

I am trying to use com.hazelcast.nio.serialization.VersionedPortable for serialization for a Customer class. This does not support Date serialization by default. So we need to convert it to Long
#Override
public void writePortable(PortableWriter writer) throws IOException {
if (dob != null) {
Long dobLong = dob.getTime();
writer.writeLong(DOB_FIELD, dobLong);
} else {
writer.writeLong(DOB_FIELD, -1);
}
}
#Override
public void readPortable(PortableReader reader) throws IOException {
if (reader.hasField(DOB_FIELD)) {
Long dobLong = reader.readLong(DOB_FIELD);
dob = dobLong == -1 ? null : new Date(dobLong);
}
}
In CustomerService I have findCustomersByDob which uses com.hazelcast.query.Predicate
public Collection<Customer> findCustomersByDob(Date dobStart, Date dobEnd) {
Predicate dobStartPredicate = Predicates.greaterEqual("dob", dobStart);
Predicate dobEndPredicate = Predicates.lessThan("dob", dobEnd);
Predicate andPredicate = Predicates.and(dobStartPredicate, dobEndPredicate);
return idToCustomerMap.values(andPredicate);
}
at idToCustomerMap.values(andPredicate); I am getting the following exception.
java.lang.IllegalArgumentException: Cannot convert [Tue Jan 01 00:00:00 IST 1980] to long
at com.hazelcast.query.impl.TypeConverters$LongConverter.convert(TypeConverters.java:159)
at com.hazelcast.query.impl.IndexImpl.convert(IndexImpl.java:154)
at com.hazelcast.query.impl.IndexImpl.getSubRecords(IndexImpl.java:148)
at com.hazelcast.query.Predicates$GreaterLessPredicate.filter(Predicates.java:691)
at com.hazelcast.query.Predicates$AndPredicate.filter(Predicates.java:477)
at com.hazelcast.query.impl.IndexService.query(IndexService.java:97)
at com.hazelcast.map.impl.operation.QueryOperation.run(QueryOperation.java:92)
at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:137)
at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:309)
at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.processPacket(OperationThread.java:142)
at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.process(OperationThread.java:115)
at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.doRun(OperationThread.java:101)
at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.run(OperationThread.java:76)
at ------ End remote and begin local stack-trace ------.(Unknown Source)
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponse(InvocationFuture.java:384)
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponseOrThrowException(InvocationFuture.java:334)
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:225)
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:204)
at com.hazelcast.map.impl.client.AbstractMapQueryRequest.collectResults(AbstractMapQueryRequest.java:103)
at com.hazelcast.map.impl.client.AbstractMapQueryRequest.invoke(AbstractMapQueryRequest.java:77)
at com.hazelcast.client.impl.client.InvocationClientRequest.process(InvocationClientRequest.java:27)
at com.hazelcast.client.impl.ClientEngineImpl$ClientPacketProcessor.processRequest(ClientEngineImpl.java:463)
at com.hazelcast.client.impl.ClientEngineImpl$ClientPacketProcessor.run(ClientEngineImpl.java:379)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92)
at ------ End remote and begin local stack-trace ------.(Unknown Source)
at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveResponse(ClientInvocationFuture.java:147)
at com.hazelcast.client.spi.impl.ClientInvocationFuture.get(ClientInvocationFuture.java:114)
at com.hazelcast.client.spi.impl.ClientInvocationFuture.get(ClientInvocationFuture.java:89)
at com.hazelcast.client.spi.ClientProxy.invoke(ClientProxy.java:151)
at com.hazelcast.client.proxy.ClientMapProxy.values(ClientMapProxy.java:837)
at com.foo.hazelcast.client.services.CustomerServiceImpl.findCustomersByDob(CustomerServiceImpl.java:99)
at com.foo.hazelcast.client.services.CustomerServiceTest.searchCustomersByDobRange(CustomerServiceTest.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:252)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:94)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:191)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Although it is clear the exception is because of this special handling of Date, I would like to know the exact reason for the TypeConversion here. I read that Hazelcast maintains the data in serialized form. So this should not be a problem right?
Also, how do I overcome this issue?
EDIT:
I fixed this issue by passing date.getTime() in the predicates as well
public Collection<Customer> findCustomersByDob(Date dobStart, Date dobEnd) {
Predicate dobStartPredicate = Predicates.greaterEqual("dob", dobStart.getTime());
Predicate dobEndPredicate = Predicates.lessThan("dob", dobEnd.getTime());
Predicate andPredicate = Predicates.and(dobStartPredicate, dobEndPredicate);
return idToCustomerMap.values(andPredicate);
}
I am guessing it is because hazelcast maintains data in serialized form and hence confused when it is trying to compare long against the date in predicate.
Still, this approach is definitely not cleaner. Is there cleaner way to avoid this in Version Portable?

#vinodhini-chockalingam, Portable designed to work with different languages, not just Java. So you can write a value from NodeJs & read it from Java with Portable. This is why you cannot write java.util.Date object to Portable directly. You need to convert it to a supported type.
And since you write the Date as long, Hazelcast recognizes that field as long. Only you know that this is a Date field. Then when you need to run a predicate, since you write this field as long, Hazelcast expect a long value to compare. This is the reason.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SPARK Exception thrown in awaitResult - sql

Just to add to what Carlo said, I used the following Line of code: sqlContext.setConf("spark.sql.autoBroadcastJoinThreshold", "-1")

I was running into this problem... none of the online searches lead to the right solution... Well, adding sparkConf.set("spark.sql.autoBroadcastJoinThreshold", "-1") solves the problem...

Related

Export all jobs from jenkins including run history

spark sql request two phases NPE

How to send Ionic push notifications using Parse-Server?

FileUpload: Get the full path of the CSV file

Version Portable for Hazelcast : Converting Date to Long throws Exception in Predicates

Categories

Resources