Failed Vertex re-running - hive
I am trying to run hive query on EMR in was using tez. I am facing below error. I am running this query on 20 node cluster. I am running below query.
insert overwrite table temp_table partition(hour)
select a,b,c,hour
from stats_temp_table;
Status: Failed
Vertex re-running, vertexName=Map 1, vertexId=vertex_1523328121104_1005_1_00
Vertex re-running, vertexName=Map 1, vertexId=vertex_1523328121104_1005_1_00
Vertex re-running, vertexName=Map 1, vertexId=vertex_1523328121104_1005_1_00
Vertex re-running, vertexName=Map 1, vertexId=vertex_1523328121104_1005_1_00
Vertex re-running, vertexName=Map 1, vertexId=vertex_1523328121104_1005_1_00
Vertex re-running, vertexName=Map 1, vertexId=vertex_1523328121104_1005_1_00
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1523328121104_1005_1_01, diagnostics=[Task failed, taskId=task_1523328121104_1005_1_01_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in Fetcher {Map_1} #2
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:301)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch failures and insufficient progress!failureCounts=5, pendingInputs=12, fetcherHealthy=false, reducerProgres`enter code here`sedEnough=false, reducerStalled=true
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:977)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:718)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:376)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:260)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:178)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:191)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:54)
... 5 more
, errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in Fetcher {Map_1} #2
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:301)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch failures and insufficient progress!failureCounts=5, pendingInputs=12, fetcherHealthy=false, reducerProgressedEnough=false, reducerStalled=true
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:977)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:718)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:376)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:260)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:178)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:191)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:54)
... 5 more
Related
Catch Scrapy exception when crawling from Airflow
I'm trying to catch the exception that occurs on my spider in a manner that I can mark the task instance as failed. Currently the task finishes and is marked as succeeded. I'm calling the crawl() from PythonOperator in Airflow, as follow: with DAG( 'MySpider', default_args=default_args, schedule_interval=None) as dag: t1 = python_task = PythonOperator( task_id="crawler_task", python_callable=run_crawler, op_kwargs=dag_kwargs ) Here is my run_crawler() method: def run_crawler(**kwargs): project_settings = set_project_settings({ 'FEEDS': { f'{kwargs["bucket"]}%(time)s.{kwargs["format"]}': { 'format': kwargs["format"], 'encoding': 'utf8', 'store_empty': kwargs["store_empty"] } } }) print("Project settings: ") pprint(project_settings.attributes.items()) set_connection("airflow", kwargs["gcs_connection_id"]) process = CrawlerProcess(project_settings) process.crawl(spider.MySpider) print("Starting crawler...") process.start() When running, I'm having problems with GCS credentials, which leads me to an Exception, as follow: google.auth.exceptions.DefaultCredentialsError: The file /tmp/file_my_credentials.json does not have a valid type. Type is None, expected one of ('authorized_user', 'service_account', 'external_account', 'external_account_authorized_user', 'impersonated_service_account', 'gdch_service_account'). {logging_mixin.py:115} WARNING - [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 21087, 'downloader/request_count': 68, 'downloader/request_method_count/GET': 68, 'downloader/response_bytes': 1863876, 'downloader/response_count': 68, 'downloader/response_status_count/200': 68, 'elapsed_time_seconds': 25.647386, 'feedexport/failed_count/GCSFeedStorage': 1, 'httpcompression/response_bytes': 9212776, 'httpcompression/response_count': 68, 'item_scraped_count': 66, 'log_count/DEBUG': 136, 'log_count/ERROR': 1, 'log_count/INFO': 10, 'log_count/WARNING': 3, 'memusage/max': 264441856, 'memusage/startup': 264441856, 'request_depth_max': 1, 'response_received_count': 68, 'scheduler/dequeued': 68, 'scheduler/dequeued/memory': 68, 'scheduler/enqueued': 68, 'scheduler/enqueued/memory': 68, [2032-13-13, 09:04:28 UTC] {engine.py:389} INFO - Spider closed (finished) [2032-13-13, 09:04:28 UTC] {logging_mixin.py:115} WARNING - [scrapy.core.engine] INFO: Spider closed (finished) [2032-13-13, 09:04:28 UTC] {python.py:173} INFO - Done. Returned value was: None [2032-13-13, 09:04:28 UTC] {taskinstance.py:1408} INFO - Marking task as SUCCESS. dag_id=MySpider, task_id=crawler_task, execution_date=2032-13-13, start_date=2032-13-13, end_date=2032-13-13 [2032-13-13, 09:04:28 UTC] {local_task_job.py:156} INFO - Task exited with return code 0 [2032-13-13, 09:04:28 UTC] {local_task_job.py:279} INFO - 0 downstream tasks scheduled from follow-on schedule check As you can see, even having this exception, the task itself is marked as "SUCCESS". Is it possible to catch it in order to mark as FAILED, then we can follow it on airflow (Composer) interface? Thank you
I don't understand why in this case the exception doesn't break the task. You can add a try except in the run_crawler method and then raise you own exception in the except bloc : import logging def run_crawler(**kwargs): class CustomException(Exception): pass try: project_settings = set_project_settings({ 'FEEDS': { f'{kwargs["bucket"]}%(time)s.{kwargs["format"]}': { 'format': kwargs["format"], 'encoding': 'utf8', 'store_empty': kwargs["store_empty"] } } }) print("Project settings: ") pprint(project_settings.attributes.items()) set_connection("airflow", kwargs["gcs_connection_id"]) process = CrawlerProcess(project_settings) process.crawl(spider.MySpider) print("Starting crawler...") process.start() except Exception as err: logging.error("Error in the Airflow task !!!!!", err) raise CustomException("Error in the Airflow task !!!!!", err) In this case when your custom exception will be raised, it will break the Airflow and mark it as failed.
Spark 3 is failing when I try to execute a simple query
I have this table on Hive: CREATE TABLE `mydb`.`raw_sales` ( `combustivel` STRING, `regiao` STRING, `estado` STRING, `jan` STRING, `fev` STRING, `mar` STRING, `abr` STRING, `mai` STRING, `jun` STRING, `jul` STRING, `ago` STRING, `set` STRING, `out` STRING, `nov` STRING, `dez` STRING, `total` STRING, `created_at` TIMESTAMP, `ano` STRING) USING orc LOCATION 'hdfs://localhost:9000/jobs/etl/tables/raw_sales.orc' TBLPROPERTIES ( 'transient_lastDdlTime' = '1601322056', 'ORC.COMPRESS' = 'SNAPPY') There is data on table, but when I try this query at bellow: spark.sql("SELECT * FROM mydb.raw_sales WHERE ano = '2000' AND combustivel like '%GASOLINA%'").show() It's crashing! >>> spark.sql("SELECT * FROM mydb.raw_sales WHERE ano = 2000 AND combustivel like '%GASOLINA%'").show() [164/3679] 20/09/28 19:25:30 ERROR executor.Executor: Exception in task 0.0 in stage 61.0 (TID 133) java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.castLiteralValue(OrcFilters.scala:163) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:235) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:134) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:137) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:136) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:75) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala:188) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$1(OrcFileFormat.scala:188) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:116) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:169) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:491) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:127) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 20/09/28 19:25:30 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 61.0 (TID 133, 639773a482b8, executor driver): java.lang.ClassCastException: java.lang.String cannot be cast to java.l ang.Number at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.castLiteralValue(OrcFilters.scala:163) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:235) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:134) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:137) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:136) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:75) [112/3679] at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala:188) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$1(OrcFileFormat.scala:188) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:116) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:169) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:491) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:127) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 20/09/28 19:25:30 ERROR scheduler.TaskSetManager: Task 0 in stage 61.0 failed 1 times; aborting job Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/spark/current/python/pyspark/sql/dataframe.py", line 440, in show print(self._jdf.showString(n, 20, vertical)) File "/opt/spark/current/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__ File "/opt/spark/current/python/pyspark/sql/utils.py", line 128, in deco return f(*a, **kw) File "/opt/spark/current/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 326, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o122.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 61.0 failed 1 times, most recent failure: Lost task 0.0 in stage 61.0 (TID 133, 639773a482b8, executor dr iver): java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.castLiteralValue(OrcFilters.scala:163) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:235) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:134) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:137) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:136) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:75) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) at scala.Option.map(Option.scala:230) [59/3679] at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$1(OrcFileFormat.scala:188) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:116) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:169) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:491) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:127) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2059) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2008) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2007) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2007) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:973) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:973) at scala.Option.foreach(Option.scala:407) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:973) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2239) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2188) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2177) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:775) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2120) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2139) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:467) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:420) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:47) at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3627) at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2697) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3618) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) [7/3679] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3616) at org.apache.spark.sql.Dataset.head(Dataset.scala:2697) at org.apache.spark.sql.Dataset.take(Dataset.scala:2904) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:300) at org.apache.spark.sql.Dataset.showString(Dataset.scala:337) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.castLiteralValue(OrcFilters.scala:163) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:235) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:134) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:137) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:136) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:75) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala:188) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$1(OrcFileFormat.scala:188) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:116) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:169) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:491) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:127) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more However, this query at bellow works! spark.sql("SELECT * FROM mydb.raw_sales WHERE ano = 2000").show() And this works too! spark.sql("SELECT combustivel FROM mydb.raw_sales WHERE combustivel like '%GASOLINA C%' ").show() My environment Hadoop 2.10.0 Hive 2.3.7 Tez 0.9.2 Spark 3.0.1 The results is the same for pyspark and Scala. I'm completely lost here! Maybe you can help me! Thanks!
I've changed the data of 'ano' field type to 'int'. Is not a ideal solution since even using 'cast' was not enough to resolve this issue! I have a good friend that is a monster on Spark that says that is possibly a bug on Spark. I'm not sure about that, but I would like that someone else corroborates this isue if possible! Anyways, change data type to 'int' works!
Insertion into hive ORC format table from other hive formats cannot be selected from
After creating a new hive external table in ORC format which is bucketed and inserting from another table (with the exact schema) but in Avro format (and non-bucketed) when selecting from the new table there are many errors. I put the stack of the errors here (some of them are repeated and had to delete from the end due to lack of space): Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1572541024266_0020_1_00, diagnostics=[Task failed, taskId=task_1572541024266_0020_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1572541024266_0020_1_00_000000_0:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: java.io.IOException: java.io.IOException: Error reading file: path/000003_0 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) ... 16 more Caused by: java.io.IOException: Error reading file: /path/000003_0 at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:93) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:238) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:213) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 22 more Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 44 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0 at org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:119) at org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:102) at org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:98) at org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:762) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:1833) at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:2001) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1815) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1184) ... 27 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1572541024266_0020_1_00_000000_1:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) ... 16 more Caused by: java.io.IOException: Error reading file: /path/000003_0 at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:93) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:238) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:213) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 22 more Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 44 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0 at org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:119) at org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:102) at org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:98) at org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:762) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:1833) at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:2001) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1815) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1184) ... 27 more ], TaskAttempt 2 failed, info=[Error: Error while running task ( failure ) : attempt_1572541024266_0020_1_00_000000_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) ... 16 more Caused by: java.io.IOException: Error reading file: /path/000003_0 at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:93) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:238) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:213) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 22 more Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 44 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0 at org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:119) at org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:102) at org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:98) at org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:762) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:1833) at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:2001) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1815) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1184) ... 27 more ], TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : attempt_1572541024266_0020_1_00_000000_3:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) ... 16 more Caused by: java.io.IOException: Error reading file: /path/000003_0 at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:93) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:238) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:213) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 22 more Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 44 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0 at org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:119) at org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:102) at org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:98) at org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:762) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:1833) at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:2001) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1815) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1184) ... 27 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1572541024266_0020_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE] DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1572541024266_0020_1_00, diagnostics=[Task failed, taskId=task_1572541024266_0020_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1572541024266_0020_1_00_000000_0:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) ... 16 more Caused by: java.io.IOException: Error reading file: /path/000003_0 at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:93) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:238) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:213) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 22 more Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 44 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0 at org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:119) at org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:102) at org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:98) at org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:762) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:1833) at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:2001) at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1815) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1184) ... 27 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1572541024266_0020_1_00_000000_1:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: java.io.IOException: java.io.IOException: Error reading file: /path/000003_0 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) ... 16 more Caused by: java.io.IOException: Error reading file:/path/000003_0 at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:93) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:238) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:213) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 22 more Any suggestion on how to resolve this issue?
An exception thrown out while running an HQL in hive
I want to run a select statement and put the result into a table, I'm sure its not syntax error. HQL: INSERT overwrite table datalake_rci.MID_DealerVehicleOutputValue --MIDDealerVehicleOutputValueID int, select b.MIDDealerVehicleOutputValueID ,b.DealerID --string, ,b.CHASSIS --string, ,b.DIMDealerID --bigint, ,b.DIMVehicleID --bigint, ,case when a.DIMDealerID is not null then 1 else b.DIMOutputValueID end DIMOutputValueID --int ,b.OutputValueName --string, ,b.OutputValueName_CN --string, ,b.OutputValueCode --varchar(50), ,b.OutputValueOrder --int from datalake_rci.MID_DealerVehicleOutputValue b left outer join ( select w.low,w.DIMDealerID, w.DIMVehicleID,w.OutputValueOrder,w.row_num from ( select z.low,z.DIMDealerID, z.DIMVehicleID, z.OutputValueOrder, row_number() over(partition by z.DIMDealerID order by z.OutputValueOrder desc) row_num from ( select t1.low,y.DIMDealerID, y.DIMVehicleID, y.OutputValueOrder from ( select b.DIMDealerID, b.cnt*l.Rate low from (select DIMDealerID, count(*) cnt from datalake_rci.MID_DealerVehicleOutputValue group by DIMDealerID) b cross join (select Rate from datalake_rci.DIM_OutputValue where OutputValueCode = 'Low') l ) t1 inner join (select DIMDealerID, DIMVehicleID, OutputValueOrder from datalake_rci.MID_DealerVehicleOutputValue) y on t1.DIMDealerID = y.DIMDealerID ) z ) w where w.row_num <= w.low ) a on b.DIMDealerID = a.DIMDealerID; and then I got below output: -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 ......... RUNNING 207 200 0 7 0 0 Map 6 .......... SUCCEEDED 1 1 0 0 0 0 Map 7 ......... RUNNING 207 198 0 9 0 0 Map 8 ......... RUNNING 207 201 0 6 0 0 Reducer 2 ..... RUNNING 40 37 0 3 0 0 Reducer 3 ..... RUNNING 98 94 0 4 0 0 Reducer 4 ..... RUNNING 44 41 0 3 0 0 Reducer 5 ... RUNNING 746 376 0 370 35 233 -------------------------------------------------------------------------------- VERTICES: 01/08 [===================>>-------] 74% ELAPSED TIME: 5795.98 s -------------------------------------------------------------------------------- ERROR : Status: Failed ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03 ERROR : Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00 ERROR : Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04 ERROR : Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05 ERROR : Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06 ERROR : Vertex failed, vertexName=Reducer 5, vertexId=vertex_1549678950511_24672_1_07, diagnostics=[Task failed, taskId=task_1549678950511_24672_1_07_000375, diagnostics=[TaskAttempt 0 failed, info=[Container container_1549678950511_24672_01_000184 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 1 failed, info=[Error: exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_4} #2 at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:382) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:334) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1601) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:253) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:356) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:264) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:176) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:191) , errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_4} #2 at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:382) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:334) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1601) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:253) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:356) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:264) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:176) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:191) ], TaskAttempt 2 failed, info=[Container container_1549678950511_24672_01_000292 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 3 failed, info=[Container container_1549678950511_24672_01_000312 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:369, Vertex vertex_1549678950511_24672_1_07 [Reducer 5] killed/failed due to:OWN_TASK_FAILURE] ERROR : Vertex killed, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:3, Vertex vertex_1549678950511_24672_1_06 [Reducer 4] killed/failed due to:OTHER_VERTEX_FAILURE] ERROR : Vertex killed, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:4, Vertex vertex_1549678950511_24672_1_05 [Reducer 3] killed/failed due to:OTHER_VERTEX_FAILURE] ERROR : Vertex killed, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:3, Vertex vertex_1549678950511_24672_1_04 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE] ERROR : Vertex killed, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:7, Vertex vertex_1549678950511_24672_1_03 [Map 1] killed/failed due to:OTHER_VERTEX_FAILURE] ERROR : Vertex killed, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:9, Vertex vertex_1549678950511_24672_1_01 [Map 7] killed/failed due to:OTHER_VERTEX_FAILURE] ERROR : Vertex killed, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:6, Vertex vertex_1549678950511_24672_1_00 [Map 8] killed/failed due to:OTHER_VERTEX_FAILURE] ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:6 Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03Vertex re-running, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06Vertex failed, vertexName=Reducer 5, vertexId=vertex_1549678950511_24672_1_07, diagnostics=[Task failed, taskId=task_1549678950511_24672_1_07_000375, diagnostics=[TaskAttempt 0 failed, info=[Container container_1549678950511_24672_01_000184 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 1 failed, info=[Error: exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_4} #2 at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:382) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:334) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1601) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:253) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:356) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:264) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:176) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:191) , errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_4} #2 at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:382) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:334) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1601) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:253) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:356) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:264) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:176) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:191) ], TaskAttempt 2 failed, info=[Container container_1549678950511_24672_01_000292 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 3 failed, info=[Container container_1549678950511_24672_01_000312 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:369, Vertex vertex_1549678950511_24672_1_07 [Reducer 5] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 4, vertexId=vertex_1549678950511_24672_1_06, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:3, Vertex vertex_1549678950511_24672_1_06 [Reducer 4] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 3, vertexId=vertex_1549678950511_24672_1_05, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:4, Vertex vertex_1549678950511_24672_1_05 [Reducer 3] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1549678950511_24672_1_04, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:3, Vertex vertex_1549678950511_24672_1_04 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 1, vertexId=vertex_1549678950511_24672_1_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:7, Vertex vertex_1549678950511_24672_1_03 [Map 1] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 7, vertexId=vertex_1549678950511_24672_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:9, Vertex vertex_1549678950511_24672_1_01 [Map 7] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 8, vertexId=vertex_1549678950511_24672_1_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:6, Vertex vertex_1549678950511_24672_1_00 [Map 8] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:6 (state=08S01,code=2) I've tried twice but I get the same result. By the way, there are a total of 336258079 rows in the MID_DealerVehicleOutputValue table. Is this causing the error? Some other similar statements are running successfully prior to this one but there are not as many rows to be processed.
Before doing any memory adjustments, try to rewrite your query in a better way. It does unnecessary work performing extra joins. First of all you can greatly simplify it by removing unnecessary inner join with subquery in which you are calculating count(*). Use analytic count(*) over(partition by DIMDealerID) instead: INSERT overwrite table datalake_rci.MID_DealerVehicleOutputValue --MIDDealerVehicleOutputValueID int, select b.MIDDealerVehicleOutputValueID ,b.DealerID --string, ,b.CHASSIS --string, ,b.DIMDealerID --bigint, ,b.DIMVehicleID --bigint, ,case when a.DIMDealerID is not null then 1 else b.DIMOutputValueID end DIMOutputValueID --int ,b.OutputValueName --string, ,b.OutputValueName_CN --string, ,b.OutputValueCode --varchar(50), ,b.OutputValueOrder --int from datalake_rci.MID_DealerVehicleOutputValue b left outer join ( select w.low,w.DIMDealerID, w.DIMVehicleID,w.OutputValueOrder,w.row_num from ( select z.low,z.DIMDealerID, z.DIMVehicleID, z.OutputValueOrder, row_number() over(partition by z.DIMDealerID order by z.OutputValueOrder desc) row_num from ( select DIMDealerID, DIMVehicleID, OutputValueOrder, count(*) over(partition by DIMDealerID) * l.Rate low from datalake_rci.MID_DealerVehicleOutputValue cross join (select Rate from datalake_rci.DIM_OutputValue where OutputValueCode = 'Low') l ) z ) w where w.row_num <= w.low ) a on b.DIMDealerID = a.DIMDealerID; And as the next step of your query optimizing try to remove unnecessary left join. Like this: INSERT overwrite table datalake_rci.MID_DealerVehicleOutputValue --MIDDealerVehicleOutputValueID int, select s.MIDDealerVehicleOutputValueID ,s.DealerID --string, ,s.CHASSIS --string, ,s.DIMDealerID --bigint, ,s.DIMVehicleID --bigint, ,case when s.row_num <= s.low --you do not need join to calculate this then 1 else s.DIMOutputValueID end DIMOutputValueID --int ,s.OutputValueName --string, ,s.OutputValueName_CN --string, ,s.OutputValueCode --varchar(50), ,s.OutputValueOrder --int from ( select s.low,s.DIMDealerID, s.DIMVehicleID, s.OutputValueOrder, s.MIDDealerVehicleOutputValueID,s.DealerID,s.CHASSIS,s.OutputValueName, s.OutputValueName_CN,s.OutputValueCode,s.OutputValueOrder row_number() over(partition by s.DIMDealerID order by s.OutputValueOrder desc) row_num from ( select s.DIMDealerID, s.DIMVehicleID, s.OutputValueOrder, s.MIDDealerVehicleOutputValueID,s.DealerID,s.CHASSIS,s.OutputValueName, s.OutputValueName_CN,s.OutputValueCode,s.OutputValueOrder count(*) over(partition by DIMDealerID) * l.Rate low from datalake_rci.MID_DealerVehicleOutputValue s cross join (select Rate from datalake_rci.DIM_OutputValue where OutputValueCode = 'Low') l ) s ) s; --one or two subqueries also can be removed Of course my query may contain some bugs and it should be tested carefully, but I hope you have got the idea. It is almost always possible to eliminate self-joins. Finally your query will do only read each table only once and many other heavy steps will be eliminated. I expect you will get rid of at least two reducer and two mapper vertices. Also I'd suggest to increase mapper parallelism. Tune these settings, try to reduce figures until you get more mappers running: --tune mapper parallelizm set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; set tez.grouping.max-size=67108864; set tez.grouping.min-size=32000000;
You should change the value for mapreduce.reduce.shuffle.memory.limit.percent. This parameter used for the maximum percentage of the above memory buffer that a single shuffle (output copied from single Map task) should take. The shuffle's size above this size will not be copied to the memory buffer, instead they will be directly written to the disk of the reducer. Try to reduce value of this parameter and then run the query again. Also make sure that value for mapreduce.reduce.shuffle.merge.percent is lower than mapreduce.reduce.shuffle.memory.limit.percent
Redis timeout issue
I am getting timeout issue with redis. Following is the exception System.TimeoutException: Timeout performing EVAL, inst: 1, mgr: ProcessReadQueue, err: never, queue: 7, qu: 0, qs: 7, qc: 0, wr: 0, wq: 0, in: 507, ar: 1, clientName: EG-APP04, serverEndpoint: Unspecified/####, keyHashSlot: 12598, IOCP: (Busy=0,Free=1600,Min=800,Max=1600), WORKER: (Busy=187,Free=1413,Min=800,Max=1600), Local-CPU: unavailable Configuration: Connection string connectionString="####,connectRetry=10, abortConnect=false, allowAdmin=true, ssl=false" Am I missing something?
Try increasing your syncTimeout to a high value like below: connectionString="####,connectRetry=10, abortConnect=false, allowAdmin=true, ssl=false, syncTimeout=30000