[Amazon](500310) Invalid operation: Assert Query caught Query_abort exception - sql

I'm running a heavy query on RedShift with multiple joins. I'm continuously getting the following error (running via SQL Workbench):
Amazon Invalid operation: Assert Details:
----------------------------------------------- error: Assert code: 1000 context: dex < m_num_colflds && dex >= 0 -
m_num_colflds:1 dex:4 query: 162544 location:
tbl_trans.hpp:398 process: padbmaster [pid=1572]
-----------------------------------------------;
I even searched for this error, but couldn't get to the root cause. But when i query the stl_error table, i get the following information.
Query 162511 caught Query_abort exception
dex < m_num_colflds && dex >= 0 - m_num_colflds:1 dex:4
We have migrated a sample data of about 2.5 TB from Netezza to RedShift and trying to run Netezza ETL queries on RedShift. I have done the necessary conversions. We have a RedShift cluster of 2 nodes and node type is ds2.xlarge. Can someone highlight, what could be the cause?
Here is the query

Related

Delta Table : org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'FROM'

I am trying to run the query on EMR/EMR Notebooks (Spark with Scala) -
SELECT max(version), max(timestamp) FROM (DESCRIBE HISTORY delta.`s3://a/b/c/d`)
But I am getting the following error -
The same query works fine on Databricks.
Another doubt that I have is - why does the colour of s3 location change post //.
So I tried to break the above query and only run the Describe HISTORY query. And for some reason it says -
Error Log -
An error was encountered:
org.apache.spark.sql.AnalysisException: Table or view not found: HISTORY;
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:47)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:835)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.resolveRelation(Analyzer.scala:787)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:817)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:810)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:71)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:89)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsUp(AnalysisHelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:810)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:756)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1$$anonfun$2.apply(RuleExecutor.scala:92)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1$$anonfun$2.apply(RuleExecutor.scala:92)
at org.apache.spark.sql.execution.QueryExecutionMetrics$.withMetrics(QueryExecutionMetrics.scala:141)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:88)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:88)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:80)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:80)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:164)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$execute$1.apply(Analyzer.scala:156)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$execute$1.apply(Analyzer.scala:156)
at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withLocalMetrics(Analyzer.scala:104)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:155)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:126)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:125)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:125)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
at org.apache.spark.sql.SparkSession.table(SparkSession.scala:630)
at org.apache.spark.sql.execution.command.DescribeColumnCommand.run(tables.scala:714)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3391)
at org.apache.spark.sql.execution.SQLExecution$.org$apache$spark$sql$execution$SQLExecution$$executeQuery$1(SQLExecution.scala:83)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecutionMetrics$.withMetrics(QueryExecutionMetrics.scala:141)
at org.apache.spark.sql.execution.SQLExecution$.org$apache$spark$sql$execution$SQLExecution$$withMetrics(SQLExecution.scala:178)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:93)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:200)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:92)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withAction(Dataset.scala:3390)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:196)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:81)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:644)
... 50 elided
Caused by: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'history' not found in database 'default';
at org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:81)
at org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:81)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.hive.client.HiveClient$class.getTable(HiveClient.scala:81)
at org.apache.spark.sql.hive.client.HiveClientImpl.getTable(HiveClientImpl.scala:84)
at org.apache.spark.sql.hive.HiveExternalCatalog.getRawTable(HiveExternalCatalog.scala:141)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:723)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:723)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:98)
at org.apache.spark.sql.hive.HiveExternalCatalog.getTable(HiveExternalCatalog.scala:722)
at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.getTable(ExternalCatalogWithListener.scala:138)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupRelation(SessionCatalog.scala:706)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:832)
UPDATED (18-Feb-2021) -> What I have tried till now.
Query Using Spark Sql -
spark.sql("SELECT max(version), max(timestamp) FROM (DESCRIBE HISTORY delta.s3://a/b/c/d)")
But this Didnt work. Same Error.
Create Spark Session with -
spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
and spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog.
But its throwing the same error.
UPDATE 2 (18-Feb-2021) :- Trying the approach as mentioned by #alex.
Using PySpark.
It was working partly and but not completely.
Thanks in Advance.
Per documentation, to get support for DESCRIBE HISTORY you need to configure Spark SQL Extensions and Catalog by passing 2 properties (see docs):
spark.sql.extensions to value io.delta.sql.DeltaSparkSessionExtension
spark.sql.catalog.spark_catalog to value org.apache.spark.sql.delta.catalog.DeltaCatalog
Update:
For Spark 2.4.x, the Delta 0.6.1 should be used, and its documentation has following code snippet to activate extensions:
spark.sparkContext._jvm.io.delta.sql.DeltaSparkSessionExtension() \
.apply(spark._jsparkSession.extensions())
spark = SparkSession(spark.sparkContext, spark._jsparkSession.cloneSession())

'An invalid floating point operation occurred' using LOG function SQL Server

I am trying to perform an operation in SQL server using LOG() and a few fields that are within my table. When I try and run the query, I get the error
'An invalid floating point operation occurred.'
I tried the solution provided in An invalid floating point operation occurred but I was not able to get it to solve. This was the original code that I used
SELECT
MIN(ROUND((ROUND(Log(amt1 / ABS(rate * amt2 - amt1)),5) / (round(LOG(1 + rate),10))),0))
FROM table
WHERE amt2 - amt1 > 0
I expected the output to show a value. I can run this for one specific field, but as I span this out to the whole data set the error occurs.

Dbeaver Exception: Data Source was invalid

I am trying to work with Dbeaver and processing data via Spark Hive. The connection is stable as the following command works:
select * from database.table limit 100
However, as soon as I differ from the simple fetching query I get an exception. E.g. runing the query
select count(*) from database.table limit 100
results in the exception:
SQL Error [2] [08S01]: org.apache.hive.service.cli.HiveSQLException: Error
while processing statement: FAILED: Execution Error, return code 2
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
vertexName=Map 1, vertexId=vertex_1526294345914_23590_12_00,
diagnostics=[Vertex vertex_1526294345914_23590_12_00 [Map 1]
killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: postings
initializer failed, vertex=vertex_1526294345914_23590_12_00 [Map 1],
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception:
Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad
Request; Request ID: 95BFFF20D13AECDA), S3 Extended Request ID:
fSbzZDf/Xi0b+CL99c5DKi8GYrJ7TQXj5/WWGCiCpGa6JU5SGeoxA4lunoxPCNBJ2MPA3Hxh14M=
Can someone help me here?
400/Bad Request is the S3/AWS Generic "didn't like your payload/request/auth" response. There's some details in the ASF S3A docs, but that is for the ASF connector, not the amazon one (which yours is, from the stack trace). Bad endpoint for v4-authenticated buckets is usually problem #1, after that...who knows?
try and do some basic hadoop fs -ls s3://bucket/path operations first.
you can try running the cloudstore diagnostics against it; that's my first call for debugging a client. Its not explicitly EMR-s3-connector aware though, so it won't look at the credentials in any detail

[Amazon](500310) Invalid operation: Assert

I am using spark-redshift and querying redshift data using pyspark for processing.
The query works fine if i run on redshift using workbench etc.But spark-redshift unloads data to s3 and then retrieves it and it is throwing the following error when i run it.
py4j.protocol.Py4JJavaError: An error occurred while calling o124.save.
: java.sql.SQLException: [Amazon](500310) Invalid operation: Assert
Details:
-----------------------------------------------
error: Assert
code: 1000
context: !AmLeaderProcess -
query: 583860
location: scheduler.cpp:642
process: padbmaster [pid=31521]
-----------------------------------------------;
at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(ErrorResponse.java:1830)
at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse(PGMessagingContext.java:822)
at com.amazon.redshift.client.PGMessagingContext.handleMessage(PGMessagingContext.java:647)
at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass(InboundMessagesPipeline.java:312)
at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass(PGMessagingContext.java:1080)
at com.amazon.redshift.client.PGMessagingContext.getErrorResponse(PGMessagingContext.java:1048)
at com.amazon.redshift.client.PGClient.handleErrorsScenario2ForPrepareExecution(PGClient.java:2524)
at com.amazon.redshift.client.PGClient.handleErrorsPrepareExecute(PGClient.java:2465)
at com.amazon.redshift.client.PGClient.executePreparedStatement(PGClient.java:1420)
at com.amazon.redshift.dataengine.PGQueryExecutor.executePreparedStatement(PGQueryExecutor.java:370)
at com.amazon.redshift.dataengine.PGQueryExecutor.execute(PGQueryExecutor.java:245)
at com.amazon.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
at com.amazon.jdbc.common.SPreparedStatement.execute(Unknown Source)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$executeInterruptibly$1.apply(RedshiftJDBCWrapper.scala:108)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$executeInterruptibly$1.apply(RedshiftJDBCWrapper.scala:108)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$2.apply(RedshiftJDBCWrapper.scala:126)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Caused by: com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: Assert
The query which gets generated:
UNLOAD ('SELECT “x”,”y" FROM (select x,y from table_name where
((load_date=20171226 and hour>=16) or (load_date between 20171227 and
20171226) or (load_date=20171227 and hour<=16))) ') TO ‘s3:s3path' WITH
CREDENTIALS ‘aws_access_key_id=xxx;aws_secret_access_key=yyy' ESCAPE
MANIFEST
What is the issue here and how can i resolve this.
Assert error usually happens when something is wrong with interpreting data types, for example for 2 parts of union query where column N in one part is varchar and in another part the same column is integer or null. Maybe your assertion error happens for data that comes from different nodes (just like in union query). Try to add explicit data formatting for each column like x::integer

Orchard.Alias.Implementation.Updater.AliasHolderUpdater - Exception during Alias refresh

crosspost: https://orchard.codeplex.com/discussions/473454
I want to start by saying I'm currently migrating from Orchard CMS 1.6 to 1.7.2. So it used to work in 1.6 but I'm now having issues with 1.7.2.
2 of my Content Types are having issues when creating items, they never finish saving and when I check the logs I get this:
Orchard.Alias.Implementation.Updater.AliasHolderUpdater - Exception during Alias refresh
NHibernate.Exceptions.GenericADOException: could not execute query
[ select aliasrecor0_.Id as Id1829_, aliasrecor0_.Path as Path1829_, aliasrecor0_.RouteValues as RouteVal3_1829_, aliasrecor0_.Source as Source1829_, aliasrecor0_.Action_id as Action5_1829_ from Orchard_Alias_AliasRecord aliasrecor0_ where aliasrecor0_.Id>#p0 order by aliasrecor0_.Id asc ]
Name:p1 - Value:48
[SQL: select aliasrecor0_.Id as Id1829_, aliasrecor0_.Path as Path1829_, aliasrecor0_.RouteValues as RouteVal3_1829_, aliasrecor0_.Source as Source1829_, aliasrecor0_.Action_id as Action5_1829_ from Orchard_Alias_AliasRecord aliasrecor0_ where aliasrecor0_.Id>#p0 order by aliasrecor0_.Id asc] ---> System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out
When I stop it and view the site (anywhere really), it's entirely wrecked with this error:
Exception Details: System.ComponentModel.Win32Exception: The wait operation timed out
[Win32Exception (0x80004005): The wait operation timed out]
[SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.]
Line 162: return criteria
Line 163: .List<ContentItemVersionRecord>()
Line 164: .Select(x => ContentManager.Get(x.ContentItemRecord.Id, _versionOptions != null && _versionOptions.IsDraftRequired ? _versionOptions : VersionOptions.VersionRecord(x.Id)))
Source File: d:\Projects\Office Ignite\Main-1.7\src\Orchard\ContentManagement\DefaultContentQuery.cs Line: 162
I don't know why this is isolated with those two CTs. They don't have parts with custom tables or anything.
Any piece of information would be highly appreciated. Thanks!
I have same error, but it seems that problem is not related directly for my code.
I found two solutions for now:
1.) Taxonomy corruption problem https://orchard.codeplex.com/workitem/20411
2.) Static is dirty and lock which is default in select statment is heavly used https://serverfault.com/questions/419997/the-wait-operation-timed-out-when-running-sql-server-in-hyper-v