I am using spark-redshift and querying redshift data using pyspark for processing.
The query works fine if i run on redshift using workbench etc.But spark-redshift unloads data to s3 and then retrieves it and it is throwing the following error when i run it.
py4j.protocol.Py4JJavaError: An error occurred while calling o124.save.
: java.sql.SQLException: [Amazon](500310) Invalid operation: Assert
Details:
-----------------------------------------------
error: Assert
code: 1000
context: !AmLeaderProcess -
query: 583860
location: scheduler.cpp:642
process: padbmaster [pid=31521]
-----------------------------------------------;
at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(ErrorResponse.java:1830)
at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse(PGMessagingContext.java:822)
at com.amazon.redshift.client.PGMessagingContext.handleMessage(PGMessagingContext.java:647)
at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass(InboundMessagesPipeline.java:312)
at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass(PGMessagingContext.java:1080)
at com.amazon.redshift.client.PGMessagingContext.getErrorResponse(PGMessagingContext.java:1048)
at com.amazon.redshift.client.PGClient.handleErrorsScenario2ForPrepareExecution(PGClient.java:2524)
at com.amazon.redshift.client.PGClient.handleErrorsPrepareExecute(PGClient.java:2465)
at com.amazon.redshift.client.PGClient.executePreparedStatement(PGClient.java:1420)
at com.amazon.redshift.dataengine.PGQueryExecutor.executePreparedStatement(PGQueryExecutor.java:370)
at com.amazon.redshift.dataengine.PGQueryExecutor.execute(PGQueryExecutor.java:245)
at com.amazon.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
at com.amazon.jdbc.common.SPreparedStatement.execute(Unknown Source)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$executeInterruptibly$1.apply(RedshiftJDBCWrapper.scala:108)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$executeInterruptibly$1.apply(RedshiftJDBCWrapper.scala:108)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$2.apply(RedshiftJDBCWrapper.scala:126)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Caused by: com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: Assert
The query which gets generated:
UNLOAD ('SELECT “x”,”y" FROM (select x,y from table_name where
((load_date=20171226 and hour>=16) or (load_date between 20171227 and
20171226) or (load_date=20171227 and hour<=16))) ') TO ‘s3:s3path' WITH
CREDENTIALS ‘aws_access_key_id=xxx;aws_secret_access_key=yyy' ESCAPE
MANIFEST
What is the issue here and how can i resolve this.
Assert error usually happens when something is wrong with interpreting data types, for example for 2 parts of union query where column N in one part is varchar and in another part the same column is integer or null. Maybe your assertion error happens for data that comes from different nodes (just like in union query). Try to add explicit data formatting for each column like x::integer
Related
I've got a Hive SQL script/action as part of an Oozie workflow. I'm doing a CREATE TABLE AS SELECT to output the results. I want to name the table using the username plus an appended string (e.g. "User123456_output_table"), but can't seem to get the correct syntax.
set tablename=${hivevar:current_user()};
CREATE TABLE `${hiveconf:tablename}_output_table` AS SELECT ...
That doesn't work and gives:
Error while compiling statement: FAILED: IllegalArgumentException java.net.URISyntaxException: Relative path in absolute URI: ${hivevar:current_user()%7D_output_table
Or changing the first line to set tablename=${current_user()}; starts running the SELECT query but eventually stops with:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hive.ql.metadata.HiveException: [${current_user()}_output_table]: is not a valid table name
Or changing the first line to set tablename=current_user(); starts running the SELECT query but eventually stops with:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hive.ql.metadata.HiveException: [current_user()_output_table]: is not a valid table name
Alternatively, is there a way to pass the username from the Oozie workflow via a parameter?
I'm using Hue to do all this rather than the command line.
Thanks
This is wrong: set tablename=${hivevar:current_user()}; - it will not be resolved and substituted as is.
Hive does not calculate variables before substitution, it substitutes them as is, all functions in variables are NOT calculated. variables are just text replacement.
This:
set tablename=current_user();
CREATE TABLE `${hiveconf:tablename}_output_table` ...
gets resolved as
CREATE TABLE `current_user()_output_table` ...
And functions are not supported in table names, it will not work this way.
The solution is to calculate functions outside the script and pass them as parameters.
See this blog: https://prodlife.wordpress.com/2013/12/06/parameterizing-hive-actions-in-oozie-workflows/
I'm running a heavy query on RedShift with multiple joins. I'm continuously getting the following error (running via SQL Workbench):
Amazon Invalid operation: Assert Details:
----------------------------------------------- error: Assert code: 1000 context: dex < m_num_colflds && dex >= 0 -
m_num_colflds:1 dex:4 query: 162544 location:
tbl_trans.hpp:398 process: padbmaster [pid=1572]
-----------------------------------------------;
I even searched for this error, but couldn't get to the root cause. But when i query the stl_error table, i get the following information.
Query 162511 caught Query_abort exception
dex < m_num_colflds && dex >= 0 - m_num_colflds:1 dex:4
We have migrated a sample data of about 2.5 TB from Netezza to RedShift and trying to run Netezza ETL queries on RedShift. I have done the necessary conversions. We have a RedShift cluster of 2 nodes and node type is ds2.xlarge. Can someone highlight, what could be the cause?
Here is the query
I have a simple example that is causing an error using Firebird SQL.
I have a table with a column called Details which is defined as:
DETAILS varchar(261) COLLATE UNICODE
If I try to do the following query:
SELECT a.DETAILS
FROM MODHISTORY a
WHERE
a.DETAILS LIKE '%Â%'
I get the error:
Error: *** IBPP::SQLException ***
Context: Statement::Prepare( SELECT a.DETAILS
FROM MODHISTORY a
WHERE
a.DETAILS LIKE '%Â%'
)
Message: isc_dsql_prepare failed
SQL Message : -104
Invalid token
Engine Code : 335544849
Engine Message :
Malformed string
If I connect to the database using CHARSET=UTF8 in the connection string this error goes away but unfortunately I cannot use UTF8 as the character set when connecting to the database because some other tables contain, for example:
SampleData blob sub_type 1 CHARACTER SET ASCII,
I have solved my problem as suggested. I am binding the parameters and now everything works fine
My application is having an exception about initialization of a parametermap into an sql statement. The error is :
Caused By: com.ibatis.common.jdbc.exception.NestedSQLException:
--- The error occurred in /com/***/cusman/cusbilman/postpaid/main/product/data/ibatis/sqlMap/THSSqlMap.xml.
--- The error occurred while applying a parameter map.
--- Check the invoicing.invoice.ths.paymentInfoMap.
--- Check the statement (query failed).
--- Cause: java.sql.SQLException: ORA-00904: : invalid identifier
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.executeQueryWithCallback(MappedStatement.java:201)
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.executeQueryForList(MappedStatement.java:139)
at com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:567)
at com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:541)
at com.ibatis.sqlmap.engine.impl.SqlMapSessionImpl.queryForList(SqlMapSessionImpl.java:118)
at org.springframework.orm.ibatis.SqlMapClientTemplate$3.doInSqlMapClient(SqlMapClientTemplate.java:298)
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:209)
at org.springframework.orm.ibatis.SqlMapClientTemplate.executeWithListResult(SqlMapClientTemplate.java:249)
at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:296)
The definitions are totally persistent which each other(java side and the xml side I mean).
Any ideas?
I found it. The problem is, oracle has not a stack trace type definition for errors. I was using a function in a select, but my db user had not got the grant to execute it, so Stupid Oracle tried to run it like, the function name is a column name. So it could not find a column name like it. So it hide the real problem...
statement.executeUpdate("INSERT INTO countrylookup (Country, DialCode) VALUES('Iran', '957')")
Running this statement gives me no error output in the console, but when I check the database no update/insert is made. What could be the reason for this?
The access to the database itself is successful, and fetching values with a statement such as SELECT * FROM countrylookup succeeds.
I tried the preparedStatement approach aswell with the exact same result. The file is not open when I execute the command.
UPDATE: Stacktrace: (first row in Swedish means "INSERT INTO-expression contains the following unknown fieldname: 'Pa_RaM000'. Please check that the name is rightly spelled and try again.)
Exception in thread "main" java.sql.SQLException: [Microsoft][Drivrutin f?r ODBC Microsoft Access] INSERT INTO-uttrycket inneh?ller f?ljande ok?nda f?ltnamn: 'Pa_RaM000'. Kontrollera att namnet ?r r?ttstavat och f?rs?k igen.
at sun.jdbc.odbc.JdbcOdbc.createSQLException(Unknown Source)
at sun.jdbc.odbc.JdbcOdbc.standardError(Unknown Source)
at sun.jdbc.odbc.JdbcOdbc.SQLExecute(Unknown Source)
at sun.jdbc.odbc.JdbcOdbcPreparedStatement.execute(Unknown Source)
at sun.jdbc.odbc.JdbcOdbcPreparedStatement.executeUpdate(Unknown Source)
at MDBAccessor.insertValueIntoField(MDBAccessor.java:43)
at TestRunner.main(TestRunner.java:28)
Is dialcode numeric? If so, remove the quotes from the value.
VALUES('Iran', 957)
In order for the INSERT INTO statement to actually be reflected in the database you have to call connection.commit() followed by connection.close(). Another similar thread describing this: Java General Error On Insert...???