ParseException when having a field name which contains '#' in SparkSQL - apache-spark-sql

I'm doing a simple operation of inserting some fields of table into another table, both tables are hive tables in databricks, so I'm able to do it with a simple query like:
INSERT INTO <BBDD_NAME>.<TABLE1_NAME> (<FIELD_1>, <FIELD_2>)
SELECT <FIELD_1>, <FIELD_2># FROM <BBDD_NAME>.<TABLE2_NAME>
The problem I have is because one of the fields has a '#' inside its name and consequently I get a ParseException Error:
Error in SQL statement: ParseException:
mismatched input '#' expecting {<EOF>, ';'}
TABLE2 is F0911 from JDE (JDE table doc) and is being inserted directly into databricks via spark inferring the schema from its origin. So, table was created with no problem containing the '#' containing field.
Is there any way of avoiding this ParseException Error?
Thanks in advance.

You can use backticks to escape illegal identifiers in column names.
This should work:
INSERT INTO <BBDD_NAME>.<TABLE1_NAME> (<FIELD_1>, <FIELD_2>)
SELECT <FIELD_1>, `<FIELD_2>#` FROM <BBDD_NAME>.<TABLE2_NAME>

Related

Moving a table from one schema to another in Exasol

I am trying to move a table which resides in a certain schema to a different schema with the same table name. I have tried the following but they do not work:
rename <OLD_SCHEMA_NAME>.<TABLE_NAME> TO <NEW_SCHEMA_NAME>.<TABLE_NAME>;
The error that appears is:
SQL Error [42000]: invalid identifier chain for new name [line 1, column 100] (Session: 1722923178259251200)
and
ALTER TABLE <OLD_SCHEMA_NAME>.<TABLE_NAME> RENAME <NEW_SCHEMA_NAME>.<TABLE_NAME>;
The error that appears is:
SQL Error [42000]: syntax error, unexpected IDENTIFIER_PART_, expecting COLUMN_ or CONSTRAINT_ [line 1, column 62] (Session: 1722923178259251200)
Many Thanks!
According to Exasol documentation there is no way to move table between schemas using RENAME statement:
Schema objects cannot be shifted to another schema with the RENAME
statement. For example, 'RENAME TABLE s1.t1 TO s2.t2' is not allowed.
I would move the table this way:
create table <NEW_SCHEMA_NAME>.<TABLE_NAME>
like <OLD_SCHEMA_NAME>.<TABLE_NAME>
including defaults
including identity
including comments;
insert into <NEW_SCHEMA_NAME>.<TABLE_NAME>
select *
from <OLD_SCHEMA_NAME>.<TABLE_NAME>;
drop table <OLD_SCHEMA_NAME>.<TABLE_NAME>;

Hive conditionally select column name

I have multiple tables with very similar schema except one column, which can have different names.
I want to make some complicated calculations using Hive and would like to have one code for all tables with possible parametrisation. For some reasons, I can't parametrise queries using language like Python, Scala etc, so decided to go with pure Hive SQL.
I want to conditionally select appropriate column, but it seems, that Hive evaluates all parts of conditional expression/statement regardless of condition.
What did I wrong?
DROP TABLE IF EXISTS `so_sample`;
CREATE TABLE `so_sample` (
`app_version` string
);
SELECT
if (true, app_version, software_version) AS firmware
FROM so_sample
;
Output:
Error: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 2:25 Invalid table alias or column reference 'software_version': (possible column names are: app_version) (state=42000,code=10004)
Regards
Pawel
Try to use regex to select the column with different names, for more information see manual and don't forget
set hive.support.quoted.identifiers=none;

psql column doesn't exist but it does

I am trying to select a single column in my data table using raw SQL in a postgresql database from the psql command line. I am getting an error message that says the column does not exist. Then it gives me a hint to use the exact column that I referenced in the select statement. Here is the query:
SELECT insider_app_ownershipdocument.transactionDate FROM insider_app_ownershipdocument;
Here is the error message:
ERROR: column insider_app_ownershipdocument.transactiondate does not exist
SELECT insider_app_ownershipdocument.transactionDate FROM in...
HINT: Perhaps you meant to reference the column "insider_app_ownershipdocument.transactionDate".
I have no idea why this is not working.
(Postgres) SQL converts names automatically to lower case although it support case-sensitive names. So
SELECT insider_app_ownershipdocument.transactionDate FROM insider_app_ownershipdocument;
will be aquivalent to:
SELECT insider_app_ownershipdocument.transactiondate FROM insider_app_ownershipdocument;
You should protect the column name with double quotes to avoid this effect:
SELECT insider_app_ownershipdocument."transactionDate" FROM insider_app_ownershipdocument;

SELECT database.table.column in Hive

Is it possible to use
SELECT DB.TABLE.COLUMN from DB.TABLE
in Hive?
I know it's possible to alias DB.TABLE as follows
SELECT T1.COLUMN FROM DB.TABLE AS T1
But, is there any way in Hive to select a column fully qualified by its database and table name, as shown in the first query above? I've done this before in MySQL but I don't know if there's a way to make Hive work this way.
No, that is not possible in Hive, you will get an exception:
SemanticException [Error 10004]: Line 1:7 Invalid table alias or column reference 'DB': (possible column names are: col)
And your second select sentence is valid.
To specify a database, either qualify the table names with database names ("db_name.table_name" starting in Hive 0.7) or issue the USE statement before the query statement (starting in Hive 0.6).
See language manual here: LanguageManual+Select

Using values from two tables to run query in Hive

I would like to run a hive query to be able to divide a column from one table by the total sum of a column from another table.
Do I have to join the tables?
The code below generates errors:
Select 100*(Num_files/total_Num_files) from jvros_p2, jvros_p3;
FAILED: Parse Error: line 1:75 mismatched input ',' expecting EOF near 'jvros_p2'
Yes, jvros_p3 is a single row single column table
Num_files is a column in jvros_p2 and total_Num_files is a single value in jvros_p3.
Your older version may be why your notation isn't working. Try this:
SELECT 100 * (Num_files / total_Num_files) FROM jvros_p2 JOIN jvros_p3;
I suspect that if you are eventually able to upgrade to at least 0.13, implicit join notation via comma-separated tables will be supported per HIVE-5558.