Using CASE as a join condition in Hive - sql

I'm trying to run a query in Hive where I run a join based on a case statement. For some reason, I'm having problems on lines 7 and 8. I have not been able to resolve the error which is
line 7: Expected: AND, AS, BETWEEN, DIV, ILIKE, IN, IREGEXP, IS, LIKE, NOT, OR, REGEXP, RLIKE CAUSED BY: Exception: Syntax error
line 8: Encountered: AS Expected: AND, BETWEEN, DIV, ILIKE, IN, IREGEXP, IS, LIKE, NOT, OR, REGEXP, RLIKE CAUSED BY: Exception: Syntax error
select * from dra_record_set.mark_set inv
INNER JOIN innerdb.name_set roll_table
on inv.record_id = roll_table.ply_record_id AND
roll_table.date =
(CASE
WHEN inv.purchase_day>0 AND inv.purhcase_date BETWEEN roll_table.discount_start_dt AND roll_table.discount_end_dt) THEN inv.purchase_date
ELSE WHEN (CONCAT(inv.purchase_yr,inv.purchase_mo,(CAST("15")AS INT))) AS temp_var BETWEEN roll_table.discount_start_dt AND roll_table.discount_end_dt) THEN inv.purchase_date
END AS temp_pur_dt)
WHERE inv.inroll_discount_eligible_flag =1
limit 10

case is an expression. It returns a value. Your seems to be more like a macro substitution for SQL code.
Remove the case and just use boolean logic:
from dra_record_set.mark_set inv join
innerdb.name_set roll_table
on inv.record_id = roll_table.ply_record_id and
( (inv.purchase_day > 0 and
inv.purhcase_date between roll_table.discount_start_dt and roll_table.discount_end_dt
) or
(inv.purchase_day <= 0 and
CONCAT(inv.purchase_yr, inv.purchase_mo, 15) between roll_table.discount_start_dt AND roll_table.discount_end_dt
)
)

Related

Postgres SQL state: 22P02 - invalid input syntax for integer

I'm using a sql query to export a database from my company's program.
Everything seems to be fine till I change the date on the "where" statement with a previous one.
Please find below the code:
SELECT p."Index", p."PSN" || CAST(p."PNR"as int) AS ID,
p."PSN" AS Serie, cast(p."PNR"as int) AS Numar,
pr."PINDate" AS r_gdate,
CASE WHEN pr."AsigEID"='10' THEN pr."PrimSUM" ELSE
pr."PrimSUM"*valuta1."EXCValue" END AS r_prima_lei,
CASE WHEN pr."AsigEID"='2'
THEN pr."PrimSUM"
ELSE CASE WHEN pr."AsigEID"='10' THEN pr."PrimSUM"/valuta2."EXCValue"
ELSE pr."PrimSUM"*valuta1."EXCValue"/valuta2."EXCValue"
END
END AS r_prima_eur,
CASE WHEN pr."AsigEID"='10' THEN pr."AsigSUM" ELSE
pr."AsigSUM"*valuta1."EXCValue" END as r_sa_lei,
CASE WHEN pr."AsigEID"='2'
THEN pr."AsigSUM"
ELSE CASE WHEN pr."AsigEID"='10' THEN pr."AsigSUM"/valuta2."EXCValue"
ELSE pr."AsigSUM"*valuta1."EXCValue"/valuta2."EXCValue"
END
END AS r_sa_eur,
pr."AsigStart", pr."AsigEnd", risc."Code", plink."Index"
FROM "PolsRisc" AS pr
LEFT JOIN "Pols" as p ON p."Index" = pr."PID"
LEFT JOIN "Riscs" as risc ON pr."RID" = risc."Index"
LEFT JOIN "PRLNK" plink ON plink."PTID" = p."PTID" AND plink."RID" = risc."Index"
LEFT JOIN "EXCValues" valuta1 ON valuta1."AtDate" = pr."AsigStart" AND valuta1."EID" = pr."AsigEID"
LEFT JOIN "EXCValues" valuta2 ON valuta2."AtDate" = pr."AsigStart" AND valuta2."EID"='2'
WHERE pr."PINDate" > '2020-08-01' AND pr."IsRezil" = 'false';
When I'm using '2020-08-01' the query works well. When I try to change it to a previous one eg. '2010-01-01' a get an error:
ERROR: invalid input syntax for integer: ""
SQL state: 22P02
I was looking for a solution on the previous posts but I didn't manage to solve this issue.
It looks like it is returning "" or a null value into one of the columns you are using integer logic for. The date change is just filtering out the data that would crash it.
You may need to use coalesce to reassign the nulls as 0 and then cast it back into being an int
select
cast(coalesce(table.column, 0) as int) as result
from table
I would advice to read the chapter http://www.postgresql.org/docs/current/interactive/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
. It's a brief and informative read.The cause for the error message is that '' is an empty string that has no representation in a numeric type like integer

What is the syntax problem here using this subquery inside where clause

SELECT p.pnum, p.pname
FROM professor p, class c
WHERE p.pnum = c.pnum AND c.cnum = CS245 AND (SELECT COUNT(*) FROM (SELECT MAX(m.grade), MAX(m.grade) - MIN(m.grade) AS diff
FROM mark m WHERE m.cnum = c.cnum AND m.term = c.term AND m.section = c.section AND diff <= 20)) = 3
Incorrect syntax near ')'. Expecting AS, FOR_PATH, ID, or QUOTED_ID.
Consider the following example:
SELECT COUNT(1)
FROM SYSCAT.TABLES T
WHERE
(
-- SELECT COUNT(1)
-- FROM
-- (
SELECT COUNT(1)
FROM SYSCAT.COLUMNS C
WHERE C.TABSCHEMA=T.TABSCHEMA AND C.TABNAME=T.TABNAME
-- )
) > 50;
The query above works as is. But the problem is, that if you uncomment the commented out lines, you get the following error message: "T.TABNAME" is an undefined name. and least in Db2 for Linux, Unix and Windows.
You can't push external to the sub-select column references too deeply.
So, your query is incorrect.
It's hard to correct it, until you provide the task description with data sample and the result expected.
I can see a potential syntax errors: It seems that CS245 refers to a value c.cnum may take and not a column name. If that is the case, it should be enclosed in single quotes.

Syntax Error: ON RIGHT when trying to match a substring in Impala

Does anyone know why I am receiving this error? I am using SQL in IMPALA and it wont run. Theres a yellow underline under mem_register_hsty_view and transparency_services_summary_2018.
Here is my code:
use sndbx_dx;
SELECT
r.member_identifier,
n.fst_nme
FROM mem_register_hsty_view n
JOIN transparency_services_summary_2018 r
ON RIGHT(TRIM(r.member_identifier),4) = LEFT(n.fst_nme,4)
ORDER BY
r.id_key,
r.group_number,
n.fst_nme;
Here is the error:
AnalysisException: Syntax error in line 1:undefined: ...ervices_summary_2018 r ON RIGHT(TRIM(r.member_identifi... ^ Encountered: RIGHT Expected: CASE, CAST, DEFAULT, EXISTS, FALSE, IF, INTERVAL, NOT, NULL, REPLACE, TRUNCATE, TRUE, IDENTIFIER CAUSED BY: Exception: Syntax error
From the current Impala documentation the functions for taking some number of characters from the left or right of the string appear to actually be STRLEFT and STRRIGHT, respectively. Apply this to your current query gives:
SELECT
r.member_identifier,
n.fst_nme
FROM mem_register_hsty_view n
INNER JOIN transparency_services_summary_2018 r
ON STRRIGHT(TRIM(r.member_identifier), 4) = STRLEFT(n.fst_nme, 4)
ORDER BY
r.id_key,
r.group_number,
n.fst_nme;

Why is my case statement not working with a calculation while using signed over punch?

I'm trying to write a query to go against "Signed Over Punch" using the following query:
SELECT
CASE when substring(MyField,16,1)='C' then cast(substring(MyField,9,7)+'0' AS decimal(20,2)*-1 FROM MyTable
Here's some sample data:
0000069A0000006C00000000#0000000#
From the above data, the position starts at 16 ("C") with a length of 1
And the other starts at position 9 (0) with a length of 7
But I keep getting this error:
Msg 102, Level 15, State 1, Line 139
Incorrect syntax near '*'.
Desired Output:
00000063 (The C = 3)
What am I doing wrong?
Please refer to this page for reference for signed over punch:
https://en.wikipedia.org/wiki/Signed_overpunch
You need to learn about operator precedence. This code:
[..snip..] then cast(substring(MyField,9,7)+'0' AS decimal(20,2)*-1
is executing as if it had been written
[..snip..] then cast(...) AS (decimal(20,2) * -1)
^------------------^
You're not multiplying the result of the cast, you're trying to mutiply the decimal(20,2), which is NOT a multiplicable value.
Try
then (cast(substring(MyField,9,7)+'0' AS decimal(20,2)) * -1
^------------------------------------------------^
instead.
Not sure if this will fix it or not, but you are missing a closing parenthesis. You are also missing the END statement I recommend indenting things like this to make it easier to spot these problems.
SELECT CASE
WHEN substring(MyField,16,1)='C'
THEN cast(substring(MyField,9,7)+'0' AS decimal(20,2))*-1
END
FROM MyTable
try so:
SELECT CASE when substring(MyField,16,1)='C' then cast(substring(MyField,9,7)+'0' AS decimal(20,2)) *-1 end FROM MyTable
The problems were a missing " ) " at the defore " *-1 " and a missing "end" to terminate the case when condition.
With these 2 fix the result for your example '0000069A0000006C00000000#0000000#' is -60.00.

HIVE - hive subquery is not working with case when statement with IN clause

I am trying to migrate data from mysql to hive.I am not able to write a subquery case when statement with IN clause.This is my query. Can you Please help in this regard. AM i not following the proper syntax .
CREATE TABLE HIVE_TPCE_TEMP.TMP_CDMA_CD AS
SELECT A.DRI,C.BOUND_ID,A.CT_ID,A.CD_ID,A.CID,
A.TID,A.TASK_SEQ_ID,A.DIV_ID,C.BLOCK_GROUP_ID,C.ZIP_CODE,C.ROAD_CATEGORY_ID,A.RXPOWER,"${hiveconf:C_CDMA_DEVICE_ONLINE_RXPOWER_METRIC_ID}" METRIC_ID,
CASE WHEN
((A.DRI,A.DIV_ID,A.RFID) in (SELECT DRI,DIV_ID,HOME_RFID FROM HIVE_TPCE_TEMP.TMP_HOME_NETWORKS)) THEN
CASE WHEN MODE IN ('A','N') THEN "${hiveconf:HAD}" ELSE "${hiveconf:HD}" END
WHEN (COALESCE(A.RFID,0) = 0) AND ((A.DRI,A.DIV_ID,D.FR,D.SUBBAND) IN (SELECT DRI,DIV_ID,HOME_FR,
HOME_SUBBAND FROM HIVE_TPCE_TEMP.TMP_HOME_NETWORKS))
THEN CASE WHEN MODE IN ('A','N') THEN "${hiveconf:HAD}" ELSE "${hiveconf:HD}" END
ELSE CASE WHEN MODE IN ('A','N') THEN "${hiveconf:PAI}" ELSE "${hiveconf:PDI}" END END HPDA_ID
FROM HIVE_TPCE.VW_CDMA_CD A INNER JOIN HIVE_TPCE.STG_CURRENT_FILES B
ON A.DRI = B.DRI AND A.SOURCE_FILE_ID = B.SOURCE_FILE_ID
INNER JOIN
HIVE_TPCE.WRK_LOCATION C
ON A.DRI = C.DRI AND A.LOCATION_ID = C.LOCATION_ID
INNER JOIN
HIVE_TPCE.LU_RADIO D
ON A.RADIO_ID = D.RADIO_ID WHERE A.CID > 0 AND D.MODE IN ('A','N') AND A.RXPOWER IS NOT NULL AND A.CALL_RESULT_ID BETWEEN 1 AND 16;
My error signature is
FAILED: ParseException line 10:42 mismatched input ',' expecting ) near 'DRI' in expression specification
According to the Hive Language Manual: "Hive supports subqueries only in the FROM clause".
Your CASE WHEN is part of the SELECT clause, but it includes includes a SELECT subquery. Seems like that is not supported, so your syntax is not correct (in Hive).
Perhaps you could stage the data in MySQL using the query you have and then load it into Hive using a simple SELECT without CASE WHEN?
See official document.
It says
Assumptions
We plan to limit the scope with the following assumptions and limitations.
Subqueries could only be top-level expressions in SELECT. That is, subqueries in complex expressions, aggregates, UDFs, etc. will not be supported for now