HIVE - hive subquery is not working with case when statement with IN clause - sql

I am trying to migrate data from mysql to hive.I am not able to write a subquery case when statement with IN clause.This is my query. Can you Please help in this regard. AM i not following the proper syntax .
CREATE TABLE HIVE_TPCE_TEMP.TMP_CDMA_CD AS
SELECT A.DRI,C.BOUND_ID,A.CT_ID,A.CD_ID,A.CID,
A.TID,A.TASK_SEQ_ID,A.DIV_ID,C.BLOCK_GROUP_ID,C.ZIP_CODE,C.ROAD_CATEGORY_ID,A.RXPOWER,"${hiveconf:C_CDMA_DEVICE_ONLINE_RXPOWER_METRIC_ID}" METRIC_ID,
CASE WHEN
((A.DRI,A.DIV_ID,A.RFID) in (SELECT DRI,DIV_ID,HOME_RFID FROM HIVE_TPCE_TEMP.TMP_HOME_NETWORKS)) THEN
CASE WHEN MODE IN ('A','N') THEN "${hiveconf:HAD}" ELSE "${hiveconf:HD}" END
WHEN (COALESCE(A.RFID,0) = 0) AND ((A.DRI,A.DIV_ID,D.FR,D.SUBBAND) IN (SELECT DRI,DIV_ID,HOME_FR,
HOME_SUBBAND FROM HIVE_TPCE_TEMP.TMP_HOME_NETWORKS))
THEN CASE WHEN MODE IN ('A','N') THEN "${hiveconf:HAD}" ELSE "${hiveconf:HD}" END
ELSE CASE WHEN MODE IN ('A','N') THEN "${hiveconf:PAI}" ELSE "${hiveconf:PDI}" END END HPDA_ID
FROM HIVE_TPCE.VW_CDMA_CD A INNER JOIN HIVE_TPCE.STG_CURRENT_FILES B
ON A.DRI = B.DRI AND A.SOURCE_FILE_ID = B.SOURCE_FILE_ID
INNER JOIN
HIVE_TPCE.WRK_LOCATION C
ON A.DRI = C.DRI AND A.LOCATION_ID = C.LOCATION_ID
INNER JOIN
HIVE_TPCE.LU_RADIO D
ON A.RADIO_ID = D.RADIO_ID WHERE A.CID > 0 AND D.MODE IN ('A','N') AND A.RXPOWER IS NOT NULL AND A.CALL_RESULT_ID BETWEEN 1 AND 16;
My error signature is
FAILED: ParseException line 10:42 mismatched input ',' expecting ) near 'DRI' in expression specification

According to the Hive Language Manual: "Hive supports subqueries only in the FROM clause".
Your CASE WHEN is part of the SELECT clause, but it includes includes a SELECT subquery. Seems like that is not supported, so your syntax is not correct (in Hive).
Perhaps you could stage the data in MySQL using the query you have and then load it into Hive using a simple SELECT without CASE WHEN?

See official document.
It says
Assumptions
We plan to limit the scope with the following assumptions and limitations.
Subqueries could only be top-level expressions in SELECT. That is, subqueries in complex expressions, aggregates, UDFs, etc. will not be supported for now

Related

Oracle database error 907: ORA-00907: missing right parenthesis

I transferred this code directly from SQL developer. Works perfectly in there.
SELECT
a.INCIDENT_NUMBER,
a.DETAILED_DESCRIPTION,
a.INCIDENT_ROOT_CAUSE
FROM
N_EVALUATION as a
INNER JOIN N_DISPOSITION as b
ON (a.INCIDENT_NUMBER = b.INCIDENT_NUMBER)
WHERE
b.DISPOSITION_LINE_NUM in (NULL, 1) AND
a.ACTIVE_FLAG = 'Y' AND
b.ACTIVE_FLAG = 'Y' AND
a.DETAILED_DESCRIPTION IS NOT NULL
However when I transfer the same exact code into Tableau to create a custom SQL query. It gives me an error;
An error occurred while communicating with the data source. Bad
Connection: Tableau could not connect to the data source. Oracle
database error 907: ORA-00907: missing right parenthesis
This has me completely stumped, not really sure what to do here. Any help or advice is much appreciated. I am more concerned regarding the missing right parenthesis rather than the bad connection.
Remove the AS from the FROM clause. Oracle does not recognize that.
In addition, this condition:
b.DISPOSITION_LINE_NUM in (NULL, 1)
Does not do what you expect. It never evaluates to true if b.DISPOSITION_LINE_NUM is NULL.
You should replace it with:
(b.DISPOSITION_LINE_NUM IS NULL OR b.DISPOSITION_LINE_NUM = 1)
Otherwise, your query looks like it has balanced parentheses, but you should write it as:
SELECT e.INCIDENT_NUMBER, e.DETAILED_DESCRIPTION, e.INCIDENT_ROOT_CAUSE
FROM N_EVALUATION e JOIN
N_DISPOSITION d
ON e.INCIDENT_NUMBER = d.INCIDENT_NUMBER
WHERE (d.DISPOSITION_LINE_NUM IS NULL OR d.DISPOSITION_LINE_NUM = 1) AND
e.ACTIVE_FLAG = 'Y' AND
d.ACTIVE_FLAG = 'Y' AND
e.DETAILED_DESCRIPTION IS NOT NULL;
Notes:
User meaningful table aliases rather than arbitrary letters (this uses abbreviations).
Do not use as in the FROM clause.
Be careful with NULL comparisons.
Finally, your original query is equivalent to:
SELECT e.INCIDENT_NUMBER, e.DETAILED_DESCRIPTION, e.INCIDENT_ROOT_CAUSE
FROM N_EVALUATION e JOIN
N_DISPOSITION d
ON e.INCIDENT_NUMBER = d.INCIDENT_NUMBER
WHERE d.DISPOSITION_LINE_NUM = 1 AND
e.ACTIVE_FLAG = 'Y' AND
d.ACTIVE_FLAG = 'Y' AND
e.DETAILED_DESCRIPTION IS NOT NULL;
This has no parentheses. So it cannot return that particular error.

Issue With SQL Pivot Function

I have a SQL query where I am trying to replace null results with zero. My code is producing an error
[1]: ORA-00923: FROM keyword not found where expected
I am using an Oracle Database.
Select service_sub_type_descr,
nvl('Single-occupancy',0) as 'Single-occupancy',
nvl('Multi-occupancy',0) as 'Multi-occupancy'
From
(select s.service_sub_type_descr as service_sub_type_descr, ch.claim_id,nvl(ci.item_paid_amt,0) as item_paid_amt
from table_1 ch, table_" ci, table_3 s, table_4 ppd
where ch.claim_id = ci.claim_id and ci.service_type_id = s.service_type_id
and ci.service_sub_type_id = s.service_sub_type_id and ch.policy_no = ppd.policy_no)
Pivot (
count(distinct claim_id), sum(item_paid_amt) as paid_amount For service_sub_type_descr IN ('Single-occupancy', 'Multi-occupancy')
)
This expression:
nvl('Single-occupancy',0) as 'Single-occupancy',
is using an Oracle bespoke function to say: If the value of the string Single-occupancy' is not null then return the number 0.
That logic doesn't really make sense. The string value is never null. And, the return value is sometimes a string and sometimes a number. This should generate a type-conversion error, because the first value cannot be converted to a number.
I think you intend:
coalesce("Single-occupancy", 0) as "Single-occupancy",
The double quotes are used to quote identifiers, so this refers to the column called Single-occupancy.
All that said, fix your data model. Don't have identifiers that need to be quoted. You might not have control in the source data but you definitely have control within your query:
coalesce("Single-occupancy", 0) as Single_occupancy,
EDIT:
Just write the query using conditional aggregation and proper JOINs:
select s.service_sub_type_descr, ch.claim_id,
sum(case when service_sub_type_descr = 'Single-occupancy' then item_paid_amt else 0 end) as single_occupancy,
sum(case when service_sub_type_descr = 'Multi-occupancy' then item_paid_amt else 0 end) as multi_occupancy
from table_1 ch join
table_" ci
on ch.claim_id = ci.claim_id join
table_3 s
on ci.service_type_id = s.service_type_id join
table_4 ppd
on ch.policy_no = ppd.policy_no
group by s.service_sub_type_descr, ch.claim_id;
Much simpler in my opinion.
for column aliases, you have to use double quotes !
don't use
as 'Single-occupancy'
but :
as "Single-occupancy",

Use Case Statement in Join

Hi every one i want to use case statement in join using this query and got error
Select CONVERT(VARCHAR(10), SII.SIDATE,103)DATE,SII.SALEID,SII.ItemName,SI.TenancyID
FROM F_SALESINVOICEITEM SII
INNER JOIN F_SALESINVOICE SI ON SI.SALEID=SII.SALEID
INNER JOIN #TempTableSearch ts ON CASE
WHEN ts.ACCOUNTTYPE = '1' THEN ts.ACCOUNTID=SI.TENANCYID
WHEN ts.ACCOUNTTYPE='2' THEN ts.ACCOUNTID=SI.EMPLOYEEID
WHEN ts.ACCOUNTTYPE='3' THEN ts.ACCOUNTID=SI.SUPPLIERID
WHEN ts.ACCOUNTTYPE='4' THEN ts.ACCOUNTID=SI.SALESCUSTOMERID
Error
Incorrect syntax near '='.
Please help me to solve this error.
IT should be,
ON
ts.ACCOUNTID = CASE
WHEN ts.ACCOUNTTYPE = '1' THEN SI.TENANCYID
WHEN ts.ACCOUNTTYPE = '2' THEN SI.EMPLOYEEID
WHEN ts.ACCOUNTTYPE = '3' THEN SI.SUPPLIERID
WHEN ts.ACCOUNTTYPE = '4' THEN SI.SALESCUSTOMERID
END
Instead of using CASE, I'd much rather do this:
Select CONVERT(VARCHAR(10), SII.SIDATE,103)DATE,SII.SALEID,SII.ItemName,SI.TenancyID
FROM F_SALESINVOICEITEM SII
INNER JOIN F_SALESINVOICE SI ON SI.SALEID=SII.SALEID
INNER JOIN #TempTableSearch ts ON
(ts.ACCOUNTTYPE='1' AND ts.ACCOUNTID=SI.TENANCYID)
OR (ts.ACCOUNTTYPE='2' AND ts.ACCOUNTID=SI.EMPLOYEEID)
OR (ts.ACCOUNTTYPE='3' AND ts.ACCOUNTID=SI.SUPPLIERID)
OR (ts.ACCOUNTTYPE='4' AND ts.ACCOUNTID=SI.SALESCUSTOMERID)
To explain why the query didn't work for you: the syntax of the CASE requires an END at the end of the clause. It would work, as the other solutions proposed suggest, but I find this version to be more convenient to understand - although this part is highly subjective.
you can do this, so you have no chance to misspell something (note that ACCOUNTTYPE and ACCOUNTID used only when needed, you don't have to copy-paste it)
select
convert(varchar(10), SII.SIDATE,103) as DATE,
SII.SALEID, SII.ItemName, SI.TenancyID
from F_SALESINVOICEITEM as SII
inner join F_SALESINVOICE as SI on SI.SALEID = SII.SALEID
outer apply (
'1', SI.TENANCYID
'2', SI.EMPLOYEEID
'3', SI.SUPPLIERID
'4', SI.SALESCUSTOMERID
) as C(ACCOUNTTYPE, ACCOUNTID)
inner join #TempTableSearch as ts on
ts.ACCOUNTTYPE = C.ACCOUNTTYPE and ts.ACCOUNTID = C.ACCOUNTID
You have syntax error. You are missing END there.
You must understand that CASE ... END block is NOT equivalent to IF { } from C-like languages. Much rather this is equivalent to elaborate version of ... ? ... : ... operator from C-like languages. What it means that the WHOLE CASE block must essentially evaluate to single value and that this value has to be the same type no matter which case of the block is executed. This means that:
CASE
WHEN ts.ACCOUNTTYPE = '1' THEN ts.ACCOUNTID=SI.TENANCYID ...
END
Is fundamentally incorrect unless you work on a version of database that will allow you bool value as a value (SQL Server won't allow it for example but I think some of MySQL version used to allow it - not sure about this). You probably should write something like:
CASE
WHEN ts.ACCOUNTTYPE = '1' AND ts.ACCOUNTID=SI.TENANCYID THEN 1
WHEN ts.ACCOUNTTYPE='2' AND ts.ACCOUNTID=SI.EMPLOYEEID THEN 1
WHEN ts.ACCOUNTTYPE='3' AND ts.ACCOUNTID=SI.SUPPLIERID THEN 1
WHEN ts.ACCOUNTTYPE='4' AND ts.ACCOUNTID=SI.SALESCUSTOMERID THEN 1
ELSE 0
END = 1
Notice how the whole CASE block evaluates to 1 or 0 and then it is compared to 1. Of course instead of 4 WHEN's you could use one WHEN with combination of AND's, OR's and ( ) brackets. Of course in this particular case answer by #ppeterka 66 is correct as CASE is not suited for what you really wanted to do - I'm just trying to clarify what CASE really is.

HiveQL : use SELECT in clause WHERE

Is there a way to do this in HiveQL :
SELECT ......
from
default.thm_renta_produits_jour rpj
WHERE
rpj.co_societe = '${hiveconf:in_co_societe}'
AND rpj.dt_jour >= (SELECT MIN(dt_jour) FROM default.calendrier WHERE co_an_semaine = '${hiveconf:in_co_an_sem}')
Because when i do this, i get this error :
FAILED: ParseException line 51:26 cannot recognize input near 'SELECT' 'MIN' '(' in expression specification
Thanks,
Hive does not support sub queries in where clause it supports sub queries in from clause only.
Hive does not support sub queries in the WHERE clause. Perhaps you can work around this by moving your sub query to a JOIN clause like so:
SELECT
rpj.*
FROM
default.thm_renta_produits_jour rpj
JOIN
( SELECT MIN(dt_jour) AS min_dt_jour
FROM default.calendrier
WHERE co_an_semaine = '${hiveconf:in_co_an_sem}'
) m
WHERE
rpj.co_societe = '${hiveconf:in_co_societe}'
AND rpj.dt_jour >= m.min_dt_jour;
Hope that helps.
I know that this is an old post, but the previous answers are now outdated. Newer versions of Hive (0.13+) support subqueries of where clauses, so your query should run.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries#LanguageManualSubQueries-SubqueriesintheWHEREClause

SQL Server Compact won't allow subselect but inner join with groupby not allowed on text datatype

I have the following sql syntax that I used in my database query (SQL Server)
SELECT Nieuwsbrief.ID
, Nieuwsbrief.Titel
, Nieuwsbrief.Brief
, Nieuwsbrief.NieuwsbriefTypeCode
, (SELECT COUNT(*) AS Expr1
FROM NieuwsbriefCommentaar
WHERE (Nieuwsbrief.ID = NieuwsbriefCommentaar.NieuwsbriefID
AND NieuwsbriefCommentaar.Goedgekeurd = 1)) AS AantalCommentaren
FROM Nieuwsbrief
I'm changing now to sql-server-ce (compact edition) which won't allow me to have subqueries like this. Proposed solution : inner join. But as I only need a count of the subtable 'NieuwsbriefCommentaar', I have to use a 'group by' clause on my base table attributes to avoid doubles in the result set.
However the 'Nieuwbrief.Brief' attribute is of datatype 'text'. Group by clauses are not allowed on 'text' datatype in sql-server-ce. 'Text' datatype is deprecated, but sql-server-ce doesn't support 'nvarchar(max)' yet...
Any idea how to solve this? Thx for your help.
I think that the solution could be easier. I don't know exactly how is your metadata but I think that this code could fit your requirements by simply using LEFT JOIN.
SELECT Nieuwsbrief.ID
, Nieuwsbrief.Titel
, Nieuwsbrief.Brief
, Nieuwsbrief.NieuwsbriefTypeCode
, COUNT(NieuwsbriefCommentaar.NieuwsbriefID) AS AantalCommentaren
FROM Nieuwsbrief
LEFT JOIN NieuwsbriefCommentaar ON (Nieuwsbrief.ID = NieuwsbriefCommentaar.NieuwsbriefID)
WHERE NieuwsbriefCommentaar.Goedgekeurd = 1
Edited: 2ndOption
SELECT N.ID, N.Titel, N.Brief, N.NieuwsbriefTypeCode, G.AantalCommentaren FROM Nieuwsbrief as N LEFT JOIN (SELECT NieuwsbriefID, COUNT(*) AS AantalCommentaren FROM NieuwsbriefCommentaar GROUP BY NieuwsbriefID) AS G ON (N.ID = G.NieuwsbriefID)
Please, let me know if this code works in order to find out another workaround..
regards,