HIVE CTE with UNION ALL is throwing table not found exception - hive

My intension is to load the table only once with CTE and reuse the CTE table, to avoid multiple table loading and stages in Hive.
But the below hive query is running through hue and throwing table cases not found exception when running through cloudera 5.11 version.
Any error in the query usage?
WITH cases
AS (
SELECT nbr
,id
,date_l2
,date_l3
,date_l4
,date_l5
,level_2
,level_3
,level_4
,level_5
FROM volume
)
SELECT nbr
,id
,CONCAT (
nbr
,'-L2'
) AS enbr
,'L2' AS level_nm
,date_l2 AS dt
FROM cases
WHERE level_2 = true
UNION ALL
SELECT nbr
,id
,CONCAT (
nbr
,'-L3'
) AS enbr
,'L3' AS level_nm
,date_l3 AS dt
FROM cases
WHERE level_3 = true
UNION ALL
SELECT nbr
,id
,CONCAT (
nbr
,'-L4'
) AS enbr
,'L4' AS level_nm
,date_l4 AS dt
FROM cases
WHERE level_4 = true
UNION ALL
SELECT nbr
,id
,CONCAT (
nbr
,'-L5'
) AS enbr
,'L5' AS level_nm
,date_l5 AS dt
FROM cases
WHERE level_5 = true
Output:
nbr id enbr level_nm dt
00193092 84575 00193092-L2 L2 2016-10-19
00193092 84575 00193092-L3 L3 2016-10-20
00193092 84575 00193092-L4 L4 2016-10-20

select t.nbr
,t.id
,concat(t.nbr,'-L',s.nm) as enbr
,concat('L',s.nm) as level_nm
,s.dt
from volume t
lateral view stack
(
4
,'2',t.level_2,t.date_l2
,'3',t.level_3,t.date_l3
,'4',t.level_4,t.date_l4
,'5',t.level_5,t.date_l5
) s as nm,lvl,dt
where s.lvl = true
;

Related

Integrate a tree in oracle

I have a query like below:
select * from (
select * from (
select distinct * from TBL_IDPS_TREE
START WITH LEDGER_CODE in (
10912520000
,10825060000
,10912380000
,11311110201
)
CONNECT BY PRIOR parent_CODE = LEDGER_CODE
)a left join (
select /*+ PARALLEL(AUTO) */ balance as "y300" , ledger_code as "id",'' as "x300" , round(abs(balance)/30835,2) as "z300",name as "name" from tbl_ledger_archive where ledger_code in (
10912520000
,10825060000
,10912380000
,11311110201) and eff_date ='29-MAY-19'
) b
on a.LEDGER_CODE = b."id")
START WITH PARENT_CODE is null
connect by PRIOR LEDGER_CODE = Parent_CODE
;
and the result is :
x300,y300,z300 are the value of tree.
I want to change a query that integrate the tree value for x300,y300,z300
I mean the query must integrate tree value from leaves to root.
Use connect_by_root in the account tree subquery and join with ledger by it.
Demo:
with TBL_IDPS_TREE as (
select 10912520000 LEDGER_CODE, 1091252 parent_CODE from dual union all
select 10825060000, 1091252 from dual union all
select 1091252, 1091 from dual union all
select 1091, null from dual
), tbl_ledger_archive as (
select 500000 as "y300" , 10912520000 as "id", '' as "x300" , round(500000/30835, 2) as "z300", 'abc' as "name" from dual union all
select 600000 as "y300" , 10825060000 as "id", '' as "x300" , round(600000/30835, 2) as "z300", 'abc' as "name" from dual
)
select a.LEDGER_CODE, a.parent_CODE, l."x300", sum(l."y300") "y300", sum(l."z300") "z300"
from (
select distinct t.*, connect_by_root LEDGER_CODE as accRoot
from TBL_IDPS_TREE t
START WITH LEDGER_CODE in (
10912520000
,10825060000
,10912380000
,11311110201
)
CONNECT BY PRIOR parent_CODE = LEDGER_CODE
) a
left join tbl_ledger_archive l
on l."id" = a.accRoot
group by a.LEDGER_CODE, a.parent_CODE, l."x300" ;

PL/SQL: Invalid Number error

I am creating a procedure for which I collect data by repeatedly running the following query.
SELECT ATTRIBUTE_VALUE,
COUNT(src1) CNT1,
COUNT(src2) CNT2
FROM (
SELECT a.ATTRIBUTE_VALUE,
1 src1,
TO_NUMBER(NULL) src2
FROM (
SELECT DECODE(
L,
1, IP_ADDRESS,
DECODE(
L,
2, IP_SUBNET_MASK,
DECODE(
L,
3, IP_DEFAULT_GATEWAY
)
)
) ATTRIBUTE_VALUE
FROM ( SELECT LEVEL L FROM DUAL X CONNECT BY LEVEL <= 3 ),
REUT_LOAD_IP_ADDRESSES
WHERE LIP_IPT_NAME = 'CE'
AND IP_LNT_ID IN (
SELECT LNT_ID
FROM REUT_LOAD_NTN
WHERE LNT_ID IN (
SELECT RLPN.LPN_LNT_ID
FROM REUT_LOAD_PI_NTN RLPN
WHERE LPN_LPI_ID IN (
SELECT RLPI.LPI_ID
FROM REUT_LOAD_PAC_INS RLPI
WHERE RLPI.LPI_DATE_ADDED IN (
SELECT MAX(RLPI2.LPI_DATE_ADDED)
FROM REUT_LOAD_PAC_INS RLPI2
WHERE RLPI2.PI_JOB_ID = P_ORDER_ID
)
)
)
AND IP_CEASE_DATE IS NULL
AND LNT_SERVICE_INSTANCE = 'PRIMARY'
)
It is running fine in SQL developer but when executing it as a procedure, I am getting INVALID NUMBER ERROR (ORA-01722: invalid number) at
AND IP_LNT_ID IN (
SELECT LNT_ID, in the code.
Can I get any help?
The error is pretty clear. You're comparing a number to another type of value.
Example:
SELECT 'x'
FROM DUAL
WHERE 1 IN (SELECT 'a'
FROM DUAL)
This means that IP_LNT_ID, LNT_ID, LPN_LNT_ID and LPI_ID have to be NUMBER. And LPI_DATE_ADDED and LPI_DATE_ADDED should both be date or timestamp.
If this is not possible you could compare everything as char:
SELECT ATTRIBUTE_VALUE, COUNT(src1) CNT1, COUNT(src2) CNT2
FROM (SELECT a.ATTRIBUTE_VALUE, 1 src1, TO_NUMBER(NULL) src2
FROM (SELECT
DECODE(L,1,IP_ADDRESS,DECODE(L,2,IP_SUBNET_MASK,DECODE(L,3,IP_DEFAULT_GATEWAY) ) ) ATTRIBUTE_VALUE
FROM
(
SELECT LEVEL L FROM DUAL X CONNECT BY LEVEL <= 3
),
REUT_LOAD_IP_ADDRESSES
WHERE LIP_IPT_NAME = 'CE'
AND to_char(IP_LNT_ID) IN (
SELECT LNT_ID
FROM REUT_LOAD_NTN
WHERE to_char(LNT_ID) IN (
SELECT RLPN.LPN_LNT_ID
FROM REUT_LOAD_PI_NTN RLPN
WHERE to_char(LPN_LPI_ID) IN (
SELECT RLPI.LPI_ID
FROM REUT_LOAD_PAC_INS RLPI
WHERE to_char(RLPI.LPI_DATE_ADDED) IN (
SELECT MAX(RLPI2.LPI_DATE_ADDED)
FROM REUT_LOAD_PAC_INS RLPI2
WHERE RLPI2.PI_JOB_ID = P_ORDER_ID
)
)
)
AND IP_CEASE_DATE IS NULL
AND LNT_SERVICE_INSTANCE = 'PRIMARY'
)
But this should be avoided on any cost. Unfortunately some times we have to cheat a little from time to time to work with our existing infrasructure ;-)
You need to make sure:
REUT_LOAD_IP_ADDRESSES.IP_LNT_ID
and
REUT_LOAD_NTN.LNT_ID
Have the same data type or cast/convert one or other so that they have the same data type.
There are multiple other issues:
You have aggregated and non-aggregated values:
SELECT ATTRIBUTE_VALUE,
COUNT(src1) CNT1,
COUNT(src2) CNT2
FROM ( ... )
Without a GROUP BY clause.
src2 is TO_NUMBER(NULL) which is just NULL and COUNT(NULL) will always be 0 so your query is:
SELECT ATTRIBUTE_VALUE,
COUNT(src1) CNT1,
0 CNT2
...
This code:
SELECT DECODE(
L,
1, IP_ADDRESS,
DECODE(
L,
2, IP_SUBNET_MASK,
DECODE(
L,
3, IP_DEFAULT_GATEWAY
)
)
) ATTRIBUTE_VALUE
FROM ( SELECT LEVEL L FROM DUAL X CONNECT BY LEVEL <= 3 ),
REUT_LOAD_IP_ADDRESSES
Can be rewritten as:
SELECT DECODE(
L,
1, IP_ADDRESS,
2, IP_SUBNET_MASK,
3, IP_DEFAULT_GATEWAY
) ATTRIBUTE_VALUE
FROM ( SELECT LEVEL L FROM DUAL X CONNECT BY LEVEL <= 3 ),
REUT_LOAD_IP_ADDRESSES
Or, without the join as:
SELECT attribute_value
FROM REUT_LOAD_IP_ADDRESSES
UNPIVOT ( attribute_value FOR L IN (
IP_ADDRESS AS 1,
IP_SUBNET_MASK AS 2,
IP_DEFAULT_GATEWAY AS 3
) )
The innermost query:
SELECT RLPI.LPI_ID
FROM REUT_LOAD_PAC_INS RLPI
WHERE RLPI.LPI_DATE_ADDED IN (
SELECT MAX(RLPI2.LPI_DATE_ADDED)
FROM REUT_LOAD_PAC_INS RLPI2
WHERE RLPI2.PI_JOB_ID = P_ORDER_ID
)
The inner query is restricted to have RLPI2.PI_JOB_ID = P_ORDER_ID but there is no correlation between the outer query so you can retrieve results that do not match P_ORDER_ID but just happen to have the same date as a matching row.

TSQL Pivot Throwing Syntax Errors

I would like to PIVOT the following query result to display a column for each Project Status Code.
WITH r AS (
SELECT ROW_NUMBER() OVER (ORDER BY ph.InsertedDateTime) rownum,
CAST(ph.InsertedDateTime AS DATE) InsertedDate, ph.Gate_1_TargetDate, ph.Gate_2_TargetDate, ph.Gate_3_TargetDate
FROM PROJECT_HIST ph
JOIN (
SELECT ProjectID, MAX(InsertedDateTime) InsertedDateTime
FROM PROJECT_HIST
GROUP BY ProjectID, CAST(InsertedDateTime AS DATE)
) ph_distinct_date ON ph_distinct_date.InsertedDateTime = ph.InsertedDateTime
AND ph_distinct_date.ProjectID = ph.ProjectID
WHERE ph.projectid = 100957
AND NOT (
ph.Gate_1_TargetDate IS NULL
AND ph.Gate_2_TargetDate IS NULL
AND ph.Gate_3_TargetDate IS NULL
)
),
fubar AS (
SELECT rownum, InsertedDate, 0 gateName, NULL targetDate FROM r
UNION ALL
SELECT rownum, InsertedDate, 1, Gate_1_TargetDate FROM r
UNION ALL
SELECT rownum, InsertedDate, 2, Gate_2_TargetDate FROM r
UNION ALL
SELECT rownum, InsertedDate, 3, Gate_3_TargetDate FROM r
)
SELECT f1.InsertedDate 'Change Date', f1.gateName 'ProjectStageCode', f1.targetDate
FROM fubar f1
LEFT JOIN fubar f2 ON f2.rownum = f1.rownum - 1
AND f2.gateName = f1.gateName
PIVOT(min(f1.InsertedDate) FOR f1.gateName IN ([0],[1],[2],[3])) AS p
WHERE f1.rownum = 1
OR f1.targetDate <> f2.targetDate
ORDER BY f1.InsertedDate
;
Without the pivot attempt, this query currently returns this result for this particular project:
What I would like to do is pivot the query to create columns for each Project Stage Code to match the following result:
Essentially, I need to have a row for each unique Change Date and have the targetDate column value fill in the respective newly pivoted numerical ProjectStageCode column.
From the looks of it it seems like you just need to use a subquery before you try to PIVOT the data. You also need to aggregate the targetDate instead of the InsertedDate:
WITH r AS
(
SELECT ROW_NUMBER() OVER (ORDER BY ph.InsertedDateTime) rownum,
CAST(ph.InsertedDateTime AS DATE) InsertedDate, ph.Gate_1_TargetDate, ph.Gate_2_TargetDate, ph.Gate_3_TargetDate
FROM PROJECT_HIST ph
JOIN
(
SELECT ProjectID, MAX(InsertedDateTime) InsertedDateTime
FROM PROJECT_HIST
GROUP BY ProjectID, CAST(InsertedDateTime AS DATE)
) ph_distinct_date
ON ph_distinct_date.InsertedDateTime = ph.InsertedDateTime
AND ph_distinct_date.ProjectID = ph.ProjectID
WHERE ph.projectid = 100957
AND NOT (ph.Gate_1_TargetDate IS NULL
AND ph.Gate_2_TargetDate IS NULL
AND ph.Gate_3_TargetDate IS NULL)
),
fubar AS
(
SELECT rownum, InsertedDate, 0 gateName, NULL targetDate FROM r
UNION ALL
SELECT rownum, InsertedDate, 1, Gate_1_TargetDate FROM r
UNION ALL
SELECT rownum, InsertedDate, 2, Gate_2_TargetDate FROM r
UNION ALL
SELECT rownum, InsertedDate, 3, Gate_3_TargetDate FROM r
)
SELECT ChangeDate, [0],[1],[2],[3]
FROM
(
SELECT f1.InsertedDate ChangeDate, f1.gateName, f1.targetDate
FROM fubar f1
LEFT JOIN fubar f2
ON f2.rownum = f1.rownum - 1
AND f2.gateName = f1.gateName
WHERE f1.rownum = 1
OR f1.targetDate <> f2.targetDate
) d
PIVOT
(
min(targetDate)
FOR gateName IN ([0],[1],[2],[3])
) AS p;

"end-of-file on communication channel" error with UNION of DISTINCT columns

I'm cleaning up SQL code from a previous engineer (not a programmer).
One query UNIONs the results to 2 almost identical queries, with an exactly identical sub-query, and the original code has a lot of "where" clauses (in both queries) to filter the data.
I am trying to use "with" tables to filter the data first, and then do the sub-queries and the union.
I keep getting a generic "end-of-file on communication channel" error during the "Prepare" step, but when I remove the DISTINCT clause from the sub-queries, it works - but it doesn't give me the results I need.
Here is the code I've "reduced" to show the error:
with
FilteredData as
(
select
ST.part
, ST.order_No
, ST.induct_Date
, ST.complete_Date
from
Some_Table ST
where
(
ST.part is not null
and ST.order_No is not null
)
-- MUCH more filtering goes on here, to limit the number of records to look at
)
,
TempTable_01A as
(
select
FD.part
, count( DISTINCT FD.part ) Count_1 -- The DISTINCT needs to be removed for it to compile
, 0 Count_2
, 0 AvgLengthOpen
from
FilteredData FD
where
FD.induct_Date is not null
and ( FD.induct_Date >= to_date( '01-01-2013', 'MM-DD-YYYY' ) )
and ( FD.induct_Date < ( to_date( '01-31-2013', 'MM-DD-YYYY' ) + 1 ) )
group by
FD.part
)
,
TempTable_01B as
(
select
FD.part
, 0 Count_1
, count( DISTINCT FD.part ) Count_2 -- The DISTINCT needs to be removed for it to compile
, avg( FD.complete_Date - FD.induct_Date ) AvgLengthOpen
from
FilteredData FD
where
FD.complete_Date is not null
and ( FD.complete_Date >= to_date( '01-01-2013', 'MM-DD-YYYY' ) )
and ( FD.complete_Date < ( to_date( '01-31-2013', 'MM-DD-YYYY' ) + 1 ) )
group by
FD.part
)
,
UnionTable as
(
select
TT_A.part
, TT_A.Count_1
, TT_A.Count_2
, TT_A.AvgLengthOpen
from
TempTable_01A TT_A
union
select
TT_B.part
, TT_B.Count_1
, TT_B.Count_2
, TT_B.AvgLengthOpen
from
TempTable_01B TT_B
)
select
UT.part
, max( UT.Count_1 ) MaxCount_1
, max( UT.Count_2 ) MaxCount_2
, max( UT.AvgLengthOpen ) MaxAvgLengthOpen
from
UnionTable UT
group by
UT.part
order by
1
NOTE: I am using Oracle SQL, version 10.0.2.1697. I get this same error whether I'm using PLSQL Developer, or my Perl program.

Subquery within SubQuery in SQL - DB2

I am having issue when trying to make a the sub query shown in the first filter dynamically based on one of the results returned from the query. Can someone please tell me what I am doing wrong. In the first subquery it worked.
( SELECT
MAX( MAX_DATE - MIN_DATE ) AS NUM_CONS_DAYS
FROM
(
SELECT
MIN(TMP.D_DAT_INDEX_DATE) AS MIN_DATE,
MAX(TMP.D_DAT_INDEX_DATE) AS MAX_DATE,
SUM(INDEX_COUNT) AS SUM_INDEX
FROM
(
SELECT
D_DAT_INDEX_DATE,
INDEX_COUNT,
D_DAT_INDEX_DATE - (DENSE_RANK() OVER(ORDER BY D_DAT_INDEX_DATE)) DAYS AS G
FROM
DWH.MQT_SUMMARY_WATER_READINGS
WHERE
N_COD_METER_CNTX_KEY = 79094
) AS TMP
GROUP BY
TMP.G
ORDER BY
1
) ) AS MAX_NUM_CONS_DAYS
Above is the subquery I am trying to replace 123456 with CTXTKEY or CTXT.N_COD_METER_CNTX_KEY from query. Below is the full code. Please note than in the subquery before "MAX_NUM_CONS_DAYS" it worked. However, it was only one subquery down.
SELECT
N_COD_WM_DWH_KEY,
V_COD_WM_SN_2,
N_COD_SP_ID,
CTXKEY,
V_COD_MIU_SN,
N_COD_POD,
MIU_CAT,
V_COD_SITR_ASSOCIATED,
WO_INST_DATE,
WO_MIU_CAT,
DAYSRECEIVED3,
MAX_NUM_CONS_DAYS,
( CASE WHEN ( DAYSRECEIVED3 = 3 ) THEN 'Y' ELSE 'N' END ) AS GREEN,
( CASE WHEN ( DAYSRECEIVED3 < 3 AND DAYSRECEIVED3 > 0 ) THEN 'Y' ELSE 'N' END ) AS BLUE,
( CASE WHEN ( DAYSRECEIVED3 = 0 AND MAX_NUM_CONS_DAYS >= 5 ) THEN 'Y' ELSE 'N' END ) AS ORANGE,
( CASE WHEN ( DAYSRECEIVED3 = 0 AND MAX_NUM_CONS_DAYS BETWEEN 1 and 4 ) THEN 'Y' ELSE 'N' END ) AS RED
FROM
(
SELECT
WMETER.N_COD_WM_DWH_KEY,
WMETER.V_COD_WM_SN_2,
WMETER.N_COD_SP_ID,
CTXT.N_COD_METER_CNTX_KEY AS CTXKEY,
CTXT.V_COD_MIU_SN,
CTXT.N_COD_POD,
MIU.N_COD_MIU_CATEGORY AS MIU_CAT,
CTXT.V_COD_SITR_ASSOCIATED,
T1.D_DAT_PLAN_INST AS WO_INST_DATE,
T1.N_COD_MIU_CATEGORY AS WO_MIU_CAT,
( SELECT COUNT( DISTINCT D_DAT_INDEX_DATE ) FROM DWH.MQT_SUMMARY_WATER_READINGS WHERE ( N_COD_METER_CNTX_KEY = CTXT.N_COD_METER_CNTX_KEY ) AND D_DAT_INDEX_DATE BETWEEN ( '2013-07-10' ) AND ( '2013-07-12' ) ) AS DAYSRECEIVED3,
( SELECT
MAX( MAX_DATE - MIN_DATE ) AS NUM_CONS_DAYS
FROM
(
SELECT
MIN(TMP.D_DAT_INDEX_DATE) AS MIN_DATE,
MAX(TMP.D_DAT_INDEX_DATE) AS MAX_DATE,
SUM(INDEX_COUNT) AS SUM_INDEX
FROM
(
SELECT
D_DAT_INDEX_DATE,
INDEX_COUNT,
D_DAT_INDEX_DATE - (DENSE_RANK() OVER(ORDER BY D_DAT_INDEX_DATE)) DAYS AS G
FROM
DWH.MQT_SUMMARY_WATER_READINGS
WHERE
N_COD_METER_CNTX_KEY = 79094
) AS TMP
GROUP BY
TMP.G
ORDER BY
1
) ) AS MAX_NUM_CONS_DAYS
FROM DWH.DWH_WATER_METER AS WMETER
LEFT JOIN DWH.DWH_WMETER_CONTEXT AS CTXT
ON WMETER.N_COD_WM_DWH_KEY = CTXT.N_COD_WM_DWH_KEY
LEFT JOIN DWH.DWH_MIU AS MIU
ON CTXT.V_COD_MIU_SN = MIU.V_COD_MIU_SN
LEFT JOIN
( SELECT V_COD_CORR_WAT_METER_SN, D_DAT_PLAN_INST, N_COD_MIU_CATEGORY
FROM DWH.DWH_ORDER_MANAGEMENT_FACT
JOIN DWH.DWH_MIU
ON DWH.DWH_ORDER_MANAGEMENT_FACT.V_COD_MIU_SN = DWH.DWH_MIU.V_COD_MIU_SN
) AS T1
ON WMETER.V_COD_WM_SN_2 = T1.V_COD_CORR_WAT_METER_SN
WHERE
( V_COD_SITR_ASSOCIATED = 'X' )
AND ( ( MIU.N_COD_MIU_CATEGORY <> 4 ) OR ( ( MIU.N_COD_MIU_CATEGORY IS NULL ) AND ( ( T1.N_COD_MIU_CATEGORY <> 4 ) OR ( T1.N_COD_MIU_CATEGORY IS NULL ) ) ) )
)
Error I am getting is:
Error Code: -204, SQL State: 42704
I would say that a good option here would be to use a CTE, or Common Table Expression. You can do something similar to the following:
WITH CTE_X AS(
SELECT VAL_A
,VAL_B
FROM TABLE_A)
,CTE_Y AS(
SELECT VAL_C
,VAL_B
FROM TABLE_B)
SELECT VAL_A
,VAL_B
FROM CTE_X X
JOIN CTE_Y Y
ON X.VAL_A = Y.VAL_C;
While this isn't specific to your example, it does show that CTE's create a sort of temporary "in memory" table that you can access in a subsequent query. This should allow you to issue your inner two subselects as a CTE, and then use the CTE in the "SELECT MAX( MAX_DATE - MIN_DATE ) AS NUM_CONS_DAYS" query.
You cannot reference columns from the outer select in the subselect, no more than 1 level deep anyway. If I correctly understand what you're doing, you'll probably need to join DWH.MQT_SUMMARY_WATER_READINGS and DWH.DWH_WMETER_CONTEXT in the outer select.