BigQuery Code Unexpected results with code formatting sqlfluff - sql

I have a perfect code that compares the data from one table with another (see below) which works totally fine and runs fine as well in BigQuery:
with source1 as (
select
b.id,
b.qty,
a.price
from <table> as a
,unnest <details> as b
where b.status != 'canceled'
),
source2 as (
select id_, qty_, price_ from <table2>
where city != 'delhi'
)
select *
from source1 s1
full outer join source2 s2
on id = id_
where format('%t', s1) != format('%t', s2)
However, the code above runs into an error in sqlfluff i.e a certain SQL formatting rules checker that I can't bypass or turn off, see the error from sqlfluff below:
ERROR FROM SQLFLUFF:
*'s1' found in select with more than one referenced table/view' and 's2' found in select with more than one referenced table/view
Does anybody know how I can fix it ?

Windows function may perform better.
Instead of joining, the tables are unioned. Then a window function will search for combinations. As you tagged this question [Big-Query] it is tested for BigQuery:
with s1 as (
select id, qty, city from <table> where x != 'pending'
),
s2 as (
select id_, qty_, city_ from <table2>
),
concat_ as (
select 1 as dummy, * , format('%t',s1) as dummy_all
from s1
union all select 2 as dummy, * , format('%t',s2) as dummy_all
from s2
)
,combine as (
select *,
sum(if(dummy=1,1,0)) over win1 = sum(if(dummy=2,1,0)) over win1 as dummy_flag
from concat_
window win1 as (partition by id,dummy_all)
)
Select * from combine
where dummy_flag is false

I fixed the code by adding extra CTE and adjusting the where caluse this would not flag the sqlfluff rules:
To fix that, I tried the second code (see below):
added additional ctes
adjusted the where clause so that I can get all the rows where the id exists in one source but not in another and vice versa
the code seems to work but it would be great if someone could suggest how I can reduce the CTEs, considering the sqlfluff will not fail:
with s1 as (
select
b.id,
b.qty,
a.price
from <table> as a
,unnest <details> as b
where b.status != 'canceled'
),
s2 as (
select id_, qty_, price_ from <table2>
where city != 'delhi'
),
,concat_s1 as (
select
*
, format('%t',s1) as l1
from s1
)
,concat_s2 as (
select
*
, format('%t',s2) as l2
from s2
)
, combined as (
select
source1.*
,source2.*
from concat_s1 as source1
full outer join concat_s2 as source2
on source1.id_ = source2.id_
where source1.l1 != source2.l2
or source1.id is null or source2.id_ is null

Related

Using Analytical Clauses with DISTINCT

The purpose is to query multiple tables using DISTINC (if not I get millions of rows as results), but at the same time using sample to gather a 10% sample from the results that should all be unique. I am getting the following error:
ORA-01446: cannot select ROWID from, or sample, a view with DISTINCT, GROUP BY, etc.
Here is the code I have written:
WITH V AS (SELECT DISTINCT AL1."NO", AL3."IR", AL1."ACCT", AL3."CUST_DA", AL1."NA",
AL3."1_LINE", AL3."2_LINE", AL3."3_LINE", AL1."DA",
AL1."CD", AL1."TITLE_NA", AL1."ENT_NA", AL3."ACCT",
AL3."ACCTLNK_ENRL_CNT"
FROM "DOC"."DOCUMENT" AL1, "DOC"."VNDR" AL2, "DOC"."CUST_ACCT" AL3
WHERE (AL1."ACCT"=AL2."VNDR"
AND AL2."ACCT"=AL3."ACCT")
AND ((AL1."IMG_DA" >= Trunc(sysdate-1)
AND AL1."PROC"='A'
AND AL3."ACCT"<>'03')))
SELECT * FROM V SAMPLE(10.0)
You can't sample a join view like this.
Simpler test case (MCVE):
with v as
( select d1.dummy from dual d1
join dual d2 on d2.dummy = d1.dummy
)
select * from v sample(10);
Fails with:
ORA-01445: cannot select ROWID from, or sample, a join view without a key-preserved table
The simplest fix would be to move the sample clause to the driving table:
with v as
( select d1.dummy from dual sample(10) d1
join dual d2 on d2.dummy = d1.dummy
)
select * from v;
I would therefore rewrite your view as:
with v as
( select distinct
d.no
, a.ir
, d.acct
, a.cust_da
, d.na
, a."1_LINE", a."2_LINE", a."3_LINE"
, d.da, d.cd, d.title_na, d.ent_na
, a.acct
, a.acctlnk_enrl_cnt
from doc.document sample(10) d
join doc.vndr v
on v.vndr = d.acct
join doc.cust_acct a
on a.acct = v.acct
and d.img_da >= trunc(sysdate - 1)
and d.proc = 'A'
and a.acct <> '03'
)
select * from v;

error incorporating a select within a IFNULL in MariaDB

I'm creating a view in MariaDB and i'm having trouble making it work for a couple of fields. Currently this is working:
( SELECT DISTINCT IFNULL(grades.`grade`,'No Grade')
FROM `table` grades
WHERE userinfo.`id` = grades.`id`
AND grades.`Item Name` = 'SOMEINFO'
) 'SOMENAME',
But i need to add a select where the 'No grade' is, in the following form
( SELECT DISTINCT IFNULL( grades.`grade`,
SELECT IF( EXISTS
( SELECT *
FROM `another_table`
WHERE userid = 365
AND courseid = 2
), 'Enrolled', 'Not enrolled'
)
)
FROM `table` grades
WHERE userinfo.`id` = grades.`id`
AND grades.`Item Name` = 'SOMEINFO'
) 'SOMENAME',
i know that
SELECT IF( EXISTS( SELECT *
FROM `another_table`
WHERE userid = 365
AND courseid = 2
),
'Enrolled', 'Not enrolled'
)
is working too, but now the whole thing it's giving me an error, so any suggestions would be greatly appreciated
Thanks
This looks like a subquery:
(SELECT DISTINCT IFNULL(grades.`grade`,
SELECT IF( EXISTS (SELECT *
FROM `another_table`
WHERE userid = 365 AND courseid = 2
), 'Enrolled', 'Not enrolled'
)
)
FROM `table` grades
WHERE userinfo.`id` = grades.`id` AND
grades.`Item Name` = 'SOMEINFO'
) as SOMENAME,
You are using a subquery that returns two columns in a position where a scalar subquery is expected. A scalar subquery returns one column in at most one row.
Unfortunately, there is no easy way to do what you want in MySQL, because of the restrictions on views. I would advise you to rewrite the logic so the exists is handled using a left join in the from clause.

SQL query top 2 columns of joined table?

I am having no luck attempting to get the top (x number) of rows from a joined table. I want the top 2 resources (ordered by name) which in this case should be Katie and Simon and regardless of what I've tried, I can't seem to get it right. You can see below what I've commented out - and what looks like it should work (but doesn't). I cannot use a union. Any ideas?
select distinct
RTRESOURCE.RNAME as Resource,
RTTASK.TASK as taskname, SUM(distinct SOTRAN.QTY2BILL) AS quantitytobill from SOTRAN AS SOTRAN INNER JOIN RTTASK AS RTTASK ON sotran.taskid = rttask.taskid
left outer JOIN RTRESOURCE AS RTRESOURCE ON rtresource.keyno=sotran.resid
WHERE sotran.phantom<>'y' and sotran.pgroup = 'L' and sotran.timesheet = 'y' and sotran.taskid >0 AND RTRESOURCE.KEYNO in ('193','159','200') AND ( SOTRAN.ADDDATE>='8/15/2015 12:00:00 AM' AND SOTRAN.ADDDATE<'9/3/2015 11:59:59 PM' )
//and RTRESOURCE.RNAME in ( select distinct top 2 RTRESOURCE.RNAME from RTRESOURCE order by RTRESOURCE.RNAME)
//and ( select count(*) from RTRESOURCE RTRESOURCE2 where RTRESOURCE2.RNAME = RTRESOURCE.RNAME ) <= 2
GROUP BY RTRESOURCE.rname,RTTASK.task,RTTASK.taskid,RTTASK.mdsstring ORDER BY Resource,taskname
You should provide a schema.
But lets assume your query work. You create a CTE.
WITH youQuery as (
SELECT *
FROM < you big join query>
), maxBill as (
SELECT Resource, Max(quantitytobill) as Bill
FROM yourQuery
)
SELECT top 2 *
FROM maxBill
ORDER BY Bill
IF you want top 2 alphabetical
WITH youQuery as (
SELECT *
FROM < you big join query>
), Names as (
SELECT distinct Resource
FROM yourQuery
Order by Resource
)
SELECT top 2 *
FROM Names

Subquery within SubQuery in SQL - DB2

I am having issue when trying to make a the sub query shown in the first filter dynamically based on one of the results returned from the query. Can someone please tell me what I am doing wrong. In the first subquery it worked.
( SELECT
MAX( MAX_DATE - MIN_DATE ) AS NUM_CONS_DAYS
FROM
(
SELECT
MIN(TMP.D_DAT_INDEX_DATE) AS MIN_DATE,
MAX(TMP.D_DAT_INDEX_DATE) AS MAX_DATE,
SUM(INDEX_COUNT) AS SUM_INDEX
FROM
(
SELECT
D_DAT_INDEX_DATE,
INDEX_COUNT,
D_DAT_INDEX_DATE - (DENSE_RANK() OVER(ORDER BY D_DAT_INDEX_DATE)) DAYS AS G
FROM
DWH.MQT_SUMMARY_WATER_READINGS
WHERE
N_COD_METER_CNTX_KEY = 79094
) AS TMP
GROUP BY
TMP.G
ORDER BY
1
) ) AS MAX_NUM_CONS_DAYS
Above is the subquery I am trying to replace 123456 with CTXTKEY or CTXT.N_COD_METER_CNTX_KEY from query. Below is the full code. Please note than in the subquery before "MAX_NUM_CONS_DAYS" it worked. However, it was only one subquery down.
SELECT
N_COD_WM_DWH_KEY,
V_COD_WM_SN_2,
N_COD_SP_ID,
CTXKEY,
V_COD_MIU_SN,
N_COD_POD,
MIU_CAT,
V_COD_SITR_ASSOCIATED,
WO_INST_DATE,
WO_MIU_CAT,
DAYSRECEIVED3,
MAX_NUM_CONS_DAYS,
( CASE WHEN ( DAYSRECEIVED3 = 3 ) THEN 'Y' ELSE 'N' END ) AS GREEN,
( CASE WHEN ( DAYSRECEIVED3 < 3 AND DAYSRECEIVED3 > 0 ) THEN 'Y' ELSE 'N' END ) AS BLUE,
( CASE WHEN ( DAYSRECEIVED3 = 0 AND MAX_NUM_CONS_DAYS >= 5 ) THEN 'Y' ELSE 'N' END ) AS ORANGE,
( CASE WHEN ( DAYSRECEIVED3 = 0 AND MAX_NUM_CONS_DAYS BETWEEN 1 and 4 ) THEN 'Y' ELSE 'N' END ) AS RED
FROM
(
SELECT
WMETER.N_COD_WM_DWH_KEY,
WMETER.V_COD_WM_SN_2,
WMETER.N_COD_SP_ID,
CTXT.N_COD_METER_CNTX_KEY AS CTXKEY,
CTXT.V_COD_MIU_SN,
CTXT.N_COD_POD,
MIU.N_COD_MIU_CATEGORY AS MIU_CAT,
CTXT.V_COD_SITR_ASSOCIATED,
T1.D_DAT_PLAN_INST AS WO_INST_DATE,
T1.N_COD_MIU_CATEGORY AS WO_MIU_CAT,
( SELECT COUNT( DISTINCT D_DAT_INDEX_DATE ) FROM DWH.MQT_SUMMARY_WATER_READINGS WHERE ( N_COD_METER_CNTX_KEY = CTXT.N_COD_METER_CNTX_KEY ) AND D_DAT_INDEX_DATE BETWEEN ( '2013-07-10' ) AND ( '2013-07-12' ) ) AS DAYSRECEIVED3,
( SELECT
MAX( MAX_DATE - MIN_DATE ) AS NUM_CONS_DAYS
FROM
(
SELECT
MIN(TMP.D_DAT_INDEX_DATE) AS MIN_DATE,
MAX(TMP.D_DAT_INDEX_DATE) AS MAX_DATE,
SUM(INDEX_COUNT) AS SUM_INDEX
FROM
(
SELECT
D_DAT_INDEX_DATE,
INDEX_COUNT,
D_DAT_INDEX_DATE - (DENSE_RANK() OVER(ORDER BY D_DAT_INDEX_DATE)) DAYS AS G
FROM
DWH.MQT_SUMMARY_WATER_READINGS
WHERE
N_COD_METER_CNTX_KEY = 79094
) AS TMP
GROUP BY
TMP.G
ORDER BY
1
) ) AS MAX_NUM_CONS_DAYS
FROM DWH.DWH_WATER_METER AS WMETER
LEFT JOIN DWH.DWH_WMETER_CONTEXT AS CTXT
ON WMETER.N_COD_WM_DWH_KEY = CTXT.N_COD_WM_DWH_KEY
LEFT JOIN DWH.DWH_MIU AS MIU
ON CTXT.V_COD_MIU_SN = MIU.V_COD_MIU_SN
LEFT JOIN
( SELECT V_COD_CORR_WAT_METER_SN, D_DAT_PLAN_INST, N_COD_MIU_CATEGORY
FROM DWH.DWH_ORDER_MANAGEMENT_FACT
JOIN DWH.DWH_MIU
ON DWH.DWH_ORDER_MANAGEMENT_FACT.V_COD_MIU_SN = DWH.DWH_MIU.V_COD_MIU_SN
) AS T1
ON WMETER.V_COD_WM_SN_2 = T1.V_COD_CORR_WAT_METER_SN
WHERE
( V_COD_SITR_ASSOCIATED = 'X' )
AND ( ( MIU.N_COD_MIU_CATEGORY <> 4 ) OR ( ( MIU.N_COD_MIU_CATEGORY IS NULL ) AND ( ( T1.N_COD_MIU_CATEGORY <> 4 ) OR ( T1.N_COD_MIU_CATEGORY IS NULL ) ) ) )
)
Error I am getting is:
Error Code: -204, SQL State: 42704
I would say that a good option here would be to use a CTE, or Common Table Expression. You can do something similar to the following:
WITH CTE_X AS(
SELECT VAL_A
,VAL_B
FROM TABLE_A)
,CTE_Y AS(
SELECT VAL_C
,VAL_B
FROM TABLE_B)
SELECT VAL_A
,VAL_B
FROM CTE_X X
JOIN CTE_Y Y
ON X.VAL_A = Y.VAL_C;
While this isn't specific to your example, it does show that CTE's create a sort of temporary "in memory" table that you can access in a subsequent query. This should allow you to issue your inner two subselects as a CTE, and then use the CTE in the "SELECT MAX( MAX_DATE - MIN_DATE ) AS NUM_CONS_DAYS" query.
You cannot reference columns from the outer select in the subselect, no more than 1 level deep anyway. If I correctly understand what you're doing, you'll probably need to join DWH.MQT_SUMMARY_WATER_READINGS and DWH.DWH_WMETER_CONTEXT in the outer select.

Oracle SQL - Sub-queries

I making a report in EM and I need to figure out somthing here
I have this query that I made:
SELECT
*
FROM
(
SELECT DISTINCT
patch.host as "PHost",
patch.home_location as "PDirectory",
patch.home_name as "PHome",
MAX(patch.INSTALLATION_TIME) as "Patched (Date)",
MAX(patch.PATCH_RELEASE) as "PVersion",
listagg(patch,',') WITHIN GROUP (ORDER BY patch) "Patches"
FROM
mgmt$applied_patches patch
GROUP BY patch.host, patch.home_location,patch.home_name
ORDER BY patch.host, patch.home_location
) "PCH",
(
SELECT DISTINCT
T1.PROPERTY_VALUE as "MHost",
T2.PROPERTY_VALUE as "MDirectory",
T3.PROPERTY_VALUE as "MVersion",
count(T4.PROPERTY_VALUE) as "Count of SID",
listagg(T4.PROPERTY_VALUE,',') WITHIN GROUP (ORDER BY T4.PROPERTY_VALUE) as "SID"
FROM
MGMT$TARGET_PROPERTIES T1,
MGMT$TARGET_PROPERTIES T2,
MGMT$TARGET_PROPERTIES T3,
MGMT$TARGET_PROPERTIES T4
WHERE
T1.TARGET_GUID = T2.TARGET_GUID
and T1.TARGET_GUID = T3.TARGET_GUID
and T1.TARGET_GUID = T4.TARGET_GUID
and T1.PROPERTY_NAME = 'MachineName'
and T2.PROPERTY_NAME = 'OracleHome'
and T3.PROPERTY_NAME = 'Version'
and T4.PROPERTY_NAME = 'SID'
GROUP BY T1.PROPERTY_VALUE, T2.PROPERTY_VALUE, T3.PROPERTY_VALUE
) "MGM"
WHERE
PDirectory = MDirectory
I'm getting error ORA-00904: "MDIRECTORY":....
I tried many combinations! (PCH.PDirectory = MGM.MDirectory, ......) nothing works
cheers
Mixed case names in Oracle are an abomination, and you've just been bitten by them.
Use:
"PDirectory" = "MDirectory"
.. or better still do not use special names that need quoting.
Well...
This doesn't work:
SELECT * FROM
(SELECT DUMMY AS "PDirectory" FROM DUAL) "PCH",
(SELECT DUMMY AS "MDirectory" FROM DUAL) "MGR"
WHERE PDIRECTORY = MDIRECTORY
But this works:
SELECT * FROM
(SELECT DUMMY AS "PDirectory" FROM DUAL) "PCH",
(SELECT DUMMY AS "MDirectory" FROM DUAL) "MGR"
where "PCH"."PDirectory" = "MGR"."MDirectory"
change your query accordingly.