Converting a correlated sub-query to JOIN - sql

I have a correlated subquery which is creating performance issues. Being a correlated subquery it doesn't accept index hints either. So I am trying to convert it into a JOIN instead. Please find below the original query and the modified query. The modified query doesn't return any row but the original returns me 224 rows.
Appreciate any insight on what is wrong with my JOIN query, if it even makes sense to use JOIN instead of subquery in this case. Thanks.
select Area_CODE,
due_DATE,
RATE,
from SCHED S
where (s.Area_CODE = 11001 and
(s.COMP_CODE = 'a'
or
(s.COMPANY_CODE = 'b'
and s.due_DATE <
(
select
nvl( min(s1.due_DATE), to_date ( '31-DEC-2999', 'DD-MM-YYYY') )
from SCHED s1
where s1.AREA_CODE = s.AREA_CODE
and s1.COMP_CODE = 'c'
)
)
)
)
order by a.EFF_DATE asc, s.due_DATE asc
Modified Query:
SELECT
Area_CODE,
due_DATE,
RATE
from SCHED S
LEFT JOIN
(
SELECT
NVL( MIN(s1.due_DATE), to_date ( '31-DEC-2999', 'DD-MM-YYYY') ) AS
min_date,
s1.AREA_CODE AS a_code
FROM
SCHED s1
WHERE
s1.COMPANY_CODE = 'c'
GROUP BY
s1.AREA_code
)
s2
ON
s2.A_CODE = s.area_code
WHERE
(
s.area_code = 11001
AND
(
s.COMP_CODE = 'a'
OR
(
s.COMP_CODE = 'b'
and s.due_DATE < s2.min_date
)
)
)
order by s.EFF_DATE asc, s.due_DATE asc

The difference is where there are no dates for a given area code
Change the left join (s2) to an inner join on
inner JOIN
(
SELECT
min(s1.due_DATE) AS min_date,
s1.AREA_CODE AS a_code
FROM
(
select Area_Code, due_date from SCHED WHERE COMPANY_CODE = 'c'
union
select Area_Code, to_date('31-dec-2999','DD-MM-YYYY') from SCHED
) s1
GROUP BY
s1.AREA_code
) s2

Related

How to mimic an ON condition while using Union?

I have a query like this:
select
yyyy_mm_dd,
xml_id,
feature,
status
from
schema.t1
where
yyyy_mm_dd >= '2019-02-02'
union all
select
yyyy_mm_dd,
p_id as xml_id,
'payment' as feature,
case
when payment = 1 then 1
else 0
end as status
from
schema.t2
where
yyyy_mm_dd >= '2019-02-02'
Is there a way I can ensure no side of the union has a greater date than the other? With a join I could enforce this with an on condition on yyyy_mm_dd. I want to maintain the union but only until the max date which is available in both tables.
Is there a more efficient way to solve this than the solution I've come up with?
select
c.yyyy_mm_dd,
xml_id,
feature,
status
from
schema.t1 c
left join(
select
max(yyyy_mm_dd) as yyyy_mm_dd
from
schema.t2
where
yyyy_mm_dd >= '2020-10-01'
) m on m.yyyy_mm_dd = c.yyyy_mm_dd
where
c.yyyy_mm_dd >= '2020-10-01'
and m.yyyy_mm_dd is null
union all
select
c.yyyy_mm_dd,
p_id as xml_id,
'payment' as feature,
case
when payment = 1 then 1
else 0
end as status
from
schema.t2 c
left join(
select
max(yyyy_mm_dd) as yyyy_mm_dd
from
schema.t1
where
yyyy_mm_dd >= '2020-10-01'
) m on m.yyyy_mm_dd = c.yyyy_mm_dd
where
c.yyyy_mm_dd >= '2020-10-01'
and m.yyyy_mm_dd is not null
Create 2 CTEs for each of your queries and then select only the rows of each CTE that have matching yyyy_mm_dds in the other CTE:
with
cte1 as (
select yyyy_mm_dd, xml_id, feature, status
from schema.t1
where yyyy_mm_dd >= '2019-02-02'
),
cte2 as (
select yyyy_mm_dd, p_id as xml_id, 'payment' as feature,
case when payment = 1 then 1 else 0 end as status
from schema.t2
where yyyy_mm_dd >= '2019-02-02'
)
select c1.* from cte1 c1
where exists (select 1 from cte2 c2 where c2.yyyy_mm_dd = c1.yyyy_mm_dd)
union all
select c2.* from cte2 c2
where exists (select 1 from cte1 c1 where c1.yyyy_mm_dd = c2.yyyy_mm_dd)

OPTIMIZING QUERY - ORACLE SQL

I have a query fetching information from a large database. Fetching one customer records takes minimum 12 seconds. I have estimated doing through the whole table will require not less than 12 days. Have a look at a it and help if it can be optimized
SELECT gam.CIF_ID
, TO_CHAR(T1.TRAN_DATE, 'MM') AS MONTH
, SUM(TBAADM.COMMONPACKAGE.getConvertedAmount('TZ', tran_amt, tran_crncy_code, 'TZS', 'REV', tran_date)) AS TOTAL_DEBIT_AMT
, ROUND(AVG(TBAADM.COMMONPACKAGE.getConvertedAmount('TZ', tran_amt, tran_crncy_code, 'TZS', 'REV', tran_date)), 3) AS AVG_DEBIT_AMT
, COUNT(T1.PART_TRAN_TYPE) AS NUMBER_OF_DEBIT_TRANSACTIONS
FROM TBAADM.HIST_TRAN_DTL_TABLE T1
LEFT JOIN TBAADM.GENERAL_ACCT_MAST_TABLE gam ON ( gam.ACID = T1.ACID )
WHERE T1.PART_TRAN_TYPE = 'D'
AND NOT EXISTS (
SELECT 1
FROM TBAADM.HIST_TRAN_DTL_TABLE T3
INNER JOIN TBAADM.GENERAL_ACCT_MAST_TABLE g ON ( g.ACID = T3.ACID )
WHERE T3.PART_TRAN_TYPE ='C'
AND T3.TRAN_DATE || T3.TRAN_ID=T1.TRAN_DATE || T1.TRAN_ID
AND g.cif_id = gam.cif_id
)
AND (
T1.TRAN_SUB_TYPE='CI'
OR T1.TRAN_SUB_TYPE='SI'
OR T1.TRAN_SUB_TYPE='PI'
OR T1.TRAN_SUB_TYPE='RI'
)
AND gam.ACCT_OWNERSHIP <> 'O'
AND T1.TRAN_SUB_TYPE <> 'CP'
AND gam.SCHM_TYPE <> 'LAA'
AND tran_particular LIKE 'POS PURCHASE %'
AND T1.TRAN_DATE BETWEEN '1/JAN/14' AND '30/JUNE/18'
GROUP BY TO_CHAR(T1.TRAN_DATE, 'MM')
, gam.CIF_ID
;

TERADATA: query optimization

This query is working but it seems to take longer time than usual to retrieve the data. Is there a better solution to optimize this query? I need to get all PRD_ID from T1 and T2 even if there is no match with S1 and S2.
SELECT DISTINCT T.PRD_ID T.AMOUNT, T.DATE, T.REGION
FROM
(
SELECT DISTINCT T1.PRD_ID, T1.PRD_CODE, S1.ORDER_DATE AS DATE, T1.REGION
FROM
(
(SELECT PRD_ID, PRD_CODE,AMOUNT,REGION
FROM PRODUCT
WHERE REGION='CA') T1
LEFT JOIN SERVICE_1 S1
ON S1.PRD_ID = T1.PRD_ID
AND S1.PRD_CODE= T1.PRD_CODE
AND S1.AMT = T1.AMOUNT
AND S1.ORDER_DATE >= '01/01/2015'
AND S1.ORDER_DATE <= '02/28/2015'
)
UNION ALL
SELECT DISTINCT T2.PRD_ID, T2.PRD_CODE, S2.ACCT_CALENDAR_DT AS DATE, T2.REGION
FROM
(
(SELECT PRD_ID, PRD_CODE,AMOUNT,REGION
FROM PRODUCT
WHERE REGION='IL') T2
LEFT JOIN SERVICE_2 S2
ON S2.PRD_ID = T2.PRD_ID
AND S2.PRD_CODE= T2.PRD_CODE
AND S2.AMT = T2.AMOUNT
AND S2.ACCT_CALENDAR_DT >= '20150101'
AND S2.ACCT_CALENDAR_DT <= '20150228'
)
) T
ORDER BY REGION, ORDER_DATE DESC, PRD_ID
I can't see why you need all these (3!) levels of nested tables. The following should be equivalent:
SELECT DISTINCT
T1.PRD_ID, T1.PRD_CODE, S1.ORDER_DATE AS DATE, T1.REGION
FROM
PRODUCT T1
LEFT JOIN SERVICE_1 S1
ON S1.PRD_ID = T1.PRD_ID
AND S1.PRD_CODE= T1.PRD_CODE
AND S1.AMT = T1.AMOUNT
AND S1.ORDER_DATE >= DATE '2015-01-01' -- converted '01/01/2015'
AND S1.ORDER_DATE <= DATE '2015-02-28' -- converted '02/28/2015'
WHERE T1.REGION = 'CA'
UNION ALL -- No need for DISTINCT here. The Region
-- is different between the 2 parts.
SELECT DISTINCT
T2.PRD_ID, T2.PRD_CODE, S2.ACCT_CALENDAR_DT AS DATE, T2.REGION
FROM
PRODUCT T2
LEFT JOIN SERVICE_2 S2
ON S2.PRD_ID = T2.PRD_ID
AND S2.PRD_CODE= T2.PRD_CODE
AND S2.AMT = T2.AMOUNT
AND S2.ACCT_CALENDAR_DT >= DATE '2015-01-01'
AND S2.ACCT_CALENDAR_DT <= DATE '2015-02-28'
WHERE T2.REGION = 'IL'
ORDER BY REGION, DATE DESC, PRD_ID ;
or:
SELECT DISTINCT
T1.PRD_ID, T1.PRD_CODE, S1.ORDER_DATE AS DATE, 'CA' AS REGION
FROM
( SELECT PRD_ID, PRD_CODE, AMOUNT
FROM PRODUCT
WHERE REGION = 'CA'
) T1
LEFT JOIN SERVICE_1 S1
ON S1.PRD_ID = T1.PRD_ID
AND S1.PRD_CODE= T1.PRD_CODE
AND S1.AMT = T1.AMOUNT
AND S1.ORDER_DATE >= DATE '2015-01-01'
AND S1.ORDER_DATE <= DATE '2015-02-28'
UNION ALL
SELECT DISTINCT
T2.PRD_ID, T2.PRD_CODE, S2.ACCT_CALENDAR_DT AS DATE, 'IL' AS REGION
FROM
( SELECT PRD_ID, PRD_CODE, AMOUNT
FROM PRODUCT
WHERE REGION = 'IL'
) T2
LEFT JOIN SERVICE_2 S2
ON S2.PRD_ID = T2.PRD_ID
AND S2.PRD_CODE= T2.PRD_CODE
AND S2.AMT = T2.AMOUNT
AND S2.ACCT_CALENDAR_DT >= DATE '2015-01-01'
AND S2.ACCT_CALENDAR_DT <= DATE '2015-02-28'
ORDER BY REGION, DATE DESC, PRD_ID ;

How to join the next date value of the same table

I have a table in SQL with the following fields:
The timestamp field will have all the punches that an employee has in a day.
So having the following data:
I need to create 2 diferent queries.
need to select all the IN timestamps with their corresponding next OUT timestamp
need to select all the OUT timestamps with their corresponding previous IN timestamp
So, in the first query, I should get the following:
In the second query, I should get the following:
Any clue on how to build such queries?
HERE IS THE Fiddle: http://sqlfiddle.com/#!6/a137d/1
I believe this is what you're looking for. These queries should work on most DBMSs.
First
SELECT ea1.employeeid, ea1.timestamp AS instamp, ea2.timestamp AS outstamp
FROM employee_attendance ea1
LEFT JOIN employee_attendance ea2
ON ea2.employeeid=ea1.employeeid
AND ea2.accesscode = 'OUT'
AND ea2.timestamp = (
SELECT MIN(ea3.timestamp)
FROM employee_attendance ea3
WHERE ea3.timestamp > ea1.timestamp
AND ea3.employeeid = ea1.employeeid
)
WHERE ea1.accessCode = 'IN'
AND ea1.employeeid = 4;
Second
SELECT ea1.employeeid, ea1.timestamp AS outstamp, ea2.timestamp AS instamp
FROM employee_attendance ea1
LEFT JOIN employee_attendance ea2
ON ea2.employeeid=ea1.employeeid
AND ea2.accesscode = 'IN'
AND ea2.timestamp = (
SELECT MIN(ea3.timestamp)
FROM employee_attendance ea3
WHERE ea3.timestamp < ea1.timestamp
AND ea3.employeeid = ea1.employeeid
AND ea3.timestamp > ISNULL((
SELECT MAX(ea4.timestamp)
FROM employee_attendance ea4
WHERE ea4.accesscode = 'OUT'
AND ea4.timestamp < ea1.timestamp
AND ea4.employeeid = ea1.employeeid
), '2000-1-1')
)
WHERE ea1.accessCode = 'OUT'
AND ea1.employeeid = 4;
This looks like nice example for usage of LEAD, LAG ANALYTIC functions in SQL 2012.
SELECT * FROM
(
SELECT EMPLOYEEID, TIMESTAMP,
LEAD(timestamp) OVER (ORDER BY TIMESTAMP
) OUTTIMESTAMP, ACCESSCODE
FROM [dbo].[employee_attendance]
WHERE EMPLOYEEID =4
) T
where T.ACCESSCODE ='IN'
second query
SELECT * FROM
(
SELECT EMPLOYEEID, TIMESTAMP,
LAG(timestamp) OVER (ORDER BY TIMESTAMP
) INTIMESTAMP, ACCESSCODE
FROM [dbo].[employee_attendance]
WHERE EMPLOYEEID =4
) T
where T.ACCESSCODE ='OUT'

Selecting the Max Date From Multiple Tables

any help would be so incredibly appreciated. I am trying to select the last activity date from a group of tables. The tables include Entry Date, Note date, payment date, and Claim Date. I would like to return only the max value from all these dates. Furthermore I only want records where there has been no activity for over 45 days. I am currently using the following SQL to bring all the dates in then using calculated fields in EXCEL to figure the rest out. Is it possible to do this all with SQL?
Thanks in advance.
SELECT xrxTrnLgr.PatId, xrxTrnLgr.Balance,
Max(xrxPatNotes.NoteDate) AS 'Max of NoteDate',
Max(xrxTrnIcf.PostDate) AS 'Max of IcfPostDate',
Max(xrxPat.EntryDate) AS 'Entry Date',
Max(xrxPat.Coverage) AS 'Coverage',
Max(xrxTrnPay.PostDate) AS 'Last Payment'
FROM xrxTrnLgr
LEFT OUTER JOIN xrxPatNotes ON xrxTrnLgr.PatId = xrxPatNotes.PatId
LEFT OUTER JOIN xrxTrnIcf ON xrxTrnLgr.PatId = xrxTrnIcf.PatId
LEFT OUTER JOIN xrxPat ON xrxTrnLgr.PatId = xrxPat.PatId
LEFT OUTER JOIN xrxTrnPay ON xrxTrnLgr.PatId = xrxTrnPay.PatId
GROUP BY xrxTrnLgr.PatId, xrxTrnLgr.Balance
HAVING (xrxTrnLgr.Balance>$.01)
I think this might do it all in SQL:
select t.patid, t.balance,
max(case when which = 'note' then thedate end) as note,
max(case when which = 'post' then thedate end) as post,
max(case when which = 'entry' then thedate end) as entry,
max(case when which = 'coverage' then thedate end) as coverage,
max(case when which = 'lastPayment' then thedate end) as lastPayment
from xrxTrnLgr t left join
((select patid, notedate as thedate, 'note' as which
from xrxPatNotes
) union all
(select patid, postdate, 'post'
from xrxtrnIcf
) union all
(select patid, EntryDate, 'entry'
from xrxPat
) union all
(select paid, Coverage, 'coverage'
from xrxPat.Coverage
) union all
(select patid, PostDate, 'LastPayment'
from xrxTrnPay.PostDate
)
) d
on t.patid = d.patid
group by t.patid, t.balance
having min(now() - thedate) >= 45