PL/SQL - How to use CASE in JOIN ON specific condition - sql

Could you please advise how can achieve below?
I'm joining below subquery (quote_history) with quote table (q.).
Just to note, I'm getting 2 rows for each quote_ref.
Currently it's joining by quote_ref, but I'd like to add CASE that if q.quote_description = 'XYZ' then join also on quote_history.rown = 1 (I want only 1st row in this case), else ignore this another AND (leave only by quote_ref)
eg. for q.quote_description = 'XYZ' I'd like to get something like
LEFT JOIN (SELECT quote_ref
,activity_date_key
,quote_status_key
,ROW_NUMBER() OVER (PARTITION BY quote_ref ORDER BY activity_date_key) rown
,COUNT(*) OVER (PARTITION BY quote_ref, quote_status_key) max_rown
FROM [...]
) quote_history
ON q.quote_ref = quote_history.quote_ref
AND quote_history.rown = 1 --this is only when q.quote_description = 'XYZ'
but for other cases (q.quote_description <> 'XYZ') I want only
LEFT JOIN (SELECT quote_ref
,activity_date_key
,quote_status_key
,ROW_NUMBER() OVER (PARTITION BY quote_ref ORDER BY activity_date_key) rown
,COUNT(*) OVER (PARTITION BY quote_ref, quote_status_key) max_rown
FROM [...]
) quote_history
ON q.quote_ref = quote_history.quote_ref
Alternatively (but it's uglier solution I think) I can also get something like
JOIN [...]
ON q.quote_ref = quote_history.quote_ref
AND IF q.quote_description = 'XYZ' THEN quote_history.rown in (1)
ELSE quote_history.rown in (1,2)

You can use:
ON q.quote_ref = quote_history.quote_ref AND
(quote_history.rown = 1 or q.quote_description <> 'XYZ')
Note: This assumes that q.quote_description is never NULL. The logic can be adjusted for this if necessary (otherwise it just complicates the idea).

Related

How to optimize this T-sql query?

There is a loop in this query, in the last where condition. and this
causes a severe problem to the performance of SQL.
I have no idea about how to modify it.
select pr.tavpun
from mta110 pr
where pr.taisoc = mta110.taisoc
and pr.taitar = mta110.taitar
and pr.taydat = mta110.taydat
and pr.tairef = mta110.tairef
and pr.tatind = (select max(pr2.tatind) from mta110 pr2
where pr2.taisoc = mta110.taisoc
and pr2.taitar = mta110.taitar
and pr2.taydat = mta110.taydat
and pr2.tairef = mta110.tairef
and pr2.tatind <= mgc100.gntind)) AS SalesPrice
Your query makes little sense, because pr is not a reasonable alias for mta110, and mta110 is not recognized in the outer query.
I speculate that you have two tables, pr and mta110 which are joined and you want the "most recent" row of mta110 for each matching row.
If this interpretation is correct, then you can use row_number() and a proper join:
select . . .
from pr join
(select m.*,
row_number() over (partition by taisoc, taitar, taydat, tairef order by gntind desc) as seqnum
from mta110 m
) m
on pr.? = m.?
where seqnum = 1;

Should a subquery on a join use tables from an outer query in the where clause?

I need to add a subquery to a join, because one payment can have more than one allotment, so I only need to account for the first match (where rownum = 1).
However, I'm not sure if adding pmt from the outer query to the subquery on the allotment join is best.
Should I be doing this differently in the event of performance hits, etc.. ?
SELECT
pmt.payment_uid,
alt.allotment_uid,
FROM
payment pmt
/* HERE: is the reference to pmt.pay_key and pmt.client_id
incorrect in the below subquery? */
INNER JOIN allotment alc ON alt.allotment_uid = (
SELECT
allotment_uid
FROM
allotment
WHERE
pay_key = pmt.pay_key
AND
pay_code = 'xyz'
AND
deleted = 'N'
AND
client_id = pmt.client_id
AND
ROWNUM = 1
)
WHERE
AND
pmt.deleted = 'N'
AND
pmt.date_paid >= TO_DATE('2017-07-01')
AND
pmt.date_paid < TO_DATE('2017-10-01') + 1;
It's difficult to identify the performance issue in your query without seeing an explain plan output. You query does seem to do an additional SELECT on the allotment for every record from the main query.
Here is a version which doesn't use correlated sub query. Obviously I haven't been able to test it. It does a simple join in and then filters all records except one of the allotments. Hope this helps.
WITH v_payment
AS
(
SELECT
pmt.payment_uid,
alt.allotment_uid,
ROW_NUMBER () OVER(PARTITION BY allotment_id) r_num
FROM
payment pmt JOIN allotment alt
ON (pmt.pay_key = alt.pay_key AND
pmt.client_id = alt.client_id)
WHERE pmt.deleted = 'N' AND
pmt.date_paid >= TO_DATE('2017-07-01') AND
pmt.date_paid < TO_DATE('2017-10-01') + 1 AND
alt.pay_code = 'xyz' AND
alt.deleted = 'N'
)
SELECT payment_uid,
allotment_uid
FROM v_payment
WHERE r_num = 1;
Let's know how this performs!
You can phrase the query that way. I would be more likely to do:
SELECT . . .
FROM payment p INNER JOIN
(SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY pay_key, client_id
ORDER BY allotment_uid
) as seqnum
FROM allotment a
WHERE pay_code = 'xyz' AND deleted = 'N'
) a
ON a.pay_key = p.pay_key AND a.client_id = p.client_id AND
seqnum = 1
WHERE p.deleted = 'N' AND
p.date_paid >= DATE '2017-07-01' AND
p.date_paid < (DATE '2017-10-01') + 1;

Query specific field based on max sequence number

I have a record set where some of the rows are duplicated. In particular, they are duplicated for the last three rows of the record set. Of the entire four rows, the correct result set that I desire would include the first row and the last row. I desire this because for a particular SARAPPD_TERM_CODE_ENTRY, the record needed is the one where the SARAPPD_SEQ_NO value is at its max. So, the first row because for that particular term, the sequence number is maxed at one and the last row because the sequence number is maxed at six. Image and query are below.
select ppd.sarappd_seq_no, ppd.sarappd_term_code_entry, ppd.sarappd_apdc_code,
dap. dap.saradap_term_code_entry,
spri.spriden_id,
t.sgbstdn_astd_code, t.*
from sgbstdn t
left join spriden spri on t.sgbstdn_pidm = spri.spriden_pidm
left join saradap dap on spri.spriden_pidm = dap.saradap_pidm
join sarappd ppd on dap.saradap_pidm = ppd.sarappd_pidm
where t.sgbstdn_astd_code not in ('AS', 'DS', 'WD', 'SU', 'LA')
and t.sgbstdn_stst_code = 'AS'
and spri.spriden_change_ind is null
and spri.spriden_id = '123456789'
and (ppd.sarappd_apdc_code = 25 or ppd.sarappd_apdc_code = 30
or ppd.sarappd_apdc_code =35)
and ppd.sarappd_term_code_entry = dap.saradap_term_code_entry
--where b.sarappd_term_code_entry = ppd.sarappd_term_code_entry)
order by ppd.sarappd_term_code_entry
I believe this is a simple "where this = ( select max() ) type of query but I've been trying some different things and nothing is working. I'm not getting the results I want. So with that said, any help on this would be greatly appreciated. Thanks in advance.
You could use ROW_NUMBER()
;WITH cte
AS
(select
ROW_NUMBER() OVER (PARTITION BY SARAPPD_TERM_CODE_ENTRY ORDER BY SARAPPD_SEQ_NO DESC) AS RN
ppd.sarappd_seq_no,
ppd.sarappd_term_code_entry,
ppd.sarappd_apdc_code,
dap. dap.saradap_term_code_entry,
spri.spriden_id,
t.sgbstdn_astd_code, t.*
from sgbstdn t
left join spriden spri on t.sgbstdn_pidm = spri.spriden_pidm
left join saradap dap on spri.spriden_pidm = dap.saradap_pidm
join sarappd ppd on dap.saradap_pidm = ppd.sarappd_pidm
where t.sgbstdn_astd_code not in ('AS', 'DS', 'WD', 'SU', 'LA')
and t.sgbstdn_stst_code = 'AS'
and spri.spriden_change_ind is null
and spri.spriden_id = '123456789'
and (ppd.sarappd_apdc_code = 25 or ppd.sarappd_apdc_code = 30
or ppd.sarappd_apdc_code =35)
and ppd.sarappd_term_code_entry = dap.saradap_term_code_entry
order by ppd.sarappd_term_code_entry) a
SELECT *
FROM cte WHERE rn = 1

SQL Server / T-SQL : query optimization assistance

I have this QA logic that looks for errors into every AuditID within a RoomID to see if their AuditType were never marked Complete or if they have two complete statuses. Finally, it picks only the maximum AuditDate of the RoomIDs with errors to avoid showing multiple instances of the same RoomID, since there are many audits per room.
The issue is that the AUDIT table is very large and takes a long time to run. I was wondering if there is anyway to reach the same result faster.
Thank you in advance !
IF object_ID('tempdb..#AUDIT') is not null drop table #AUDIT
IF object_ID('tempdb..#ROOMS') is not null drop table #ROOMS
IF object_ID('tempdb..#COMPLETE') is not null drop table #COMPLETE
IF object_ID('tempdb..#FINALE') is not null drop table #FINALE
SELECT distinct
oc.HotelID, o.RoomID
INTO #ROOMS
FROM dbo.[rooms] o
LEFT OUTER JOIN dbo.[hotels] oc on o.HotelID = oc.HotelID
WHERE
o.[status] = '2'
AND o.orderType = '2'
SELECT
t.AuditID, t.RoomID, t.AuditDate, t.AuditType
INTO
#AUDIT
FROM
[dbo].[AUDIT] t
WHERE
t.RoomID IN (SELECT RoomID FROM #ROOMS)
SELECT
t1.RoomID, t3.AuditType, t3.AuditDate, t3.AuditID, t1.CompleteStatus
INTO
#COMPLETE
FROM
(SELECT
RoomID,
SUM(CASE WHEN AuditType = 'Complete' THEN 1 ELSE 0 END) AS CompleteStatus
FROM
#AUDIT
GROUP BY
RoomID) t1
INNER JOIN
#AUDIT t3 ON t1.RoomID = t3.RoomID
WHERE
t1.CompleteStatus = 0
OR t1.CompleteStatus > 1
SELECT
o.HotelID, o.RoomID,
a.AuditID, a.RoomID, a.AuditDate, a.AuditType, a.CompleteStatus,
c.ClientNum
INTO
#FINALE
FROM
#ROOMS O
LEFT OUTER JOIN
#COMPLETE a on o.RoomID = a.RoomID
LEFT OUTER JOIN
[dbo].[clients] c on o.clientNum = c.clientNum
SELECT
t.*,
Complete_Error_Status = CASE WHEN t.CompleteStatus = 0
THEN 'Not Complete'
WHEN t.CompleteStatus > 1
THEN 'Complete More Than Once'
END
FROM
#FINALE t
INNER JOIN
(SELECT
RoomID, MAX(AuditDate) AS MaxDate
FROM
#FINALE
GROUP BY
RoomID) tm ON t.RoomID = tm.RoomID AND t.AuditDate = tm.MaxDate
One section you could improve would be this one. See the inline comments.
SELECT
t1.RoomID, t3.AuditType, t3.AuditDate, t3.AuditID, t1.CompleteStatus
INTO
#COMPLETE
FROM
(SELECT
RoomID,
COUNT(1) AS CompleteStatus
-- Use the above along with the WHERE clause below
-- so that you are aggregating fewer records and
-- avoiding a CASE statement. Remove this next line.
--SUM(CASE WHEN AuditType = 'Complete' THEN 1 ELSE 0 END) AS CompleteStatus
FROM
#AUDIT
WHERE
AuditType = 'Complete'
GROUP BY
RoomID) t1
INNER JOIN
#AUDIT t3 ON t1.RoomID = t3.RoomID
WHERE
t1.CompleteStatus = 0
OR t1.CompleteStatus > 1
Just a thought. Streamline your code and your solution. you are not effectively filtering your datasets smaller so you continue to query the entire tables which is taking a lot of your resources and your temp tables are becoming full copies of those columns without the indexes (PK, FK, ++??) on the original table to take advantage of. This by no means is a perfect solution but it is an idea of how you can consolidate your logic and reduce your overall data set. Give it a try and see if it performs better for you.
Note this will return the last audit record for any room that has either not had an audit completed or completed more than once.
;WITH cte AS (
SELECT
o.RoomId
,o.clientNum
,a.AuditId
,a.AuditDate
,a.AuditType
,NumOfAuditsComplete = SUM(CASE WHEN a.AuditType = 'Complete' THEN 1 ELSE 0 END) OVER (PARTITION BY o.RoomId)
,RowNum = ROW_NUMBER() OVER (PARTITION BY o.RoomId ORDER BY a.AuditDate DESC)
FROm
dbo.Rooms o
LEFT JOIN dbo.Audit a
ON o.RoomId = a.RoomId
WHERE
o.[Status] = 2
AND o.OrderType = 2
)
SELECT
oc.HotelId
,cte.RoomId
,cte.AuditId
,cte.AuditDate
,cte.AuditType
,cte.NumOfAuditsComplete
,cte.clientNum
,Complete_Error_Status = CASE WHEN cte.NumOfAuditsComplete > 1 THEN 'Complete More Than Once' ELSE 'Not Complete' END
FROM
cte
LEFT JOIN dbo.Hotels oc
ON cte.HotelId = oc.HotelId
LEFT JOIN dbo.clients c
ON cte.clientNum = c.clientNum
WHERE
cte.RowNum = 1
AND cte.NumOfAuditsComplete != 1
Also note I changed your
WHERE
o.[status] = '2'
AND o.orderType = '2'
TO
WHERE
o.[status] = 2
AND o.orderType = 2
to be numeric without the single quotes. If the data type is truely varchar add them back but when you query a numeric column as a varchar it will do data conversion and may not take advantage of indexes that you have built on the table.

Using Aliased column in a join

I have an aliased colum:
FIRST_VALUE(SUBSTR(ba.CREATED,1,18))
OVER (PARTITION BY bsh.STRUCTURE_ELEMENT_ID, bal.BUDGET_CYCLE_ID
ORDER BY ba.CREATED DESC NULLS FIRST) AS UPDATED_DATE
I need to use this new field in a join, how can I do it? I've tried copying the same syntax but I get an error message stating window functions are not allowed here and also tried using the UPDATED_DATE alias name but that says the field does not exist.
Can anyone advise please?
Edited 24/10/15:
I've tried the suggestions I've been given but they don't seem to be working. I'm not sure if it's because it's already a complex statement with other joins in so this is the full code as it currently is
SELECT /* State Change 2 to 3 */
bal.BUDGET_CYCLE_ID AS BUDGET_CYCLE_ID,
bal.STRUCTURE_ELEMENT_ID AS COST_CENTRE,
FIRST_VALUE(SUBSTR(ba.CREATED,1,15))
OVER (PARTITION BY bsh.STRUCTURE_ELEMENT_ID,
bal.BUDGET_CYCLE_ID,
bsh.PREVIOUS_STATE || bsh.NEW_STATE
ORDER BY ba.CREATED DESC NULLS FIRST) AS UPDATED_DATE,
bsh.PREVIOUS_STATE AS PREVIOUS_STATE,
bsh.NEW_STATE AS NEW_STATE,
ba.USER_ID AS USER_ID
FROM BUDGET_ACTIVITY ba
LEFT JOIN BUDGET_ACTIVITY_LINK bal
ON ba.BUDGET_ACTIVITY_ID = bal.BUDGET_ACTIVITY_ID
AND ba.ACTIVITY_TYPE = 5
LEFT JOIN BUDGET_STATE_HISTORY bsh
ON bal.STRUCTURE_ELEMENT_ID = bsh.STRUCTURE_ELEMENT_ID
AND bal.BUDGET_CYCLE_ID = bsh.BUDGET_CYCLE_ID
AND SUBSTR(ba.CREATED,1,15) = SUBSTR(bsh.CHANGED_TIME,1,15)
WHERE PREVIOUS_STATE || NEW_STATE = 23
AND bal.budget_cycle_ID = '227565'
AND bal.structure_element_ID = '418'
I need to change the SUBSTR(ba.created,1,15) to the UPDATED_DATE field derived above. I'm relatively new to SQL and this one is beyond me.
You'll need to put your aliased column in a subquery:
SELECT
...
FROM
(
SELECT
...
FIRST_VALUE(SUBSTR(ba.CREATED,1,18))
OVER (PARTITION BY bsh.STRUCTURE_ELEMENT_ID, bal.BUDGET_CYCLE_ID
ORDER BY ba.CREATED DESC NULLS FIRST) AS UPDATED_DATE
FROM
...
) T1
JOIN TABLE2 T2 ON T1.UPDATED_DATE = T2.DATE_FIELD;