Query specific field based on max sequence number

Query specific field based on max sequence number - sql

I have a record set where some of the rows are duplicated. In particular, they are duplicated for the last three rows of the record set. Of the entire four rows, the correct result set that I desire would include the first row and the last row. I desire this because for a particular SARAPPD_TERM_CODE_ENTRY, the record needed is the one where the SARAPPD_SEQ_NO value is at its max. So, the first row because for that particular term, the sequence number is maxed at one and the last row because the sequence number is maxed at six. Image and query are below.
select ppd.sarappd_seq_no, ppd.sarappd_term_code_entry, ppd.sarappd_apdc_code,
dap. dap.saradap_term_code_entry,
spri.spriden_id,
t.sgbstdn_astd_code, t.*
from sgbstdn t
left join spriden spri on t.sgbstdn_pidm = spri.spriden_pidm
left join saradap dap on spri.spriden_pidm = dap.saradap_pidm
join sarappd ppd on dap.saradap_pidm = ppd.sarappd_pidm
where t.sgbstdn_astd_code not in ('AS', 'DS', 'WD', 'SU', 'LA')
and t.sgbstdn_stst_code = 'AS'
and spri.spriden_change_ind is null
and spri.spriden_id = '123456789'
and (ppd.sarappd_apdc_code = 25 or ppd.sarappd_apdc_code = 30
or ppd.sarappd_apdc_code =35)
and ppd.sarappd_term_code_entry = dap.saradap_term_code_entry
--where b.sarappd_term_code_entry = ppd.sarappd_term_code_entry)
order by ppd.sarappd_term_code_entry
I believe this is a simple "where this = ( select max() ) type of query but I've been trying some different things and nothing is working. I'm not getting the results I want. So with that said, any help on this would be greatly appreciated. Thanks in advance.

You could use ROW_NUMBER()
;WITH cte
AS
(select
ROW_NUMBER() OVER (PARTITION BY SARAPPD_TERM_CODE_ENTRY ORDER BY SARAPPD_SEQ_NO DESC) AS RN
ppd.sarappd_seq_no,
ppd.sarappd_term_code_entry,
ppd.sarappd_apdc_code,
dap. dap.saradap_term_code_entry,
spri.spriden_id,
t.sgbstdn_astd_code, t.*
from sgbstdn t
left join spriden spri on t.sgbstdn_pidm = spri.spriden_pidm
left join saradap dap on spri.spriden_pidm = dap.saradap_pidm
join sarappd ppd on dap.saradap_pidm = ppd.sarappd_pidm
where t.sgbstdn_astd_code not in ('AS', 'DS', 'WD', 'SU', 'LA')
and t.sgbstdn_stst_code = 'AS'
and spri.spriden_change_ind is null
and spri.spriden_id = '123456789'
and (ppd.sarappd_apdc_code = 25 or ppd.sarappd_apdc_code = 30
or ppd.sarappd_apdc_code =35)
and ppd.sarappd_term_code_entry = dap.saradap_term_code_entry
order by ppd.sarappd_term_code_entry) a
SELECT *
FROM cte WHERE rn = 1

Related

SQL Last DateTime value

I am trying to query some 'job' databases where I need the last datetime value for each operation_service in a given job. I also need to know whether the operation is complete or not.
SELECT Job.Job, Job_Operation.Operation_Service, Job_Operation.Sequence,
MAX(Job_Operation_Time.Last_Updated) AS 'Last_Updated', Job_Operation_Time.Operation_Complete
FROM Job_Operation
LEFT JOIN Job ON Job.Job = Job_Operation.Job
LEFT JOIN Job_Operation_Time ON Job_Operation_Time.Job_Operation = Job_Operation.Job_Operation
WHERE Job.Status = 'Active'
GROUP BY Job.Job, Job_Operation.Operation_Service, Job_Operation.Sequence, Job_Operation_Time.Last_Updated, Job_Operation_Time.Operation_Complete
ORDER BY Job.Job, Sequence
A snippet of some results here:
What I would like is a query that returns all the highlighted records but does not return the records with a red line through the job field. NULL values are possible for both Operation_Complete and Last_Updated.

use row_number()
with cte as
(
SELECT Job.Job, Job_Operation.Operation_Service, Job_Operation.Sequence,
(Job_Operation_Time.Last_Updated) AS 'Last_Updated', Job_Operation_Time.Operation_Complete
,row_number()over(partition by Job_Operation.Operation_Service order by Job_Operation_Time.Last_Updated desc) rn
FROM Job_Operation
LEFT JOIN Job ON Job.Job = Job_Operation.Job
LEFT JOIN Job_Operation_Time ON Job_Operation_Time.Job_Operation = Job_Operation.Job_Operation
WHERE Job.Status = 'Active'
) select * from cte where rn=1

Removing duplicates rows in left Joining query

SELECT Rec.[Reg_ID]
,Rec.[Reg_No]
,Rec.[Case_ID]
,Det.Deleted AS CaseDeleted
,[Status].[Status]
,Det.[Unit_Submission_Date] AS [Signature]
,TD.TargetDate AS [Target]
,TD.TargetID
FROM [dbo].[Regestrations] Rec
LEFT JOIN [dbo].[Reg_Details] Det ON Rec.Case_ID = Det.CaseID
LEFT JOIN [dbo].[lkpStatus] [Status] ON Rec.Status_ID = [Status].StatusID
LEFT JOIN TargetDate TD ON TD.RecommId = Rec.Reg_ID
WHERE (Det.MissionID = 50 AND [Status].[Status] = 1 AND Rec.Deleted = 0 AND Det.Deleted = 0)
GROUP BY Rec.[Reg_ID],Rec.[Reg_No],Rec.[Case_ID]
,[Status].[Status]
,Det.[Unit_Submission_Date]
,TD.TargetDate
,Det.Deleted
,TD.TargetID
ORDER BY TD.TargetID desc
I have the above query that is supposed to return rows with unique Rec.[Reg_No]. But joined table TargetDate can have duplicate Rec.[Reg_ID] and if thats the case i get duplicate Rec.[Reg_No] rows in my results.
Table TargetDate has a date time column so i want to eliminate the duplicate Rec.[Reg_No] by selecting 1 row with the latest date value from table TargetDate.
How do modify my Join condition or the query where clause to achive the above?

One way is to use a window function such as ROW_NUMBER() that will generate sequential number based on the specified partition. This generated number can then be used to get the latest row.
SELECT Reg_ID, Reg_No, Case_ID, CaseDeleted, [Status], Signature, [Target], TargetID
FROM
(
SELECT Rec.[Reg_ID]
,Rec.[Reg_No]
,Rec.[Case_ID]
,Det.Deleted AS CaseDeleted
,[Status].[Status]
,Det.[Unit_Submission_Date] AS [Signature]
,TD.TargetDate AS [Target]
,TD.TargetID
,RN = ROW_NUMBER() OVER (PARTITION BY Rec.[Reg_No] ORDER BY TD.TargetID DESC)
FROM [BOI].[dbo].[Regestrations] Rec
LEFT JOIN [dbo].[Reg_Details] Det ON Rec.Case_ID = Det.CaseID
LEFT JOIN [dbo].[lkpStatus] [Status] ON Rec.Status_ID = [Status].StatusID
LEFT JOIN TargetDate TD ON TD.RecommId = Rec.Reg_ID
WHERE (Det.MissionID = 50 AND [Status].[Status] = 1 AND Rec.Deleted = 0 AND Det.Deleted = 0)
) subQuery
WHERE RN = 1
ORDER BY TargetID desc

This query can work correctly if you remove TD.TargetDate from GROUP BY clause and compute what you really need in output - MAX(TD.TargetDate)
But preferable way it to avoid GROUP BY clause at all:
...
FROM [BOI].[dbo].[Regestrations] Rec
LEFT JOIN [dbo].[Reg_Details] Det ON Rec.Case_ID = Det.CaseID
LEFT JOIN [dbo].[lkpStatus] [Status] ON Rec.Status_ID = [Status].StatusID
OUTER APPLY(
SELECT TOP 1 td.TargetDate, td.TargetID
FROM TargetDate TD
WHERE TD.RecommId = Rec.Reg_ID
ORDER BY TD.TargetDate DESC
) td
...

You should first find latest TargetDate for each Reg_ID or RecommId. Then you can use your normal join with TargetDate table just this time with matching both the RecommId and TargetDate.
Try this Query:
SELECT Rec.[Reg_ID]
,Rec.[Reg_No]
,Rec.[Case_ID]
,Det.Deleted AS CaseDeleted
,[Status].[Status]
,Det.[Unit_Submission_Date] AS [Signature]
,TD.TargetDate AS [Target]
,TD.TargetID
FROM [BOI].[dbo].[Regestrations] Rec
LEFT JOIN [dbo].[Reg_Details] Det ON Rec.Case_ID = Det.CaseID
LEFT JOIN [dbo].[lkpStatus] [Status] ON Rec.Status_ID = [Status].StatusID
LEFT JOIN (SELECT RecommId, MAX(TargetDate) MaxTargetDate GROUP BY RecommId) TDWithLatestDate ON TDWithLatestDate.RecommId = Rec.Reg_ID
LEFT OUTER JOIN TargetDate ON TD.RecommId = TDWithLatestDate.RecommId AND TD.TargetDate = TDWithLatestDate.MaxTargetDate
WHERE (Det.MissionID = 50
AND [Status].[Status] = 1
AND Rec.Deleted = 0
AND Det.Deleted = 0
)
GROUP BY Rec.[Reg_ID]
,Rec.[Reg_No]
,Rec.[Case_ID]
,[Status].[Status]
,Det.[Unit_Submission_Date]
,TD.TargetDate
,Det.Deleted
,TD.TargetID
ORDER BY TD.TargetID desc
You can improve this query if you want to avoid tie when there more than one record fighting to be latest.

where statement execute before inner join

I'm trying to grab the first instance of each result with a sysAddress of less than 4. However my statement currently grabs the min(actionTime) result first before applying the where sysAddress < 4. I'm trying to have the input for the inner join as the where sysAddress < 4 however i cant seem to figure out how to do it.
Should i be nesting it all differently? I didnt want to create an additional layer of table joins. Is this possible? I'm a bit lost at all the answers ive found.
SELECT
tblHistoryObject.info,
tblHistory.actionTime,
tblHistoryUser.userID,
tblHistoryUser.firstName,
tblHistoryUser.surname,
tblHistory.eventID,
tblHistoryObject.objectID,
tblHistorySystem.sysAddress
FROM tblHistoryObject
JOIN tblHistory
ON (tblHistory.historyObjectID = tblHistoryObject.historyObjectID)
JOIN tblHistorySystem
ON (tblHistory.historySystemID = tblHistorySystem.historySystemID)
JOIN tblHistoryUser
ON (tblHistory.historyUserID = tblHistoryUser.historyUserID)
INNER JOIN (SELECT
MIN(actionTime) AS recent_date,
historyObjectID
FROM tblHistory
GROUP BY historyObjectID) AS t2
ON t2.historyObjectID = tblHistoryObject.historyObjectID
AND tblHistory.actionTime = t2.recent_date
WHERE sysAddress < 4
ORDER BY actionTime ASC

WITH
all_action_times AS
(
SELECT
tblHistoryObject.info,
tblHistory.actionTime,
tblHistoryUser.userID,
tblHistoryUser.firstName,
tblHistoryUser.surname,
tblHistory.eventID,
tblHistoryObject.objectID,
tblHistorySystem.sysAddress,
ROW_NUMBER() OVER (PARTITION BY tblHistoryObject.historyObjectID
ORDER BY tblHistory.actionTime
)
AS historyObjectID_SeqByActionTime
FROM
tblHistoryObject
INNER JOIN
tblHistory
ON tblHistory.historyObjectID = tblHistoryObject.historyObjectID
INNER JOIN
tblHistorySystem
ON tblHistory.historySystemID = tblHistorySystem.historySystemID
INNER JOIN
tblHistoryUser
ON tblHistory.historyUserID = tblHistoryUser.historyUserID
WHERE
tblHistorySystem.sysAddress < 4
)
SELECT
*
FROM
all_action_times
WHERE
historyObjectID_SeqByActionTime = 1
ORDER BY
actionTime ASC
This does exactly what your original query did, without trying to filter by action_time.
Then it appends a new column, using ROW_NUMBER() to generate sequences from 1 for each individual tblHistoryObject.historyObjectID. Then it takes only the rows where this sequence value is 1 (the first row per historyObjectID, when sorted in action_time order).

Should a subquery on a join use tables from an outer query in the where clause?

I need to add a subquery to a join, because one payment can have more than one allotment, so I only need to account for the first match (where rownum = 1).
However, I'm not sure if adding pmt from the outer query to the subquery on the allotment join is best.
Should I be doing this differently in the event of performance hits, etc.. ?
SELECT
pmt.payment_uid,
alt.allotment_uid,
FROM
payment pmt
/* HERE: is the reference to pmt.pay_key and pmt.client_id
incorrect in the below subquery? */
INNER JOIN allotment alc ON alt.allotment_uid = (
SELECT
allotment_uid
FROM
allotment
WHERE
pay_key = pmt.pay_key
AND
pay_code = 'xyz'
AND
deleted = 'N'
AND
client_id = pmt.client_id
AND
ROWNUM = 1
)
WHERE
AND
pmt.deleted = 'N'
AND
pmt.date_paid >= TO_DATE('2017-07-01')
AND
pmt.date_paid < TO_DATE('2017-10-01') + 1;

It's difficult to identify the performance issue in your query without seeing an explain plan output. You query does seem to do an additional SELECT on the allotment for every record from the main query.
Here is a version which doesn't use correlated sub query. Obviously I haven't been able to test it. It does a simple join in and then filters all records except one of the allotments. Hope this helps.
WITH v_payment
AS
(
SELECT
pmt.payment_uid,
alt.allotment_uid,
ROW_NUMBER () OVER(PARTITION BY allotment_id) r_num
FROM
payment pmt JOIN allotment alt
ON (pmt.pay_key = alt.pay_key AND
pmt.client_id = alt.client_id)
WHERE pmt.deleted = 'N' AND
pmt.date_paid >= TO_DATE('2017-07-01') AND
pmt.date_paid < TO_DATE('2017-10-01') + 1 AND
alt.pay_code = 'xyz' AND
alt.deleted = 'N'
)
SELECT payment_uid,
allotment_uid
FROM v_payment
WHERE r_num = 1;
Let's know how this performs!

You can phrase the query that way. I would be more likely to do:
SELECT . . .
FROM payment p INNER JOIN
(SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY pay_key, client_id
ORDER BY allotment_uid
) as seqnum
FROM allotment a
WHERE pay_code = 'xyz' AND deleted = 'N'
) a
ON a.pay_key = p.pay_key AND a.client_id = p.client_id AND
seqnum = 1
WHERE p.deleted = 'N' AND
p.date_paid >= DATE '2017-07-01' AND
p.date_paid < (DATE '2017-10-01') + 1;

Distinct, count, group by query madness

I am trying to return a count of tests taken per term. I can get the count to return, but I can't get it grouped by term.
I've tried everything and the closest I get is grouping by term but then my count only = 1, which isn't right.
Here is what I have now. It just returns a count, how do I group it by term_id?
SELECT COUNT(*)
FROM (SELECT DISTINCT ON(student_id, test_event_id, terf.term_id) student_id
FROM report.test_event_result_fact terf
JOIN report.growth_measurement_window gw on gw.term_id = terf.term_id
JOIN report.term t on t.term_id = terf.term_id
JOIN report.test tt on tt.test_id = terf.test_id
WHERE terf.partner_id = 98
AND growth_event_yn = 't'
AND gw.test_window_complete_yn = 't'
AND gw.growth_window_type = 'DISTRICT'
AND tt.test_type_description = 'SURVEY_WITH_GOALS') as TestEvents

Without knowing more about your setup, that's my best bet:
select term_id, count(*) AS count_per_term
from (
select Distinct on (student_id, test_event_id, terf.term_id)
terf.term_id, student_id
from report.test_event_result_fact terf
join report.growth_measurement_window gw using (term_id)
join report.term t using (term_id)
join report.test tt using (term_id)
where terf.partner_id = 98
and growth_event_yn = 't'
and gw.test_window_complete_yn = 't'
and gw.growth_window_type = 'DISTRICT'
and tt.test_type_description = 'SURVEY_WITH_GOALS') as TestEvents
group by 1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas