How to run Row_Number() with filtering - sql

I have a table with multiple rows for each 'Case Number'. I want to pick one row for each Case Number and join this back to another table maintaining a one-to-one relationship.
The conditions to pick this row are :
1) First of all filter out all rows for each Case Number that have Stage = Cancelled
2) If you find Stage = 'In Progress' or 'Paused', pick that row. (Only one of these two can be present for a Case Number)
3) If not (2), then pick Stage = 'Completed' but for the latest 'Stop Time'. (This is where I thought we might have to use ROW_NUMBER())
I've already created a query to push in row numbers and pick up one row based on the latest 'Stop time' but I'm not able to figure out how to add the above filters and if-else conditions in there.
SELECT [Case Number],
ROW_NUMBER ( )
OVER ( PARTITION BY [Case Number] order by [Stop time] desc ) idx
,[Stage]
,[Time left]
,[SLA definition]
,[Elapsed time]
,[Elapsed percentage]
,[Start time]
,[Stop time]
,[Has breached]
,[Breach time]
,[Updated]
,[Updated by]
,[Created]
,[Created by]
FROM ( select * from [SLA_Data] where Stage != 'Cancelled' )v1

It's a bit hard to tell from your question but something like this is my interpretation (I'm not able to access SQL, nor did you provide enough test data so cannot test it)
select * from
(
SELECT [Task],...,...,
ROW_NUMBER ( )
OVER ( PARTITION BY Task order by
case
when Stage in('In Progress' ,'Paused') then 1
when Stage='Completed' then 2 end,
[Stop time] desc ) idx
)
where idx=1

Below code should work assuming Case_Number is present in the below table or the present table must be joined with the Case_Number Table
create view [dbo].[SLA_View] as select * from (
SELECT * FROM
(
SELECT [Task]
,[Stage]
,[Time left]
,[SLA definition]
,[Elapsed time]
,[Elapsed percentage]
,[Start time]
,[Stop time]
,[Has breached]
,[Breach time]
,[Updated]
,[Updated by]
,[Created]
,[Created by]
FROM
(
/* GETS SINGLE CASE FOR MULTIPLE STAGES */
SELECT *,ROW_NUMBER ( ) OVER ( PARTITION BY Task order by [Stage] desc) RNK
FROM [SLA_Data] WHERE [CASE_NUMBER] IN
(
/* GETS DISTINCT CASE NUMBER WITH STAGE = 'PAUSED' OR 'IN PROGRESS' */
SELECT DISTINCT [CASE_NUMBER]
FROM [SLA_Data]
WHERE [Stage] != 'Cancelled'
AND [Stage] IN ('Paused','In Progess')
GROUP BY [CASE_NUMBER]
HAVING COUNT(*) >= 1
)
)Y
WHERE RNK = 1
)Z
UNION
SELECT [Task]
,[Stage]
,[Time left]
,[SLA definition]
,[Elapsed time]
,[Elapsed percentage]
,[Start time]
,[Stop time]
,[Has breached]
,[Breach time]
,[Updated]
,[Updated by]
,[Created]
,[Created by]
FROM
(
SELECT *, ROW_NUMBER ( ) OVER ( PARTITION BY Task order by [start time] desc) idx
FROM [SLA_Data]
WHERE [CASE_NUMBER] NOT IN (
SELECT DISTINCT [CASE_NUMBER]
FROM [SLA_Data]
WHERE [Stage] != 'Cancelled'
AND [Stage] IN ('Paused','In Progess')
GROUP BY [CASE_NUMBER]
HAVING COUNT(*) > 1
)
)v1 where idx = 1
)

Related

Query to show each day data when we have records for 2 distinct dates

I have a SQL Table as below:
DEVICE ID
STATUS
Created Date
Device 1
ACTIVE
1/10/2022
Device 1
INACTIVE
5/10/2022
Now I need to write a query to show the status of every day. My Output should be as below:
Device 1 - 1/10/2022 - ACTIVE
Device 1 - 2/10/2022 - ACTIVE
Device 1 - 3/10/2022 - ACTIVE
Device 1 - 4/10/2022 - ACTIVE
Device 1 - 5/10/2022 - INACTIVE
I have tried few queries which does not giving me correct result. So, it would be appreciate if I can get some help on this. Thanks in Advance.
You may try a recursive CTE as the following:
WITH CTE AS
(
SELECT DEVICE_ID, STATUS, Created_Date
FROM table_name
UNION ALL
SELECT C.DEVICE_ID, C.STATUS, DATEADD(DAY, 1, C.Created_Date)
FROM CTE C
WHERE DATEADD(DAY, 1, C.Created_Date) NOT IN (SELECT Created_Date FROM table_name T WHERE T.DEVICE_ID=C.DEVICE_ID AND T.STATUS='INACTIVE')
AND DATEADD(DAY, 1, C.Created_Date)<=GETDATE()
AND STATUS='ACTIVE'
)
SELECT DEVICE_ID, STATUS, Created_Date
FROM CTE
ORDER BY DEVICE_ID, Created_Date
See a demo with extended data sample.
Create a table with every day as a record.
Take the following Device Status dates from YourStatusTable with a self join and rank them with a row_number
Join your status table to the date seed where the rn = 1
with YourStatusTable as (
select 'Device 1' [Device ID], 'ACTIVE' [Status], cast('01-Oct-2022' as date) [Created Date]
union all select 'Device 1' [Device ID], 'INACTIVE' [Status], '05-Oct-2022'
union all select 'Device 1' [Device ID], 'ACTIVE' [Status], '08-Oct-2022'
union all select 'Device 1' [Device ID], 'INACTIVE' [Status], '12-Oct-2022'
union all select 'Device 2' [Device ID], 'ACTIVE' [Status], '20-Oct-2022'
union all select 'Device 2' [Device ID], 'INACTIVE' [Status], '25-Oct-2022'
),
seed as (
select null as n
union all select null
),
dates as (
select dateadd(day,row_number() OVER ( ORDER BY a.n )-1, '-Sep-2022') date_value
from seed a,
seed b,
seed c,
seed d,
seed e,
seed f
),
status_boundaries as (
select
a.[Device ID],
a.[Status],
a.[Created Date],
b.[Created Date] [next_status_date],
row_number() over ( partition by a.[Device ID],a.[Created Date] order by b.[Created Date]) rn
from YourStatusTable a
left join YourStatusTable b on a.[Device ID] = b.[Device ID] and a.[Created Date] < b.[Created Date]
)
select *
from dates
inner join status_boundaries on date_value >= [Created Date]
where ( date_value < next_status_date or next_status_date is null )
and rn = 1

Return only the highest row number for a partitioned column

I'm trying to partition a list of submitted machining jobs by the date they were submitted and return a maximum row number for each partition.
I have tried using Group By, but I want to retain all rows in the result. Partition By does what I need, but I want to display all rows except the maximum row number as blank.
SELECT [Created Date]
,row_number() over(partition by format([Created Date],'d','en-gb') order by [Created Date] desc) AS [Jobs Submitted That Day]
FROM [UK_App].[dbo].[rvxDevMCRequests]
order by [Created Date] desc
Results:
Created Date Jobs Submitted That Day
31/12/2014 1
31/10/2019 1
31/10/2019 2
31/10/2019 3
31/10/2018 1
31/10/2018 2
The order by function is not working correctly, and I can't figure out how to display only the highest row number. I would like it to output this:
Created Date Jobs Submitted That Day
31/12/2014 1
31/10/2018
31/10/2018 2
31/10/2019
31/10/2019
31/10/2019 3
Not an elegant solution:
SELECT [Created Date]
, case when row_number() over(partition by format([Created Date],'d','en-gb') order by [Created Date] desc)
= count(*) over(partition by format([Created Date],'d','en-gb'))
then count(*) over(partition by format([Created Date],'d','en-gb'))
else null end AS [Jobs Submitted That Day]
FROM [UK_App].[dbo].[rvxDevMCRequests]
order by [Created Date] desc
Try this one:
SELECT
a.CreatedDate,
CASE
WHEN y.rnum IS NULL
THEN ''
ELSE
a.JobsSubmitted
END AS JobsSubmitted
FROM
input a
LEFT OUTER JOIN
(
SELECT
x.CreatedDate, x.JobsSubmitted, x.rnum
FROM (
SELECT
a.*,
ROW_NUMBER() OVER(PARTITION BY a.CreatedDate ORDER BY a.JobsSubmitted DESC) AS rnum
FROM
input a
) x
WHERE
x.rnum = 1
) y
ON (
a.CreatedDate = y.CreatedDate
AND a.JobsSubmitted = y.JobsSubmitted
);
SQL Fiddle link for demo: http://www.sqlfiddle.com/#!18/511abf/17
Why are you using format()? There is no reason to convert a date to a string, especially in this case.
One significant issue is that the column [Created Date] has duplicates. When you order by that column, the duplicates can be in any order. In fact, two different order bys on the column in the same query can result in different ordering.
The solution to that is to capture the ordering once in a subquery and then use that:
select [Created Date],
(case when cnt = seqnum then seqnum
end) as [Jobs Submitted That Day]
from (select r.*,
row_number() over (partition by [Created Date] order by [Created Date] desc) as seqnum,
count(*) over (partition by [Created Date]) as cnt
from [UK_App].[dbo].[rvxDevMCRequests]
) r
order by [Created Date] desc, seqnum;
In the above query, seqnum captures the ordering, so it is used for the outer order by.

Perform recursive calculation based on previous days calculation

I have a table dbo.TrueMarginCalc which I need to perform calculate a weighted cost based on the date. I also have this spreadsheet which illustrates what I need to do. The table in the database is just like this.
In the image below this table is sorted by [Date] ASC and I calculate the first instance as so:
I need to then recursively calculate this again using the prior day's "Weighted True Cost" so on and so on:
My code is as such:
CREATE TABLE dbo.TrueMarginCalc
(
[WHS] varchar(3),
[PRODUCT] varchar(5),
[TRANS DATE] DATE,
[RECEIPTS] INT,
[TRUE COST] NUMERIC(18,8),
[RUNNING_SALES] INT,
)
INSERT INTO dbo.TrueMarginCalc
SELECT
'350','54710','2018-09-06',42,0.7128,52 UNION ALL SELECT
'350','54710','2018-09-07',42,0.7154,61 UNION ALL SELECT
'350','54710','2018-09-08',42,0.715 ,42 UNION ALL SELECT
'350','54710','2018-09-10',0 ,0 ,37 UNION ALL SELECT
'350','54710','2018-09-11',42,0.7124,44 UNION ALL SELECT
'350','54710','2018-09-12',42,0.7125,42 UNION ALL SELECT
'350','54710','2018-09-13',42,0.7147,77 UNION ALL SELECT
'350','54710','2018-09-14',0 ,0 ,35 UNION ALL SELECT
'350','54710','2018-09-15',42,0.7123,47 UNION ALL SELECT
'350','54710','2018-09-17',0 ,0 ,22 UNION ALL SELECT
'350','54710','2018-09-18',42,0.7183,45 UNION ALL SELECT
'350','54710','2018-09-19',42,0.71 ,42 UNION ALL SELECT
'350','54710','2018-09-20',42,0.7124,56 UNION ALL SELECT
'350','54710','2018-09-21',0 ,0 ,10 UNION ALL SELECT
'350','54710','2018-09-22',42,0.7124,43 UNION ALL SELECT
'350','54710','2018-09-24',0 ,0 ,0 UNION ALL SELECT
'350','54710','2018-09-25',42,0.71 ,41 UNION ALL SELECT
'350','54710','2018-09-26',42,0.71 ,54
select *, (Running_Sales*[TRUE COST])/NULLIF(Running_Sales,0) As [Weighted True Cost]
FROM dbo.TrueMarginCalc order by [TRANS DATE]
All this does is calculate that first day's Weighted True Cost. Does this require some sort of cursor or recursion to perform this in T-SQL?
;
WITH CTE AS (
select ROW_NUMBER() OVER (PARTITION BY WHS, PRODUCT ORDER BY [TRANS DATE]) AS RNK, WHS,PRODUCT,[TRANS DATE], [RECEIPTS], [TRUE COST], RUNNING_SALES
, CAST((Running_Sales*[TRUE COST])/NULLIF(Running_Sales,0) AS MONEY) As [Weighted True Cost] FROM dbo.TrueMarginCalc
UNION ALL
select ROW_NUMBER() OVER (PARTITION BY WHS, PRODUCT ORDER BY [TRANS DATE]) AS RNK, WHS,PRODUCT,[TRANS DATE], [RECEIPTS], [TRUE COST], RUNNING_SALES
, CAST((Receipts*[TRUE COST])+([Weighted True Cost] * Running_Sales)/(Receipts+Running_Sales) AS MONEY) AS [Weighted True Cost] FROM CTE
WHERE RNK = RNK - 1
)
SELECT * FROM CTE
ORDER BY [TRANS DATE] asc
The above code is theoretically what I need but I am most certainly using the terminator incorrectly as it spits out the exact same calculation as the anchor of the recursive CTE.
You could use a recursive cte for this, something like:
WITH cte AS (
-- numbering is required for rcte
SELECT *, ROW_NUMBER() OVER (PARTITION BY WHS, PRODUCT ORDER BY TRANS_DATE) AS rn
FROM TrueMarginCalc
), rcte AS (
-- base row for each partition
SELECT *, CAST(TRUE_COST * RUNNING_SALES / TRUE_COST AS DECIMAL(18, 8)) AS WTC
FROM cte AS base
WHERE rn = 1
UNION ALL
-- next row for each partition
SELECT curr.*, CAST(prev.WTC * curr.RUNNING_SALES / curr.TRUE_COST AS DECIMAL(18, 8))
FROM cte AS curr
INNER JOIN rcte AS prev ON curr.WHS = prev.WHS AND curr.rn = prev.rn + 1
)
SELECT *
FROM rcte
Unfortunately the formula is incomplete but the above query shows you how to access columns from current row and previous iteration.
I'm not 100% about the formula, but you can try this.
SELECT WHS, PRODUCT, [TRANS DATE], RECEIPTS, [TRUE COST], [RUNNING_SALES]
, (
CASE [TRUE COST]
WHEN 0.0 THEN LAG([True Cost]) OVER(PARTITION BY WHS, PRODUCT ORDER BY [TRANS DATE])
ELSE (COALESCE( LAG([True Cost]) OVER(PARTITION BY WHS, PRODUCT ORDER BY [TRANS DATE]), [TRUE COST] ) + [True Cost]) / 2
END
) AS [WeightedTrueCost]
FROM dbo.TrueMarginCalc

SQL IF type logic help requested

I have a SQL problem that I have been stuck on for days. So this is the context. I work for a company where employees have timesheets. Each timesheet has an ID but it is not unique because it is possible for an employee to have 2 timesheets for the same ID. The difference is that normally when you submit the sheet your status is ‘Posted’. But, sometimes people screw up their entries and it has to get re-submitted with changes. Therefore, the status ‘Adjusted’.
The logic I need is the following
-Where timesheet ID’s only have one value (count=1) always use ‘Posted’ status. If there is only one value but it is not ‘Posted’ return an error string saying ‘Error’.
-where timesheet IDs have more than one value and BOTH ‘Posted’ and ‘Adjusted’ show up as status always default to ‘Adjusted’. BOTH posted and adjusted must be present in this.
I have tried case and subquery but no luck. I also have a column ‘timesheet post date’ and logic is earliest date is always posted and later date is ‘adjusted’, but in some cases the posting dates are identical.
so as you can see, I need to look at the duplicate count in one column, and then choose the value if that count is >1 from another column.
SELECT t1.[Resource NUID]
,t1.[Timesheet ID]
,t1.[Timesheet Start Date]
,t1.[Timesheet End Date]
,t1.[Timesheet Posted Date]
,t1.[Timesheet Status]
,t1.[RunSourceID]
,t1.[SpanStartDate]
,t1.[SpanEndDate]
FROM [TIME_DW].[dbo].[Timecard_Timesheets] as t1, [TIME_DW].[dbo]. [Timecard_Timesheets] as t2
where t1.[Timesheet ID]=t2.[Timesheet ID]
and t1.[Resource NUID]='e066308' and t1.[Timesheet Status]<>'Open' and t1.[Timesheet Status]<>'Submitted'
group by
t1.[Resource NUID]
,t1.[Timesheet ID]
,t1.[Timesheet Start Date]
,t1.[Timesheet End Date]
,t1.[Timesheet Posted Date]
,t1.[Timesheet Status]
,t1.[RunSourceID]
,t1.[SpanStartDate]
,t1.[SpanEndDate]
order by t1.[Timesheet Start Date] asc
this is an example of an actual record that has two statuses
thanks
I am expecting logic like this:
select timesheet_id,
(case when count(*) = 1 and min(status) = 'Posted' then min(status)
when count(*) = 1 then 'Error'
when min(status) = 'Adjusted' and max(status) = 'Posted' then 'Adjusted'
else NULL -- this case is not covered in the description
end) as new_status
from [TIME_DW].[dbo].[Timecard_Timesheets]
group by timesheet_id;
I don't understand what all the other columns are going in the code in the question.
This should get you going in the right direction:
First we count rows by TimeSheet Id (CN) and we assign a row_number (RN) ordered by "Adjusted" records first, then everything else (you might want to add an adjustment date as a 2nd order by to get the most recent one first).
Then we add an error status if the first and only row is not a Status of "Posted".
Finally we select out only the rows WHERE RN=1
DECLARE #TimeSheet TABLE (Id INT, Status VARCHAR(15))
INSERT INTO #TimeSheet (Id,Status)
VALUES
(1,'Posted'),
(2,'Posted'),
(2,'Adjusted'),
(3,'Adjusted')
;WITH X AS
(
SELECT COUNT(2) OVER(PARTITION BY Id) AS CN,
ROW_NUMBER() OVER(PARTITION BY Id ORDER BY CASE WHEN Status='Adjusted' THEN 0 ELSE 1 END) AS RN,
*
FROM #TimeSheet
), Y AS
(
SELECT CASE WHEN CN=1 AND RN=1 AND Status<>'Posted' THEN 'Error'
ELSE ''
END AS Error,
*
FROM X
)
SELECT *
FROM Y
WHERE RN=1
from your code, I think you were trying to do this :
SELECT
ts.[Resource NUID]
, ts.[Timesheet ID]
, ts.[Timesheet Start Date]
, ts.[Timesheet End Date]
, ts.[Timesheet Posted Date]
, ts.[Timesheet Status]
, ts.[RunSourceID]
, ts.[SpanStartDate]
, ts.[SpanEndDate]
FROM
[TIME_DW].[dbo].[Timecard_Timesheets] as ts
JOIN (
SELECT *
, CASE
WHEN TimeSheetCount > 1 AND [Timesheet Status] <> 'Posted' THEN 'Adjusted'
WHEN TimeSheetCount = 1 AND [Timesheet Status] <> 'Posted' THEN 'Error'
ELSE 'Posted'
END NewStatus
FROM (
SELECT *
, COUNT(*) OVER(PARTITION BY t1.[Timesheet ID]) TimeSheetCount
, ROW_NUMBER() OVER(ORDER BY t1.[Timesheet Start Date]) RN
FROM
[TIME_DW].[dbo].[Timecard_Timesheets] as t1
) D
) t2 ON ts.[Timesheet ID] = t2.[Timesheet ID]
WHERE
ts.[Resource NUID] = 'e066308'
AND ts.[Timesheet Status] <> 'Open'
AND ts.[Timesheet Status] <> 'Submitted'

How to group some records by name and pivot some values

I am working on a SQL Query to group the results of a View by Id to have only one Row per Id
with a maximum of three pivoted results and keeping some columns static : TestCaseId, TestName, Test Case Num, Owner
Actually this is the Query i Created to get the Desired output but is now working as expected because the MAX is always retriving the max value so i am getting only one row but the pivoted values are repeated to the right.
SELECT DISTINCT TBL1.[TestName], TBL1.[Test Case Num], TBL1.[Owner], MAX(TBL1.[Browser]) as 'Column1', MAX(TBL1.[Run Date]) as 'Column2', MAX(TBL1.[Status]) as 'Column3', MAX(TBL1.[Duration]) as 'Column4', MAX(TBL1.[ErrorMsg]) as 'Column5', MAX(TBL2.[Browser]) as 'Column6', MAX(TBL2.[Run Date]) as 'Column7', MAX(TBL2.[Status]) as 'Column8', MAX(TBL2.[Duration]) as 'Column9', MAX(TBL2.[ErrorMsg]) as 'Column10', MAX(TBL3.[Browser]) as 'Column11' , MAX(TBL3.[Run Date]) as 'Column12', MAX(TBL3.[Status]) as 'Column13', MAX(TBL3.[Duration]) as 'Column14', MAX(TBL3.[ErrorMsg]) as 'Column15'
FROM (SELECT DISTINCT T1.[TestCaseId], T1.[TestName], T1.[Test Case Num], T1.[Owner], T1.[Browser], T1.[Run Date], T1.[Status], T1.[Duration], T1.[ErrorMsg]
FROM [TestRunner].[dbo].RunsRawResults T1) TBL1
cross apply (SELECT DISTINCT T2.[TestCaseId], T2.[Browser], T2.[Run Date], T2.[Status], T2.[Duration], T2.[ErrorMsg]
FROM [TestRunner].[dbo].RunsRawResults T2
WHERE T2.[TestCaseId] = TBL1.[TestCaseId] AND T2.[Run Date] TBL1.[Run Date]) TBL2
cross apply (SELECT DISTINCT T3.[TestCaseId], T3.[Browser], T3.[Run Date], T3.[Status], T3.[Duration], T3.[ErrorMsg]
FROM [TestRunner].[dbo].RunsRawResults T3
WHERE T3.[TestCaseId] = TBL2.[TestCaseId] AND T3.[Run Date] TBL2.[Run Date] AND T3.[Run Date] TBL1.[Run Date]) TBL3
GROUP BY TBL1.[TestCaseId], TBL1.[TestName], TBL1.[Test Case Num], TBL1.[Owner]
Input -
Raw Data (Comes from the RunRawResults View)
Desired and Pivoted Output
Using a common table expression (cte) and row_number() we can simplify the identification and order of multiple run dates. This also lets us skip using distinct and group by.
Switching to outer apply lets us include results where there are less than 3 runs per TestCaseId.
;with cte as (
select *
, rn = row_number() over (
partition by TestCaseId
order by [Run Date]
)
from TestRunner.dbo.RunsRawResults
)
select
tbl1.TestName
, tbl1.[Test Case Num]
, tbl1.Owner
, tbl1.Browser
, [Run Date]_tbl1 = tbl1.[Run Date]
, Status_tbl1 = tbl1.Status
, Duration_tbl1 = tbl1.Duration
, ErrorMsg_tbl1 = tbl1.ErrorMsg
, Browser_tbl2 = tbl2.Browser
, [Run Date]_tbl2 = tbl2.[Run Date]
, Status_tbl2 = tbl2.Status
, Duration_tbl2 = tbl2.Duration
, ErrorMsg_tbl2 = tbl2.ErrorMsg
, Browser_tbl3 = tbl3.Browser
, [Run Date]_tbl3 = tbl3.[Run Date]
, Status_tbl3 = tbl3.Status
, Duration_tbl3 = tbl3.Duration
, ErrorMsg_tbl3 = tbl3.ErrorMsg
from cte as tbl1
outer apply (
select
i.[Run Date]
, i.Status
, i.Duration
, i.ErrorMsg
from cte as i
where i.TestCaseId = tbl1.TestCaseId
and i.rn = 2
) as tbl2
outer apply (
select
i.[Run Date]
, i.Status
, i.Duration
, i.ErrorMsg
from cte as i
where i.TestCaseId = tbl1.TestCaseId
and i.rn = 3
) as tbl3
where tbl1.rn = 1