SQL to ensure unique node names in adjacency list - sql
So I have an adjacency list that forms a hierarchy simulating a versioned file structure. The problem is that the incoming file names are not currently unique and they need to be. To make things slightly more interesting the files may have different versions which should keep the name of the first version (note the versions all have the same NodeID).
Adjacency List
ParentID
NodeID
VersionNum
FileName
-1
1
1
FirstFolder
1
2
1
SecondFolder
1
3
1
ThirdFolder
1
4
1
FirstDocument
1
4
2
FirstDocument
1
5
1
FirstDocument
1
5
2
FirstDocument
2
6
1
FirstDocument
2
6
2
FirstDocument
2
7
1
SecondDocument
3
8
1
SecondDocument
3
9
1
ThirdDocument
3
9
2
ThirdDocument
3
10
1
ThirdDocument
3
11
1
ThirdDocument
Targeted Result
ParentID
NodeID
VersionNum
FileName
-1
1
1
FirstFolder
1
2
1
SecondFolder
1
3
1
ThirdFolder
1
4
1
FirstDocument
1
4
2
FirstDocument
1
5
1
FirstDocument_1
1
5
2
FirstDocument_1
2
6
1
FirstDocument
2
6
2
FirstDocument
2
7
1
SecondDocument
3
8
1
SecondDocument
3
9
1
ThirdDocument
3
9
2
ThirdDocument
3
10
1
ThirdDocument_1
3
11
1
ThirdDocument_2
*I should also note that the folder names are already guaranteed to be unique (they already exist, it is the documents that are incoming) and they only have 1 version.
CREATE TABLE #tmp_tree
(
ParentID INT,
NodeID INT,
VersionNum INT,
FileName VARCHAR(50),
);
INSERT INTO #tmp_tree (ParentID, NodeID, VersionNum, FileName)
VALUES (-1, 1, 1, 'FirstFolder' ),
(1, 2, 1, 'SecondFolder' ),
(1, 3, 1, 'ThirdFolder' ),
(1, 4, 1, 'FirstDocument' ),
(1, 4, 2, 'FirstDocument' ),
(1, 5, 1, 'FirstDocument' ),
(1, 5, 2, 'FirstDocument' ),
(2, 6, 1, 'FirstDocument' ),
(2, 6, 2, 'FirstDocument' ),
(2, 7, 1, 'SecondDocument' ),
(3, 8, 1, 'SecondDocument' ),
(3, 9, 1, 'ThirdDocument' ),
(3, 9, 2, 'ThirdDocument' ),
(3, 10, 1, 'ThirdDocument' )
(3, 11, 1, 'ThirdDocument' )
I really don't know how to approach this though resorting to a stored procedure. Adjacency list scream CTEs to me but that got me no where real fast. Group By loses the NodeID so while I can find the names of the documents that need to be renamed - I don't know how to use that to select the second occurrence of the name (ordered by NodeID).
-- I don't see how this helps... but this finds the names that need to change.
select ParentID, FileName,VersionNum, count(*) from #tmp_tree
GROUP BY ParentID, FileName, VersionNum
HAVING VersionNum = 1 and count(*) > 1
order by FileName
I know how to solve this procedural but not declaratively.
I don't know if this is closer or farther away from the solution:
select f2.*, Row_Number() over (order by f2.FileName) from
(select top 10 f.*, count(FileName) over (PARTITION by ParentID, FileName) as n from (select * from #tmp_tree where versionNum = 1) as f
order by f.ParentID, f.FileName) as f2
Where n > 1
I would assume the last line (3, 11) in the targeted result is a mistake.
You can find the repeated names with a window function in a subquery and then join it during the update. In short, you can do:
update #tmp_tree
set #tmp_tree.filename = concat(#tmp_tree.filename, '_', x.rn)
from #tmp_tree
join (
select *,
row_number() over(partition by parentid, filename order by nodeid) as rn
from #tmp_tree
where versionnum = 1
) x on x.rn > 1 and x.nodeid = #tmp_tree.nodeid;
Result:
ParentID NodeID VersionNum FileName
--------- ------- ----------- ---------------
-1 1 1 FirstFolder
1 2 1 SecondFolder
1 3 1 ThirdFolder
1 4 1 FirstDocument
1 4 2 FirstDocument
1 5 1 FirstDocument_2
1 5 2 FirstDocument_2
2 6 1 FirstDocument
2 6 2 FirstDocument
2 7 1 SecondDocument
3 8 1 SecondDocument
3 9 1 ThirdDocument
3 9 2 ThirdDocument
3 10 1 ThirdDocument_2
See running example at db<>fiddle.
You don't need to self-join the table, you can update the derived table directly, after calculating the row-number using DENSE_RANK
update x
set filename = concat(x.filename, '_', x.rn)
from (
select *,
dense_rank() over(partition by parentid, filename order by nodeid) as rn
from #tmp_tree
) x
where x.rn > 1;
db<>fiddle
DENSE_RANK will return the same number for tied results according to the ordering clause.
Related
Calculate loads and avoiding cursors
Given the following table structure, which is a representation of a bus route where passengers get on and off the bus with a door sensor. And, there is a person who sits on that bus with a clipboard holding a spot count. CREATE TABLE BusLoad( ROUTE CHAR(4) NOT NULL, StopNumber INT NOT NULL, ONS INT, OFFS INT, SPOT_CHECK INT) go INSERT BusLoad VALUES('AAAA', 1, 5, 0, null) INSERT BusLoad VALUES('AAAA', 2, 0, 0, null) INSERT BusLoad VALUES('AAAA', 3, 2, 1, null) INSERT BusLoad VALUES('AAAA', 4, 6, 3, 8) INSERT BusLoad VALUES('AAAA', 5, 1, 0, null) INSERT BusLoad VALUES('AAAA', 6, 0, 1, 7) INSERT BusLoad VALUES('AAAA', 7, 0, 3, null) I want to add a column "LOAD" to this table that calculates the load at each stop. Load = Previous stops load + current stop ONS - Current stop's OFFS if SPOT_CHECK is null, otherwise LOAD = SPOT_CHECK Expected Results: ROUTE StopNumber ONS OFFS SPOT_CHECK LOAD AAAA 1 5 0 NULL 5 AAAA 2 0 0 NULL 5 AAAA 3 2 1 NULL 6 AAAA 4 6 3 8 8 AAAA 5 1 0 NULL 9 AAAA 6 0 1 7 7 AAAA 7 0 3 NULL 4 I can do this with a cursor, but is there a way to do it using a query?
You can use the following query: select ROUTE, StopNumber, ONS, OFFS, SPOT_CHECK, COALESCE(SPOT_CHECK, ONS - OFFS) AS ld, SUM(CASE WHEN SPOT_CHECK IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY ROUTE ORDER BY StopNumber) AS grp from BusLoad to get: ROUTE StopNumber ONS OFFS SPOT_CHECK ld grp ---------------------------------------------------- AAAA 1 5 0 NULL 5 0 AAAA 2 0 0 NULL 0 0 AAAA 3 2 1 NULL 1 0 AAAA 4 6 3 8 8 1 AAAA 5 1 0 NULL 1 1 AAAA 6 0 1 7 7 2 AAAA 7 0 3 NULL -3 2 All you want now is the running total of ld over ROUTE, grp partitions of data: ;WITH CTE AS ( .... previous query here ) select ROUTE, StopNumber, ONS, OFFS, SPOT_CHECK, grp, sum(ld) over (PARTITION BY ROUTE, grp ORDER BY StopNumber) as load from cte Demo here Note: The above query works for versions starting from 2012. If you want a query for 2008 you have to somehow simulate sum() over (order by ...). You can find many relevant posts here in SO.
You may use recursive query with act_load as ( select *, ons load from busload where stopnumber = 1 and route = 'AAAA' union all select b.*, case when b.spot_check is null then l.load + b.ons - b.offs else b.spot_check end load from busload b join act_load l on b.StopNumber = l.StopNumber + 1 and b.route = l.route ) select * from act_load dbfiddle demo
skip consecutive rows after specific value
Note: I have a working query, but am looking for optimisations to use it on large tables. Suppose I have a table like this: id session_id value 1 5 7 2 5 1 3 5 1 4 5 12 5 5 1 6 5 1 7 5 1 8 6 7 9 6 1 10 6 3 11 6 1 12 7 7 13 8 1 14 8 2 15 8 3 I want the id's of all rows with value 1 with one exception: skip groups with value 1 that directly follow a value 7 within the same session_id. Basically I would look for groups of value 1 that directly follow a value 7, limited by the session_id, and ignore those groups. I then show all the remaining value 1 rows. The desired output showing the id's: 5 6 7 11 13 I took some inspiration from this post and ended up with this code: declare #req_data table ( id int primary key identity, session_id int, value int ) insert into #req_data(session_id, value) values (5, 7) insert into #req_data(session_id, value) values (5, 1) -- preceded by value 7 in same session, should be ignored insert into #req_data(session_id, value) values (5, 1) -- ignore this one too insert into #req_data(session_id, value) values (5, 12) insert into #req_data(session_id, value) values (5, 1) -- preceded by value != 7, show this insert into #req_data(session_id, value) values (5, 1) -- show this too insert into #req_data(session_id, value) values (5, 1) -- show this too insert into #req_data(session_id, value) values (6, 7) insert into #req_data(session_id, value) values (6, 1) -- preceded by value 7 in same session, should be ignored insert into #req_data(session_id, value) values (6, 3) insert into #req_data(session_id, value) values (6, 1) -- preceded by value != 7, show this insert into #req_data(session_id, value) values (7, 7) insert into #req_data(session_id, value) values (8, 1) -- new session_id, show this insert into #req_data(session_id, value) values (8, 2) insert into #req_data(session_id, value) values (8, 3) select id from ( select session_id, id, max(skip) over (partition by grp) as 'skip' from ( select tWithGroups.*, ( row_number() over (partition by session_id order by id) - row_number() over (partition by value order by id) ) as grp from ( select session_id, id, value, case when lag(value) over (partition by session_id order by session_id) = 7 then 1 else 0 end as 'skip' from #req_data ) as tWithGroups ) as tWithSkipField where tWithSkipField.value = 1 ) as tYetAnotherOutput where skip != 1 order by id This gives the desired result, but with 4 select blocks I think it's way too inefficient to use on large tables. Is there a cleaner, faster way to do this?
The following should work well for this. WITH cte_ControlValue AS ( SELECT rd.id, rd.session_id, rd.value, ControlValue = ISNULL(CAST(SUBSTRING(MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), 5, 4) AS INT), 999) FROM #req_data rd CROSS APPLY ( VALUES (CAST(rd.id AS BINARY(4)) + CAST(NULLIF(rd.value, 1) AS BINARY(4))) ) bv (BinVal) ) SELECT cv.id, cv.session_id, cv.value FROM cte_ControlValue cv WHERE cv.value = 1 AND cv.ControlValue <> 7; Results... id session_id value ----------- ----------- ----------- 5 5 1 6 5 1 7 5 1 11 6 1 13 8 1 Edit: How and why it works... The basic premise is taken from Itzik Ben-Gan's "The Last non NULL Puzzle". Essentially, we are relying 2 different behaviors that most people don't usually think about... 1) NULL + anything = NULL. 2) You can CAST or CONVERT an INT into a fixed length BINARY data type and it will continue to sort as an INT (as opposed to sorting like a text string). This is easier to see when the intermittent steps are added to the query in the CTE... SELECT rd.id, rd.session_id, rd.value, bv.BinVal, SmearedBinVal = MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), SecondHalfAsINT = CAST(SUBSTRING(MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), 5, 4) AS INT), ControlValue = ISNULL(CAST(SUBSTRING(MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), 5, 4) AS INT), 999) FROM #req_data rd CROSS APPLY ( VALUES (CAST(rd.id AS BINARY(4)) + CAST(NULLIF(rd.value, 1) AS BINARY(4))) ) bv (BinVal) Results... id session_id value BinVal SmearedBinVal SecondHalfAsINT ControlValue ----------- ----------- ----------- ------------------ ------------------ --------------- ------------ 1 5 7 0x0000000100000007 0x0000000100000007 7 7 2 5 1 NULL 0x0000000100000007 7 7 3 5 1 NULL 0x0000000100000007 7 7 4 5 12 0x000000040000000C 0x000000040000000C 12 12 5 5 1 NULL 0x000000040000000C 12 12 6 5 1 NULL 0x000000040000000C 12 12 7 5 1 NULL 0x000000040000000C 12 12 8 6 7 0x0000000800000007 0x0000000800000007 7 7 9 6 1 NULL 0x0000000800000007 7 7 10 6 3 0x0000000A00000003 0x0000000A00000003 3 3 11 6 1 NULL 0x0000000A00000003 3 3 12 7 7 0x0000000C00000007 0x0000000C00000007 7 7 13 8 1 NULL NULL NULL 999 14 8 2 0x0000000E00000002 0x0000000E00000002 2 2 15 8 3 0x0000000F00000003 0x0000000F00000003 3 3 Looking at the BinVal column, we see an 8 byte hex value for all non-[value] = 1 rows and NULLS where [value] = 1... The 1st 4 bytes are the Id (used for ordering) and the 2nd 4 bytes are [value] (used to set the "previous non-1 value" or set the whole thing to NULL. The 2nd step is to "smear" the non-NULL values into the NULLs using the window framed MAX function, partitioned by session_id and ordered by id. The 3rd step is to parse out the last 4 bytes and convert them back to an INT data type (SecondHalfAsINT) and deal with any nulls that result from not having any non-1 preceding value (ControlValue). Since we can't reference a windowed function in the WHERE clause, we have to throw the query into a CTE (a derived table would work just as well) so that we can use the new ControlValue in the where clause.
SELECT CRow.id FROM #req_data AS CRow CROSS APPLY (SELECT MAX(id) AS id FROM #req_data PRev WHERE PRev.Id < CRow.id AND PRev.session_id = CRow.session_id AND PRev.value <> 1 ) MaxPRow LEFT JOIN #req_data AS PRow ON MaxPRow.id = PRow.id WHERE CRow.value = 1 AND ISNULL(PRow.value,1) <> 7
You can use the following query: select id, session_id, value, coalesce(sum(case when value <> 1 then 1 end) over (partition by session_id order by id), 0) as grp from #req_data to get: id session_id value grp ---------------------------- 1 5 7 1 2 5 1 1 3 5 1 1 4 5 12 2 5 5 1 2 6 5 1 2 7 5 1 2 8 6 7 1 9 6 1 1 10 6 3 2 11 6 1 2 12 7 7 1 13 8 1 0 14 8 2 1 15 8 3 2 So, this query detects islands of consecutive 1 records that belong to the same group, as specified by the first preceding row with value <> 1. You can use a window function once more to detect all 7 islands. If you wrap this in a second cte, then you can finally get the desired result by filtering out all 7 islands: ;with session_islands as ( select id, session_id, value, coalesce(sum(case when value <> 1 then 1 end) over (partition by session_id order by id), 0) as grp from #req_data ), islands_with_7 as ( select id, grp, value, count(case when value = 7 then 1 end) over (partition by session_id, grp) as cnt_7 from session_islands ) select id from islands_with_7 where cnt_7 = 0 and value = 1
SQL Recursive CTE unexpectedly returns alternating sets
I am trying to get the use recursive CTE to repeat the same pattern over and over, resetting when "Scenario" increases in value. RowNumber repeats 1-21 (as desired), but whenever "Scenario" is an even number, there are too few items in the "Vals" column to feed into "Value". I can't figure out which part of the code is causing me to be 1 short for only even Scenarios. Below are the results of the code I'm using at the bottom. Scenario RowNumber Value Vals 1 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 5 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 6 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 7 A A,A,A,A,A,A,A,A,A,A,A,A,B,C 1 8 A A,A,A,A,A,A,A,A,A,A,A,B,C 1 9 A A,A,A,A,A,A,A,A,A,A,B,C 1 10 A A,A,A,A,A,A,A,A,A,B,C 1 11 A A,A,A,A,A,A,A,A,B,C 1 12 A A,A,A,A,A,A,A,B,C 1 13 A A,A,A,A,A,A,B,C 1 14 A A,A,A,A,A,B,C 1 15 A A,A,A,A,B,C 1 16 A A,A,A,B,C 1 17 A A,A,B,C 1 18 A A,B,C 1 19 A B,C 1 20 B C 1 21 C 2 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C 2 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C 2 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C 2 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C 2 5 A A,A,A,A,A,A,A,A,A,A,A,A,B,B,C 2 6 A A,A,A,A,A,A,A,A,A,A,A,B,B,C 2 7 A A,A,A,A,A,A,A,A,A,A,B,B,C 2 8 A A,A,A,A,A,A,A,A,A,B,B,C 2 9 A A,A,A,A,A,A,A,A,B,B,C 2 10 A A,A,A,A,A,A,A,B,B,C 2 11 A A,A,A,A,A,A,B,B,C 2 12 A A,A,A,A,A,B,B,C 2 13 A A,A,A,A,B,B,C 2 14 A A,A,A,B,B,C 2 15 A A,A,B,B,C 2 16 A A,B,B,C 2 17 A B,B,C 2 18 B B,C 2 19 B C 2 20 C 2 21 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C 3 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 5 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 6 A A,A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 7 A A,A,A,A,A,A,A,A,A,A,A,B,C,C 3 8 A A,A,A,A,A,A,A,A,A,A,B,C,C 3 9 A A,A,A,A,A,A,A,A,A,B,C,C 3 10 A A,A,A,A,A,A,A,A,B,C,C 3 11 A A,A,A,A,A,A,A,B,C,C 3 12 A A,A,A,A,A,A,B,C,C 3 13 A A,A,A,A,A,B,C,C 3 14 A A,A,A,A,B,C,C 3 15 A A,A,A,B,C,C 3 16 A A,A,B,C,C 3 17 A A,B,C,C 3 18 A B,C,C 3 19 B C,C 3 20 C C 3 21 C 4 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C 4 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C 4 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C 4 4 A A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C 4 5 A A,A,A,A,A,A,A,A,A,A,A,B,B,B,C 4 6 A A,A,A,A,A,A,A,A,A,A,B,B,B,C 4 7 A A,A,A,A,A,A,A,A,A,B,B,B,C 4 8 A A,A,A,A,A,A,A,A,B,B,B,C 4 9 A A,A,A,A,A,A,A,B,B,B,C 4 10 A A,A,A,A,A,A,B,B,B,C 4 11 A A,A,A,A,A,B,B,B,C 4 12 A A,A,A,A,B,B,B,C 4 13 A A,A,A,B,B,B,C 4 14 A A,A,B,B,B,C 4 15 A A,B,B,B,C 4 16 A B,B,B,C 4 17 B B,B,C 4 18 B B,C 4 19 B C 4 20 C This is the code I used to generate the above sample. Where am I going wrong? CREATE TABLE #temp3 ( Scenario INT ,Vals VARCHAR(64) ,LEN INT ) ; WITH vals AS ( SELECT v.* FROM (VALUES ('A'), ('B'), ('C')) v(x) ), CTE AS ( SELECT CAST('A' AS VARCHAR(MAX)) AS STR, 0 AS LEN UNION ALL SELECT (CTE.STR + ',' + vals.x), CTE.LEN + 1 FROM CTE JOIN vals ON vals.x >= RIGHT(CTE.STR, 1) WHERE CTE.LEN < 19 ) INSERT INTO #temp3 SELECT ROW_NUMBER() OVER(ORDER BY STR + ',C') AS Scenario ,STR + ',C' AS Vals ,LEN FROM CTE WHERE STR + 'C' LIKE '%B%' AND LEN = 19 ; -- Split strings created above into individual characters WITH cte(Scenario, Value, Vals) AS ( SELECT Scenario ,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) AS Value ,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') AS Vals FROM #temp3 UNION ALL SELECT Scenario ,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) ,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') FROM cte WHERE Vals > '' ) SELECT Scenario ,ROW_NUMBER() OVER (PARTITION BY Scenario ORDER BY Scenario) RowNumber ,Value ,Vals FROM cte t
I'm not exactly sure what the problem you are describing is, but the ROW_NUMBER() should use an ORDER BY clause that completely orders the rows in each partition. When you use "PARTITION BY Scenario ORDER BY Scenario" the order in which the ROW_NUMBER() values are assigned is undefined. Try something like WITH cte(Scenario, depth, Value, Vals) AS ( SELECT Scenario, 0 depth ,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) AS Value ,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') AS Vals FROM #temp3 UNION ALL SELECT Scenario, depth+1 ,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) ,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') FROM cte WHERE Vals > '' ) SELECT Scenario ,depth ,ROW_NUMBER() OVER (PARTITION BY Scenario ORDER BY depth ) RowNumber ,Value ,Vals FROM cte t
Different select criteria in odd and even events
I have a table which looks like this ( 10 billion rows) AID BID CID 1 2 1 1 6 9 0 1 4 1 3 2 1 100 2 0 4 2 0 0 1 The AID could only be 0 or 1. BID and CID could be anything. Now I want to select events first with AID=1 and then AID=0, and again AID=1 and then AID=0. The idea is to select equal numbers of AID=1 and AID=0 event. How can I achieve that? The expected result is AID BID CID 1 2 1 0 1 4 1 6 9 0 4 2 1 3 2 0 0 1
;WITH cte AS ( select * FROM (VALUES (1, 2, 1), (1, 6, 9), (0, 1, 4), (1, 3, 2), (1, 100, 2), (0, 4, 2), (0, 0, 1) ) as t(AID, BID, CID) ), withrow AS ( SELECT ROW_NUMBER() OVER (PARTITION BY AID ORDER BY AID) as RN, * FROM cte) SELECT AID,BID,CID FROM withrow ORDER BY RN asc , aid desc Output: AID BID CID ----------- ----------- ----------- 1 100 2 0 4 2 1 3 2 0 1 4 1 6 9 0 0 1 1 2 1 (7 row(s) affected)
Assign rownumber in SQL grouped on value and n rows per rownumber
I am trying to generate a report with 3 rows per page for each order number using the following SQL. As you can see from the results the fields Actual & Expected do not match up. Any help would be appreciated. set nocount on DECLARE #Orders TABLE (Expected int, OrderNumber INT, OrderDetailsNumber int) Insert into #orders values (0,1,1) Insert into #orders values (0,1,2) Insert into #orders values (0,1,3) Insert into #orders values (1,1,4) Insert into #orders values (2,2,5) Insert into #orders values (2,2,6) Insert into #orders values (2,2,7) Insert into #orders values (3,2,8) Insert into #orders values (3,2,9) select cast(((row_number() over( order by OrderNumber)) -1) /3 as int) as [Actual] ,* from #orders Actual Expected OrderNumber OrderDetailsNumber ----------- ----------- ----------- ------------------ 0 0 1 1 0 0 1 2 0 0 1 3 1 1 1 4 1 2 2 5 1 2 2 6 2 2 2 7 2 3 2 8 2 3 2 9
Right, after a couple of edits I have the final answer: SELECT DENSE_RANK() OVER (Order BY OrderNumber, floor(RowNumber/3)) - 1 AS Actual, Expected, OrderNumber, OrderDetailsNumber FROM ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY OrderNumber ORDER BY OrderDetailsNumber ) - 1 AS RowNumber FROM #Orders ) RowNumberTable Gives the result (with extra rows for testing): Actual Expected OrderNumber OrderDetailsNumber -------------------- ----------- ----------- ------------------ 0 0 1 1 0 0 1 2 0 0 1 3 1 1 1 4 1 1 1 12 2 2 2 5 2 2 2 6 2 2 2 7 3 3 2 8 3 3 2 9 3 4 2 11 4 3 2 27 5 5 3 10 This only works where OrderDetailsNumber is unique such that the result is deterministic. Edit I've now got the complete code working, however the dependence on OrderDetailsNumber being in order is very iffy, hopefully you can test and edit as required. Edit 2 I've put the 'golfed' version in the main answer. WITH FirstCTE AS ( SELECT OrderNumber, OrderDetailsNumber, Expected, ROW_NUMBER() OVER ( PARTITION BY OrderNumber ORDER BY OrderDetailsNumber ) - 1 AS RowNumber FROM #Orders ) , SecondCTE AS ( SELECT OrderDetailsNumber as odn, floor(RowNumber/3) as page_for_order_number, DENSE_RANK() OVER (Order BY OrderNumber, floor(RowNumber/3)) - 1 AS Actual FROM FirstCTE ) SELECT c2.page_for_order_number, c1.RowNumber, C2.Actual, c1.Expected, c1.OrderNumber, c1.OrderDetailsNumber FROM FirstCTE AS c1 INNER JOIN SecondCTE AS c2 on c2.odn = c1.OrderDetailsNumber
This strikes me as a bit of a hack, but it works... Divide the row_number() by 3, and use CEILINGto get the smallest integer greater than or equal to the result of that division. select row_number() over( order by OrderNumber) as [Actual], cast (row_number() over(order by ordernumber) as decimal(5,1)) / 3, CEILING(cast (row_number() over(order by ordernumber) as decimal(5,1)) / 3)as GRPR, * from #orders EDIT: Dang it, can never get results to line up. The 3rd column in the result set is your "page number". Which yields: Actual (No column name) PG_NBR Expected OrderNumber OrderDetailsNumber 1 0.333333 1 0 1 1 2 0.666666 1 0 1 2 3 1.000000 1 0 1 3 4 1.333333 2 1 1 4 5 1.666666 2 2 2 5 6 2.000000 2 2 2 6 7 2.333333 3 2 2 7 8 2.666666 3 3 2 8 9 3.000000 3 3 2 9