SQL to ensure unique node names in adjacency list - sql

So I have an adjacency list that forms a hierarchy simulating a versioned file structure. The problem is that the incoming file names are not currently unique and they need to be. To make things slightly more interesting the files may have different versions which should keep the name of the first version (note the versions all have the same NodeID).
Adjacency List
ParentID
NodeID
VersionNum
FileName
-1
1
1
FirstFolder
1
2
1
SecondFolder
1
3
1
ThirdFolder
1
4
1
FirstDocument
1
4
2
FirstDocument
1
5
1
FirstDocument
1
5
2
FirstDocument
2
6
1
FirstDocument
2
6
2
FirstDocument
2
7
1
SecondDocument
3
8
1
SecondDocument
3
9
1
ThirdDocument
3
9
2
ThirdDocument
3
10
1
ThirdDocument
3
11
1
ThirdDocument
Targeted Result
ParentID
NodeID
VersionNum
FileName
-1
1
1
FirstFolder
1
2
1
SecondFolder
1
3
1
ThirdFolder
1
4
1
FirstDocument
1
4
2
FirstDocument
1
5
1
FirstDocument_1
1
5
2
FirstDocument_1
2
6
1
FirstDocument
2
6
2
FirstDocument
2
7
1
SecondDocument
3
8
1
SecondDocument
3
9
1
ThirdDocument
3
9
2
ThirdDocument
3
10
1
ThirdDocument_1
3
11
1
ThirdDocument_2
*I should also note that the folder names are already guaranteed to be unique (they already exist, it is the documents that are incoming) and they only have 1 version.
CREATE TABLE #tmp_tree
(
ParentID INT,
NodeID INT,
VersionNum INT,
FileName VARCHAR(50),
);
INSERT INTO #tmp_tree (ParentID, NodeID, VersionNum, FileName)
VALUES (-1, 1, 1, 'FirstFolder' ),
(1, 2, 1, 'SecondFolder' ),
(1, 3, 1, 'ThirdFolder' ),
(1, 4, 1, 'FirstDocument' ),
(1, 4, 2, 'FirstDocument' ),
(1, 5, 1, 'FirstDocument' ),
(1, 5, 2, 'FirstDocument' ),
(2, 6, 1, 'FirstDocument' ),
(2, 6, 2, 'FirstDocument' ),
(2, 7, 1, 'SecondDocument' ),
(3, 8, 1, 'SecondDocument' ),
(3, 9, 1, 'ThirdDocument' ),
(3, 9, 2, 'ThirdDocument' ),
(3, 10, 1, 'ThirdDocument' )
(3, 11, 1, 'ThirdDocument' )
I really don't know how to approach this though resorting to a stored procedure. Adjacency list scream CTEs to me but that got me no where real fast. Group By loses the NodeID so while I can find the names of the documents that need to be renamed - I don't know how to use that to select the second occurrence of the name (ordered by NodeID).
-- I don't see how this helps... but this finds the names that need to change.
select ParentID, FileName,VersionNum, count(*) from #tmp_tree
GROUP BY ParentID, FileName, VersionNum
HAVING VersionNum = 1 and count(*) > 1
order by FileName
I know how to solve this procedural but not declaratively.
I don't know if this is closer or farther away from the solution:
select f2.*, Row_Number() over (order by f2.FileName) from
(select top 10 f.*, count(FileName) over (PARTITION by ParentID, FileName) as n from (select * from #tmp_tree where versionNum = 1) as f
order by f.ParentID, f.FileName) as f2
Where n > 1

I would assume the last line (3, 11) in the targeted result is a mistake.
You can find the repeated names with a window function in a subquery and then join it during the update. In short, you can do:
update #tmp_tree
set #tmp_tree.filename = concat(#tmp_tree.filename, '_', x.rn)
from #tmp_tree
join (
select *,
row_number() over(partition by parentid, filename order by nodeid) as rn
from #tmp_tree
where versionnum = 1
) x on x.rn > 1 and x.nodeid = #tmp_tree.nodeid;
Result:
ParentID NodeID VersionNum FileName
--------- ------- ----------- ---------------
-1 1 1 FirstFolder
1 2 1 SecondFolder
1 3 1 ThirdFolder
1 4 1 FirstDocument
1 4 2 FirstDocument
1 5 1 FirstDocument_2
1 5 2 FirstDocument_2
2 6 1 FirstDocument
2 6 2 FirstDocument
2 7 1 SecondDocument
3 8 1 SecondDocument
3 9 1 ThirdDocument
3 9 2 ThirdDocument
3 10 1 ThirdDocument_2
See running example at db<>fiddle.

You don't need to self-join the table, you can update the derived table directly, after calculating the row-number using DENSE_RANK
update x
set filename = concat(x.filename, '_', x.rn)
from (
select *,
dense_rank() over(partition by parentid, filename order by nodeid) as rn
from #tmp_tree
) x
where x.rn > 1;
db<>fiddle
DENSE_RANK will return the same number for tied results according to the ordering clause.

Related

Calculate loads and avoiding cursors

Given the following table structure, which is a representation of a bus route where passengers get on and off the bus with a door sensor. And, there is a person who sits on that bus with a clipboard holding a spot count.
CREATE TABLE BusLoad(
ROUTE CHAR(4) NOT NULL,
StopNumber INT NOT NULL,
ONS INT,
OFFS INT,
SPOT_CHECK INT)
go
INSERT BusLoad VALUES('AAAA', 1, 5, 0, null)
INSERT BusLoad VALUES('AAAA', 2, 0, 0, null)
INSERT BusLoad VALUES('AAAA', 3, 2, 1, null)
INSERT BusLoad VALUES('AAAA', 4, 6, 3, 8)
INSERT BusLoad VALUES('AAAA', 5, 1, 0, null)
INSERT BusLoad VALUES('AAAA', 6, 0, 1, 7)
INSERT BusLoad VALUES('AAAA', 7, 0, 3, null)
I want to add a column "LOAD" to this table that calculates the load at each stop.
Load = Previous stops load + current stop ONS - Current stop's OFFS if
SPOT_CHECK is null, otherwise LOAD = SPOT_CHECK
Expected Results:
ROUTE StopNumber ONS OFFS SPOT_CHECK LOAD
AAAA 1 5 0 NULL 5
AAAA 2 0 0 NULL 5
AAAA 3 2 1 NULL 6
AAAA 4 6 3 8 8
AAAA 5 1 0 NULL 9
AAAA 6 0 1 7 7
AAAA 7 0 3 NULL 4
I can do this with a cursor, but is there a way to do it using a query?
You can use the following query:
select ROUTE, StopNumber, ONS, OFFS, SPOT_CHECK,
COALESCE(SPOT_CHECK, ONS - OFFS) AS ld,
SUM(CASE WHEN SPOT_CHECK IS NULL THEN 0 ELSE 1 END)
OVER (PARTITION BY ROUTE ORDER BY StopNumber) AS grp
from BusLoad
to get:
ROUTE StopNumber ONS OFFS SPOT_CHECK ld grp
----------------------------------------------------
AAAA 1 5 0 NULL 5 0
AAAA 2 0 0 NULL 0 0
AAAA 3 2 1 NULL 1 0
AAAA 4 6 3 8 8 1
AAAA 5 1 0 NULL 1 1
AAAA 6 0 1 7 7 2
AAAA 7 0 3 NULL -3 2
All you want now is the running total of ld over ROUTE, grp partitions of data:
;WITH CTE AS (
....
previous query here
)
select ROUTE, StopNumber, ONS, OFFS, SPOT_CHECK, grp,
sum(ld) over (PARTITION BY ROUTE, grp ORDER BY StopNumber) as load
from cte
Demo here
Note: The above query works for versions starting from 2012. If you want a query for 2008 you have to somehow simulate sum() over (order by ...). You can find many relevant posts here in SO.
You may use recursive query
with act_load as
(
select *, ons load
from busload
where stopnumber = 1 and route = 'AAAA'
union all
select b.*, case when b.spot_check is null then l.load + b.ons - b.offs
else b.spot_check
end load
from busload b
join act_load l on b.StopNumber = l.StopNumber + 1 and
b.route = l.route
)
select *
from act_load
dbfiddle demo

skip consecutive rows after specific value

Note: I have a working query, but am looking for optimisations to use it on large tables.
Suppose I have a table like this:
id session_id value
1 5 7
2 5 1
3 5 1
4 5 12
5 5 1
6 5 1
7 5 1
8 6 7
9 6 1
10 6 3
11 6 1
12 7 7
13 8 1
14 8 2
15 8 3
I want the id's of all rows with value 1 with one exception:
skip groups with value 1 that directly follow a value 7 within the same session_id.
Basically I would look for groups of value 1 that directly follow a value 7, limited by the session_id, and ignore those groups. I then show all the remaining value 1 rows.
The desired output showing the id's:
5
6
7
11
13
I took some inspiration from this post and ended up with this code:
declare #req_data table (
id int primary key identity,
session_id int,
value int
)
insert into #req_data(session_id, value) values (5, 7)
insert into #req_data(session_id, value) values (5, 1) -- preceded by value 7 in same session, should be ignored
insert into #req_data(session_id, value) values (5, 1) -- ignore this one too
insert into #req_data(session_id, value) values (5, 12)
insert into #req_data(session_id, value) values (5, 1) -- preceded by value != 7, show this
insert into #req_data(session_id, value) values (5, 1) -- show this too
insert into #req_data(session_id, value) values (5, 1) -- show this too
insert into #req_data(session_id, value) values (6, 7)
insert into #req_data(session_id, value) values (6, 1) -- preceded by value 7 in same session, should be ignored
insert into #req_data(session_id, value) values (6, 3)
insert into #req_data(session_id, value) values (6, 1) -- preceded by value != 7, show this
insert into #req_data(session_id, value) values (7, 7)
insert into #req_data(session_id, value) values (8, 1) -- new session_id, show this
insert into #req_data(session_id, value) values (8, 2)
insert into #req_data(session_id, value) values (8, 3)
select id
from (
select session_id, id, max(skip) over (partition by grp) as 'skip'
from (
select tWithGroups.*,
( row_number() over (partition by session_id order by id) - row_number() over (partition by value order by id) ) as grp
from (
select session_id, id, value,
case
when lag(value) over (partition by session_id order by session_id) = 7
then 1
else 0
end as 'skip'
from #req_data
) as tWithGroups
) as tWithSkipField
where tWithSkipField.value = 1
) as tYetAnotherOutput
where skip != 1
order by id
This gives the desired result, but with 4 select blocks I think it's way too inefficient to use on large tables.
Is there a cleaner, faster way to do this?
The following should work well for this.
WITH
cte_ControlValue AS (
SELECT
rd.id, rd.session_id, rd.value,
ControlValue = ISNULL(CAST(SUBSTRING(MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), 5, 4) AS INT), 999)
FROM
#req_data rd
CROSS APPLY ( VALUES (CAST(rd.id AS BINARY(4)) + CAST(NULLIF(rd.value, 1) AS BINARY(4))) ) bv (BinVal)
)
SELECT
cv.id, cv.session_id, cv.value
FROM
cte_ControlValue cv
WHERE
cv.value = 1
AND cv.ControlValue <> 7;
Results...
id session_id value
----------- ----------- -----------
5 5 1
6 5 1
7 5 1
11 6 1
13 8 1
Edit: How and why it works...
The basic premise is taken from Itzik Ben-Gan's "The Last non NULL Puzzle".
Essentially, we are relying 2 different behaviors that most people don't usually think about...
1) NULL + anything = NULL.
2) You can CAST or CONVERT an INT into a fixed length BINARY data type and it will continue to sort as an INT (as opposed to sorting like a text string).
This is easier to see when the intermittent steps are added to the query in the CTE...
SELECT
rd.id, rd.session_id, rd.value,
bv.BinVal,
SmearedBinVal = MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id),
SecondHalfAsINT = CAST(SUBSTRING(MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), 5, 4) AS INT),
ControlValue = ISNULL(CAST(SUBSTRING(MAX(bv.BinVal) OVER (PARTITION BY rd.session_id ORDER BY rd.id), 5, 4) AS INT), 999)
FROM
#req_data rd
CROSS APPLY ( VALUES (CAST(rd.id AS BINARY(4)) + CAST(NULLIF(rd.value, 1) AS BINARY(4))) ) bv (BinVal)
Results...
id session_id value BinVal SmearedBinVal SecondHalfAsINT ControlValue
----------- ----------- ----------- ------------------ ------------------ --------------- ------------
1 5 7 0x0000000100000007 0x0000000100000007 7 7
2 5 1 NULL 0x0000000100000007 7 7
3 5 1 NULL 0x0000000100000007 7 7
4 5 12 0x000000040000000C 0x000000040000000C 12 12
5 5 1 NULL 0x000000040000000C 12 12
6 5 1 NULL 0x000000040000000C 12 12
7 5 1 NULL 0x000000040000000C 12 12
8 6 7 0x0000000800000007 0x0000000800000007 7 7
9 6 1 NULL 0x0000000800000007 7 7
10 6 3 0x0000000A00000003 0x0000000A00000003 3 3
11 6 1 NULL 0x0000000A00000003 3 3
12 7 7 0x0000000C00000007 0x0000000C00000007 7 7
13 8 1 NULL NULL NULL 999
14 8 2 0x0000000E00000002 0x0000000E00000002 2 2
15 8 3 0x0000000F00000003 0x0000000F00000003 3 3
Looking at the BinVal column, we see an 8 byte hex value for all non-[value] = 1 rows and NULLS where [value] = 1... The 1st 4 bytes are the Id (used for ordering) and the 2nd 4 bytes are [value] (used to set the "previous non-1 value" or set the whole thing to NULL.
The 2nd step is to "smear" the non-NULL values into the NULLs using the window framed MAX function, partitioned by session_id and ordered by id.
The 3rd step is to parse out the last 4 bytes and convert them back to an INT data type (SecondHalfAsINT) and deal with any nulls that result from not having any non-1 preceding value (ControlValue).
Since we can't reference a windowed function in the WHERE clause, we have to throw the query into a CTE (a derived table would work just as well) so that we can use the new ControlValue in the where clause.
SELECT CRow.id
FROM #req_data AS CRow
CROSS APPLY (SELECT MAX(id) AS id FROM #req_data PRev WHERE PRev.Id < CRow.id AND PRev.session_id = CRow.session_id AND PRev.value <> 1 ) MaxPRow
LEFT JOIN #req_data AS PRow ON MaxPRow.id = PRow.id
WHERE CRow.value = 1 AND ISNULL(PRow.value,1) <> 7
You can use the following query:
select id, session_id, value,
coalesce(sum(case when value <> 1 then 1 end)
over (partition by session_id order by id), 0) as grp
from #req_data
to get:
id session_id value grp
----------------------------
1 5 7 1
2 5 1 1
3 5 1 1
4 5 12 2
5 5 1 2
6 5 1 2
7 5 1 2
8 6 7 1
9 6 1 1
10 6 3 2
11 6 1 2
12 7 7 1
13 8 1 0
14 8 2 1
15 8 3 2
So, this query detects islands of consecutive 1 records that belong to the same group, as specified by the first preceding row with value <> 1.
You can use a window function once more to detect all 7 islands. If you wrap this in a second cte, then you can finally get the desired result by filtering out all 7 islands:
;with session_islands as (
select id, session_id, value,
coalesce(sum(case when value <> 1 then 1 end)
over (partition by session_id order by id), 0) as grp
from #req_data
), islands_with_7 as (
select id, grp, value,
count(case when value = 7 then 1 end)
over (partition by session_id, grp) as cnt_7
from session_islands
)
select id
from islands_with_7
where cnt_7 = 0 and value = 1

SQL Recursive CTE unexpectedly returns alternating sets

I am trying to get the use recursive CTE to repeat the same pattern over and over, resetting when "Scenario" increases in value. RowNumber repeats 1-21 (as desired), but whenever "Scenario" is an even number, there are too few items in the "Vals" column to feed into "Value". I can't figure out which part of the code is causing me to be 1 short for only even Scenarios.
Below are the results of the code I'm using at the bottom.
Scenario RowNumber Value Vals
1 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 5 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 6 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 7 A A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 8 A A,A,A,A,A,A,A,A,A,A,A,B,C
1 9 A A,A,A,A,A,A,A,A,A,A,B,C
1 10 A A,A,A,A,A,A,A,A,A,B,C
1 11 A A,A,A,A,A,A,A,A,B,C
1 12 A A,A,A,A,A,A,A,B,C
1 13 A A,A,A,A,A,A,B,C
1 14 A A,A,A,A,A,B,C
1 15 A A,A,A,A,B,C
1 16 A A,A,A,B,C
1 17 A A,A,B,C
1 18 A A,B,C
1 19 A B,C
1 20 B C
1 21 C
2 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 5 A A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 6 A A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 7 A A,A,A,A,A,A,A,A,A,A,B,B,C
2 8 A A,A,A,A,A,A,A,A,A,B,B,C
2 9 A A,A,A,A,A,A,A,A,B,B,C
2 10 A A,A,A,A,A,A,A,B,B,C
2 11 A A,A,A,A,A,A,B,B,C
2 12 A A,A,A,A,A,B,B,C
2 13 A A,A,A,A,B,B,C
2 14 A A,A,A,B,B,C
2 15 A A,A,B,B,C
2 16 A A,B,B,C
2 17 A B,B,C
2 18 B B,C
2 19 B C
2 20 C
2 21 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
3 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 5 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 6 A A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 7 A A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 8 A A,A,A,A,A,A,A,A,A,A,B,C,C
3 9 A A,A,A,A,A,A,A,A,A,B,C,C
3 10 A A,A,A,A,A,A,A,A,B,C,C
3 11 A A,A,A,A,A,A,A,B,C,C
3 12 A A,A,A,A,A,A,B,C,C
3 13 A A,A,A,A,A,B,C,C
3 14 A A,A,A,A,B,C,C
3 15 A A,A,A,B,C,C
3 16 A A,A,B,C,C
3 17 A A,B,C,C
3 18 A B,C,C
3 19 B C,C
3 20 C C
3 21 C
4 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 4 A A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 5 A A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 6 A A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 7 A A,A,A,A,A,A,A,A,A,B,B,B,C
4 8 A A,A,A,A,A,A,A,A,B,B,B,C
4 9 A A,A,A,A,A,A,A,B,B,B,C
4 10 A A,A,A,A,A,A,B,B,B,C
4 11 A A,A,A,A,A,B,B,B,C
4 12 A A,A,A,A,B,B,B,C
4 13 A A,A,A,B,B,B,C
4 14 A A,A,B,B,B,C
4 15 A A,B,B,B,C
4 16 A B,B,B,C
4 17 B B,B,C
4 18 B B,C
4 19 B C
4 20 C
This is the code I used to generate the above sample. Where am I going wrong?
CREATE TABLE #temp3
(
Scenario INT
,Vals VARCHAR(64)
,LEN INT
)
;
WITH vals AS
(
SELECT
v.*
FROM
(VALUES ('A'), ('B'), ('C')) v(x)
),
CTE AS
(
SELECT CAST('A' AS VARCHAR(MAX)) AS STR, 0 AS LEN
UNION ALL
SELECT (CTE.STR + ',' + vals.x), CTE.LEN + 1
FROM
CTE
JOIN vals
ON vals.x >= RIGHT(CTE.STR, 1)
WHERE CTE.LEN < 19
)
INSERT INTO #temp3
SELECT
ROW_NUMBER() OVER(ORDER BY STR + ',C') AS Scenario
,STR + ',C' AS Vals
,LEN
FROM
CTE
WHERE
STR + 'C' LIKE '%B%'
AND LEN = 19
;
-- Split strings created above into individual characters
WITH cte(Scenario, Value, Vals) AS
(
SELECT
Scenario
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) AS Value
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') AS Vals
FROM #temp3
UNION ALL
SELECT
Scenario
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10))
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '')
FROM cte
WHERE Vals > ''
)
SELECT
Scenario
,ROW_NUMBER() OVER (PARTITION BY Scenario ORDER BY Scenario) RowNumber
,Value
,Vals
FROM cte t
I'm not exactly sure what the problem you are describing is, but the ROW_NUMBER() should use an ORDER BY clause that completely orders the rows in each partition.
When you use "PARTITION BY Scenario ORDER BY Scenario" the order in which the ROW_NUMBER() values are assigned is undefined. Try something like
WITH cte(Scenario, depth, Value, Vals) AS
(
SELECT
Scenario, 0 depth
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) AS Value
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') AS Vals
FROM #temp3
UNION ALL
SELECT
Scenario, depth+1
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10))
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '')
FROM cte
WHERE Vals > ''
)
SELECT
Scenario
,depth
,ROW_NUMBER() OVER (PARTITION BY Scenario ORDER BY depth ) RowNumber
,Value
,Vals
FROM cte t

Different select criteria in odd and even events

I have a table which looks like this ( 10 billion rows)
AID BID CID
1 2 1
1 6 9
0 1 4
1 3 2
1 100 2
0 4 2
0 0 1
The AID could only be 0 or 1. BID and CID could be anything.
Now I want to select events first with AID=1 and then AID=0, and again AID=1 and then AID=0.
The idea is to select equal numbers of AID=1 and AID=0 event.
How can I achieve that?
The expected result is
AID BID CID
1 2 1
0 1 4
1 6 9
0 4 2
1 3 2
0 0 1
;WITH cte AS (
select *
FROM (VALUES
(1, 2, 1),
(1, 6, 9),
(0, 1, 4),
(1, 3, 2),
(1, 100, 2),
(0, 4, 2),
(0, 0, 1)
) as t(AID, BID, CID)
),
withrow AS (
SELECT ROW_NUMBER() OVER (PARTITION BY AID ORDER BY AID) as RN, *
FROM cte)
SELECT AID,BID,CID
FROM withrow
ORDER BY RN asc , aid desc
Output:
AID BID CID
----------- ----------- -----------
1 100 2
0 4 2
1 3 2
0 1 4
1 6 9
0 0 1
1 2 1
(7 row(s) affected)

Assign rownumber in SQL grouped on value and n rows per rownumber

I am trying to generate a report with 3 rows per page for each order number using the following SQL.
As you can see from the results the fields Actual & Expected do not match up.
Any help would be appreciated.
set nocount on
DECLARE #Orders TABLE (Expected int, OrderNumber INT, OrderDetailsNumber int)
Insert into #orders values (0,1,1)
Insert into #orders values (0,1,2)
Insert into #orders values (0,1,3)
Insert into #orders values (1,1,4)
Insert into #orders values (2,2,5)
Insert into #orders values (2,2,6)
Insert into #orders values (2,2,7)
Insert into #orders values (3,2,8)
Insert into #orders values (3,2,9)
select cast(((row_number() over( order by OrderNumber)) -1) /3 as int) as [Actual]
,*
from #orders
Actual Expected OrderNumber OrderDetailsNumber
----------- ----------- ----------- ------------------
0 0 1 1
0 0 1 2
0 0 1 3
1 1 1 4
1 2 2 5
1 2 2 6
2 2 2 7
2 3 2 8
2 3 2 9
Right, after a couple of edits I have the final answer:
SELECT DENSE_RANK() OVER (Order BY OrderNumber, floor(RowNumber/3)) - 1 AS Actual,
Expected,
OrderNumber,
OrderDetailsNumber
FROM
(
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY OrderNumber
ORDER BY OrderDetailsNumber
) - 1 AS RowNumber
FROM #Orders
) RowNumberTable
Gives the result (with extra rows for testing):
Actual Expected OrderNumber OrderDetailsNumber
-------------------- ----------- ----------- ------------------
0 0 1 1
0 0 1 2
0 0 1 3
1 1 1 4
1 1 1 12
2 2 2 5
2 2 2 6
2 2 2 7
3 3 2 8
3 3 2 9
3 4 2 11
4 3 2 27
5 5 3 10
This only works where OrderDetailsNumber is unique such that the result is deterministic.
Edit
I've now got the complete code working, however the dependence on OrderDetailsNumber being in order is very iffy, hopefully you can test and edit as required.
Edit 2
I've put the 'golfed' version in the main answer.
WITH FirstCTE AS
(
SELECT
OrderNumber,
OrderDetailsNumber,
Expected,
ROW_NUMBER() OVER (
PARTITION BY OrderNumber
ORDER BY OrderDetailsNumber
) - 1 AS RowNumber
FROM #Orders
)
, SecondCTE AS
(
SELECT OrderDetailsNumber as odn,
floor(RowNumber/3) as page_for_order_number,
DENSE_RANK() OVER (Order BY OrderNumber, floor(RowNumber/3)) - 1 AS Actual
FROM FirstCTE
)
SELECT c2.page_for_order_number,
c1.RowNumber,
C2.Actual,
c1.Expected,
c1.OrderNumber,
c1.OrderDetailsNumber
FROM FirstCTE AS c1
INNER JOIN SecondCTE AS c2
on c2.odn = c1.OrderDetailsNumber
This strikes me as a bit of a hack, but it works...
Divide the row_number() by 3, and use CEILINGto get the smallest integer greater than or equal to the result of that division.
select row_number() over( order by OrderNumber) as [Actual],
cast (row_number() over(order by ordernumber) as decimal(5,1)) / 3,
CEILING(cast (row_number() over(order by ordernumber) as decimal(5,1)) / 3)as GRPR,
*
from #orders
EDIT: Dang it, can never get results to line up. The 3rd column in the result set is your "page number".
Which yields:
Actual (No column name) PG_NBR Expected OrderNumber OrderDetailsNumber
1 0.333333 1 0 1 1
2 0.666666 1 0 1 2
3 1.000000 1 0 1 3
4 1.333333 2 1 1 4
5 1.666666 2 2 2 5
6 2.000000 2 2 2 6
7 2.333333 3 2 2 7
8 2.666666 3 3 2 8
9 3.000000 3 3 2 9