I have a problem with finding matches in one Oracle-Table and I hope you can help me.
I have one bill-table containing booking data looking like this:
ID GROUP Bill-Number Value Partner-Number
1 1 111 10,90
2 1 751 40,28
3 1 438 125,60
4 1 659 -10,90 987
5 1 387 -165,88 755
6 1 774 -100,10
7 1 664 -80,12
8 1 259 180,22 999
9 2 774 -200,10
10 2 664 -80,12
11 2 259 280,22 777
As you can see, we have some bills that are containing costs.
Some time later, there is coming the counter-bill that is summarizing the previous costs. The sum of the bills and the associated counter-bill is creating a Value of 0.
Example: value of (id 2 + id 3 = id 5*-1)
or in numbers: 40,28 + 125,60 + (-165,88) = 0
The counter-bills are containing a "Partner-Number". I need to add this information to the associated bills.
The solution should look like this:
ID GROUP Bill-Number Value Partner-Number
1 1 111 10,90 987
2 1 751 40,28 755
3 1 438 125,60 755
4 1 659 -10,90 987
5 1 387 -165,88 755
6 1 774 -100,10 999
7 1 664 -80,12 999
8 1 259 180,22 999
9 2 774 -200,10 777
10 2 664 -80,12 777
11 2 259 280,22 777
I have to match the bills only inside a group. (ID is my primary key)
As long as the group is containing one counter-bill with a 1:1 relation to a bill it is doable for me.
But how can I find the matches like in group 1 where the relation is 1:N? (the group contains multiple counter-bills)
I hope you can help me - thank you in advance :)
The following SQL code, has been tested with Oracle 12c and 18c, respectively. Ideas/steps:
{1} Split the original table into a MINUSES and PLUSES table, containing just positive numbers, saving us a few function calls later.
{2} Create 2 views that will find combinations of pluses that fit a particular minus (and vice versa).
{3} List all components, in "comma-separated" form in a table called ALLCOMPONENTS.
{4} Table GAPFILLERS: Expand the (comma separated) IDs of all components, thereby obtaining all necessary values to fill the gaps in the original table.
{5} LEFT JOIN the original table to the GAPFILLERS.
Original table/data
create table bills ( id primary key, bgroup, bnumber, bvalue, partner )
as
select 1, 1, 111, 10.90, null from dual union all
select 2, 1, 751, 40.28, null from dual union all
select 3, 1, 438, 125.60, null from dual union all
select 4, 1, 659, -10.90, 987 from dual union all
select 5, 1, 387, -165.88, 755 from dual union all
select 6, 1, 774, -100.10, null from dual union all
select 7, 1, 664, -80.12, null from dual union all
select 8, 1, 259, 180.22, 999 from dual union all
select 9, 2, 774, -200.10, null from dual union all
select 10, 2, 664, -80.12, null from dual union all
select 11, 2, 259, 280.22, 777 from dual ;
{1} split the table into a PLUS and a MINUS table
-- MINUSes
create table minuses as
select id
, bgroup as mgroup
, bnumber as mnumber
, bvalue * -1 as mvalue
, partner as mpartner
from bills where bvalue < 0 ;
-- PLUSes
create table pluses as
select id
, bgroup as pgroup
, bnumber as pnumber
, bvalue as pvalue
, partner as ppartner
from bills where bvalue >= 0 ;
{2} View: find components of PLUSvalues
-- used here: "recursive subquery factoring"
-- and LATERAL join (needs Oracle 12c or later)
create or replace view splitpluses
as
with recursiveclause ( nextid, mgroup, tvalue, componentid )
as (
select -- anchor member
id as nextid
, mgroup as mgroup
, mvalue as tvalue -- total value
, to_char( id ) as componentid
from minuses
union all
select -- recursive member
M.id
, R.mgroup
, R.tvalue + M.mvalue
, R.componentid || ',' || to_char( M.id )
from recursiveclause R
join minuses M
on M.id > R.nextid and M.mgroup = R.mgroup -- only look at values in the same group
)
--
select
mgroup
, tvalue as plusvalue
, componentid as minusids
, ppartner
from
recursiveclause R
, lateral ( select ppartner from pluses P where R.tvalue = P.pvalue ) -- fetch the partner id
where
tvalue in ( select pvalue from pluses where ppartner is not null ) -- get all relevant pvalues that must be broken down into components
and ppartner is not null -- do this for all pluses that have a partner id
;
{2b} View: find components of MINUSvalues
create or replace view splitminuses
as
with recursiveclause ( nextid, pgroup, tvalue, componentid )
as (
select -- anchor member
id as nextid
, pgroup as pgroup
, pvalue as tvalue -- total value
, to_char( id ) as componentid
from pluses
union all
select -- recursive member
P.id
, R.pgroup
, R.tvalue + P.pvalue
, R.componentid || ',' || to_char( P.id )
from recursiveclause R
join pluses P
on P.id > R.nextid and P.pgroup = R.pgroup
)
--
select
pgroup
, tvalue as minusvalue
, componentid as plusids
, mpartner
from
recursiveclause R
, lateral ( select mpartner from minuses M where R.tvalue = M.mvalue )
where
tvalue in ( select mvalue from minuses where mpartner is not null )
and mpartner is not null
;
The views give us the following result sets:
SQL> select * from splitpluses;
MGROUP PLUSVALUE MINUSIDS PPARTNER
1 180.22 6,7 999
2 280.22 9,10 777
SQL> select * from splitminuses ;
PGROUP MINUSVALUE PLUSIDS MPARTNER
1 10.9 1 987
1 165.88 2,3 755
{3} Table ALLCOMPONENTS: list of all "components"
create table allcomponents ( type_, group_, value_, cids_, partner_ )
as
select 'components of PLUS' as type_, M.* from splitminuses M
union all
select 'components of MINUS', P.* from splitpluses P
;
SQL> select * from allcomponents ;
TYPE_ GROUP_ VALUE_ CIDS_ PARTNER_
components of PLUS 1 10.9 1 987
components of PLUS 1 165.88 2,3 755
components of MINUS 1 180.22 6,7 999
components of MINUS 2 280.22 9,10 777
{4} Table GAPFILLERS: derived from ALLCOMPONENTS, contains all values we need to fill the "gaps" in the original table.
-- One row for each CSV (comma-separated value) of ALLCOMPONENTS
create table gapfillers
as
select unique type_, group_, value_
, trim( regexp_substr( cids_, '[^,]+', 1, level ) ) cids_
, partner_
from (
select type_, group_, value_, cids_, partner_
from allcomponents
) AC
connect by instr( cids_, ',', 1, level - 1 ) > 0
order by group_, partner_ ;
SQL> select * from gapfillers ;
TYPE_ GROUP_ VALUE_ CIDS_ PARTNER_
components of PLUS 1 165.88 2 755
components of PLUS 1 165.88 3 755
components of PLUS 1 10.9 1 987
components of MINUS 1 180.22 6 999
components of MINUS 1 180.22 7 999
components of MINUS 2 280.22 10 777
components of MINUS 2 280.22 9 777
7 rows selected.
{5} The final LEFT JOIN
select
B.id, bgroup, bnumber, bvalue
, case
when B.partner is null then G.partner_
else B.partner
end as partner
from bills B
left join gapfillers G on B.id = G.cids_
order by 1 ;
-- result
ID BGROUP BNUMBER BVALUE PARTNER
1 1 111 10.9 987
2 1 751 40.28 755
3 1 438 125.6 755
4 1 659 -10.9 987
5 1 387 -165.88 755
6 1 774 -100.1 999
7 1 664 -80.12 999
8 1 259 180.22 999
9 2 774 -200.1 777
10 2 664 -80.12 777
11 2 259 280.22 777
11 rows selected.
DBFIDDLE here.
SQLis idealy designe for a brute force approach to solve the problems (the only problm is, that for large data the query would hang forever).
Here a possible step by step approach, considering only one counter-bill in teh first step, than two in the second step etc.
I'm showing queries for the first two steps, you should get the idea how to proceed - most probably with dynamic SQL in a loop.
The first step is trivial self-joining joining the table and constraining the GROUP and value. A result table is created, whichis used further to limit already matched rows.
create table tab_match as
-- 1 row match
select b.ID, b.GROUP_ID, b.BILL_NUMBER, b.VALUE, a.partner_number from tab a
join tab b
on a.group_id = b.group_id and /* same group */
-1 * a.value = b.value /* oposite value */
where a.partner_number is not NULL /* consider group row only */
In the second step you repeats the same, only adding one join (we investigate two sub.bills) with an additional constraint on the total value -1 * a.value = (b.value + c.value)
Also we suppress all partner_numbers and bills already assigned. The result is inserted in the temporary table.
insert into tab_match (ID, GROUP_ID, BILL_NUMBER, VALUE, PARTNER_NUMBER)
select b.ID, b.GROUP_ID, b.BILL_NUMBER, b.VALUE, a.partner_number partner_number_match from tab a
join tab b
on a.group_id = b.group_id and /* same group */
sign(a.value) * sign(b.value) < 0 and /* values must have oposite signs */
abs(a.value) > abs(b.value) /* the partial value is lower than the sum */
join tab c /* join to 2nd table */
on a.group_id = c.group_id and
sign(a.value) * sign(c.value) < 0 and
abs(a.value) > abs(c.value) and
-1 * a.value = (b.value + c.value)
where a.partner_number is not NULL and /* consider open group row only */
a.partner_number not in (select partner_number from tab_match) and
a.id not in (select id from tab_match) /* ignore matched rows */
;
You must proceed with processing of 3,4 etc. rows until all partner_numbers and bills are assigned.
Add a next join
join tab d
on a.group_id = d.group_id and
sign(a.value) * sign(d.value) < 0 and
abs(a.value) > abs(d.value)
and adjust the total sum predicate in each step
-1 * a.value = (b.value + c.value + d.value)
Good Luck;)
Related
Lets say I have a table called #OrgList
CREATE TABLE #OrgList (
OrgUnitId int,
ParentOrgUnitId int,
PersonId int,
isExcluded bit
);
INSERT INTO #OrgList(OrgUnitId, ParentOrgUnitId, PersonId, isExcluded) VALUES
(1, NULL, 100, 0), (2,1, 101, 0), (3,1,102,0), (4,2,103,1), (5,2,104,0), (6,3,105,0), (7,4,106,0), (8,4,107,0), (9,4,108,0), (10,4,109,0), (11,4,110,1), (12,11,111,0)
My hierarchy tree structure looks like this
and my cte like this:
;
with cte as (
select OrgUnitId, ParentOrgUnitId, PersonId, isExcluded , 0 as level_num
from #OrgList
where ParentOrgUnitId is null
UNION ALL
select o.OrgUnitId, o.ParentOrgUnitId, o.PersonId, o.isExcluded , level_num+1 as level_num
from #OrgList o
join cte on o.ParentOrgUnitId=cte.OrgUnitId
)
select * from cte
I exclude OrgUnitId=4 and =11, then I want to update my recursive query that will recalculate the tree and show the new tree details, including level moves (there can be more levels and more consecutive exclusions, of course except the root node):
You should just add a second UNION ALL in your cte:
with cte as (
select OrgUnitId, ParentOrgUnitId, PersonId, isExcluded , 0 as level_num
from #OrgList
where ParentOrgUnitId is null
UNION ALL
select o.OrgUnitId, o.ParentOrgUnitId, o.PersonId, o.isExcluded , level_num+1 as level_num
from #OrgList o
join cte on o.ParentOrgUnitId=cte.OrgUnitId
where cte.isExcluded = 0
UNION ALL
select o.OrgUnitId, cte.ParentOrgUnitId, o.PersonId, o.isExcluded , level_num as level_num
from #OrgList o
join cte on o.ParentOrgUnitId=cte.OrgUnitId
where cte.isExcluded = 1
)
select * from cte
My approach:
Extend your initial CTE with an exclusion counter (ExclusionCount), counting the number of excluded nodes going from root to leaf nodes.
Add another recursive CTE to construct the upward path (cte_upwards) for each leaf node. Now decrement the counter added in the initial CTE.
Use a cross apply to select the first node where the upward path reaches an exlusion count of zero.
Solution:
with cte as -- initial CTE
(
select OrgUnitId,
ParentOrgUnitId,
PersonId,
IsExcluded,
convert(int, IsExcluded) as 'ExclusionCount', -- new counter
0 as 'level_num'
from #OrgList
where ParentOrgUnitId is null
union all
select o.OrgUnitId,
o.ParentOrgUnitId,
o.PersonId,
o.IsExcluded,
cte.ExclusionCount + convert(int, o.isExcluded), -- increment counter
cte.level_num + 1
from #OrgList o
join cte on o.ParentOrgUnitId = cte.OrgUnitId
),
cte_upwards as
(
select cte.OrgUnitId,
cte.ParentOrgUnitId as 'NewParentOrgUnitId',
cte.IsExcluded,
cte.ExclusionCount,
cte.level_num
from cte
where cte.ParentOrgUnitId is not null -- only leaf nodes (not a root)
and not exists ( select top 1 'x' -- only leaf nodes (not an intermediate node)
from cte cp
where cp.ParentOrgUnitId = cte.OrgUnitId )
union all
select cte_upwards.OrgUnitId,
cte.ParentOrgUnitId,
cte.IsExcluded,
cte_upwards.ExclusionCount - cte.IsExcluded, -- decrement counter
cte.level_num
from cte_upwards
join cte
on cte.OrgUnitId = cte_upwards.NewParentOrgUnitId
)
select cte.OrgUnitId,
cte.ParentOrgUnitId,
cte.IsExcluded,
x.NewParentOrgUnitId,
coalesce(x.NewParentOrgUnitId, cte.ParentOrgUnitId) as 'Recalculated'
from cte
outer apply ( select top 1 cu.NewParentOrgUnitId
from cte_upwards cu
where cu.OrgUnitId = cte.OrgUnitId
and cu.ExclusionCount = 0 -- node without excluded parent nodes
order by cu.level_num desc ) x -- select lowest node in upwards path
order by cte.OrgUnitId;
Result:
OrgUnitId ParentOrgUnitId IsExcluded NewParentOrgUnitId Recalculated
----------- --------------- ---------- ------------------ ------------
1 NULL 0 NULL NULL
2 1 0 NULL 1
3 1 0 NULL 1
4 2 1 NULL 2
5 2 0 2 2
6 3 0 3 3
7 4 0 2 2
8 4 0 2 2
9 4 0 2 2
10 4 0 2 2
11 4 1 NULL 4
12 11 0 2 2
I've added a VirtualParentOrgUnitId, which contains the parent with excluded nodes taken into account. I've also added a counter, VirtualDistance, which will report how many real hops there are between this node and it's virtual parent.
VirtualParentOrgUnitId will use the parent's ID if is not excluded, otherwise it will use it's parents's VirtualParentOrgUnitId, which allows chaining of multiple levels.
DROP TABLE IF EXISTS #OrgList
CREATE TABLE #OrgList (
OrgUnitId int,
ParentOrgUnitId int,
PersonId int,
isExcluded bit
);
INSERT INTO #OrgList(OrgUnitId, ParentOrgUnitId, PersonId, isExcluded) VALUES
(1, NULL, 100, 0), (2,1, 101, 0), (3,1,102,0), (4,2,103,1), (5,2,104,0), (6,3,105,0), (7,4,106,0), (8,4,107,0), (9,4,108,0), (10,4,109,0), (11,4,110,1), (12,11,111,0)
DROP TABLE IF EXISTS #Excludes
CREATE Table #Excludes (
OrgUnitId int
);
INSERT INTO #Excludes VALUES (4), (11);
with cte as (
select OrgUnitId, ParentOrgUnitId, ParentOrgUnitId VirtualParentOrgUnitId, 1 as VirtualDistance , PersonId, isExcluded , 0 as level_num
from #OrgList
where ParentOrgUnitId is null
UNION ALL
select o.OrgUnitId, o.ParentOrgUnitId, IIF(o.ParentOrgUnitId IN (SELECT OrgUnitId FROM #Excludes),cte.VirtualParentOrgUnitId, o.ParentOrgUnitId ), IIF(o.ParentOrgUnitId IN (SELECT OrgUnitId FROM #Excludes),VirtualDistance + 1, 1 ), o.PersonId, o.isExcluded , level_num+1 as level_num
from #OrgList o
join cte on o.ParentOrgUnitId=cte.OrgUnitId
)
select * from cte
Here are the results:
OrgUnitId ParentOrgUnitId VirtualParentOrgUnitId VirtualDistance PersonId isExcluded level_num
----------- --------------- ---------------------- --------------- ----------- ---------- -----------
1 NULL NULL 0 100 0 0
2 1 1 0 101 0 1
3 1 1 0 102 0 1
6 3 3 0 105 0 2
4 2 2 0 103 1 2
5 2 2 0 104 0 2
7 4 2 1 106 0 3
8 4 2 1 107 0 3
9 4 2 1 108 0 3
10 4 2 1 109 0 3
11 4 2 1 110 1 3
12 11 2 2 111 0 4
;
with cte as (
select OrgUnitId, ParentOrgUnitId, PersonId, isExcluded , 0 as level_num, 0 as level_after_exclusions,
cast(',' as varchar(max)) + case isExcluded when 1 then cast(OrgUnitId as varchar(20)) else '' end as excludedmembers,
case isExcluded when 1 then ParentOrgUnitId end as newParentId
from #OrgList
where ParentOrgUnitId is null
UNION ALL
select o.OrgUnitId, o.ParentOrgUnitId, o.PersonId, o.isExcluded , level_num + 1, level_after_exclusions + case o.isExcluded when 1 then 0 else 1 end,
excludedmembers + case o.isExcluded when 1 then cast(o.OrgUnitId as varchar(20))+',' else '' end,
case when excludedmembers like '%,'+cast(o.ParentOrgUnitId as varchar(20))+',%' then newParentId else o.ParentOrgUnitId end
from #OrgList o
join cte on o.ParentOrgUnitId=cte.OrgUnitId
)
select *, level_num - level_after_exclusions as shiftbylevels
from cte
I have parent child relation SQL table
LOCATIONDETAIL Table
OID NAME PARENTOID
1 HeadSite 0
2 Subsite1 1
3 subsite2 1
4 subsubsite1 2
5 subsubsite2 2
6 subsubsite3 3
RULESETCONFIG
OID LOCATIONDETAILOID VALUE
1 1 30
2 4 15
If i provide Input as LOCATIONDETAIL 6, i should get RULESETCONFIG value as 30
because for
LOCATIONDETAIL 6, parentid is 3 and for LOCATIONDETAIL 3 there is no value in RULESETCONFIG,
LOCATIONDETAIL 3 has parent 1 which has value in RULESETCONFIG
if i provide Input as LOCATIONDETAIL 4, i should get RULESETCONFIG value 15
i have code to populate the tree, but don't know how to find the next available Parent
;WITH GLOBALHIERARCHY AS
(
SELECT A.OID,A.PARENTOID,A.NAME
FROM LOCATIONDETAIL A
WHERE OID = #LOCATIONDETAILOID
UNION ALL
SELECT A.OID,A.PARENTOID,A.NAME
FROM LOCATIONDETAIL A INNER JOIN GLOBALHIERARCHY GH ON A.PARENTOID = GH.OID
)
SELECT * FROM GLOBALHIERARCHY
This will return the next parent with a value. If you want to see all, remove the top 1 from the final select.
dbFiddle
Example
Declare #Fetch int = 4
;with cteHB as (
Select OID
,PARENTOID
,Lvl=1
,NAME
From LOCATIONDETAIL
Where OID=#Fetch
Union All
Select R.OID
,R.PARENTOID
,P.Lvl+1
,R.NAME
From LOCATIONDETAIL R
Join cteHB P on P.PARENTOID = R.OID)
Select Top 1
Lvl = Row_Number() over (Order By A.Lvl Desc )
,A.OID
,A.PARENTOID
,A.NAME
,B.Value
From cteHB A
Left Join RULESETCONFIG B on A.OID=B.OID
Where B.VALUE is not null
and A.OID <> #Fetch
Order By 1 Desc
Returns when #Fetch=4
Lvl OID PARENTOID NAME Value
2 2 1 Subsite1 15
Returns when #Fetch=6
Lvl OID PARENTOID NAME Value
1 1 0 HeadSite 30
This should do the job:
;with LV as (
select OID ID,PARENTOID PID,NAME NAM, VALUE VAL FROM LOCATIONDETAIL
left join RULESETCONFIG ON LOCATIONDETAILOID=OID
), GH as (
select ID gID,PID gPID,NAM gNAM,VAL gVAL from LV where ID=#OID
union all
select ID,PID,NAM,VAL FROM LV INNER JOIN GH ON gVAL is NULL AND gPID=ID
)
select * from GH WHERE gVAL>0
See here for e little demo: http://rextester.com/OXD40496
My table is like this:
DrawDate num DrawName etc...
2015-08-01 2 Draw 2
2015-08-01 3 Draw 3
2015-08-02 1 Draw 1
2015-08-02 2 Draw 2
2015-08-02 4 Draw 4
2015-08-03 3 Draw 3
2015-08-04 1 Draw 1
2015-08-04 2 Draw 2
2015-08-04 3 Draw 3
2015-08-04 4 Draw 4
i would like to get the missing sequence number (num column) in the Table.
How could i achieve this?
I had find so many solutions but there are drawbacks of not including start number or the hardcoding the first and last number for sequence. But i could not hardcode there any value. Start value is specified by 1 but no end sequence is defined. it could be 4 as in the table above or in some cases may be 18 or 20 but i need to find the duplicates by drawdate maximum value (the last one).
Edit
The Final Result would be
DrawDate missingnum
2015-08-01 1
2015-08-02 3
2015-08-03 1
2015-08-03 2
Let me guess:
declare #t table
(
DrawDate date,
Num int
)
insert into #t values
('2015-08-01' , 2 ),
('2015-08-01' , 3 ),
('2015-08-02' , 1 ),
('2015-08-02' , 2 ),
('2015-08-02' , 4 ),
('2015-08-03' , 3 ),
('2015-08-04' , 1 ),
('2015-08-04' , 2 ),
('2015-08-04' , 3 ),
('2015-08-04' , 4 );
with AvailableDates as
(
select distinct DrawDate from #t
),
AvailableNumbers as
(
select distinct Num from #t
),
CrossJoined as
(
select
DrawDate,
Num
from
AvailableDates
cross join AvailableNumbers
)
select
*
from
CrossJoined cj
left join #t t on t.DrawDate = cj.DrawDate and t.Num = cj.Num
where
t.Num is null;
Edit
The second guess (after the comment reading):
with MaxNumberPerDate as
(
select DrawDate, MaxNum = max(Num) from #t group by DrawDate
),
N as
(
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) t(n)
),
Multiplied as
(
select
DrawDate,
Num = N.n
from
MaxNumberPerDate md
inner join N on N.n <= md.MaxNum
)
select
m.*
from
Multiplied m
left join #t t on t.DrawDate = m.DrawDate and t.Num = m.Num
where
t.Num is null;
I have the following table in Postgres that has overlapping data in the two columns a_sno and b_sno.
create table data
( a_sno integer not null,
b_sno integer not null,
PRIMARY KEY (a_sno,b_sno)
);
insert into data (a_sno,b_sno) values
( 4, 5 )
, ( 5, 4 )
, ( 5, 6 )
, ( 6, 5 )
, ( 6, 7 )
, ( 7, 6 )
, ( 9, 10)
, ( 9, 13)
, (10, 9 )
, (13, 9 )
, (10, 13)
, (13, 10)
, (10, 14)
, (14, 10)
, (13, 14)
, (14, 13)
, (11, 15)
, (15, 11);
As you can see from the first 6 rows data values 4,5,6 and 7 in the two columns intersects/overlaps that need to partitioned to a group. Same goes for rows 7-16 and rows 17-18 which will be labeled as group 2 and 3 respectively.
The resulting output should look like this:
group | value
------+------
1 | 4
1 | 5
1 | 6
1 | 7
2 | 9
2 | 10
2 | 13
2 | 14
3 | 11
3 | 15
Assuming that all pairs exists in their mirrored combination as well (4,5) and (5,4). But the following solutions work without mirrored dupes just as well.
Simple case
All connections can be lined up in a single ascending sequence and complications like I added in the fiddle are not possible, we can use this solution without duplicates in the rCTE:
I start by getting minimum a_sno per group, with the minimum associated b_sno:
SELECT row_number() OVER (ORDER BY a_sno) AS grp
, a_sno, min(b_sno) AS b_sno
FROM data d
WHERE a_sno < b_sno
AND NOT EXISTS (
SELECT 1 FROM data
WHERE b_sno = d.a_sno
AND a_sno < b_sno
)
GROUP BY a_sno;
This only needs a single query level since a window function can be built on an aggregate:
Get the distinct sum of a joined table column
Result:
grp a_sno b_sno
1 4 5
2 9 10
3 11 15
I avoid branches and duplicated (multiplicated) rows - potentially much more expensive with long chains. I use ORDER BY b_sno LIMIT 1 in a correlated subquery to make this fly in a recursive CTE.
Create a unique index on a non-unique column
Key to performance is a matching index, which is already present provided by the PK constraint PRIMARY KEY (a_sno,b_sno): not the other way round (b_sno, a_sno):
Is a composite index also good for queries on the first field?
WITH RECURSIVE t AS (
SELECT row_number() OVER (ORDER BY d.a_sno) AS grp
, a_sno, min(b_sno) AS b_sno -- the smallest one
FROM data d
WHERE a_sno < b_sno
AND NOT EXISTS (
SELECT 1 FROM data
WHERE b_sno = d.a_sno
AND a_sno < b_sno
)
GROUP BY a_sno
)
, cte AS (
SELECT grp, b_sno AS sno FROM t
UNION ALL
SELECT c.grp
, (SELECT b_sno -- correlated subquery
FROM data
WHERE a_sno = c.sno
AND a_sno < b_sno
ORDER BY b_sno
LIMIT 1)
FROM cte c
WHERE c.sno IS NOT NULL
)
SELECT * FROM cte
WHERE sno IS NOT NULL -- eliminate row with NULL
UNION ALL -- no duplicates
SELECT grp, a_sno FROM t
ORDER BY grp, sno;
Less simple case
All nodes can be reached in ascending order with one or more branches from the root (smallest sno).
This time, get all greater sno and de-duplicate nodes that may be visited multiple times with UNION at the end:
WITH RECURSIVE t AS (
SELECT rank() OVER (ORDER BY d.a_sno) AS grp
, a_sno, b_sno -- get all rows for smallest a_sno
FROM data d
WHERE a_sno < b_sno
AND NOT EXISTS (
SELECT 1 FROM data
WHERE b_sno = d.a_sno
AND a_sno < b_sno
)
)
, cte AS (
SELECT grp, b_sno AS sno FROM t
UNION ALL
SELECT c.grp, d.b_sno
FROM cte c
JOIN data d ON d.a_sno = c.sno
AND d.a_sno < d.b_sno -- join to all connected rows
)
SELECT grp, sno FROM cte
UNION -- eliminate duplicates
SELECT grp, a_sno FROM t -- add first rows
ORDER BY grp, sno;
Unlike the first solution, we don't get a last row with NULL here (caused by the correlated subquery).
Both should perform very well - especially with long chains / many branches. Result as desired:
SQL Fiddle (with added rows to demonstrate difficulty).
Undirected graph
If there are local minima that cannot be reached from the root with ascending traversal, the above solutions won't work. Consider Farhęg's solution in this case.
I want to say another way, it may be useful, you can do it in 2 steps:
1. take the max(sno) per each group:
select q.sno,
row_number() over(order by q.sno) gn
from(
select distinct d.a_sno sno
from data d
where not exists (
select b_sno
from data
where b_sno=d.a_sno
and a_sno>d.a_sno
)
)q
result:
sno gn
7 1
14 2
15 3
2. use a recursive cte to find all related members in groups:
with recursive cte(sno,gn,path,cycle)as(
select q.sno,
row_number() over(order by q.sno) gn,
array[q.sno],false
from(
select distinct d.a_sno sno
from data d
where not exists (
select b_sno
from data
where b_sno=d.a_sno
and a_sno>d.a_sno
)
)q
union all
select d.a_sno,c.gn,
d.a_sno || c.path,
d.a_sno=any(c.path)
from data d
join cte c on d.b_sno=c.sno
where not cycle
)
select distinct gn,sno from cte
order by gn,sno
Result:
gn sno
1 4
1 5
1 6
1 7
2 9
2 10
2 13
2 14
3 11
3 15
here is the demo of what I did.
Here is a start that may give some ideas on an approach. The recursive query starts with a_sno of each record and then tries to follow the path of b_sno until it reaches the end or forms a cycle. The path is represented by an array of sno integers.
The unnest function will break the array into rows, so a sno value mapped to the path array such as:
4, {6, 5, 4}
will be transformed to a row for each value in the array:
4, 6
4, 5
4, 4
The array_agg then reverses the operation by aggregating the values back into a path, but getting rid of the duplicates and ordering.
Now each a_sno is associated with a path and the path forms the grouping. dense_rank can be used to map the grouping (cluster) to a numeric.
SELECT array_agg(DISTINCT map ORDER BY map) AS cluster
,sno
FROM ( WITH RECURSIVE x(sno, path, cycle) AS (
SELECT a_sno, ARRAY[a_sno], false FROM data
UNION ALL
SELECT b_sno, path || b_sno, b_sno = ANY(path)
FROM data, x
WHERE a_sno = x.sno
AND NOT cycle
)
SELECT sno, unnest(path) AS map FROM x ORDER BY 1
) y
GROUP BY sno
ORDER BY 1, 2
Output:
cluster | sno
--------------+-----
{4,5,6,7} | 4
{4,5,6,7} | 5
{4,5,6,7} | 6
{4,5,6,7} | 7
{9,10,13,14} | 9
{9,10,13,14} | 10
{9,10,13,14} | 13
{9,10,13,14} | 14
{11,15} | 11
{11,15} | 15
(10 rows)
Wrap it one more time for the ranking:
SELECT dense_rank() OVER(order by cluster) AS rank
,sno
FROM (
SELECT array_agg(DISTINCT map ORDER BY map) AS cluster
,sno
FROM ( WITH RECURSIVE x(sno, path, cycle) AS (
SELECT a_sno, ARRAY[a_sno], false FROM data
UNION ALL
SELECT b_sno, path || b_sno, b_sno = ANY(path)
FROM data, x
WHERE a_sno = x.sno
AND NOT cycle
)
SELECT sno, unnest(path) AS map FROM x ORDER BY 1
) y
GROUP BY sno
ORDER BY 1, 2
) z
Output:
rank | sno
------+-----
1 | 4
1 | 5
1 | 6
1 | 7
2 | 9
2 | 10
2 | 13
2 | 14
3 | 11
3 | 15
(10 rows)
My table structure is below :
MyTable (ID Int, AccID1 Int, AccID2 Int, AccID3 int)
ID AccID1 AccID2 AccID3
---- -------- -------- --------
1 12 2 NULL
2 4 12 1
3 NULL NULL 5
4 7 NULL 1
I want to create indexed view with below output :
ID Level Value
---- ----- -------
1 1 12
1 2 2
2 1 4
2 2 12
2 3 1
3 3 5
4 1 7
4 3 1
EDIT :
My table is very huge and I want to have above output.
I can Get my query such as below :
Select ID,
Case StrLevel
When 'AccID1' Then 1
When 'AccID2' Then 2
Else 3
End AS [Level],
AccID as Value
From (
Select A.ID, A.AccID1, A.AccID2, A.AccID3
From MyTable A
)as p
UNPIVOT (AccID FOR [StrLevel] IN (AccID1, AccID2, AccID3)) AS unpvt
or
Select *
from (
select MyTable.ID,
num.n as [Level],
Case Num.n
When 1 Then MyTable.AccID1
When 2 Then MyTable.AccID2
Else MyTable.AccID3
End AS AccID
from myTable
cross join (select 1
union select 2
union select 3)Num(n)
)Z
Where Z.AccID IS NOT NULL
or
Select A.ID,
2 AS [Level],
A.AccID1 AS AccID
From MyTable A
Where A.AccID1 IS NOT NULL
Union
Select A.ID,
2 AS [Level],
A.AccID2
From MyTable A
Where A.AccID2 IS NOT NULL
Union
Select A.ID,
3 AS [Level],
A.AccID3
From MyTable A
Where A.AccID3 IS NOT NULL
But Above query is slow and I want to have indexed view to have better performance.
and in indexed view I can't use UNION or UNPIVOT or CROSS JOIN in indexed view.
What if you created a Numbers table to essentially do the work of your illegal CROSS JOIN?
Create Table Numbers (number INT NOT NULL PRIMARY KEY)
Go
Insert Numbers
Select top 30000 row_number() over (order by (select 1)) as rn
from sys.all_objects s1 cross join sys.all_objects s2
go
Create view v_unpivot with schemabinding
as
Select MyTable.ID,
n.number as [Level],
Case n.number
When 1 Then MyTable.AccID1
When 2 Then MyTable.AccID2
Else MyTable.AccID3
End AS AccID
From dbo.Mytable
Join dbo.Numbers n on n.number BETWEEN 1 AND 3
go
Create unique clustered index pk_v_unpivot on v_unpivot (ID, [Level])
go
Select
ID,
[Level],
AccID
From v_unpivot with (noexpand)
Where AccID IS NOT NULL
Order by ID, [Level]
The WHERE AccID IS NOT NULL must be part of the query because derived tables are not allowed in indexed views.