Flatten the tree path in SQL server Hierarchy ID - sql

I am using SQL Hierarchy data type to model a taxonomy structure in my application.
The taxonomy can have the same name in different levels
During the setup this data needs to be uploaded via an excel sheet.
Before inserting any node I would like to check if the node at a particular path already exists so that I don't duplicate the entries.
What is the easiest way to check if the node # particular absolute path already exists or not?
for e.g Before inserting say "Retail" under "Bank 2" I should be able to check "/Bank 2/Retail" is not existing
Is there any way to provide a flattened representation of the entire tree structure so that I can check for the absolute path and then proceed?

Yes, you can do it using a recursive CTE.
In each iteration of the query you can append a new level of the hierarchy name.
There are lots of examples of this technique on the internet.
For example, with this sample data:
CREATE TABLE Test
(id INT,
parent_id INT null,
NAME VARCHAR(50)
)
INSERT INTO Test VALUES(1, NULL, 'L1')
INSERT INTO Test VALUES(2, 1, 'L1-A')
INSERT INTO Test VALUES(3, 2, 'L1-A-1')
INSERT INTO Test VALUES(4, 2, 'L1-A-2')
INSERT INTO Test VALUES(5, 1, 'L1-B')
INSERT INTO Test VALUES(6, 5, 'L1-B-1')
INSERT INTO Test VALUES(7, 5, 'L1-B-2')
you can write a recursive CTE like this:
WITH H AS
(
-- Anchor: the first level of the hierarchy
SELECT id, parent_id, name, CAST(name AS NVARCHAR(300)) AS path
FROM Test
WHERE parent_id IS NULL
UNION ALL
-- Recursive: join the original table to the anchor, and combine data from both
SELECT T.id, T.parent_id, T.name, CAST(H.path + '\' + T.name AS NVARCHAR(300))
FROM Test T INNER JOIN H ON T.parent_id = H.id
)
-- You can query H as if it was a normal table or View
SELECT * FROM H
WHERE PATH = 'L1\L1-A' -- for example to see if this exists
The result of the query (without the where filter) looks like this:
1 NULL L1 L1
2 1 L1-A L1\L1-A
5 1 L1-B L1\L1-B
6 5 L1-B-1 L1\L1-B\L1-B-1
7 5 L1-B-2 L1\L1-B\L1-B-2
3 2 L1-A-1 L1\L1-A\L1-A-1
4 2 L1-A-2 L1\L1-A\L1-A-2

Related

Combine values into string inside recursive CTE

I am using a recursive CTE to essentially navigate down a tree structure of id values. At each iteration of the recursion I would like to have a column that represents a string 'list' of all id values (nodes) that have been visited in the iteration steps so far. At first glance it seems like I would need a group by or aggregate function (like string_agg()) to accomplish this, but these are not allowed in the recursive part of a recursive CTE. My question seems to be similar to that found here recursive CTE to combine values, but I am hoping for a slightly more straightforward answer that will relate to the data I have added a sample of below. This first table is essentially the top of the tree. There could be one or multiple nodes at the top depending on how many records are in this first table. Here there is just 1 record:
name_id
group_id
100
15
where name_id you could think of as the name of the tree, which is essentially irrelevant to this example since there is only one name_id, and group_id is the top of the tree which will then branch out based on the following table:
group_id
group_id_member
10
15
11
15
4
10
11
4
3
11
10
3
where the group_id_member is saying that, for example, the group_id from the first table, 15, is a member of the group_id 10 and also 11. So the tree would look like 15 at the top, with a branch down for 10 and another branch for 11. Then off of 10 comes a branch of 4, and off of 11 comes a branch of 3. Then off of 4 comes a branch of 11, and off of 3 comes a branch of 10. Thus, this is an infinite loop since those branches are already in the table.
Essentially, the goal of this recursive query is to return all the branch nodes but not return any of them more than once. This recursive cte will run forever unless we do a check, like checking whether or not the group_id value is already in rpath. To check that, I am wanting to create a column which is a list of all nodes that have gotten hit from current and previous iterations (rpath column), and append to that column at each iteration, making the result table look something like:
name_id
group_id
group_id_member
iter
rpath
100
15
null
1
/15/
100
10
15
2
/15/10/11/
100
11
15
2
/15/10/11/
100
4
10
3
/15/10/11/4/3/
100
3
11
3
/15/10/11/4/3/
100
10
3
4
/15/10/11/4/3/10/11/
100
11
4
4
/15/10/11/4/3/10/11/
Where I included the 4th iteration in the table, which shows that the group_id values 10 and 11 are now in the rpath two times since this loop has been fully completed. Ultimately I would want to terminate this recursion at the 3rd iteration since at the 4th, for both group_id values, they appear in the rpath list already.
So my primary question is, what is the best way to create this rpath string inside the recursive part of the cte? See below for code to create the temp tables, and my code which attempts to solve this but is returning an error since I am trying to use string_agg() in the recursive part of the cte which is not allowed. Thanks!
if object_id('tempdb..#t1') is not null drop table #t1
CREATE TABLE #t1 (group_id int, group_id_member int)
INSERT into #t1 VALUES
(10,15),
(11, 15),
(4, 10),
(11, 4),
(3, 11),
(10, 3);
if object_id('tempdb..#t2') is not null drop table #t2
CREATE TABLE #t2 (name_id int, group_id int)
INSERT into #t2 VALUES
(100, 15)
; with rec as (
select name_id,
group_id,
cast(null as int) as group_id_member,
1 as iter,
convert(varchar(128),concat('/', str_agg.agg, '/')) as rpath
from #t2
cross apply (select string_agg(group_id, '/') as agg from #t2) as str_agg -- this does work since it is not in the recursive part of the cte
union all
select rec.name_id,
t1.group_id,
t1.group_id_member,
iter + 1 as iter,
convert(varchar(128), concat(rec.rpath, str_agg1.agg1, '/')) as rpath -- trying to create the string that represents all of the nodes that have been visited
from #t1 t1
inner join rec
on t1.group_id_member = rec.group_id
cross apply (select string_agg(t1.group_id_member ,'/') as agg1 from #t1 t1 where t1.group_id_member = rec.group_id) str_agg1 -- cross apply to create the string_agg, if this was allowed in recursive cte...
where rec.rpath not like '%/' + convert(varchar(128), t1.group_id) + '/%'
)
select * from rec order by iter asc
At each iteration of the recursion I would like to have a column that represents a string 'list' of all id values (nodes) that have been visited in the iteration steps so far
As I understand your question, you don’t need aggregation here. The recursive query processes iteratively, so you can just accumulate the ids of the visited nodes along the way, using concat.
with rec as (
select
name_id,
group_id,
cast(null as int) as group_id_member,
1 as iter,
concat('/', convert(varchar(max), group_id)) as rpath
from #t2
union all
select
r.name_id,
t1.group_id,
t1.group_id_member,
iter + 1 as iter,
concat(r.rpath, '/', t1.group_id)
from #t1 t1
inner join rec r
on t1.group_id_member = r.group_id
where r.rpath not like concat('%/', t1.group_id, '/%')
)
select * from rec order by iter
Demo on DB Fiddle

SQL Tree / Hierarchial Data

This is my first post, I am trying to make a sql tree table that traverses. For example, If a person clicks on a drop down list called Categories, it will display Electric, and InterC. Then, if the user clicks on electric, it will drop down relays and switches, next if the person clicks on relays it will drop down X relays and if the person clicks on switches it will drop down Y switches. I have attempted below , but the part i don't understand is if i have another category InterC, how do I make that another level of drop downs ?
Table Category
insert test select 1, 0,'Electric'
insert test select 2, 1,'Relays'
insert test select 3, 1,'Switches'
insert test select 5, 2,'X Relays'
insert test select 6, 2,'Y Switches'
insert test select 7, 0,'InterC'
insert test select 8, 1,'x Sockets'
insert test select 9, 1,'y Sockets'
insert test select 10, 2,'X Relays'
insert test select 11, 2,'Y Relays'
;
create table test(id int,parentId int,name varchar(50))
WITH tree (id, parentid, level, name) as (
SELECT id, parentid, 0 as level, name
FROM test WHERE parentid = 0
UNION ALL
SELECT c2.id, c2.parentid, tree.level + 1, c2.name
FROM test c2
INNER JOIN tree ON tree.id = c2.parentid
)
SELECT *
FROM tree
order by parentid
Your hierarchical T-SQL query should return all the records in the table, both those under Electric and InterC.
However, you should make parentId nullable and have the root records have a null rather than 0. That will let you add a foreign key that protects your data integrity (it won't be possible to add orphaned records by mistake).
You hierarchy query returns all of your records, I'm guessing that you want to return just one at a time - for that add a where condition to the starting query.
WITH tree (id, parentid, level, name) as (
SELECT id, parentid, 0 as level, name
FROM test
WHERE name = #category AND
parentId is null
UNION ALL
SELECT c2.id, c2.parentid, tree.level + 1, c2.name
FROM test c2
INNER JOIN tree ON tree.id = c2.parentid
)
SELECT *
FROM tree
order by parentid
Then set #category to 'Electric' or'InterC' to get one or the other hierarchy.

Querying Parent-Child relationship in a consecutive way

I'm trying to write an import tool to convert my database from one schema to another.
So now I've come across a table that uses a Parent-Child relationship (via PK ID FK ParentID) and I want to select all records consecutively.
The risk of my query is that I might try to import a child element, whose parent element is not already imported. This would result in a recordset that's not going to be imported and is therefore to avoid.
My query I've worked on is as following:
SELECT * FROM Table a INNER JOIN Table b ON (b.ParentID=a.ID and a.ID= b.ParentID)
Unfortunately that doesn't work (it doesn't give me all the records in the table), so I need a query that gives me all rows in the table, ordered by child and parent elements, that I just can loop over to import.
Can someone guide me the way?
What you're looking for is a recursive common table expression which can be found at this link:
http://technet.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx
You can use this to tell your downstream ETL the sequence things should be loaded in. For instance, all 1's go first and 2's second and so on.
DECLARE #Table TABLE (
ID INT,
ParentId INT)
INSERT INTO #Table
VALUES
(1, 0),
(2, 1),
(3, 1),
(4, 0),
(5, 4),
(6, 4),
(7, 1),
(8, 7)
--This is the anchor query and selects top level records
;WITH cte_Recursive AS (
SELECT ID, ParentId, 1 [Depth]
FROM #Table
WHERE ParentId = 0
UNION ALL
SELECT T.ID, T.ParentId
,R.Depth + 1 [Depth]
FROM #Table T
INNER JOIN cte_Recursive R ON R.ID = T.ParentId
)
SELECT *
FROM cte_Recursive

Matching a set of child records between two similar table hierarchies

I have two similar table hierarchies:
Owner -> OwnerGroup -> Parent
and
Owner2 -> OwnerGroup2
I would like to determine if there is an exact match of Owners that exists in Owner2 based on a set of values. There are approximately a million rows in each Owner table. Some OwnerGroups contain up to 100 Owners.
So basically if there is an OwnerGroup than contains Owners "Smith", "John" and "Smith, "Jane", I want to know the id of the OwnerGroup2s that are exact matches.
The first attempt at this was to generate a join per Owner (which required dynamic sql being generated in the application:
select og.id
from owner_group2 og
-- dynamic bit starts here
join owner2 o1 on
(og.id = o1.og_id) AND
(o1.given_names = 'JOHN' and o1.surname='SMITH')
-- dynamic bit ends here
join owner2 o2 on
(og.id = o2.og_id) AND
(o2.given_names = 'JANE' and o2.surname='SMITH');
This works fine until for small numbers of owners, but when we have to deal with the 100 Owners in a group scenario as this query plan means there 100 nested loops and it takes almost a minute to run.
Another option I had was to use something around the intersect operator. E.g.
select * from (
select o.surname, o.given_names
from owner1 o1
join owner_group1 og1 on o1.og_id = og1.id
where
og1.parent_id = 1936233
)
intersect
select o.surname, o.given_names
from owner2 o2
join owner_group2 og2 on og2.id = o2.og_id;
I'm not sure how to suck out the owner2.id in this scenario either - and it was still running in the 4-5 second range.
I feel like I am missing something obvious - so please feel free to provide some better solutions!
You're on the right track with intersect, you just need to go a bit further. You need to join the results of it back to the owner_groups2 table to find the ids.
You can use the listagg function to convert the groups into comma-separated lists of the names (note - requires 11g). You can then take the intersection of these name lists to find the matches and join this back to the list in owner_groups2.
I've created a simplified example below, in it "Dave, Jill" is the group that is present in both tables.
create table grps (id integer, name varchar2(100));
create table grps2 (id integer, name varchar2(100));
insert into grps values (1, 'Dave');
insert into grps values(1, 'Jill');
insert into grps values (2, 'Barry');
insert into grps values(2, 'Jane');
insert into grps2 values(3, 'Dave');
insert into grps2 values(3, 'Jill');
insert into grps2 values(4, 'Barry');
with grp1 as (
SELECT id, listagg(name, ',') within group (order by name) n
FROM grps
group by id
), grp2 as (
SELECT id, listagg(name, ',') within group (order by name) n
FROM grps2
group by id
)
SELECT * FROM grp2
where n in (
-- find the duplicates
select n from grp1
intersect
select n from grp2
);
Note this will still require a full scan of owner_groups2; I can't think of a way you can avoid this. So your query is likely to remain slow.

Ordering parent rows by date descending with child rows ordered independently beneath each

This is a contrived version of my table schema to illustrate my problem:
QuoteID, Details, DateCreated, ModelQuoteID
Where QuoteID is the primary key and ModelQuoteID is a nullable foreign key back onto this table to represent a quote which has been modelled off another quote (and may have subsequently had its Details column etc changed).
I need to return a list of quotes ordered by DateCreated descending with the exception of modelled quotes, which should sit beneath their parent quote, ordered by date descending within any other sibling quotes (quotes can only be modelled one level deep).
So for example if I have these 4 quote rows:
1, 'Fix the roof', '01/01/2012', null
2, 'Clean the drains', '02/02/2012', null
3, 'Fix the roof and door', '03/03/2012', 1
4, 'Fix the roof, door and window', '04/04/2012', 1
5, 'Mow the lawn', '05/05/2012', null
Then I need to get the results back in this order:
5 - Mow the lawn
2 - Clean the drains
1 - Fix the roof
4 - -> Fix the roof, door and window
3 - -> Fix the roof and door
I'm also passing in search criteria such as keywords for Details, and I'm returning modelled quotes even if they don't contain the search term but their parent quote does. I've got that part working using a common table expression to get the original quotes, unioned with a join for modelled ones.
That works nicely but currently I'm having to do the rearrangement of the modelled quotes into the correct order in code. That's not ideal because my next step is to implement paging in the SQL, and if the rows are not grouped properly at that time then I won't have the children present in the current page to do the re-ordering in code. Generally speaking they will be naturally grouped together anyway, but not always. You could create a model quote today for a quote from a month back.
I've spent quite some time on this, can any SQL gurus help? Much appreciated.
EDIT: Here is a contrived version of my SQL to fit my contrived example :-)
;with originals as (
select
q.*
from
Quote q
where
Details like #details
)
select
*
from
(
select
o.*
from
originals o
union
select
q2.*
from
Quote q2
join
originals o on q2.ModelQuoteID = o.QuoteID
)
as combined
order by
combined.CreatedDate desc
Watching the Olympics -- just skimmed your post -- looks like you want to control the sort at each level (root and one level in), and make sure the data is returned with the children directly beneath its parent (so you can page the data...). We do this all the time. You can add an order by to each inner query and create a sort column. I contrived a slightly different example that should be easy for you to apply to your circumstance. I sorted the root ascending and level one descending just to illustrate how you can control each part.
declare #tbl table (id int, parent int, name varchar(10))
insert into #tbl (id, parent, name)
values (1, null, 'def'), (2, 1, 'this'), (3, 1, 'is'), (4, 1, 'a'), (5, 1, 'test'),
(6, null, 'abc'), (7, 6, 'this'), (8, 6, 'is'), (9, 6, 'another'), (10, 6, 'test')
;with cte (id, parent, name, sort) as (
select id, parent, name, cast(right('0000' + cast(row_number() over (order by name) as varchar(4)), 4) as varchar(1024))
from #tbl
where parent is null
union all
select t.id, t.parent, t.name, cast(cte.sort + right('0000' + cast(row_number() over (order by t.name desc) as varchar(4)), 4) as varchar(1024))
from #tbl t inner join cte on t.parent = cte.id
)
select * from cte
order by sort
This produces these results:
id parent name sort
---- -------- ------- ----------
6 NULL abc 0001
7 6 this 00010001
10 6 test 00010002
8 6 is 00010003
9 6 another 00010004
1 NULL def 0002
2 1 this 00020001
5 1 test 00020002
3 1 is 00020003
4 1 a 00020004
You can see that the root nodes are sorted ascending and the inner nodes are sorted descending.