SQL Find all direct descendants in a tree - sql

I have a tree in my database that is stored using parent id links.
A sample of what I have for data in the table is:
id | name | parent id
---+-------------+-----------
0 | root | NULL
1 | Node 1 | 0
2 | Node 2 | 0
3 | Node 1.1 | 1
4 | Node 1.1.1 | 3
5 | Node 1.1.2 | 3
Now I would like to get a list of all the direct descendants of a given node but if none exist I would like to have it just return the node itself.
I want the return for the query for children of id = 3 to be:
children
--------
4
5
Then the query for the children of id = 4 to be:
children
--------
4
I can change the way I am storing the tree to a nested set but I don't see how that would make the query I want possible.

In new PostgreSQL 8.4 you can do it with a CTE:
WITH RECURSIVE q AS
(
SELECT h, 1 AS level, ARRAY[id] AS breadcrumb
FROM t_hierarchy h
WHERE parent = 0
UNION ALL
SELECT hi, q.level + 1 AS level, breadcrumb || id
FROM q
JOIN t_hierarchy hi
ON hi.parent = (q.h).id
)
SELECT REPEAT(' ', level) || (q.h).id,
(q.h).parent,
(q.h).value,
level,
breadcrumb::VARCHAR AS path
FROM q
ORDER BY
breadcrumb
See this article in my blog for details:
PostgreSQL 8.4: preserving order for hierarchical query
In 8.3 or earlier, you'll have to write a function:
CREATE TYPE tp_hierarchy AS (node t_hierarchy, level INT);
CREATE OR REPLACE FUNCTION fn_hierarchy_connect_by(INT, INT)
RETURNS SETOF tp_hierarchy
AS
$$
SELECT CASE
WHEN node = 1 THEN
(t_hierarchy, $2)::tp_hierarchy
ELSE
fn_hierarchy_connect_by((q.t_hierarchy).id, $2 + 1)
END
FROM (
SELECT t_hierarchy, node
FROM (
SELECT 1 AS node
UNION ALL
SELECT 2
) nodes,
t_hierarchy
WHERE parent = $1
ORDER BY
id, node
) q;
$$
LANGUAGE 'sql';
and select from this function:
SELECT *
FROM fn_hierarchy_connect_by(4, 1)
The first parameter is the root id, the second should be 1.
See this article in my blog for more detail:
Hierarchical queries in PostgreSQL
Update:
To show only the first level children, or the node itself if the children do not exist, issue this query:
SELECT *
FROM t_hierarchy
WHERE parent = #start
UNION ALL
SELECT *
FROM t_hierarchy
WHERE id = #start
AND NOT EXISTS
(
SELECT NULL
FROM t_hierarchy
WHERE parent = #start
)
This is more efficient than a JOIN, since the second query will take but two index scans at most: the first one to make sure to find out if a child exists, the second one to select the parent row if no children exist.

Found a query that works the way I wanted.
SELECT * FROM
( SELECT id FROM t_tree WHERE name = '' ) AS i,
t_tree g
WHERE
( ( i.id = g.id ) AND
NOT EXISTS ( SELECT * FROM t_tree WHERE parentid = i.id ) ) OR
( ( i.id = g.parentid ) AND
EXISTS ( SELECT * FROM t_tree WHERE parentid = i.id ) )

Related

How to group hierarchical relationships together in SQL Server

I have a column name Parent and Child in table Example and Below is the Table Data
| Parent | Child |
|---------------------|------------------|
| 100 | 101 |
|---------------------|------------------|
| 101 | 102 |
|---------------------|------------------|
| 200 | 201 |
|---------------------|------------------|
| 103 | 102 |
|---------------------|------------------|
| 202 | 201 |
|---------------------|------------------|
If i give the input as 100 i should get the result as 100,101,102,103 Since 100->101->102->103 and also if i give the input as 102 then it should give the same above result. 102->101->100 and 102->103. I need to achieve this using stored Procedure only.
Below is the sample Code which i am trying
CREATE PROCEDURE GetAncestors(#thingID varchar(MAX))
AS
BEGIN
SET NOCOUNT ON;
WITH
CTE
AS
(
SELECT
Example.Parent, Example.Child
FROM Example
WHERE Parent = #thingID or Child = #thingID
UNION ALL
SELECT
Example.Parent, Example.Child
FROM
CTE
INNER JOIN Example ON Example.Parent = CTE.Child
)
SELECT
Parent AS Result
FROM CTE
UNION
SELECT
Child AS Result
FROM CTE
;
END
GO
The problem with your attempt is filtering at the start. If I'm right, your want to cluster your data (group them all together) by their relationships, either ascendant or descendant, or a mix of them. For example ID 100 has child 101, which has another child 102, but 102 has a parent 103 and you want the result to be these four (100, 101, 102, 103) for any input that is in that set. This is why you can't filter at the start, since you don't have any means of knowing which relationship will be chained throughout another relationship.
Solving this isn't as simple as it seems and you won't be able to solve it with just 1 recursion.
The following is a solution I made long time ago to group all these relationships together. Keep in mind that, for large datasets (over 100k), it might take a while to calculate, since it has to identify all groups first, and select the result at the end.
CREATE PROCEDURE GetAncestors(#thingID INT)
AS
BEGIN
SET NOCOUNT ON
-- Load your data
IF OBJECT_ID('tempdb..#TreeRelationship') IS NOT NULL
DROP TABLE #TreeRelationship
CREATE TABLE #TreeRelationship (
RelationID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
Parent INT,
Child INT,
GroupID INT)
INSERT INTO #TreeRelationship (
Parent,
Child)
SELECT
Parent = D.Parent,
Child = D.Child
FROM
Example AS D
UNION -- Data has to be loaded in both ways (direct and reverse) for algorithm to work correctly
SELECT
Parent = D.Child,
Child = D.Parent
FROM
Example AS D
-- Start algorithm
IF OBJECT_ID('tempdb..#FirstWork') IS NOT NULL
DROP TABLE #FirstWork
CREATE TABLE #FirstWork (
Parent INT,
Child INT,
ComponentID INT)
CREATE CLUSTERED INDEX CI_FirstWork ON #FirstWork (Parent, Child)
INSERT INTO #FirstWork (
Parent,
Child,
ComponentID)
SELECT DISTINCT
Parent = T.Parent,
Child = T.Child,
ComponentID = ROW_NUMBER() OVER (ORDER BY T.Parent, T.Child)
FROM
#TreeRelationship AS T
IF OBJECT_ID('tempdb..#SecondWork') IS NOT NULL
DROP TABLE #SecondWork
CREATE TABLE #SecondWork (
Component1 INT,
Component2 INT)
CREATE CLUSTERED INDEX CI_SecondWork ON #SecondWork (Component1)
DECLARE #v_CurrentDepthLevel INT = 0
WHILE #v_CurrentDepthLevel < 100 -- Relationships depth level can be controlled with this value
BEGIN
SET #v_CurrentDepthLevel = #v_CurrentDepthLevel + 1
TRUNCATE TABLE #SecondWork
INSERT INTO #SecondWork (
Component1,
Component2)
SELECT DISTINCT
Component1 = t1.ComponentID,
Component2 = t2.ComponentID
FROM
#FirstWork t1
INNER JOIN #FirstWork t2 on
t1.child = t2.parent OR
t1.parent = t2.parent
WHERE
t1.ComponentID <> t2.ComponentID
IF (SELECT COUNT(*) FROM #SecondWork) = 0
BREAK
UPDATE #FirstWork SET
ComponentID = CASE WHEN items.ComponentID < target THEN items.ComponentID ELSE target END
FROM
#FirstWork items
INNER JOIN (
SELECT
Source = Component1,
Target = MIN(Component2)
FROM
#SecondWork
GROUP BY
Component1
) new_components on new_components.source = ComponentID
UPDATE #FirstWork SET
ComponentID = target
FROM #FirstWork items
INNER JOIN(
SELECT
source = component1,
target = MIN(component2)
FROM
#SecondWork
GROUP BY
component1
) new_components ON new_components.source = ComponentID
END
;WITH Groupings AS
(
SELECT
parent,
child,
group_id = DENSE_RANK() OVER (ORDER BY ComponentID DESC)
FROM
#FirstWork
)
UPDATE FG SET
GroupID = IT.group_id
FROM
#TreeRelationship FG
INNER JOIN Groupings IT ON
IT.parent = FG.parent AND
IT.child = FG.child
-- Select the proper result
;WITH IdentifiedGroup AS
(
SELECT TOP 1
T.GroupID
FROM
#TreeRelationship AS T
WHERE
T.Parent = #thingID
)
SELECT DISTINCT
Result = T.Parent
FROM
#TreeRelationship AS T
INNER JOIN IdentifiedGroup AS I ON T.GroupID = I.GroupID
END
You will see that for #thingID of value 100, 101, 102 and 103 the result are these four, and for values 200, 201 and 202 the results are these three.
I'm pretty sure this isn't an optimal solution, but it gives the correct output and I never had the need to tune it up since it works fast for my requirements.
Here is a cut-down version of the query from a more generic question How to find all connected subgraphs of an undirected graph
The main idea is to treat (Parent,Child) pairs as edges in a graph and traverse all connected edges starting from a given node.
Since the graph is undirectional we build a list of pairs in both directions in CTE_Pairs at first.
CTE_Recursive follows the edges of a graph and stops when it detects a loop. It builds a path of visited nodes as a string in IDPath and stops the recursion if the new node is in the path (has been visited before).
Final CTE_CleanResult puts all found nodes in one simple list.
CREATE PROCEDURE GetAncestors(#thingID varchar(8000))
AS
BEGIN
SET NOCOUNT ON;
WITH
CTE_Pairs
AS
(
SELECT
CAST(Parent AS varchar(8000)) AS ID1
,CAST(Child AS varchar(8000)) AS ID2
FROM Example
WHERE Parent <> Child
UNION
SELECT
CAST(Child AS varchar(8000)) AS ID1
,CAST(Parent AS varchar(8000)) AS ID2
FROM Example
WHERE Parent <> Child
)
,CTE_Recursive
AS
(
SELECT
ID1 AS AnchorID
,ID1
,ID2
,CAST(',' + ID1 + ',' + ID2 + ',' AS varchar(8000)) AS IDPath
,1 AS Lvl
FROM
CTE_Pairs
WHERE ID1 = #thingID
UNION ALL
SELECT
CTE_Recursive.AnchorID
,CTE_Pairs.ID1
,CTE_Pairs.ID2
,CAST(CTE_Recursive.IDPath + CTE_Pairs.ID2 + ',' AS varchar(8000)) AS IDPath
,CTE_Recursive.Lvl + 1 AS Lvl
FROM
CTE_Pairs
INNER JOIN CTE_Recursive ON CTE_Recursive.ID2 = CTE_Pairs.ID1
WHERE
CTE_Recursive.IDPath NOT LIKE '%,' + CTE_Pairs.ID2 + ',%'
)
,CTE_RecursionResult
AS
(
SELECT AnchorID, ID1, ID2
FROM CTE_Recursive
)
,CTE_CleanResult
AS
(
SELECT AnchorID, ID1 AS ID
FROM CTE_RecursionResult
UNION
SELECT AnchorID, ID2 AS ID
FROM CTE_RecursionResult
)
SELECT ID
FROM CTE_CleanResult
ORDER BY ID
OPTION(MAXRECURSION 0);
END;
you can simply use graph processing introduced in SQL‌ Server 2017.
here is an example
https://www.red-gate.com/simple-talk/sql/t-sql-programming/sql-graph-objects-sql-server-2017-good-bad/

postgresql procedure for finding hie parent chain

I am new to postgresql and need a little help.I have a table named products
ID Product Parent_ID
1 laptop Null
2 Camera 1
3 Iphone 1
4 Mouse 2
5 Printer 2
6 Scanner 3
7 HardDisk 3
I want to create a function in postgres to get the hierarchy of the parent chain of any value i pass like if i pass 4 then my output should be
id parent_id
1 Null
2 1
4 2
I would suggest to use "recursive with" clause. Kindly check below query.
WITH RECURSIVE recursiveTable AS (
SELECT id, parent_id
FROM table_name
WHERE id = 4 -- you can pass an id here to get the output
UNION ALL
SELECT t.id, t.parent_id
FROM table_name t
JOIN recursiveTable rt ON t.id = rt.parent_id
)
SELECT * FROM recursiveTable
You can read more about recursive with clause on it official websites. Or you can check couple of examples.
Here is one link
http://technobytz.com/recursive-query-evaluation-postgresql.html
Function: Try this
CREATE OR REPLACE FUNCTION function_name(param_id INT)
RETURNS TABLE (
id INT,
parent_id INT
)
AS $$
BEGIN
RETURN QUERY WITH RECURSIVE recursiveTable AS (
SELECT t.id, t.parent_id
FROM table_name t
WHERE t.id = param_id -- you can pass an id here to get the output
UNION ALL
SELECT t.id, t.parent_id
FROM table_name t
JOIN recursiveTable rt ON t.id = rt.parent_id
)
SELECT * FROM recursiveTable ;
END; $$
LANGUAGE 'plpgsql';
Function Execution:
select * from function_name(4)

postgresql with recursive grabs whole table

I have the following postgresql structure
\d brand_categories;
Table "public.brand_categories"
Column | Type | Modifiers
----------------------+---------+---------------------------------------------------------------
id | integer | not null default nextval('brand_categories_id_seq'::regclass)
category_code | text | not null
correlation_id | uuid | not null default uuid_generate_v4()
created_by_id | integer | not null
updated_by_id | integer | not null
parent_category_code | text |
I am trying to get all the parents and childs of a category via WITH RECURSIVE but not take siblings of a category. I tried to do the following (inside ruby code):
WITH RECURSIVE included_categories(category_code) AS (
SELECT category_code FROM brand_categories
WHERE category_code = 'beer'
UNION ALL
SELECT children.category_code FROM brand_categories AS parents, brand_categories AS children
WHERE parents.category_code = children.parent_category_code AND parents.category_code != 'alcohol'
UNION SELECT parents.category_code FROM brand_categories AS children, brand_categories AS parents
WHERE parents.category_code = children.parent_category_code
)
SELECT * from included_categories
The problem is that it takes the whole set of categories even though most are completely unrelated. Is there something wrong in this query?
Note that this is a simple categorization with a depth of 2 or 3.
My boss helped me to solve the problem, it made more sense to do it in 2 parts:
Find all parents
Find all children
Here is the sql:
WITH RECURSIVE children_of(category_code) AS (
SELECT category_code FROM brand_categories WHERE parent_category_code = 'alcohol'
UNION ALL
SELECT brand_categories.category_code FROM brand_categories
JOIN children_of ON brand_categories.parent_category_code = children_of.category_code
),
parents_of(parent_category_code) AS (
SELECT parent_category_code FROM brand_categories WHERE category_code = 'alcohol'
UNION
SELECT brand_categories.parent_category_code FROM parents_of
JOIN brand_categories ON brand_categories.category_code = parents_of.parent_category_code
)
SELECT category_code FROM (SELECT * FROM children_of UNION SELECT parent_category_code FROM parents_of) t0(category_code)
WHERE category_code IS NOT NULL

Hierarchical SQL Queries: Best SQL query to obtain the whole branch of a tree from a [nodeid, parentid] pairs table given the end node id

Is there any way to send a recursive query in SQL?
Given the end node id, I need all the rows up to the root node (which has parentid = NULL) ordered by level. E.g. if I have something like:
nodeid | parentid
a | NULL
b | a
c | b
after querying for end_node_id = c, I'd get something like:
nodeid | parentid | depth
a | NULL | 0
b | a | 1
c | b | 2
(Instead of the depth I can also work with the distance to the given end node)
The only (and obvious) way I could come up with is doing a single query per row until I reach the parent node.
Is there a more efficient way of doing it?
If you are using mssql 2005+ you can do this:
Test data:
DECLARE #tbl TABLE(nodeId VARCHAR(10),parentid VARCHAR(10))
INSERT INTO #tbl
VALUES ('a',null),('b','a'),('c','b')
Query
;WITH CTE
AS
(
SELECT
tbl.nodeId,
tbl.parentid,
0 AS Depth
FROM
#tbl as tbl
WHERE
tbl.parentid IS NULL
UNION ALL
SELECT
tbl.nodeId,
tbl.parentid,
CTE.Depth+1 AS Depth
FROM
#tbl AS tbl
JOIN CTE
ON tbl.parentid=CTE.nodeId
)
SELECT
*
FROM
CTE
Ended up with the following solutions (where level is the distance to the end node)
Oracle, using hierarchical queries (thanks to the info provided by #Mureinik):
SELECT IDCATEGORY, IDPARENTCATEGORY, LEVEL
FROM TNODES
START WITH IDCATEGORY=122
CONNECT BY IDCATEGORY = PRIOR IDPARENTCATEGORY;
Example using a view so it boils down to a single standard SQL query (requires >= 10g):
CREATE OR REPLACE VIEW VNODES AS
SELECT CONNECT_BY_ROOT IDCATEGORY "IDBRANCH", IDCATEGORY, IDPARENTCATEGORY, LEVEL AS LVL
FROM TNODES
CONNECT BY IDCATEGORY = PRIOR IDPARENTCATEGORY;
SELECT * FROM VNODES WHERE IDBRANCH = 122 ORDER BY LVL ASC;
http://sqlfiddle.com/#!4/18ba80/3
Postgres >= 8.4, using a WITH RECURSIVE Common Table Expression query:
WITH RECURSIVE BRANCH(IDPARENTCATEGORY, IDCATEGORY, LEVEL) AS (
SELECT IDPARENTCATEGORY, IDCATEGORY, 1 AS LEVEL FROM TNODES WHERE IDCATEGORY = 122
UNION ALL
SELECT p.IDPARENTCATEGORY, p.IDCATEGORY, LEVEL+1
FROM BRANCH pr, TNODES p
WHERE p.IDCATEGORY = pr.IDPARENTCATEGORY
)
SELECT IDCATEGORY,IDPARENTCATEGORY, LEVEL
FROM BRANCH
ORDER BY LEVEL ASC
Example using a view so it boils down to a single standard SQL query:
CREATE OR REPLACE VIEW VNODES AS
WITH RECURSIVE BRANCH(IDBRANCH,IDPARENTCATEGORY,IDCATEGORY,LVL) AS (
SELECT IDCATEGORY AS IDBRANCH, IDPARENTCATEGORY, IDCATEGORY, 1 AS LVL FROM TNODES
UNION ALL
SELECT pr.IDBRANCH, p.IDPARENTCATEGORY, p.IDCATEGORY, LVL+1
FROM BRANCH pr, TNODES p
WHERE p.IDCATEGORY = pr.IDPARENTCATEGORY
)
SELECT IDBRANCH, IDCATEGORY, IDPARENTCATEGORY, LVL
FROM BRANCH;
SELECT * FROM VNODES WHERE IDBRANCH = 122 ORDER BY LVL ASC;
http://sqlfiddle.com/#!11/42870/2
For Oracle, as requested in the comments, you can use the connect by operator to produce the hierarchy, and the level pseudocolumn to get the depth:
SELECT nodeid, parentid, LEVEL
FROM t
START WITH parentid IS NULL
CONNECT BY parentid = PRIOR nodeid;

How to select parent ids

I have table with such structure.
ElementId | ParentId
-------------------
1 | NULL
2 | 1
3 | 2
4 | 3
Let say current element has Id 4. I want to select all parent ids.
Result should be: 3, 2, 1
How I can do it? DB is MSSQL
You can use recursive queries for this: http://msdn.microsoft.com/en-us/library/aa175801(SQL.80).aspx
You can use it like this:
with Hierachy(ElementID, ParentID, Level) as (
select ElementID, ParentID, 0 as Level
from table t
where t.ElementID = X -- insert parameter here
union all
select t.ElementID, t.ParentID, th.Level + 1
from table t
inner join Hierachy th
on t.ParentId = th.ElementID
)
select ElementID, ParentID
from Hierachy
where Level > 0
I think it might be easiest to do the following:
while parent != NULL
get parent of current element
I can't think of any way of doing this in plain SQL that wouldn't cause issues on larger databases.
if you want pure sql try:
select ParentId from myTable Desc
that would work in mysql... you might need to modify the Desc (sort in descending order) part