Recursive calculation to form a tree using sql - sql

I am working on a simple problem and wanted to solve it using SQL. I am having 3 tables Category, Item & a relational table CategoryItem. I need to return count of items per category but the twist is Categories are arranged in Parent-Child relationships and the count of items in child categories should be added to the count in its parent Category. Please consider the sample data below and the expected resultset using SQL.
Id Name ParentCategoryId
1 Category1 Null
2 Category1.1 1
3 Category2.1 2
4 Category1.2 1
5 Category3.1 3
ID CateoryId ItemId
1 5 1
2 4 2
3 5 2
4 3 1
5 2 3
6 1 1
7 3 2
Result:
CategoryNAme Count
Category1 7
Category1.1 5
Category2.1 4
Category1.2 1
Category3.1 2
I can do it in my business layer but performance its not optimal because of size of data. I am hoping if I can do it in data layer, I would be able to improve performance greatly.
Thanks in Advance for your reply

your tables and sample data
create table #Category(Id int identity(1,1),Name Varchar(255),parentId int)
INSERT INTO #Category(Name,parentId) values
('Category1',null),('Category1.1',1),('Category2.1',2),
('Category1.2',1),('Category3.1',3)
create table #CategoryItem(Id int identity(1,1),categoryId int,itemId int)
INSERT INTO #CategoryItem(categoryId,itemId) values
(5,1),(4,2),(5,2),(3,1),(2,3),(1,1),(3,2)
create table #Item(Id int identity(1,1),Name varchar(255))
INSERT INTO #Item(Name) values('item1'),('item2'),('item3')
Checking for all childs of parent by Recursive Commom Table Expressions
;WITH CategorySearch(ID, parentId) AS
(
SELECT ID, ID AS ParentId FROM #Category
UNION ALL
SELECT CT.Id,CS.parentId FROM #Category CT
INNER JOIN CategorySearch CS ON CT.ParentId = CS.ID
)
select * from CategorySearch order by 1,2
Output: All child records against parent
ID parentId
1 1
2 1
3 1
4 1
5 1
2 2
3 2
5 2
3 3
5 3
4 4
5 5
Final query for your result, count all items for category and its children categories.
;WITH CategorySearch(ID, parentId) AS
(
SELECT ID, ID AS ParentId FROM #Category
UNION ALL
SELECT CT.Id,CS.parentId FROM #Category CT
INNER JOIN CategorySearch CS ON CT.ParentId = CS.ID
)
SELECT CA.Name AS CategoryName,count(itemId) CountItem
FROM #Category CA
INNER JOIN CategorySearch CS ON CS.ParentId = CA.id
INNER JOIN #CategoryItem MI ON MI.CategoryId =CS.ID
GROUP BY CA.Name
Output:
CategoryName CountItem
Category1 7
Category1.1 5
Category1.2 1
Category2.1 4
Category3.1 2

with help of CTE (common table expression) with recursion, you could achieve what you are looking for.
see Microsoft help for more details retated to recursive CTEs: CTE MS SQL 2008 +
hereby you could find complete example, with your sample data:
-- tables definition
SELECT 1 as id, 'cat1' as [name],NULL as id_parent
into cat
union
select 2, 'cat1.1', 1
union
select 3, 'cat2.1', 2
union
select 4, 'cat1.2', 1
union
select 5, 'cat3.1', 3
select 1 as id , 5 as id_cat, 1 as id_item
iNTO item
UNION
select 2, 4, 2
UNION
select 3, 5, 2
UNION
select 4, 3, 1
UNION
select 5, 2, 3
UNION
select 6, 1, 1
UNION
select 7, 3, 2
-- CTE to get desired result
with childs
as
(
select c.id, c.id_parent
from cat c
UNION ALL
select s.id, p.id_parent
from cat s JOIN childs p
ON (s.id_parent=p.id)
),
category_count
AS
(
SELECT c.id, c.name, count(i.id) as items
from cat c left outer join item i
on (c.id=i.id_cat)
GROUP BY c.id,c.name
),
pairs
AS
(
SELECT id, ISNULL(id_parent,id) as id_parent
FROM childs
)
select p.id_parent, n.name, sum(items)
from pairs p JOIN category_count cc
ON (p.id=cc.id)
join cat n ON (p.id_parent=n.id)
GROUP by p.id_parent ,n.name
ORDER by 1;

Related

get the sum of children nodes right below for every parent in a hierarchical query

i have a table objective with columns id, label, cost, parent_id
and i have this query
select
b.id,
b.parent_id,
b.label,
b.cost
from objective b
start with b.parent_id is null
connect by
prior b.id=b.parent_id
ORDER SIBLINGS BY b.id
only a leaf (and sometimes a parent of a leaf) has the column cost not null the others are almost always null, is it possible to get the cost of every node (summing the cost of the nodes right below) or not.
i know it can be done with join and sum but i thought maybe there is a simpler way to do it.
if this was the table content :
ID PARENT_ID LABEL COST
1 null A 0
2 1 B 1
3 2 C 3
4 1 D 0
5 4 E 3
6 5 F 7
7 4 G 5
the result should be something like this :
ID PARENT_ID LABEL COST CALCULATED_COST
1 null A 0 15(meaning 0 is a wrong value)
2 1 B 1 3 (meaning 1 is a wrong value)
3 2 C 3 3
4 1 D 0 12
5 4 E 3 7 (meaning 3 is a wrong value)
6 5 F 7 7
7 4 G 5 5
only a leaf (and sometimes a parent of a leaf) has the column cost not null the others are almost always null
If the rule is that "only a leaf or a parent of a leaf have a non-null cost and the others are always null" then you can use:
SELECT id,
parent_id,
label,
COALESCE(PRIOR cost, 0) + COALESCE(cost, 0) AS cost
FROM objective
START WITH
parent_id IS NULL
CONNECT BY
PRIOR id = parent_id
ORDER SIBLINGS BY
id
If the rule is that "any row can have a non-null cost" then you will need to parse the data structure in both directions:
SELECT id,
parent_id,
label,
( SELECT COALESCE(SUM(cost), 0)
FROM objective c
START WITH c.id = o.id
CONNECT BY PRIOR parent_id = id -- Reverse the direction
) AS cost
FROM objective o
START WITH
parent_id IS NULL
CONNECT BY
PRIOR id = parent_id
ORDER SIBLINGS BY
id
or use a recursive sub-query factoring clause:
WITH rsqfc (id, parent_id, label, cost) AS (
SELECT id, parent_id, label, COALESCE(cost, 0)
FROM objective
WHERE parent_id IS NULL
UNION ALL
SELECT o.id, o.parent_id, o.label, COALESCE(o.cost, 0) + r.cost
FROM objective o
INNER JOIN rsqfc r
ON (r.id = o.parent_id)
) SEARCH DEPTH FIRST BY id SET order_id
SELECT *
FROM rsqfc;
fiddle
Update
If you are looking to sum the cost for a row and all its descendants then you can use:
SELECT id,
parent_id,
label,
cost,
( SELECT COALESCE(SUM(cost), 0)
FROM objective c
START WITH c.id = o.id
CONNECT BY PRIOR id = parent_id
) AS total_cost
FROM objective o
START WITH
parent_id IS NULL
CONNECT BY
PRIOR id = parent_id
ORDER SIBLINGS BY
id;
Which, for the sample data outputs:
CREATE TABLE objective ( id, parent_id, label, cost ) AS
SELECT 1, NULL, 'A', 0 FROM DUAL UNION ALL
SELECT 2, 1, 'B', 1 FROM DUAL UNION ALL
SELECT 3, 2, 'C', 3 FROM DUAL UNION ALL
SELECT 4, 1, 'D', 0 FROM DUAL UNION ALL
SELECT 5, 4, 'E', 3 FROM DUAL UNION ALL
SELECT 6, 5, 'F', 7 FROM DUAL UNION ALL
SELECT 7, 4, 'G', 5 FROM DUAL;
Outputs:
ID
PARENT_ID
LABEL
COST
TOTAL_COST
1
null
A
0
19
2
1
B
1
4
3
2
C
3
3
4
1
D
0
15
5
4
E
3
10
6
5
F
7
7
7
4
G
5
5
fiddle
Update 2:
If you only want to total the descendants that are leaves then:
SELECT id,
parent_id,
label,
cost,
( SELECT COALESCE(SUM(cost), 0)
FROM objective c
WHERE CONNECT_BY_ISLEAF = 1
START WITH c.id = o.id
CONNECT BY PRIOR id = parent_id
) AS total_cost
FROM objective o
START WITH
parent_id IS NULL
CONNECT BY
PRIOR id = parent_id
ORDER SIBLINGS BY
id;
Which, for the sample data, outputs:
ID
PARENT_ID
LABEL
COST
TOTAL_COST
1
null
A
0
15
2
1
B
1
3
3
2
C
3
3
4
1
D
0
12
5
4
E
3
7
6
5
F
7
7
7
4
G
5
5
fiddle

Recursive query to retrieve all child from given id

I am migrating from oracle to postgres and I don't know how to make a recursive query
Here is an example data set:
id
id_parent
name
1
0
aa
2
0
aa
3
1
aa
4
3
aa
5
3
aa
6
2
aa
7
6
aa
For id = 3, I want to get
id
3
4
5
for id =1
id
1
3
4
5
with this I have all but I don't know how to filter by id:
SELECT id FROM (
with recursive cat as (
select * from table
union all
select table.*
from table
join cat on cat.id_parent = table.id
)
select * from cat order by id
)
as listado where id != '0' group by id
There is no need of using sub-query. You just need to fix your recursive query only as -
WITH RECURSIVE cat (id, id_parent) as
(
SELECT id, id_parent
FROM table
WHERE id = 1
UNION ALL
SELECT d.id, d.id_parent
FROM table d
JOIN cat ON cat.id = d.id_parent
)
SELECT id
FROM cat
ORDER BY id;
Demo.

Using the results of a STRING_AGG function with the IN operator in a WHERE clause

I have column children_ids which contain PKs from a STRING_AGG function. I am trying to use this column within a WHERE clause with the IN operator to return the total_pets but it doesn't work. If I copy and paste the values directly into the IN operator the query returns the correct info, otherwise no reuslts are found.
Here are my data sets:
Parents
=======
id parent_name
----------------
1 Bob and Mary
2 Mick and Jo
Children
========
id child_name parent_id
-------------------------
1 Eddie 1
2 Frankie 1
3 Robbie 1
4 Duncan 2
5 Rick 2
6 Jen 2
Childrens Pets
===============
id pet_name child_id
-------------------------
1 Puppy 1
2 Piggy 2
3 Monkey 3
4 Lamb 4
5 Tiger 5
6 Bear 6
7 Zebra 6
Expected Output
===============
parent_id children_ids total_pets
-----------------------------------
1 1,2,3 3
2 4,5,6 4
Current [undesired] Output
==========================
parent_id children_ids total_pets
-----------------------------------
1 1,2,3 0
2 4,5,6 0
here is the standard sql to test for yourself
# setup data with standardSQL
WITH `parents` AS (
SELECT 1 id, 'Bob and Mary' parent_names UNION ALL
SELECT 2, 'Mick and Jo'
),
`children` AS (
SELECT 1 id, 'Eddie' child_name, 1 parent_id UNION ALL
SELECT 2, 'Frankie', 1 UNION ALL
SELECT 3, 'Robbie', 1 UNION ALL
SELECT 4, 'Duncan', 2 UNION ALL
SELECT 5, 'Rick', 2 UNION ALL
SELECT 6, 'Jen', 2
),
`childrens_pets` AS (
SELECT 1 id, 'Puppy' pet_name, 1 child_id UNION ALL
SELECT 2, 'Piggy', 2 UNION ALL
SELECT 3, 'Monkey', 3 UNION ALL
SELECT 4, 'Lamb', 4 UNION ALL
SELECT 5, 'Tiger', 5 UNION ALL
SELECT 6, 'Bear', 6 UNION ALL
SELECT 7, 'Zebra', 6
)
And the query:
#standardSQL
select
parent_id
, children_ids
-- !!! This keeps returning 0 instead of the total pets for each parent based on their children
, (
select count(p1.id)
from childrens_pets p1
where cast(p1.child_id as string) in (children_ids)
) as total_pets
from
(
SELECT
p.id as parent_id
, (
select string_agg(cast(c1.id as string))
from children as c1
where c1.parent_id = p.id
) as children_ids
FROM parents as p
join children as c
on p.id = c.parent_id
join childrens_pets as cp
on cp.child_id = c.id
)
GROUP BY
parent_id
, children_ids
... but is there a way to do it using the IN operator as my query ...
Just fix one line and it will work for you!
Replace
WHERE CAST(p1.child_id AS STRING) IN (children_ids)
with
WHERE CAST(p1.child_id AS STRING) IN (SELECT * FROM UNNEST(SPLIT(children_ids)))
Huh? This would seem to do what you want:
SELECT p.id as parent_id,
string_agg(distinct cast(c.id as string)) as children_ids
count(distinct cp.id) as num_pets
FROM parents p JOIN
children c
ON p.id = c.parent_id JOIN
children_pets cp
ON cp.child_id = c.id
GROUP BY parent_id;

SQL Database Parent/Child recursion

Here is my table:
parent_id | child_id
--------------
1 | 2
1 | 3
1 | 4
2 | 5
2 | 6
5 | 8
8 | 9
9 | 5
I need to get all of the items under parent 2. I've found a few things similar to this, but but couldn't figure out how to make it work for my case. I keep getting maximum recursion limit reached. Here's what I have:
WITH CTE AS
(
SELECT gt.[child_id]
FROM [CHSPortal].[dbo].[company_adgroupstoadgroups] gt
WHERE gt.parent_id='2'
UNION ALL
SELECT g.[child_id]
FROM [CHSPortal].[dbo].[company_adgroupstoadgroups] g
INNER JOIN CTE g2 on g.parent_id=g2.child_id
)
select distinct child_id from CTE
The desired result is going to be: 2,3,4,5,6,8,9.
What modification do I need to make to get a list of all the items under child 2. I would also prefer 2 (the parent node) to be in the list. Any help would be appreciated.
First of all, there is a loop in your example (5|8, 8|9, 9|5), that is why you reach the maximum recursion limit.
Regarding the filtering question,below you can find an example for filtering by root node:
;WITH MTree (parent_id, child_id, LEVEL) AS (
SELECT t.parent_id , t.child_id, 0 AS LEVEL
FROM table_1 t
WHERE child_id = 2 --here you can filter the root node
UNION ALL
SELECT m.parent_id , m.child_id, LEVEL + 1
FROM Table_1 m
INNER JOIN MTree t ON t.child_id = m.parent_id
)
SELECT * FROM Mtree;
Not sure what's wrong with your query, aside from not relating to the sample data you provided, but this works just fine:
;WITH src AS (SELECT 1 AS parent_id, 2 AS child_id
UNION SELECT 1, 3
UNION SELECT 1, 4
UNION SELECT 2, 5
UNION SELECT 2, 6
UNION SELECT 5, 8
UNION SELECT 8, 9
UNION SELECT 9, 5)
,cte AS (SELECT *
FROM src
WHERE child_id = 2
UNION ALL
SELECT a.*
FROM src a
JOIN cte b
ON a.parent_id = b.child_id
)
SELECT TOP 100 *
FROM cte
--Limited to top 100 because of infinite recursion problem with sample data.

Need sql query for matching with three values

I have a table like below
CAccountID CID NetworkID
1 1 1
2 1 2
3 2 1
4 2 2
5 2 3
6 3 1
7 3 2
8 3 3
9 4 1
10 4 2
I need a query to select all CID having all 3 NetworkID(1,2,3) and don't need to display only 1 and 2 NetworkID.
Output should be like below,
CAccountID CID NetworkID
3 2 1
4 2 2
5 2 3
6 3 1
7 3 2
8 3 3
You can use GROUP BY with JOIN :
select t.*
from table t inner join
( select cid
from table
where NetworkID in (1,2,3)
group by cid
having count(distinct NetworkID) = 3
) tt
on tt.cid = t.cid;
Try this:
select * from my_table t
where exists(select 1 from my_table
where CID = t.CID and NetworkID in (1,2,3)
group by CID
having count(*) = 3)
Try this:
select * from <<tablename>> where cid in(select cid from <<tablename>> group by cid having count(*)=3).
Here the subquery will return you all thouse cid which have 3 rows in your table.
Or if you have more network ids then use of INTERSECT operator can be helpful:
select * from <<tablename>> where cid in (
select cid from <<tablename>> where NetworkID=1
INTERSECT
select cid from <<tablename>> where NetworkID=2
INTERSECT
select cid from <<tablename>> where NetworkID=3
);
INTERSECT operator basically returns all the rows common in the queries. Thus, your data unpredicatbility can be handled in this way
Try xml path.
SELECT *
FROM Table_Name B
WHERE (SELECT [text()] = A.Network FROM Table_Name A WHERE A.CID = B.CID
ORDER BY CID, CAAccount FOR XML PATH('')) = 123
CTE Demo:
; WITH CTE(CAAccount, CID, Network) AS
(
SELECT 1 , 1, 1 UNION ALL
SELECT 2 , 1, 2 UNION ALL
SELECT 3 , 2, 1 UNION ALL
SELECT 4 , 2, 2 UNION ALL
SELECT 5 , 2, 3 UNION ALL
SELECT 6 , 3, 1 UNION ALL
SELECT 7 , 3, 2 UNION ALL
SELECT 8 , 3, 3 UNION ALL
SELECT 9 , 4, 1 UNION ALL
SELECT 10, 4, 2
) SELECT *
FROM CTE B
WHERE (SELECT [text()] = A.Network FROM CTE A WHERE A.CID = B.CID ORDER BY CID, CAAccount FOR XML PATH('')) = 123
Output:
CAAccount CID Network
3 2 1
4 2 2
5 2 3
6 3 1
7 3 2
8 3 3