Recursively sum the nodes of a tree using Postgresql WITH clause - sql

(Using Postgresql 9.1)
I have a tree structure in the database and I need to sum the node's values. There are two caveats:
Not all nodes have a value.
If a parent node has a value, ignore the child values.
While recursing the tree is easy with the powerful recursive WITH clause, it's enforcing these two caveats that is breaking my code. Here's my setup:
CREATE TABLE node (
id VARCHAR(1) PRIMARY KEY
);
INSERT INTO node VALUES ('A');
INSERT INTO node VALUES ('B');
INSERT INTO node VALUES ('C');
INSERT INTO node VALUES ('D');
INSERT INTO node VALUES ('E');
INSERT INTO node VALUES ('F');
INSERT INTO node VALUES ('G');
CREATE TABLE node_value (
id VARCHAR(1) PRIMARY KEY,
value INTEGER
);
INSERT INTO node_value VALUES ('B', 5);
INSERT INTO node_value VALUES ('D', 2);
INSERT INTO node_value VALUES ('E', 0);
INSERT INTO node_value VALUES ('F', 3);
INSERT INTO node_value VALUES ('G', 4);
CREATE TABLE tree (
parent VARCHAR(1),
child VARCHAR(1)
);
INSERT INTO tree VALUES ('A', 'B');
INSERT INTO tree VALUES ('B', 'D');
INSERT INTO tree VALUES ('B', 'E');
INSERT INTO tree VALUES ('A', 'C');
INSERT INTO tree VALUES ('C', 'F');
INSERT INTO tree VALUES ('C', 'G');
This gives me the following tree (nodes and values):
A
|--B(5)
| |--D(2)
| |--E(0)
|
|--C
|--F(3)
|--G(4)
Given the rules above, here are the expected sum values:
A = (5 + 3 + 4) = 12
B = 5
D = 2
E = 0
C = (3 + 4) = 7
F = 3
G = 4
I have written the following SQL, but I can't integrate the recursive UNION and JOIN logic to enforce rule #1 and #2:
WITH recursive treeSum(root, parent, child, total_value) AS (
SELECT tree.parent root, tree.parent, tree.child, node_value.value total_value
FROM tree
LEFT JOIN node_value ON node_value.id = tree.parent
UNION
SELECT treeSum.root, tree.parent, tree.child, node_value.value total_value
FROM tree
INNER JOIN treeSum ON treeSum.child = tree.parent
LEFT JOIN node_value ON node_value.id = tree.parent
)
SELECT root, sum(total_value) FROM treeSum WHERE root = 'A' GROUP BY root
The query returns 10 for root A, but it should be 12. I know the UNION and/or JOIN logic is what's throwing this off. Any help would be appreciated.
EDIT: To clarify, the sum for A is 12, not 14. Given the rules, if a node has a value, grab that value and ignore its children. Because B has a value of 5 we ignore D and E. C has no value, so we grab its children, thus the sum of A = 5(B) + 3(F) + 4(G) = 12. I know it's odd but that's the requirement. Thanks.
EDIT 2: These results will be joined with external datasets so I can't hardcode the root in the WITH clause. For example, I might need something like this:
SELECT root, SUM(total_value) FROM treeSUM GROUP BY root WHERE root = 'A'
This tree is one of many so that means there's multiple roots, specified by calling code--not within the recursive clause itself. Thanks.
EDIT 3: An example of how this will be used in production is the roots will be specified by another table, so I can't hardcode the root into the recursive clause. There might be many roots from many trees.
SELECT id, SUM(COALESCE(value,0)) FROM treeSUM
INNER JOIN roots_to_select rts ON rts.id = treeSUM.id GROUP BY id
SOLUTION (Cleaned up from koriander's answer below)! The following allows roots to be specified by outside sources (either using roots_to_select or WHERE criteria:
WITH recursive roots_to_select AS (
SELECT 'A'::varchar as id
),
treeSum(root, id, value) AS (
select node.id as root, node.id, node_value.value
from node
inner join roots_to_select rts on (node.id = rts.id)
left join node_value on (node.id = node_value.id)
union
select treeSum.root, node.id, node_value.value
from treeSum
inner join tree on (treeSum.id = tree.parent)
inner join node on (tree.child = node.id)
left join node_value on (node.id = node_value.id)
where treeSum.value is null
)
select root, sum(coalesce(value, 0))
from treeSum
group by root
OUTPUT: 12

tested here:
with recursive treeSum(id, value) AS (
select node.id, node_value.value
from node
left join node_value on (node.id = node_value.id)
where node.id = 'A'
union
select node.id, node_value.value
from treeSum
inner join tree on (treeSum.id = tree.parent)
inner join node on (tree.child = node.id)
left join node_value on (node.id = node_value.id)
where treeSum.value is null
)
select sum(coalesce(value, 0)) from treeSum
Edit 1: to combine the result with other table, you can do:
select id, (select sum(coalesce(value, 0)) from treeSum) as nodesum
from node
inner join some_table on (...)
where node.id = 'A'
Edit 2: to support multiple roots based on your Edit 3, you can do (untested):
with recursive treeSum(root, id, value) AS (
select node.id as root, node.id, node_value.value
from node
inner join roots_to_select rts on (node.id = rts.id)
left join node_value on (node.id = node_value.id)
union
select treeSum.root, node.id, node_value.value
from treeSum
inner join tree on (treeSum.id = tree.parent)
inner join node on (tree.child = node.id)
left join node_value on (node.id = node_value.id)
where treeSum.value is null
)
select root, sum(coalesce(value, 0))
from treeSum
group by root

Related

UPDATE Table based on the same table with relations to other tables

I have a 2 tables:
Product (Id, RefKey, ParentId)
example data.: (1, 'SX1234', NULL), (2, 'SX4321', NULL)
and
ProductSTAGE (Id, RefKeyCode, ParentCode)
example data: (1, 'SX1234', 'SX4321')
where Product.RefKey = ProductSTAGE.RefKeyCode
How can I update Product table based on these relations to result
Product (Id, RefKey, ParentId)
result data.: (1, 'SX1234', 2)
I used
WITH CTE
AS (
SELECT P.ParentId FROM Product AS P
)
UPDATE CTE SET ParentId = P2.Id
FROM Product AS P2
INNER JOIN ProductSTAGE AS PS ON PS.RefKeyCode = P2.RefKey
WHERE PS.ParentCode IS NOT NULL
but using this my Product.ParentId always is equal Product.Id
I found my CTE problem but CTE is not necessary on this issue. This query works and should be enough for me:
UPDATE P
SET P.ParentId = P2.Id
FROM Product AS P
INNER JOIN ProductSTAGE AS PS ON P.RefKey = PS.RefKeyCode
INNER JOIN Product AS P2 ON P2.RefKey = PS.ParentCode
WHERE PS.ParentCode IS NOT NULL

Transpose data in SQL Server Select

I am wondering if there is a better way to write this query. It achieves the target result but my colleague would prefer it be written without the subselects into temp tables t1-t3. The main "challenge" here is transposing the data from dbo.ReviewsData into a single row along with the rest of the data joined from dbo.Prodcucts and dbo.Reviews.
CREATE TABLE dbo.Products (
idProduct int identity,
product_title varchar(100)
PRIMARY KEY (idProduct)
);
INSERT INTO dbo.Products VALUES
(1001, 'poptart'),
(1002, 'coat hanger'),
(1003, 'sunglasses');
CREATE TABLE dbo.Reviews (
Rev_IDReview int identity,
Rev_IDProduct int
PRIMARY KEY (Rev_IDReview)
FOREIGN KEY (Rev_IDProduct) REFERENCES dbo.Products(idProduct)
);
INSERT INTO dbo.Reviews VALUES
(456, 1001),
(457, 1002),
(458, 1003);
CREATE TABLE dbo.ReviewFields (
RF_IDField int identity,
RF_FieldName varchar(32),
PRIMARY KEY (RF_IDField)
);
INSERT INTO dbo.ReviewFields VALUES
(1, 'Customer Name'),
(2, 'Review Title'),
(3, 'Review Message');
CREATE TABLE dbo.ReviewData (
RD_idData int identity,
RD_IDReview int,
RD_IDField int,
RD_FieldContent varchar(100)
PRIMARY KEY (RD_idData)
FOREIGN KEY (RD_IDReview) REFERENCES dbo.Reviews(Rev_IDReview)
);
INSERT INTO dbo.ReviewData VALUES
(79, 456, 1, 'Daniel'),
(80, 456, 2, 'Love this item!'),
(81, 456, 3, 'Works well...blah blah'),
(82, 457, 1, 'Joe!'),
(84, 457, 2, 'Pure Trash'),
(85, 457, 3, 'It was literally a used banana peel'),
(86, 458, 1, 'Karen'),
(87, 458, 2, 'Could be better'),
(88, 458, 3, 'I can always find something wrong');
SELECT P.product_title as "item", t1.ReviewedBy, t2.ReviewTitle, t3.ReviewContent
FROM dbo.Reviews R
INNER JOIN dbo.Products P
ON P.idProduct = R.Rev_IDProduct
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewedBy", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewTitle", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 2
) t2
ON t2.RD_IDReview = R.Rev_IDReview
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewContent", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 3
) t3
ON t3.RD_IDReview = R.Rev_IDReview
EDIT: I have updated this post with the create statements for the tables as opposed to an image of the data (shame on me) and a more specific description of what exactly needed to be improved. Thanks to all for the comments and patience.
As others have said in comments, there is nothing objectively wrong with the query. However, you could argue that it's verbose and hard to read.
One way to shorten it is to replace INNER JOIN with CROSS APPLY:
INNER JOIN (
SELECT D.RD_FieldContent AS 'ReviewedBy', D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
APPLY lets you refer to values from the outer query, like in a subquery:
CROSS APPLY (
SELECT D.RD_FieldContent AS 'ReviewedBy'
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1 AND D.RD_IDReview = R.Rev_IDReview
) t1
I think of APPLY like a subquery that brings in new columns. It's like a cross between a subquery and a join. Benefits:
The query can be shorter, because you don't have to repeat the ID column twice.
You don't have to expose columns that you don't need.
Disadvantages:
If the query in the APPLY references outer values, then you can't extract it and run it all by itself without modifications.
APPLY is specific to Sql Server and it's not that widely-used.
Another thing to consider is using subqueries instead of joins for values that you only need in one place. Benefits:
The queries can be made shorter, because you don't have to repeat the ID column twice, and you don't have to give the output columns unique aliases.
You only have to look in one place to see the whole subquery.
Subqueries can only return 1 row, so you can't accidentally create extra rows, if only 1 row is desired.
SELECT
P.product_title as 'item',
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewedBy,
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 2 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewTitle,
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 3 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewContent
FROM dbo.Reviews R
INNER JOIN dbo.Products P ON P.idProduct = R.Rev_IDProduct
Edit:
It just occurred to me that you have made the joins themselves unnecessarily verbose (#Dale K actually already pointed this out in the comments):
INNER JOIN (
SELECT D.RD_FieldContent AS 'ReviewedBy', D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
Shorter:
SELECT RevBy.RD_FieldContent AS 'ReviewedBy'
...
INNER JOIN dbo.ReviewsData RevBy
ON RevBy.RD_IDReview = R.Rev_IDReview AND
RevBy.RD_IDField = 1
The originally submitted query is undoubtedly and unnecessarily verbose. Having digested various feedback from the community it has been revised to the following, working splendidly. In retrospect I feel very silly for having done this with subselects originally. I am clearly intermediate at best when it comes to SQL - I had not realized an "AND" clause could be included in the "ON" clause in a "JOIN" statement. Not sure why I would have made such a poor assumption.
SELECT
P.product_title as 'item',
D1.RD_FieldContent as 'ReviewedBy',
D2.RD_FieldContent as 'ReviewTitle',
D3.RD_FieldContent as 'ReviewContent'
FROM dbo.Reviews R
INNER JOIN dbo.Products P
ON P.idProduct = R.Rev_IDProduct
INNER JOIN dbo.ReviewsData D1
ON D1.RD_IDReview = R.Rev_IDReview AND D1.RD_IDField = 1
INNER JOIN dbo.ReviewsData D2
ON D2.RD_IDReview = R.Rev_IDReview AND D2.RD_IDField = 2
INNER JOIN dbo.ReviewsData D3
ON D3.RD_IDReview = R.Rev_IDReview AND D3.RD_IDField = 3

SQL Server query for merging 2 rows in 1

There is a SQL Server table (see the screenshot below) that I cannot change:
Products have identifiers and process parameters. There are 2 processes, A and B. Every process stores data in an own row.
I would like to get a table result without useless NULL values. One product on one row, no more. Desired cells are highlighted. See the 2nd screenshot for the desired output:
select a.id,
isnull(b.state, a.state) as state,
isnull(b.process, a.process) as process,
isnull(b.length, a.length) as length,
isnull(b.force, a.force) as force,
isnull(b.angle, a.angle) as angle
from table as a
left join table as b
on a.id = b.id
and b.process = 'B'
where a.process = 'A'
DECLARE #T AS TABLE (id int, state varchar(10), process varchar(10), length int, angle int
primary key (id, process));
insert into #t (id, state, process, length, angle) values
(111, 'OK', 'A', 77, null)
,(111, 'OK', 'B', null, 30)
,(159, 'NOK', 'A', 89, null)
,(147, 'OK', 'A', 78, null)
,(147, 'NOK', 'B', null, 36);
select ta.id, --ta.*, tb.*
isnull(tb.state, ta.state) as state,
isnull(tb.process, ta.process) as process,
isnull(tb.length, ta.length) as length,
isnull(tb.angle, ta.angle) as angle
from #t ta
left join #t tb
on ta.id = tb.id
and tb.process = 'B'
where ta.process = 'A'
order by ta.id
Self join? (edited)
so,
select
a.id,
a.process,
isnull(a.length, b.length) as length,
isnull(a.force, b.force) as force,
isnull(a.angle, b.angle) as angle
from
(select * from tmpdel where process = 'A') as a left join
(select * from tmpdel where process = 'B') as b on
a.id = b.id
I've given this a quick test and I think it looks right.

How to handle recursion in a flat table?

I have two tables that keep track of permissions for groups of users. The first table is just two columns, an identifier and a name, used solely for the names of the permissions. The second table is where the permissions are applied and parent permissions are assigned to create an hierarchy. My problem is that I'm using joins to create a permission hierarchy "string" based on parent permissions and, without knowing how deep that parent recursion might go, I have no way of knowing how many joins to make. My questions is, is there a more correct way to solve this problem?
I've included a complete working script, but I stripped unnecessary columns:
CREATE TABLE #TempPermissions
(
Permission_ID INT IDENTITY,
Permission VARCHAR(50)
)
CREATE TABLE #TempAppPermissions
(
AppPermission_ID INT IDENTITY,
Permission_ID INT,
Parent_ID INT
)
INSERT INTO #TempPermissions VALUES ('Users')
INSERT INTO #TempPermissions VALUES ('Add')
INSERT INTO #TempPermissions VALUES ('Edit')
INSERT INTO #TempPermissions VALUES ('Remove')
INSERT INTO #TempPermissions VALUES ('Permissions')
INSERT INTO #TempPermissions VALUES ('Configure')
INSERT INTO #TempAppPermissions VALUES (1, -1)
INSERT INTO #TempAppPermissions VALUES (2, 1)
INSERT INTO #TempAppPermissions VALUES (3, 1)
INSERT INTO #TempAppPermissions VALUES (4, 1)
INSERT INTO #TempAppPermissions VALUES (5, 1)
INSERT INTO #TempAppPermissions VALUES (6, 5)
SELECT app.AppPermission_ID,
(CASE WHEN NOT child3.Permission IS NULL THEN '/' + child3.Permission ELSE '' END)+
(CASE WHEN NOT child2.Permission IS NULL THEN '/' + child2.Permission ELSE '' END)+
'/' + child1.Permission AS PermissionString
FROM #TempAppPermissions app
INNER JOIN #TempPermissions child1
ON child1.Permission_ID = app.Permission_ID
LEFT JOIN #TempAppPermissions parent1
ON parent1.AppPermission_ID = app.Parent_ID
LEFT JOIN #TempPermissions child2
ON child2.Permission_ID = parent1.Permission_ID
LEFT JOIN #TempAppPermissions parent2
ON parent2.AppPermission_ID = parent1.Parent_ID
LEFT JOIN #TempPermissions child3
ON child3.Permission_ID = parent2.Permission_ID
DROP TABLE #TempPermissions, #TempAppPermissions
This provides me with the results:
AppPermission_ID PermissionString
1 /Users
2 /Users/Add
3 /Users/Edit
4 /Users/Remove
5 /Users/Permissions
6 /Users/Permissions/Configure
This works fine as is, but if I were to go another parent deep with:
INSERT INTO #TempPermissions VALUES ('Reports')
INSERT INTO #TempAppPermissions VALUES (7, 6)
I would have to compensate for it with another set of joins and another case expression in the select statement:
(CASE WHEN NOT child4.Permission IS NULL THEN '/' + child4.Permission ELSE '' END)+
...
LEFT JOIN #TempAppPermissions parent3
ON parent3.AppPermission_ID = parent2.Parent_ID
LEFT JOIN #TempPermissions child4
ON child4.Permission_ID = parent3.Permission_ID
If I do not, I will end up losing the topmost parent on the last result:
1 /Users
2 /Users/Add
3 /Users/Edit
4 /Users/Remove
5 /Users/Permissions
6 /Users/Permissions/Configure
7 /Permissions/Configure/Reports
Technically, I could repeat this any number of times to compensate for how deep that structure may go, but I have the feeling there is probably a better approach this problem. Thanks in advance.
I would use CTE (Common Table Expressions).
;WITH t AS (
SELECT 1 AS iteration, p.Permission_ID AS PermissionID, p.Permission_ID, CAST(N'/' + p.Permission AS NVARCHAR(MAX)) AS Permission
FROM #TempPermissions AS p
UNION ALL
SELECT iteration + 1, t.PermissionID, p.Parent_ID, COALESCE(N'/' + (SELECT s.Permission FROM #TempPermissions AS s WHERE s.Permission_ID = p.Parent_ID), N'') + t.Permission
FROM t INNER JOIN #TempAppPermissions AS p ON t.Permission_ID = p.Permission_ID
)
SELECT PermissionID, Permission FROM t
WHERE Permission_ID = -1
ORDER BY PermissionID, Iteration
Let me know if this helps!
Supplementing the top answer here since I can't post the code in a comment. After playing with JoeFletch's code a bit, I realized the actual recursion should only be happening with the table #TempAppPermissions. With JoeFletch's code, I would hit the maximum recursion of 100 on a larger table. Also the iteration is unimportant.
One thing to note is "SELECT r.AppPermission_ID," on line 13 in the recursion, because I need that child's ID (not the parent's) from #TempAppPermissions to reference back to the user to see if they have that permission. The "Permission_ID" from #TempPermissions is omitted from the select because it is only necessary to get the actual permission name and "Parent_ID" is only used to filter out single instances of dependent permissions.
Thanks again, JoeFletch.
;WITH r AS
(
SELECT p.AppPermission_ID,
p.Parent_ID,
CAST('/' + (SELECT s.Permission
FROM #TempPermissions AS s
WHERE s.Permission_ID = p.Permission_ID)
AS NVARCHAR(MAX)) AS Permission
FROM #TempAppPermissions p
UNION ALL
SELECT r.AppPermission_ID,
p.Parent_ID,
COALESCE(N'/' + (SELECT s.Permission
FROM #TempPermissions AS s
WHERE s.Permission_ID = p.Permission_ID), N'')
+ r.Permission
FROM r
INNER JOIN #TempAppPermissions p
ON p.AppPermission_ID = r.Parent_ID
)
SELECT r.AppPermission_ID, r.Permission
FROM r
WHERE r.Parent_ID = -1
ORDER BY r.AppPermission_ID ASC

SQL Parent-Child in same Table in Oracle

I need held understanding how to work with the following example.
Let's say I have these two tables:
**Table Name: Children**
Child Shirt_Color_ID Parent
Bob 1 Kate
Kate 2 Jack
Jack 3 Jill
. . .
. . .
**Table Name: Shirt_Colors**
Shirt_Color_ID Shirt_Color
1 Red
2 Blue
3 White
And I want to return a following table:
Child Child_Shirt_Color Parent Parent_Shirt_Color
Bob Red Kate Blue
How would I get the Parent_Shirt_Color in?
I got how to show Child, Child_Shirt_Color, Parent:
select
Children.Child,
Shirt_Colors.Shirt_Color,
Children.Parent
from
Children,
Shirt_Colors
where
Shirt_Colors.Shirt_Color_ID = Children.Shirt_Color_ID and
Children.Child = 'Bob';
Other examples I have looked at for this, talked about using "WITH," but get errors every time I try saying it is unsupported. Also, I have a very very long relation between parents and children, so I do not want the entire list returned - only 2-3 generations.
Using Oracle
Any help would be appreciated. Thanks!
You need CTE and used it for recursive queries.
http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx
Try following code:
DROP TABLE Children
DROP TABLE Shirt_Colors
CREATE TABLE Children(
Child varchar(20),
Shirt_Color_ID int,
Parent varchar(20)
)
CREATE TABLE Shirt_Colors
(
Shirt_Color_ID int,
Shirt_Color varchar(20)
)
INSERT INTO Shirt_Colors (Shirt_Color_ID, Shirt_Color)
VALUES (1, 'Red'),
(2, 'Blue'),
(3, 'White'),
(4, 'Yellow')
INSERT INTO Children (Child, Shirt_Color_ID, Parent)
VALUES ('Bob', 1, 'Kate'),
('Kate', 2, 'Jack'),
('Jack', 3, 'Jill'),
('Jill', 4, NULL)
select * from Children
;
WITH CTE (Child, Shirt_Color, Parent)
AS
(
SELECT
C.Child,
SC.Shirt_Color,
C.Parent
FROM Children C
INNER JOIN Shirt_Colors SC
ON C.Shirt_Color_ID = SC.Shirt_Color_ID
WHERE C.Parent IS NULL
UNION ALL
SELECT
C.Child,
SC.Shirt_Color,
C.Parent
FROM CTE
INNER JOIN Children C
ON CTE.Child = C.Parent
INNER JOIN Shirt_Colors SC
ON C.Shirt_Color_ID = SC.Shirt_Color_ID
)
SELECT
Child,
Shirt_Color,
Parent
FROM CTE