Recursive query for multiple tables - sql

In a SQL Server table OBJECTS, some objects are derived from another object and it is potentially infinitely many levels deep. Another table contains ATTRIBUTES for objects but it list the attributes for the main(parent) object but not its derived objects. I am searching for a way to get all the objects with a specific attribute(that may or may not be derived)?
I think a Common Table Expression (recursive query) is the way to go but I cant understand how to use it.
DDL:
CREATE TABLE OBJECTS
(
[ID] INT,
[PARENTID] INT,
[ObjectName] VARCHAR(32)
);
INSERT INTO OBJECTS ([ID], [PARENTID], [ObjectName])
VALUES
(1, 0, 'Parent1'),
(2, 1, 'Parent2'),
(3, 1, 'Item1'),
(4, 1, 'Item2'),
(5, 2, 'Item3'),
(6, 0, 'Item4'),
(7, 0, 'Item5');
CREATE TABLE ATTRIBUTES
(
[ID] INT,
[AttributeName] VARCHAR(1)
);
INSERT INTO ATTRIBUTES ([ID], [AttributeName])
VALUES
(1, 'A'),
(1, 'B'),
(2, 'C'),
(2, 'D'),
(3, 'F'),
(6, 'C'),
(7, 'A');
Example question: how to list all objects(both 'native' and derived from parent objects) with a Attribute of 'A'?
Desired output:
ID OBJECTNAME
---------------
1 Parent1
2 Parent2
3 Item1
4 Item2
5 Item3
7 Item5

WITH cte AS (
SELECT
o.id,
o.parentid
FROM objects o
UNION ALL
SELECT
c.id,
o.parentid
FROM objects o
JOIN cte c ON c.parentid = o.id
)
SELECT DISTINCT
a.attributename,
c.id,
o.objectname
FROM cte c
JOIN attributes a ON (a.id = c.id OR a.id = c.parentid)
AND a.attributename = 'A'
LEFT JOIN objects o ON o.id = c.id
ORDER BY 1, 2, 3

Related

SQL Server recursive self join

I have a simple categories table as with the following columns:
Id
Name
ParentId
So, an infinite amount of Categories can be the child of a category. Take for example the following hierarchy:
I want, in a simple query that returns the category "Business Laptops" to also return a column with all it's parents, comma separator or something:
Or take the following example:
Recursive cte to the rescue....
Create and populate sample table (Please save us this step in your future questions):
DECLARE #T as table
(
id int,
name varchar(100),
parent_id int
)
INSERT INTO #T VALUES
(1, 'A', NULL),
(2, 'A.1', 1),
(3, 'A.2', 1),
(4, 'A.1.1', 2),
(5, 'B', NULL),
(6, 'B.1', 5),
(7, 'B.1.1', 6),
(8, 'B.2', 5),
(9, 'A.1.1.1', 4),
(10, 'A.1.1.2', 4)
The cte:
;WITH CTE AS
(
SELECT id, name, name as path, parent_id
FROM #T
WHERE parent_id IS NULL
UNION ALL
SELECT t.id, t.name, cast(cte.path +','+ t.name as varchar(100)), t.parent_id
FROM #T t
INNER JOIN CTE ON t.parent_id = CTE.id
)
The query:
SELECT id, name, path
FROM CTE
Results:
id name path
1 A A
5 B B
6 B.1 B,B.1
8 B.2 B,B.2
7 B.1.1 B,B.1,B.1.1
2 A.1 A,A.1
3 A.2 A,A.2
4 A.1.1 A,A.1,A.1.1
9 A.1.1.1 A,A.1,A.1.1,A.1.1.1
10 A.1.1.2 A,A.1,A.1.1,A.1.1.2
See online demo on rextester

SQL recursive logic

I have a situation where I need to configure existing client data to address a problem where our application was not correctly updating IDs in a table when it should have been.
Here's the scenario. We have a parent table, where rows can be inserted that effectively replace existing rows; the replacement can be recursive. We also have a child table, which has a field that points to the parent table. In existing data, the child table could be pointing at rows that have been replaced, and I need to correct that. I can't simply update each row to the replacing row, however, because that row could have been replaced as well, and I need the latest row to be reflected.
I was trying to find a way to write a CTE that would accomplish this for me, but I'm struggling to find a query that finds what I'm actually looking for. Here's a sample of the tables that I'm working with; the 'ShouldBe' column is what I'd like my update query to end up with, taking into account the recursive replacement of some of the rows.
DECLARE #parent TABLE (SampleID int,
SampleIDReplace int,
GroupID char(1))
INSERT INTO #parent (SampleID, SampleIDReplace, GroupID)
VALUES (1, -1, 'A'), (2, 1, 'A'), (3, -1, 'A'),
(4, -1, 'A'), (5, 4, 'A'), (6, 5, 'A'),
(7, -1, 'B'), (8, 7, 'B'), (9, 8, 'B')
DECLARE #child TABLE (ChildID int, ParentID int)
INSERT INTO #child (ChildID, ParentID)
VALUES (1, 4), (2, 7), (3, 1), (4, 3)
Desired results in child table, after the update script has been applied:
ChildID ParentID ParentID_ShouldBe
1 4 6 (4 replaced by 5, 5 replaced by 6)
2 7 9 (7 replaced by 8, 8 replaced by 9)
3 1 2 (1 replaced by 2)
4 3 3 (unchanged, never replaced)
The following returns what you are looking for:
with cte as (
select sampleid, sampleidreplace, 1 as num
from #parent
where sampleidreplace <> -1
union all
select p.sampleid, cte.sampleidreplace, cte.num+1
from #parent p join
cte
on p.sampleidreplace = cte.sampleId
)
select c.*, coalesce(p.sampleid, c.parentid)
from #child c left outer join
(select ROW_NUMBER() over (partition by sampleidreplace order by num desc) as seqnum, *
from cte
) p
on c.ParentID = p.SampleIDReplace and p.seqnum = 1
The recursive part keeps track of every correspondence (4-->5, 4-->6). The addition number is a "generation" count. We actually want the last generation. This is identified by using the row_number() function, ordering by the num in decreasing order -- hence the p.seqnum = 1.
Ok, so it took me a while and there are probably better ways to do it, but here is one option.
DECLARE #parent TABLE (SampleID int,
SampleIDReplace int,
GroupID char(1))
INSERT INTO #parent (SampleID, SampleIDReplace, GroupID)
VALUES (1, -1, 'A'), (2, 1, 'A'), (3, -1, 'A'),
(4, -1, 'A'), (5, 4, 'A'), (6, 5, 'A'),
(7, -1, 'B'), (8, 7, 'B'), (9, 8, 'B')
DECLARE #child TABLE (ChildID int, ParentID int)
INSERT INTO #child (ChildID, ParentID)
VALUES (1, 4), (2, 7), (3, 1), (4, 3)
;WITH RecursiveParent1 AS
(
SELECT SampleIDReplace, SampleID, 1 RecursionLevel
FROM #parent
WHERE SampleIDReplace != -1
UNION ALL
SELECT A.SampleIDReplace, B.SampleID, RecursionLevel + 1
FROM RecursiveParent1 A
INNER JOIN #parent B
ON A.SampleId = B.SampleIDReplace
),RecursiveParent2 AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY SampleIdReplace ORDER BY RecursionLevel DESC) RN
FROM RecursiveParent1
)
SELECT A.ChildID, ISNULL(B.ParentID,A.ParentID) ParentID
FROM #child A
LEFT JOIN ( SELECT SampleIDReplace, SampleID ParentID
FROM RecursiveParent2
WHERE RN = 1) B
ON A.ParentID = B.SampleIDReplace
OPTION(MAXRECURSION 500)
I've got a iterative SQL loop that I think sorts this out as follows:
WHILE EXISTS (SELECT * FROM #child C INNER JOIN #parent P ON C.ParentID = P.SampleIDReplace WHERE P.SampleIDReplace > -1)
BEGIN
UPDATE #child
SET ParentID = SampleID
FROM #parent
WHERE #child.ParentID = SampleIDReplace
END
Basically, the while condition compares the contents of the parent ID column in the child table and sees if there is a matching value in the SampleIDReplace column of the parent table. If there is, it goes and gets the SampleID of that record. It only stops when the join results in every SampleIDReplace being -1, meaning we have nothing else to do.
On your sample data, the above results in the expected output.
Note that I had to use temp tables rather than table variables here in order for the table to be accessible within the loop. If you have to use table variables then there would need to be a bit more surgery done.
Clearly if you have deep replacement hierarchies then you'll do quite a few updates, which may be a consideration when looking to perform the query against a production database.

PostgreSQL - Calculation count the elements in the hierarchy

There is a table:
CREATE TABLE product_categories (
id INT NOT NULL PRIMARY KEY,
parent INT NOT NULL,
name varchar(50) NOT NULL,
isProduct boolean NOT NULL
);
Is there any way to calculate count products in the each category?
That is:
INSERT INTO product_categories VALUES (1, NULL, 'Main', 'no');
INSERT INTO product_categories VALUES (2, 1, 'Plant', 'no');
INSERT INTO product_categories VALUES (3, 2, 'Cactus', 'yes');
INSERT INTO product_categories VALUES (4, 2, 'Spruce', 'yes');
INSERT INTO product_categories VALUES (5, 2, 'Birch', 'yes');
INSERT INTO product_categories VALUES (6, 2, 'Pine', 'yes');
INSERT INTO product_categories VALUES (7, 1, 'Stock', 'no');
INSERT INTO product_categories VALUES (8, 7, 'Spade', 'yes');
INSERT INTO product_categories VALUES (9, 7, 'Watering can', 'yes');
and should to receive:
Category | Count
Main | 6
Plant | 4
Stock | 2
You need to use a Recursive Common Table Expression
WITH RECURSIVE Parents AS
( SELECT ID, Parent, Name, IsProduct
FROM product_categories
WHERE Parent IS NOT NULL
UNION ALL
SELECT c.ID, p.Parent, c.Name, c.IsProduct
FROM product_categories c
INNER JOIN Parents p
ON p.ID = c.Parent
)
SELECT pc.Name,
COUNT(*) AS Products,
ARRAY_AGG(p.Name) AS ProductList
FROM product_categories pc
INNER JOIN Parents p
ON p.Parent = pc.ID
WHERE p.IsProduct = 'yes'
GROUP BY pc.Name
Working Example

Check if matching child records exist before saving parent records

We basically have a set of child records in which we will use to create a new parent/child record(s) but need to first verify that a parent record doesn't already exist containing the same child records. Here are the details:
We have 3 tables, one is basically a linking table between the parent and children records.
Table A (parent table)
Id
Name
Desc
Table B (linking table between tables A and C)
Id
TableAId
TableCId
Table C (child table)
Id
StartPosition
EndPosition
Percentage
So with that structure, here is an example of a complete record, the parent table it one-to-many relation with child table:
Table A
(1, 'Sample', 'N/A')
Table B
(1, 1, 1)
(2, 1, 2)
(3, 1, 3)
Table C
(1, 1, 3, 0.50)
(2, 4, 5, 0.30)
(3, 6, 9, 0.20)
So we then pass in an xml string which we parse and throw into a temp table. The contents of the temp table are that of Table C, without the specific Id.
Then before we save any new records, we need to check if there is an existing Table A record which has both the same number of child records and that those child records match the 3 columns in our temp table (no ID match possible).
Hopefully this is explained well enough, I have done many searches and can't find anything specific to this issue.
What you're looking for is called a relational division. The article "Divided We Stand: The SQL of Relational Division" provides a nice summary of various techniques for using SQL to perform a relational division. For your case, you want the technique listed under "Exact Division":
CREATE TABLE tableA (
Id int PRIMARY KEY,
Name varchar(25),
[Desc] varchar(255)
);
INSERT INTO tableA
(Id, Name, [Desc])
VALUES
(1, 'Sample 1', 'Should match the XML'),
(2, 'Sample 2', 'Partial match (should be excluded)'),
(3, 'Sample 3', 'Has extra matches (should be excluded)');
GO
CREATE TABLE tableB (
Id int PRIMARY KEY,
TableAId int,
TableCId int
);
INSERT INTO tableB
(Id, TableAId, TableCId)
VALUES
(1, 1, 1),
(2, 1, 2),
(3, 1, 3),
(4, 2, 1),
(5, 2, 2),
(6, 3, 1),
(7, 3, 2),
(8, 3, 3),
(9, 3, 4);
GO
CREATE TABLE tableC (
Id int PRIMARY KEY,
StartPosition int,
EndPosition int,
Percentage decimal(3,2)
);
INSERT INTO tableC
(Id, StartPosition, EndPosition, Percentage)
VALUES
(1, 1, 3, 0.50),
(2, 4, 5, 0.30),
(3, 6, 9, 0.20),
(4, 10, 12, 0.10);
GO
-- this represents the temp table holding the XML data
-- we want to match Sample 1
CREATE TABLE xmlData (
StartPosition int,
EndPosition int,
Percentage decimal(3,2)
);
INSERT INTO xmlData
(StartPosition, EndPosition, Percentage)
VALUES
(1, 3, 0.50),
(4, 5, 0.30),
(6, 9, 0.20);
GO
SELECT
b.TableAId
FROM
tableB AS b
INNER JOIN
tableC AS c
ON
b.TableCId = c.Id
LEFT OUTER JOIN
xmlData AS x
ON
c.StartPosition = x.StartPosition AND
c.EndPosition = x.EndPosition AND
c.Percentage = x.Percentage
GROUP BY
b.TableAId
HAVING
COUNT(c.Id) = (SELECT COUNT(*) FROM xmlData) AND
COUNT(x.StartPosition) = (SELECT COUNT(*) FROM xmlData);
GO
DROP TABLE xmlData;
DROP TABLE tableC;
DROP TABLE tableB;
DROP TABLE tableA;
GO

Filtering a bottom up recursive CTE with Sql Server 2005

I'm trying to query a hierarchy of data in a single database table from the bottom up (I don't want to include parents that don't have a particular type of child due to authorities). The schema and sample data are as follows:
create table Users(
id int,
name varchar(100));
insert into Users values (1, 'Jill');
create table nodes(
id int,
name varchar(100),
parent int,
nodetype int);
insert into nodes values (1, 'A', 0, 1);
insert into nodes values (2, 'B', 0, 1);
insert into nodes values (3, 'C', 1, 1);
insert into nodes values (4, 'D', 3, 2);
insert into nodes values (5, 'E', 1, 1);
insert into nodes values (6, 'F', 5, 2);
insert into nodes values (7, 'G', 5, 2);
create table nodeAccess(
userid int,
nodeid int,
access int);
insert into nodeAccess values (1, 1, 1);
insert into nodeAccess values (1, 2, 1);
insert into nodeAccess values (1, 3, 1);
insert into nodeAccess values (1, 4, 1);
insert into nodeAccess values (1, 5, 1);
insert into nodeAccess values (1, 6, 0);
insert into nodeAccess values (1, 7, 1);
with Tree(id, name, nodetype, parent)
as
(
select n.id, n.name, n.nodetype, n.parent
from nodes as n
inner join nodeAccess as na on na.nodeid = n.id
where na.access =1 and na.userid=1 and n.nodetype=2
union all
select n.id, n.name, n.nodetype, n.parent
from nodes as n
inner join Tree as t on t.parent = n.id
inner join nodeAccess as na on na.nodeid = n.id
where na.access =1 and na.userid=1 and n.nodetype=1
)
select * from Tree
Yields:
id name nodetype parent
4 D 2 3
7 G 2 5
5 E 1 1
1 A 1 0
3 C 1 1
1 A 1 0
How can I not include the duplicates in the result set? The queries against the real tables have many more nodes at the lowest levels and hence many more duplicates of the parent nodes. The solution needs to work with at least SQL Server 2005.
Thanks in advance!
The simplest (not necessarily the most efficient) solution:
...
)
SELECT DISTINCT id,name,nodetype,parent FROM Tree;
This changes the order from your sample output because the DISTINCT operator implements a sort. If there is some intentional ordering there I cannot detect it but you can add an ORDER BY if you know the order you want.