Is it possible to craft an ORDER BY clause to ensure the following criteria for two fields (both of type INT), called child and parent respectively for this example.
parent references child, but can be null.
A parent can have multiple children; a child only one parent.
A child cannot be a parent of itself.
There must exist at least one child without a parent.
Each value of child must appear before it appears in parent in the ordered result set.
I'm having difficulty with point 5.
Sample unordered data:
child parent
------------
1 NULL
3 5
4 2
2 5
5 NULL
Obviously neither ORDER BY a, b or ORDER BY b, a work. In fact the more I think about it, I'm not sure it can even be done at all. Given the restrictions, obvious cases such as:
child parent
------------
1 2
2 1
aren't allowed because it violates rule 3 and 4 (and obviously 5).
So, is what I am trying to achieve possible, and if so how? Platform is SQL Server 2005.
Update: Desired sort order for the sample data:
child parent
------------
1 NULL
5 NULL
2 5
3 5
4 2
For each row that defines a non-null value in the parent column, the value has already been present int the child column.
You could use a recursive CTE to find the "depth" of each node, and order by that:
create table node (id int, parent int)
insert into node values (1, null)
insert into node values (2, 5)
insert into node values (3, 5)
insert into node values (4, 2)
insert into node values (5, null)
insert into node values (6, 4);
with node_cte (id, depth) as (
select id, 0 from node where parent is null
union all
select node.id, node_cte.depth + 1
from node join node_cte on node.parent = node_cte.id
)
select node.*
from node
join node_cte on node_cte.id=node.id
order by node_cte.depth asc
You won't be able to do it with an 'ORDER BY' clause because requirement 5 specifies that the order is sensitive to the data hierarchy. In SQL Server 2005 hierarchy data is usually dealt with using recursive CTEs; maybe someone here will provide appropriate code for this case.
Related
I'm searching a way to do a recursive delete on a table.
The situation is that table have 3 foreign key 1 on itself and 2 others, I want to delete depending on the date of the occurrence.
Table1 --> Id1, dateOCC, ParentID
1, 13-12-26, null
2, 13-07-18, null
3, 14-12-31, 1
4, 13-06-26, 1
5, 14-07-23, null
6, 13-07-22, 2
Table2--> ID, stuff
Table3 --> ID, stuff
The ID of Table 2 and Table 3 are linked directly on ID of Table1.
The amount of data inside table 1 is approximately 20 000 000 row and the others table is approximately the same amount.
Here is on of the request I tried(its inside of a cursor who delete the data returned.
SELECT EO.ID,
EO.DATEOCC,
EO.PARENTID
FROM TABLE1 EO
WHERE EO.DATEOCC <= TO_DATE ('2013-12-31','YYYY-MM-DD')
AND NOT EXISTS(SELECT 1 FROM TABLE2 WHERE ID = EO.ID)
AND NOT EXISTS( SELECT 1 FROM TABLE3 WHERE ID = EO.ID)
START WITH EO.PARENTID IS NULL
CONNECT BY PRIOR EO.ID = EO.PARENTID;
This request is really really slow to output the data that I want.
And it seems that is not return the data that I need to delete.
Edit #1
Ok so heres an example of what I need to do(In this example I suppose that the table 2 and table 3 have no matching ID on Table 1)
Table1 --> Id1, dateOCC, ParentID
1, 13-12-26, null
2, 13-07-18, null
3, 14-12-31, 1
4, 13-06-26, 1
5, 14-07-23, null
6, 13-07-22, 2
After the delete sequence the table have to be like that if the >= date is 13-12-31
Table1 --> Id1, dateOCC, ParentID
1, 13-12-26, null
3, 14-12-31, 1
5, 14-07-23, null
So as you can see I delte the child that I can delete with his parent if possible. If I cant delete his parent because another child exist and I cant delete it I dont delete de parent(delete only the child that I can).
In a hierarchical query, the WHERE clause is applied after the START WITH and CONNECT BY are used to build the hierarchy. But syntactically it comes first, which makes it intuitively seem that it will be applied first.
If what you really want is to apply the WHERE clause first, then build the hierarchy, you can use a subquery like this:
SELECT EO.ID,
EO.DATEOCC,
EO.PARENTID
FROM (
SELECT * FROM TABLE1 EO
WHERE EO.DATEOCC <= TO_DATE ('2013-12-31','YYYY-MM-DD')
AND NOT EXISTS(SELECT 1 FROM TABLE2 WHERE ID = EO.ID)
AND NOT EXISTS( SELECT 1 FROM TABLE3 WHERE ID = EO.ID)
) EO
START WITH EO.PARENTID IS NULL
CONNECT BY PRIOR EO.ID = EO.PARENTID;
But it is not clear whether that is what you want. This would give you the top-level parents within the desired date range, and without children in the other tables, then build the entire hierarchy for those parents. It's possible that lower nodes in the hierarchy would have children in the other tables, which would cause the delete to fail.
If that's not what you want, I think you need to describe your requirements more clearly.
Background: sqlite should be used to store information that can be queried using SNMP. SNMP organizes the information in a hierarchical structure of OID and supports 3 types of queries:
value for a single OID
OID and value that is lexicographically next to a given OID.
All OID and values of a subtree
I started with the following table:
PRAGMA foreign_keys = ON;
CREATE TABLE data (
oid TEXT NOT NULL PRIMARY KEY,
parent TEXT REFERENCES data(oid) ON DELETE CASCADE CHECK(oid = '1' OR parent IS NOT NULL),
leaf_id INTEGER NOT NULL,
value BLOB DEFAULT NULL,
CHECK(parent IS NULL OR oid = parent || '.' || leaf_id)
);
INSERT INTO data VALUES ('1', NULL, 1, NULL);
INSERT INTO data VALUES ('1.5', '1', 5, NULL);
INSERT INTO data VALUES ('1.5.2', '1.5', 2, 'foo');
INSERT INTO data VALUES ('1.5.11', '1.5', 11, 'foo');
INSERT INTO data VALUES ('1.5.1', '1.5', 1, 'foo');
INSERT INTO data VALUES ('1.3', '1', 3, NULL);
INSERT INTO data VALUES ('1.3.4', '1.3', 4, 'foo');
INSERT INTO data VALUES ('1.3.7', '1.3', 7, 'foo');
INSERT INTO data VALUES ('1.3.5', '1.3', 5, 'foo');
INSERT INTO data VALUES ('1.3.6', '1.3', 6, 'foo');
The idea of storing the last part of the OID as an INT is to be able order all the items with the same parent lexicographically.
The first query type is trivial to write. However - due to my limited experience with SQL - a struggled in writing queries for the second and third case.
I think it should be possible with the right WITH RECURSIVE ... SELECT construct. So far I could not find a way to combine the 3 situations (first child, next sibling, next of the parent) and correct sorting did not work either. An additional complexity is that all OIDs with a value of NULL must be ignored.
I would really appreciate if someone could provide the two queries or assist me in writing them.
If the queries would get too complex or impossible to write, another idea would be to add another column 'next' with a 'pointer' to the next item and fill the next values with a trigger.
I don't like to use nested sets however - too complex and slow for inserts/deletes.
Retrieving a subtree is trivial:
WITH RECURSIVE subtree(oid, value, depth, leaf_id) AS (
SELECT oid,
value,
0 AS depth,
leaf_id
FROM data
WHERE oid = '1' -- start of subtree
UNION ALL
SELECT child.oid,
child.value,
parent.depth + 1,
child.leaf_id
FROM data AS child
JOIN subtree AS parent ON child.parent = parent.oid
ORDER BY depth DESC, leaf_id ASC
)
SELECT oid, value
FROM subtree
WHERE value IS NOT NULL
The depth and leaf_id values are needed only for sorting the results lexicographically.
You should have an index on the parent column.
As for the lexicographical next item, consider first the following CTE, which goes simply up through the tree, but remembers the leaf value of the last level below:
WITH RECURSIVE parents(oid, parent, previous_leaf, step, leaf_id) AS (
SELECT oid,
parent,
-1,
0 AS step,
leaf_id
FROM data
WHERE oid = '1.3.4' -- start point
UNION ALL
SELECT parent.oid,
parent.parent,
child.leaf_id,
child.step + 1,
parent.leaf_id
FROM data AS parent
JOIN parents AS child ON parent.oid = child.parent
ORDER BY step
)
SELECT oid, previous_leaf FROM parents
oid previous_leaf
---------- -------------
1.3.4 -1 (1.)
1.3 4 (2.)
1 3 (3.)
What happens when, for each result row, we search the subtree below the oid value, with the additional restriction that the top-level leaf in that subtree must be larger than the previous_leaf value?
We search for children of 1.3.4. (previous_leaf has no effect.)
We search for larger silbings of 1.3.4, e.g., 1.3.5.
We search for larger silbings of 1.3, e.g., 1.5.
So now we just have to do this subtree search:
WITH RECURSIVE parents(oid, parent, previous_leaf, step, leaf_id) AS (
... see above ...
),
subtree(oid, value, depth, leaf_id, previous_leaf, step) AS (
SELECT oid,
NULL, -- interesting items are only *below* top of subtree
0 AS depth,
leaf_id,
previous_leaf,
step
FROM parents
UNION ALL
SELECT child.oid,
child.value,
parent.depth + 1,
child.leaf_id,
-1, -- previous_leaf mattered only at the top
parent.step
FROM data AS child
JOIN subtree AS parent ON child.parent = parent.oid
WHERE child.leaf_id > parent.previous_leaf
ORDER BY step, depth DESC, leaf_id
)
SELECT oid, value
FROM subtree
WHERE value IS NOT NULL
LIMIT 1 -- only the first item
I have the following data structure already in the system.
ItemDetails:
ID Name
--------
1 XXX
2 YYY
3 ZZZ
4 TTT
5 UUU
6 WWW
And the hierarchies are in separate table (with many to many relationships)
ItemHierarchy:
ParentCode ChildCode
--------------------
1 2
1 3
3 4
4 5
5 3
5 6
As you can see that 3 is child node for 1 and 3. I want to traverse records say for example that from the node 3.
I need to write a stored procedure and get all the ancestors of 3 and all the child nodes of 3.
Could you please let me know whether any possibilities to pull the data? If so, which data structure is OK for it.
Please note that my table is containing 1 million records and out of it 40% are having multiple hierarchies.
I did 'CTE' with level and incrementing it based upon the hierarchy but I'm getting max recursive error when we traverse from root to leaf level node. I have tried 'HierarchyID' but unable to get all the details when its having multiple parent for a node.
Update: I can set a recursion limit to max and run the query. Since it has millions of rows, I'm unable to get the output at all.
I want to create a data structure such that its capable to giving information from top to bottom or bottom to top (at any node level).
Could someone kindly please help me with that?
Using RDBMS for hierarchical data structure is not recommended, its why graph database have been created.
BTW following Closure Table pattern will help you.
The Closure Table solution is a simple and elegant way of storing hierarchies. It involves storing all paths through the tree, not just those with a direct parent-child relationship.
The key point to use the pattern is how you must fill ItemHierarchy table.
Store one row in this table for each pair of nodes in the tree that shares an ancestor/descendant relationship, even if they are separated by multiple levels in the tree. Also add a row for each node to reference itself.
Think we have a simple graph like bellow:
The doted arrows shows the rows in ItemHierarchy table:
To retrieve descendants of #3:
SELECT c.*
FROM ItemDetails AS ID
JOIN ItemHierarchy AS IH ON ID.ID = IH.ChildCode
WHERE IH.ParentCode = 3;
To retrieve ancestors of #3:
SELECT c.*
FROM ItemDetails AS ID
JOIN ItemHierarchy AS IH ON ID.ID = IH.ParentCode
WHERE IH.ChildCode = 3;
To insert a new leaf node, for instance a new child of #5, first
insert the self-referencing row. Then add a copy of the set of rows in
TreePaths that reference comment #5 as a descendant (including the row
in which #5 references itself), replacing the descendant with
the number of the new item:
INSERT INTO ItemHierarchy (parentCode, childCode)
SELECT IH.parentCode, 8
FROM ItemHierarchy AS IH
WHERE IH.childCode = 5
UNION ALL
SELECT 8, 8;
To delete a complete sub-tree, for instance #4 and its descendants, delete all rows in ItemHierarchy that reference #4 as a
descendant, as well as all rows that reference any of #4’s
descendants as descendants:
DELETE FROM ItemHierarchy
WHERE chidCode IN (SELECT childCode
FROM ItemHierarchy
WHERE parrentCode = 4);
UPDATE
Since the sample data you have shown us leads to recursive loops(not hierarchies) like:
1 -> 3 -> 4 -> 5 -> 3 -> 4 -> 5
Following Path Enumeration pattern will help you.
A UNIX path like /usr/local/lib/ is a path enumeration of the file system,
where usr is the parent of local, which in turn is the parent of lib.
You can create a Table or View from ItemHierarchy table, calling it EnumPath:
Table EnumPath(NodeCode, Path) For the sample data we will have:
To find ancestors of node #4:
select distinct E1.NodeCode from EnumPath E1
inner join EnumPath E2
On E2.path like E1.path || '%'
where E2.NodeCode = 4 and E1.NodeCode != 4;
To find descendants of node #4:
select distinct E1.NodeCode from EnumPath E1
inner join EnumPath E2
On E1.path like E2.path || '%'
where E2.NodeCode = 4 and E1.NodeCode != 4;
Sqlfiddle demo
Using PostgreSQL, and given the following sample table, how do I select all parents that have at least a child 10 and a child 20?
parent | child
--------+-------
1 | 10
1 | 20
1 | 30
2 | 10
2 | 20
3 | 10
In other words, this is the expected result:
parent
--------
1
2
In general, how do I select all parents that have at least all of the given children x1, x2, ..., xn? What is the most efficient way to do this?
Thanks!
SELECT parent FROM table WHERE child IN(10,20)
GROUP BY parent
HAVING COUNT(DISTINCT child)>=2
Fiddle
It's not completely clear what your asking. However, I shall give it a crack.
If you're going to manually define the children you can do a simple select statement:
SELECT DISTINCT parent
FROM table1
WHERE child IN ('10', '20')
This would select all Parents that have 10 or 20 as there child. To add more, just add the number to the IN() part.
If however you want to do this for a large number of children or perhaps an unknown number of children then you can create a temp table to store the children search values and join it to your main table. Something like:
CREATE TABLE #SearchChildren
(
Child int
)
Then input your search values into #SearchChildren. Need to know more about what your doing to do this bit.
SELECT DISTINCT a.parent
FROM table1 as a
JOIN #SearchChildren as s
ON a.child = s.Child
Without knowing more about what your trying to do it's difficult to give a full answer but hopefully this helps.
Take this table:
id name sub_id
---------------------------
1 A (null)
2 B (null)
3 A2 1
4 A3 1
The sub_id column is a relation to his own table, to column ID.
subid --- 0:1 --- id
Now I have the problem to make a correctly SELECT query to show that the child rows (which sub_id is not null) directly selected under his parent row. So this must be a correctly order:
1 A (null)
3 A2 1
4 A3 1
2 B (null)
A normal SELECT order the id. But how or which keyword help me to order this correctly?
JOIN isn't possible I think because I want to get all the rows separated. Because the rows will be displayed on a Gridview (ASP.Net) with EntityDataSource but the child rows must be displayed directly under his parent.
Thank you.
Look at Managing Hierarchical Data in MySQL.
Since recursion is an expensive operation because basicly you're firing multiple queries to your database you could consider using the Nested Set Model. In short you're assigning numbers to ranges in your table. It's a long article but it worth reading it. I've used it during my internship as a solution not to have 1000+ queries, But bring it down to 1 query.
Your handling 'overhead' now lies at the point of updating the table by adding, updating or deleting records. Since you then have to update all the records with a bigger 'right-value'. But when you're retrieving the data, it all goes with 1 query :)
select * from table1 order by name, sub_id will in this case return your desired result but only because the parents names and the child name are similar. If you're using SQL 2005 a recursive CTE will work:
WITH recurse (id, Name, childID, Depth)
AS
(
SELECT id, Name, ISNULL(childID, id) as id, 0 AS Depth
FROM table1 where childid is null
UNION ALL
SELECT table1.id, table1.Name, table1.childID, recurse.Depth + 1 AS Depth FROM table1
JOIN recurse ON table1.childid = recurse.id
)
SELECT * FROM recurse order by childid, depth
SELECT
*
FROM
table
ORDER BY
COALESCE(id,sub_id), id
btw, this will work only for one level.. any thing more than that requires recursive/cte function