Oracle Recursive Query Connect By Loop in data - sql

I have a table that looks essentially like this (the first row pk1=1 is the parent row)
pk1
event_id
parent_event_id
1
123
123
2
456
123
3
789
456
Given any particular row in the above table, I need a query that returns all the related rows (up and down the hierarchy). I was trying to do this via an initial CTE table that grabs all the parent rows. Then use that as my base table and join back into the above table using a recursive query to navigate down (this seems wildly inefficient and I assume there is a better way???).
However, trying even the first step (populating my CTE table) and using a query like below to navigate up returns the connect by LOOP error.
select event_id, level
from myTable
start with pk1 = 2
connect by prior parent_event_id = event_id
I assume this is due to the fact the parent row is self-referencing (event_id = parent_event_id)? If I add in the NOCYCLE statement, then the recursion stops at the row prior to the actual parent.
Two questions:
1.) Is there a better way to do this in one query?
2.) Any clue how to tweak the above to get the parent row returned?
Thanks

I'm not super clear on what you mean by "all the related rows (up and down the tree)", but it might be possible.
Here, I'm adding more logic to the connect clause to go up OR down the tree. This includes direct parents and descendants, but also includes siblings/cousins to the starting node. That might or might not be what you want.
with mytable as (select 1 as pk1, 123 as event_id, 123 as parent_event_id from dual
union select 2, 456, 123 from dual
union select 3, 789, 456 from dual
union select 4, 837, 123 from dual)
select pk1, event_id, level, SYS_CONNECT_BY_PATH(event_id, '/') as path
from myTable
start with pk1 = 2
connect by nocycle (prior parent_event_id = event_id and prior event_id <> event_id)
or (prior event_id = parent_event_id)
The tweak to get the root parent to show up is just and prior event_id <> event_id - ie, don't go further up the tree if the parent node = the current node.
I added an example row (pk1=4) to show a sibling row (not direct parent or descendant) being returned.

Related

Simple recursive query in Oracle

I'm currently having some trouble understanding and writing recursive queries. I understand that recursive queries are used to search through hierarchies of information, but I haven't found a simple solution online that can travel up a hierarchy. For example, let's say that I have a relation that models a family tree:
create table family_tree (
child varchar(10)
parent varchar(10)
);
If I wanted to write a recursive query that travelled up this family tree, collecting all parents until origin, how should I go about this?
Thanks in advance.
You can use connect by clause.
In your case, SQL might look like:
select child, parent, level
from family_tree
connect by prior parent = child
If I wanted to write a recursive query that travelled up this family tree, collecting all parents until origin, how should I go about this?
Use a hierarchical query and the SYS_CONNECT_BY_PATH( column_name, delimiter ) function:
Oracle 18 Setup:
create table family_tree (
child varchar(10),
parent varchar(10)
);
INSERT INTO family_tree ( child, parent )
SELECT 'B', 'A' FROM DUAL UNION ALL
SELECT 'C', 'B' FROM DUAL UNION ALL
SELECT 'D', 'C' FROM DUAL UNION ALL
SELECT 'E', 'D' FROM DUAL UNION ALL
SELECT 'F', 'C' FROM DUAL;
Query 1:
SELECT SYS_CONNECT_BY_PATH( parent, ' -> ' ) || ' -> ' || child AS path
FROM family_tree
START WITH parent = 'A'
CONNECT BY PRIOR child = parent;
Results:
PATH
-------------------------
-> A -> B
-> A -> B -> C
-> A -> B -> C -> D
-> A -> B -> C -> D -> E
-> A -> B -> C -> F
There is an ANSI syntax that I'm not really familiar with and there is an Oracle syntax that I usually use. The Oracle syntax uses a CONNECT BY ... PRIOR clause to build the tree and a START WITH clause that tells the database where to start walking the tree. It will look like this:
SELECT child, parent, level
FROM family_tree
CONNECT BY ...
START WITH ...
The START WITH clause is easier. You're looking "up" the tree, so you'd pick a child where you want to start walking the tree. So this would look like START WITH parent = 'John'. This is our level 1 row. I'm assuming John's row will have him as the parent and no children, since it's the bottom of the tree.
Now, think about how rows in the tree relate to each other. If we're looking at a level 2 row, how do we know if it is the correct row to the "John" row? In this case, it will have John in the child column. So we want a clause of: CONNECT BY PRIOR parent = child. That means "the prior row's parent equals this row's child"
So the query looks like:
SELECT child, parent, level
FROM family_tree
CONNECT BY PRIOR parent = child
START WITH parent = 'John'
SQL Fiddle example
(This is a bit of a strange example since actual children have two parents, but that would make it more complicated.)
Are you familiar with the SCOTT.EMP table? It's in the "standard" SCOTT schema (which, unfortunately, is no longer pre-packaged with every copy of Oracle database, since version 12.1 or so). Check your database: you may find it there. Or ask your DBA about it.
Anyway: the table shows the 14 employees of a small business, and it includes the employee's ID as well as his or her manager's employee ID. So, suppose you start with a given employee and you want to find his or her highest-level boss. (Similar to your test problem.) In this particular hierarchy, the highest-level "ancestor" is unique, but that is irrelevant; the recursive query would work the same way if each department had a "head of department" and there was no CEO above the heads of department.
In this arrangement, it's easy to identify the "boss of all bosses" - he does not have a boss. In his row, the manager ID is null. This is a very common arrangement for the "root" (or "roots") of tree-like hierarchies.
Here is how you would find the boss, starting with a specific employee id, and using a recursive query - which is what I understand is what you are looking to practice on. (That is: if I understand correctly, you are not interested in solving the problem "by any means"; rather, you want to see how recursive queries work, in a small example so you can understand EVERYTHING that goes on.)
with
r ( empno, mgr ) as (
select empno, mgr -- ANCHOR leg of recursive query
from scott.emp
where empno = 7499
union all
select e.empno, e.mgr -- RECURSIVE leg of recursive query
from scott.emp e inner join r on e.empno = r.mgr
)
select empno
from r
where mgr is null
;
I will not try to guess where you may have difficulty understanding this example. Instead, I will wait for you to ask.

Performance issues in SQL query with a hierarchical relationship

I have an Oracle table that represents parent-child relationships, and I want to improve the performance of a query that searches the hierarchy for an ancestor record. I'm testing with the small data set here, though the real table is much larger:
id name parent_id tagged
== ==== ========= ======
1 One null null
2 Two 1 1
3 Three 2 null
4 Four 3 null
5 Five null null
6 Six 5 1
7 Seven 6 null
8 Eight null null
9 Nine 8 null
parent_id refers back to id in this same table in a foreign key relationship.
I want to write a query that returns each leaf record (those records that have no descendants... id 4 and id 7 in this example) which has an ancestor record that has tagged = 1 (walking back through the parent_id relationship).
So, for the above source data, I want my query to return:
id name tagged_ancestor_id
== ==== ==================
4 Four 2
7 Seven 6
My current query to retrieve these records is:
select * from (
select id,
name,
connect_by_root id tagged_ancestor_id
from mytree
connect by prior id = parent_id
start with tagged is not null
) m1
where not exists (
select * from mytree m2 where m2.parent_id = m1.id
)
This query works fine on this simple little example table, but its performance is terrible on my real table which has about 11,000,000 records. The query takes over a minute to run.
There are indexes on both fields in the connect by clause.
The "tagged" field in the start with clause also has an index on it, and there are about 1,500,000 records in my table with non-null values in this field.
The where clause doesn't seem to be the problem, because when modify it to return a specific name (also indexed) with where name = 'somename' instead of where not exists ..., the query still takes about the same amount of time.
So, what are some strategies I can use to try to make these types queries on this hierarchy run faster?
Here is what I would check first:
Make sure your table has a primary key.
Make sure the statistics on the table are current. Use DBMS_STATS.GATHER_TABLE_STATS to collect the statistics. See this URL: (for ORACLE version 11.1):
http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_stats.htm
Even if you have indexes on both fields individually, you still need
an index on the 2 fields combined; Create an index on the ID and PARENT_ID:
CREATE INDEX on TABLE_NAME(ID, PARENT_ID);
See this URL:
Optimizing Oracle CONNECT BY when used with WHERE clause
Make sure the underlying table does not have row chaining or other problems (E.G. corruption).
Make sure the table and all indexes are in the same tablespace.
I'm not sure if this is any faster without the volume of data to test with... but something to consider. I guess I'm hoping by starting with only those that are tagged, and only those that are leafs we are dealing with a smaller volume to process which may result in a performance gain. but the overhead for the string manipulation seems hackish.
with cte(id, name, parent_id, tagged) as (
SELECT 1, 'ONE', null, null from dual union all
SELECT 2, 'TWO', 1, 1 from dual union all
SELECT 3, 'THREE', 2, null from dual union all
SELECT 4, 'FOUR', 3, null from dual union all
select 5, 'FIVE', null, null from dual union all
select 6, 'SIX', 5, 1 from dual union all
select 7, 'SEVEN', 6, null from dual union all
select 8, 'EIGHT', null, null from dual union all
select 9, 'NINE', 8, null from dual),
Leafs(id, name) as (select id, Name
from cte
where connect_by_isleaf = 1
Start with parent_Id is null
connect by nocycle prior id =parent_id),
Tagged as (SELECT id, name, SYS_CONNECT_BY_PATH(ID, '/') Path, substr(SYS_CONNECT_BY_PATH(ID, '/'),2,instr(SYS_CONNECT_BY_PATH(ID, '/'),'/',2)-2) as Leaf
from cte
where tagged=1
start with id in (select id from leafs)
connect by nocycle prior parent_id = id)
select l.*, T.ID as Tagged_ancestor from leafs L
inner join tagged t
on l.id = t.leaf
In essence I created 3 cte's one for the data (Cte) one for the leafs(leafs) and one for the tagged records (tagged)
We traverse the hierarchy twice. Once to get all the leafs, once to get all the tagged. We then parse out the first leaf value from the tagged hierarchy and join it back to leafs to get the leafs related to tagged records.
As to if this is faster than what you're doing... Shrug I didn't want to spend the time testing since I don't have your indexes nor do I have your data volume

Recursive Delete SQL Oracle

I'm searching a way to do a recursive delete on a table.
The situation is that table have 3 foreign key 1 on itself and 2 others, I want to delete depending on the date of the occurrence.
Table1 --> Id1, dateOCC, ParentID
1, 13-12-26, null
2, 13-07-18, null
3, 14-12-31, 1
4, 13-06-26, 1
5, 14-07-23, null
6, 13-07-22, 2
Table2--> ID, stuff
Table3 --> ID, stuff
The ID of Table 2 and Table 3 are linked directly on ID of Table1.
The amount of data inside table 1 is approximately 20 000 000 row and the others table is approximately the same amount.
Here is on of the request I tried(its inside of a cursor who delete the data returned.
SELECT EO.ID,
EO.DATEOCC,
EO.PARENTID
FROM TABLE1 EO
WHERE EO.DATEOCC <= TO_DATE ('2013-12-31','YYYY-MM-DD')
AND NOT EXISTS(SELECT 1 FROM TABLE2 WHERE ID = EO.ID)
AND NOT EXISTS( SELECT 1 FROM TABLE3 WHERE ID = EO.ID)
START WITH EO.PARENTID IS NULL
CONNECT BY PRIOR EO.ID = EO.PARENTID;
This request is really really slow to output the data that I want.
And it seems that is not return the data that I need to delete.
Edit #1
Ok so heres an example of what I need to do(In this example I suppose that the table 2 and table 3 have no matching ID on Table 1)
Table1 --> Id1, dateOCC, ParentID
1, 13-12-26, null
2, 13-07-18, null
3, 14-12-31, 1
4, 13-06-26, 1
5, 14-07-23, null
6, 13-07-22, 2
After the delete sequence the table have to be like that if the >= date is 13-12-31
Table1 --> Id1, dateOCC, ParentID
1, 13-12-26, null
3, 14-12-31, 1
5, 14-07-23, null
So as you can see I delte the child that I can delete with his parent if possible. If I cant delete his parent because another child exist and I cant delete it I dont delete de parent(delete only the child that I can).
In a hierarchical query, the WHERE clause is applied after the START WITH and CONNECT BY are used to build the hierarchy. But syntactically it comes first, which makes it intuitively seem that it will be applied first.
If what you really want is to apply the WHERE clause first, then build the hierarchy, you can use a subquery like this:
SELECT EO.ID,
EO.DATEOCC,
EO.PARENTID
FROM (
SELECT * FROM TABLE1 EO
WHERE EO.DATEOCC <= TO_DATE ('2013-12-31','YYYY-MM-DD')
AND NOT EXISTS(SELECT 1 FROM TABLE2 WHERE ID = EO.ID)
AND NOT EXISTS( SELECT 1 FROM TABLE3 WHERE ID = EO.ID)
) EO
START WITH EO.PARENTID IS NULL
CONNECT BY PRIOR EO.ID = EO.PARENTID;
But it is not clear whether that is what you want. This would give you the top-level parents within the desired date range, and without children in the other tables, then build the entire hierarchy for those parents. It's possible that lower nodes in the hierarchy would have children in the other tables, which would cause the delete to fail.
If that's not what you want, I think you need to describe your requirements more clearly.

how to query with child relations to same table and order this correctly

Take this table:
id name sub_id
---------------------------
1 A (null)
2 B (null)
3 A2 1
4 A3 1
The sub_id column is a relation to his own table, to column ID.
subid --- 0:1 --- id
Now I have the problem to make a correctly SELECT query to show that the child rows (which sub_id is not null) directly selected under his parent row. So this must be a correctly order:
1 A (null)
3 A2 1
4 A3 1
2 B (null)
A normal SELECT order the id. But how or which keyword help me to order this correctly?
JOIN isn't possible I think because I want to get all the rows separated. Because the rows will be displayed on a Gridview (ASP.Net) with EntityDataSource but the child rows must be displayed directly under his parent.
Thank you.
Look at Managing Hierarchical Data in MySQL.
Since recursion is an expensive operation because basicly you're firing multiple queries to your database you could consider using the Nested Set Model. In short you're assigning numbers to ranges in your table. It's a long article but it worth reading it. I've used it during my internship as a solution not to have 1000+ queries, But bring it down to 1 query.
Your handling 'overhead' now lies at the point of updating the table by adding, updating or deleting records. Since you then have to update all the records with a bigger 'right-value'. But when you're retrieving the data, it all goes with 1 query :)
select * from table1 order by name, sub_id will in this case return your desired result but only because the parents names and the child name are similar. If you're using SQL 2005 a recursive CTE will work:
WITH recurse (id, Name, childID, Depth)
AS
(
SELECT id, Name, ISNULL(childID, id) as id, 0 AS Depth
FROM table1 where childid is null
UNION ALL
SELECT table1.id, table1.Name, table1.childID, recurse.Depth + 1 AS Depth FROM table1
JOIN recurse ON table1.childid = recurse.id
)
SELECT * FROM recurse order by childid, depth
SELECT
*
FROM
table
ORDER BY
COALESCE(id,sub_id), id
btw, this will work only for one level.. any thing more than that requires recursive/cte function

Using CONNECT BY to get all parents and one child in Hierarchy through SQL query in Oracle

I was going through some previous posts on CONNECT BY usage. What I need to find is that what to do if I want to get all the parents (i.e, up to root) and just one child for a node, say 4.
It seems Like I will have to use union of the following two:-
SELECT *
FROM hierarchy
START WITH id = 4
CONNECT BY id = PRIOR parent
union
SELECT *
FROM hierarchy
WHERE LEVEL =<2
START WITH
id = 4
CONNECT BY
parent = PRIOR id
Is there a better way to do this, some workaround that is more optimized?
You should be able to do this using a sub-select (and DISTINCT) to find all children of 4:
Select Distinct *
From hierarchy
Start With id In ( Select id
From hierarchy
Where parent = 4 )
Connect By id = Prior parent
Using UNION you could at least remove the CONNECT BY from your second query:
Select *
From hierarchy
Start With id = 4
Connect By id = Prior parent
Union
Select *
From hierarchy
Where parent = 4
Never use SELECT *, always name the columns you actually need. This makes your query easier to read, to maintain and to optimize.