Oracle SQL hierarchical query from bottom to top element - sql

In my table I store the successor of each entry.
+----+-----------+--+
| ID | SUCCESSOR | |
+----+-----------+--+
| 1 | 2 | |
| 2 | 3 | |
| 3 | | |
+----+-----------+--+
I need to get from ID 3 to ID 1.
I have tried to archieve this with the following query, but this does not work. :-(
SELECT NVL (id, 3)
FROM my_table
WHERE LEVEL = 1
CONNECT BY id = PRIOR successor
START WITH id = 3;
Can somebody please give me some advice how to get this working?

The following version should also provide the correct answer:
with my_table as
(select 1 id, 2 successor from dual union
select 2 id, 3 successor from dual union
select 3 id, null successor from dual )
SELECT id FROM my_table
WHERE level = 3
CONNECT BY successor = PRIOR id
START WITH successor is null
;

Related

Find top parent of child, multiple levels

ENTRY TABLE
__________________
| ID | PARENT_ID |
| 1 | null |
| 2 | 1 |
| 3 | 2 |
| 4 | null |
| 5 | 4 |
| 6 | 5 |
...
I make copies of the entries in some cases and they are conneted by parent ID.
Each entry can have one copy:
THIS WONT HAPPEN
__________________
| ID | PARENT_ID |
| 1 | null |
| 2 | 1 |
| 3 | 1 |
...
Sometimes I need to take a copy and query for it's top level parent. I need to find the top parent entries for all the entries I search for.
For example, if I query for the parents of ID 6 and 3, I would get ID 4 and 1.
If I query for the parents of ID 5 and 2, I would get ID 4 and 1.
But also If I query for ID 5 and 1, it should return ID 4 and 1 because the entry ID 1 is already the top parent itself.
I don't know where to begin since I don't know how to recursively query in such case.
Can anyone point me in the right direction?
I know that the query below will just return the child elemements (ID 6 and 3), but I don't know where to go from here honestly.
I am using OracleSQL by the way.
SELECT * FROM entry WHERE id IN (6, 3);
You can use a hierarchical query and CONNECT_BY_ROOT.
Either starting at the root of the hierarchy and working down:
SELECT id,
CONNECT_BY_ROOT(id) AS root_id
FROM entry
WHERE id IN (6, 3)
START WITH parent_id IS NULL
CONNECT BY PRIOR id = parent_id;
Or, from the entry back up to the root:
SELECT CONNECT_BY_ROOT(id) AS id,
id AS root_id
FROM entry
WHERE parent_id IS NULL
START WITH id IN (6, 3)
CONNECT BY PRIOR parent_id = id;
Which, for the sample data:
CREATE TABLE entry( id, parent_id ) AS
SELECT 1, NULL FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 2 FROM DUAL UNION ALL
SELECT 4, NULL FROM DUAL UNION ALL
SELECT 5, 4 FROM DUAL UNION ALL
SELECT 6, 5 FROM DUAL UNION ALL
SELECT 7, 6 FROM DUAL
Both output:
ID
ROOT_ID
3
1
6
4
db<>fiddle here
You can use recursive CTE to walk the graph and find the initial parent. For example:
with
n (starting_id, current_id, parent_id, v) as (
select id, id, parent_id, 0 from entry where id in (6, 3)
union all
select n.starting_id, e.id, e.parent_id, n.v - 1
from n
join entry e on e.id = n.parent_id
)
select starting_id, current_id as initial_id
from (
select n.*, row_number() over(partition by starting_id order by v) as rn
from n
) x
where rn = 1
Result:
STARTING_ID INITIAL_ID
------------ ----------
3 1
6 4
See running example at db<>fiddle.

Oracle SQL hierarchical query from bottom to top

I have a table where I want to go from bottom to top using hierarchical queries.
The problem is that I need the get the value of one column from root (top) using CONNECT_BY_ROOT, but since I reverse the way the hierarchical query works (reverse the prior in connect by and the start with), this function (CONNECT_BY_ROOT) consider my 'start with' row as level 1 (root) then gets me this value.
In other words, I want a way to reverse the CONNECT_BY_ROOT to get me the value of a column from the last possible level and not the root.
+----+-----------+-------+
| ID | ID_PARENT | VALUE |
+----+-----------+-------+
| 1 | null | 5 |
| 2 | 1 | 9 |
| 3 | 2 | null |
+----+-----------+-------+
I want to get the value of ID = 1 (5) to the ID = 3 like this:
+----+-------+------------+
| ID | VALUE | VALUE_root |
+----+-------+------------+
| 1 | 5 | 5 |
| 2 | 9 | 5 |
| 3 | null | 5 |
+----+-------+------------+
I tried this but all I get is null as value_root:
SELECT id,
CONNECT_BY_ROOT VALUE as VALUE_root
FROM my_table
START WITH ID = 3
CONNECT BY ID = PRIOR ID_PARENT
EDIT: I forgot to mention that in my real system I'm dealing with millions of rows of data, the reason why I'm reversing the hierachical queries in first place is to make it better in terms of performance!
You may retrieve the root (which is a bottom node in your case) for all the tree upwards and then apply analytical function partitioned by the root to translate parent value to all the tree nodes. This is also possible for multiple nodes in start with.
with src (id, parentid, val) as (
select 1, cast(null as int), 5 from dual union all
select 2, 1, 9 from dual union all
select 3, 2, null from dual union all
select 4, 2, null from dual union all
select 5, null, 10 from dual union all
select 6, 5, 7 from dual
)
select
connect_by_root id as tree_id
, id
, parentid
, val
, max(decode(connect_by_isleaf, 1, val))
over(partition by connect_by_root id) as val_root
from src
start with id in (3, 4, 6)
connect by id = prior parentid
order by 1, 2, 3
TREE_ID
ID
PARENTID
VAL
VAL_ROOT
3
1
-
5
5
3
2
1
9
5
3
3
2
-
5
4
1
-
5
5
4
2
1
9
5
4
4
2
-
5
6
5
-
10
10
6
6
5
7
10
You can try below query here I have just updated the START WITH condition and CONNECT BY clause -
SELECT id,
CONNECT_BY_ROOT VALUE as VALUE_root
FROM my_table
START WITH ID = 1
CONNECT BY PRIOR ID = ID_PARENT;
Fiddle Demo.
You were almost there
SELECT id,
value,
CONNECT_BY_ROOT VALUE as VALUE_root
FROM your_table
START WITH ID = 1
CONNECT BY prior ID = ID_PARENT
One posibility is to first perform a hierarchical query starting from the root - to get the root node for each row.
In the second step you perform the bottom up query (starting in all leaves nodes) and use the pre-calculated root node
Below the solution using Recursive Subquery Factoring
with hir (id, id_parent, value, value_root) as
(select id, id_parent, value, value value_root
from tab
where id_parent is null
union all
select tab.id, tab.id_parent, tab.value, hir.value_root
from hir
join tab on tab.id_parent = hir.id
),
hir2 (id, id_parent, value, value_root) as
(select id, id_parent, value, value_root from hir
where ID in (select id from tab /* id of leaves */
minus
select id_parent from tab)
union all
select hir.id, hir.id_parent, hir.value, hir.value_root
from hir2
join hir on hir2.id_parent = hir.id
)
select id,value, value_root
from hir2
;
ID VALUE VALUE_ROOT
---------- ---------- ----------
3 5
2 9 5
1 5 5
Nte that the order of the row 3, 2, 1 is the bottom up order that you want, but fail to reach in your example output.

Creating a SQL view to query whether a node is a descendant of a specific node

I have 2 SQL tables with the following data structure:
Table 1 : FAVORITES
Columns:
pk
fk_user (the user table is irrelevant for now)
fk_tree_node
Table 2 : TREE
Columns:
pk
fk_parent_node
I want to create a view from so that I can query, whether a node is favorited/descendant of a favorited node or not. So for every entry in FAVORITES it the view would have several entries where the user is associated with either the favorited node, or a descendant of it.
View: FAV_OR_DESCENDANT
Columns:
fk_tree_node
fk_user
pk
Queries would work like this
SELECT *
FROM FAV_OR_DESCENDANT
WHERE fk_user = 0
The results I'd expect to get for a given tree would look like this:
Tree:
TREE:
+--------+----+----------------+
| rownum | pk | pk_parent_node |
+--------+----+----------------+
| 1 | 1 | null |
| 2 | 2 | 1 |
| 3 | 3 | 2 |
| 4 | 4 | 1 |
+--------+----+----------------+
FAVORITES:
+--------+----+---------+--------------+
| rownum | pk | fk_user | fk_tree_node |
+--------+----+---------+--------------+
| 1 | 0 | 0 | 1 |
+--------+----+---------+--------------+
Tree representation:
1 <-- User 0 has only favorited this single node
/ \
2 4
/
3
Result data in FAV_OR_DESCENDANT:
+--------+---------+--------------+
| rownum | fk_user | fk_tree_node |
+--------+---------+--------------+
| 1 | 0 | 1 |
| 2 | 0 | 2 |
| 3 | 0 | 3 |
| 4 | 0 | 4 |
+--------+---------+--------------+
I know how to write this query if I'm asking for all favorited nodes/descendant nodes for a specific user. However, I'm struggling in translating that into an SQL query that would create a view:
SELECT DISTINCT *
FROM tree
START WITH tree.pk IN (
SELECT fk_tree_node
FROM favorites
WHERE fk_user = 0
)
CONNECT BY PRIOR tree.pk = tree.fk_parent_node
Other questions I found where more centered around making queries or were not limited by SQL. I'd be thankful for every hint in the right direction.
Since you need to get all descendant nodes, you need to aggregate all parent nodes, for example using sys_connect_by_path (or you can use hierarchical recursive subquery factoring):
with
TREE(pk, pk_parent_node) as (
select 1 , null from dual union all
select 2 , 1 from dual union all
select 3 , 2 from dual union all
select 4 , 1 from dual
)
,FAVORITES( pk, fk_user, fk_tree_node) as (
select 0, 0, 1 from dual
)
,v_tree as (
select pk,pk_parent_node,sys_connect_by_path(pk,'/') p_path
from TREE
start with pk_parent_node is null
connect by prior pk = pk_parent_node
)
select *
from v_tree;
Results:
PK PK_PARENT_NODE P_PATH
---------- -------------- ------------------------------
1 NULL /1
2 1 /1/2
3 2 /1/2/3
4 1 /1/4
In fact you can already check if P_PATH contains a favorite node, but since you want a view with users, we can aggregate them into a new column:
with
TREE(pk, pk_parent_node) as (
select 1 , null from dual union all
select 2 , 1 from dual union all
select 3 , 2 from dual union all
select 4 , 1 from dual
)
,FAVORITES( pk, fk_user, fk_tree_node) as (
select 0, 0, 1 from dual union all
select 1, 1, 3 from dual
)
,v_tree as (
select pk,pk_parent_node,sys_connect_by_path(pk,'/')||'/' p_path
from TREE
start with pk_parent_node is null
connect by prior pk = pk_parent_node
)
select
v.*, v2.*
from v_tree v
outer apply(
select
xmlelement(USERS, xmlagg(xmlelement(ID, f.pk) order by f.pk)) as users
from FAVORITES f
where p_path like '%/'||f.fk_tree_node||'/%'
) v2;
Results:
PK PK_PARENT_NODE P_PATH USERS
---------- -------------- ------------------------------ ------------------------------------------------------------
1 NULL /1/ <USERS><ID>0</ID></USERS>
2 1 /1/2/ <USERS><ID>0</ID></USERS>
3 2 /1/2/3/ <USERS><ID>0</ID><ID>1</ID></USERS>
4 1 /1/4/ <USERS><ID>0</ID></USERS>
I've added one more user 1 to make it more clear.
So now you just need to add a predicate to filter users:
with
TREE(pk, pk_parent_node) as (
select 1 , null from dual union all
select 2 , 1 from dual union all
select 3 , 2 from dual union all
select 4 , 1 from dual
)
,FAVORITES( pk, fk_user, fk_tree_node) as (
select 0, 0, 1 from dual union all
select 1, 1, 3 from dual
)
,v_tree as (
select pk,pk_parent_node,sys_connect_by_path(pk,'/')||'/' p_path
from TREE
start with pk_parent_node is null
connect by prior pk = pk_parent_node
)
,v_final_view as (
select
v.*, v2.*
from v_tree v
outer apply(
select
xmlelement(USERS, xmlagg(xmlelement(ID, f.pk) order by f.pk)) as users
from FAVORITES f
where p_path like '%/'||f.fk_tree_node||'/%'
) v2
)
select *
from v_final_view
where
xmlexists(
'$USERS/USERS[ID=$USER_ID]'
passing
users as USERS,
1 as USER_ID -- your input param - user id
)
;
Results:
PK PK_PARENT_NODE P_PATH USERS
---- -------------- ------------ ------------------------------------
3 2 /1/2/3/ <USERS><ID>0</ID><ID>1</ID></USERS>
Of course, it's just an example of this approach, so you can use other functions for aggregation or even create materialized view for performance.

Select unique combinations (unique on both sides)

EDIT: added a link to Fiddle for a more comprehensive sample (actual dataset)
I wonder if the below is possible in SQL, in BigQuery in particular, and in one SELECT statement.
Consider following input:
Key | Value
-----|-------
a | 2
a | 3
b | 2
b | 3
b | 5
c | 2
c | 5
c | 7
Logic: select the lowest value "available" for each key. Available meaning not yet assigned/used. See below.
Key | Value | Rule
-----|-------|--------------------------------------------
a | 2 | keep
a | 3 | ignore because key "a" has a value already
b | 2 | ignore because value "2" was already used
b | 3 | keep
b | 5 | ignore because key "b" has a value already
c | 2 | ignore because value "2" was already used
c | 5 | keep
c | 7 | ignore because key "c" has a value already
Hence expected outcome:
Key | Value
-----|-------
a | 2
b | 3
c | 5
Here the SQL to create the dummy table:
with t as ( select
'a' key, 2 value UNION ALL select 'a', 3
UNION ALL select 'b', 2 UNION ALL select 'b', 3 UNION ALL select 'b', 5
UNION ALL select 'c', 2 UNION ALL select 'c', 5 UNION ALL select 'c', 7
)
select * from t
EDIT: here another dataset
Not sure what combination of FULL JOIN, DISTINCT, ARRAY or WINDOW functions I can use.
Any guidance is appreciated.
EDIT: This is an incorrect answer that worked with the original example dataset, but has issues (as seen with comprehensive sample). I'm leaving it here for now to maintain comment history.
I don't have a specific BigQuery answer, but here is one SQL solution using a Common Table Expression and recursion.
WITH MyCTE AS
(
/* ANCHOR SUBQUERY */
SELECT MyKey, MyValue
FROM MyTable t
WHERE t.MyKey = (SELECT MIN(MyKey) FROM MyTable)
UNION ALL
/* RECURSIVE SUBQUERY */
SELECT t.MyKey, t.MyValue
FROM MyTable t
INNER JOIN MyCTE c
ON c.MyKey < t.MyKey
AND c.MyValue < t.MyValue
)
SELECT MyKey, MIN(MyValue)
FROM MyCTE
GROUP BY MyKey
;
Results:
Key | Value
-----|-------
a | 2
b | 3
c | 5
SQL Fiddle

SQL query update by grouping

I'm dealing with some legacy data in an Oracle table and have the following
--------------------------------------------
| RefNo | ID |
--------------------------------------------
| FOO/BAR/BAZ/AAAAAAAAAA | 1 |
| FOO/BAR/BAZ/BBBBBBBBBB | 1 |
| FOO/BAR/BAZ/CCCCCCCCCC | 1 |
| FOO/BAR/BAZ/DDDDDDDDDD | 1 |
--------------------------------------------
For each of the /FOO/BAR/BAZ/% records I want to make the ID a Unique incrementing number.
Is there a method to do this in SQL?
Thanks in advance
EDIT
Sorry for not being specific. I have several groups of records /FOO/BAR/BAZ/, /FOO/ZZZ/YYY/. The same transformation needs to occur for each of these other (example) groups. The recnum can't be used I want ID to start from 1, incrementing, for each group of records I have to change.
Sorry for making a mess of my first post. Output should be
--------------------------------------------
| RefNo | ID |
--------------------------------------------
| FOO/BAR/BAZ/AAAAAAAAAA | 1 |
| FOO/BAR/BAZ/BBBBBBBBBB | 2 |
| FOO/BAR/BAZ/CCCCCCCCCC | 3 |
| FOO/BAR/BAZ/DDDDDDDDDD | 4 |
| FOO/ZZZ/YYY/AAAAAAAAAA | 1 |
| FOO/ZZZ/YYY/BBBBBBBBBB | 2 |
--------------------------------------------
Let's try something like this(Oracle version 10g and higher):
SQL> with t1 as(
2 select 'FOO/BAR/BAZ/AAAAAAAAAA' as RefNo, 1 as ID from dual union all
3 select 'FOO/BAR/BAZ/BBBBBBBBBB', 1 from dual union all
4 select 'FOO/BAR/BAZ/CCCCCCCCCC', 1 from dual union all
5 select 'FOO/BAR/BAZ/DDDDDDDDDD', 1 from dual union all
6 select 'FOO/ZZZ/YYY/AAAAAAAAAA', 1 from dual union all
7 select 'FOO/ZZZ/YYY/BBBBBBBBBB', 1 from dual union all
8 select 'FOO/ZZZ/YYY/CCCCCCCCCC', 1 from dual union all
9 select 'FOO/ZZZ/YYY/DDDDDDDDDD', 1 from dual
10 )
11 select row_number() over(partition by ComPart order by DifPart) as id
12 , RefNo
13 From (select regexp_substr(RefNo, '[[:alpha:]]+$') as DifPart
14 , regexp_substr(RefNo, '([[:alpha:]]+/)+') as ComPart
15 , RefNo
16 , Id
17 from t1
18 ) q
19 ;
ID REFNO
---------- -----------------------
1 FOO/BAR/BAZ/AAAAAAAAAA
2 FOO/BAR/BAZ/BBBBBBBBBB
3 FOO/BAR/BAZ/CCCCCCCCCC
4 FOO/BAR/BAZ/DDDDDDDDDD
1 FOO/ZZZ/YYY/AAAAAAAAAA
2 FOO/ZZZ/YYY/BBBBBBBBBB
3 FOO/ZZZ/YYY/CCCCCCCCCC
4 FOO/ZZZ/YYY/DDDDDDDDDD
I think that actual updating the ID column wouldn't be a good idea. Every time you add new groups of data you would have to run the update statement again. The better way would be creating a view and you will see desired output every time you query it.
rownum can be used as an incrementing ID?
UPDATE legacy_table
SET id = ROWNUM;
This will assign unique values to all records in the table. This link contains documentation about Oracle Pseudocolumn.
You can run the following:
update <table_name> set id = rownum where descr like 'FOO/BAR/BAZ/%'
This is pretty rough and I'm not sure if your RefNo is a single value column or you just made it like that for simplicity.
select
sub.RefNo
row_number() over (order by sub.RefNo) + (select max(id) from TABLE),
from (
select FOO+'/'+BAR+'/'+BAZ+'/'+OTHER as RefNo
from TABLE
group by FOO+'/'+BAR+'/'+BAZ+'/'+OTHER
) sub