DB2 : search depth first : (syntax) error - sql

A few days ago I installed DB2 LUW (11.5) on a server to play around with.
Now I would like to do some recursive SQL (Recursive Common Table Expression):
Let me show how I setup :
drop table relations;
create table relations (id int, parent int);
insert into relations values(0,NULL);
insert into relations values(1,0);
insert into relations values(2,1);
insert into relations values(3,1);
insert into relations values(4,3);
insert into relations values(5,0);
insert into relations values(6,5);
insert into relations values(7,5);
insert into relations values(8,6);
insert into relations values(9,7);
insert into relations values(10,0);
insert into relations values(11,1);
commit;
Now I would like to see the hierarchy in the table. So I tried the following:
with recur(id, parent, level) as
(
select rel.id id, rel.parent parent, 0 level from relations rel where rel.id=0
union all
select rel.id, rel.parent, rec.level+1 from recur rec, relations rel where rec.id=rel.parent
and rec.level<10
)
select id, lpad(parent, level*2, ' ') from recur;
This gives me:
ID PARENT
----------- ------------------
0 -
1 0
5 0
10 0
2 1
3 1
11 1
6 5
7 5
4 3
8 6
9 7
This is (to me) : "Search Breadth First"
What I would like to see is "Search Depth First"
So I did this:
with recur(id, parent, level) as
(
select rel.id id, rel.parent parent, 0 level from relations rel where rel.id=0
union all
select rel.id, rel.parent, rec.level+1 from recur rec, relations rel where rec.id=rel.parent
and rec.level<10
)
search depth first by parent set ord
select id, lpad(parent, level*2, ' ') parent from recur order by ord;
But this delivers to me:
SQL0104N An unexpected token "search depth first by parent set ord sel" was
found following "t and rec.level<10 )". Expected tokens may include:
"<values>". SQLSTATE=42601
No clue how to solve it now. I (think I) have tried a lot of possible solutions. But none worked.
I'm starting to believe that DB2 LUW (11.5) doesn't know about Search Depth First. Or some setting must be made to make DB2 aware of the "SDF" possibility.
My question to you all:
How to solve this problem? How do I get Search Depth First to work?
On the positive....following works like a charms....but that is not whatI want tot know :-)
select id, lpad(parent, level*2, ' ') parent, level
from relations
start with id=0
connect by prior id=parent;
ID PARENT LEVEL
----------- ---------- -----------
0 - 1
1 0 2
2 1 3
3 1 3
4 3 4
11 1 3
5 0 2
6 5 3
8 6 4
7 5 3
9 7 4
10 0 2
This works like a charm, but I had tot make a switch in the database (and a restart):
db2set DB2_COMPATIBILITY_VECTOR=08

Your question is about displaying rows in a specific ordering, not about searching in a specific ordering.
You can display the rows in the ordering you want by assembling an ordering column that fits your needs.
For example:
with
n (id, parent, lvl, ordering) as (
select id, parent, 1, lpad(id, 3, '0') || lpad('', 30, ' ')
from relations
where parent is null
union all
select r.id, r.parent, n.lvl + 1, trim(n.ordering) || '/' || lpad(r.id, 3, '0')
from n, relations r where r.parent = n.id
)
select id, lpad(parent, lvl * 2, ' ') as parent, lvl
from n
order by ordering;
Result:
ID PARENT LVL
--- --------- ---
0 1
1 0 2
2 1 3
3 1 3
4 3 4
11 1 3
5 0 2
6 5 3
8 6 4
7 5 3
9 7 4
10 0 2
See running example at db<>fiddle.

Related

SQL Server (terminal result) hierarchy map

In SQL Server 2016, I have a table with the following chaining structure:
dbo.Item
OriginalItem
ItemID
NULL
7
1
2
NULL
1
5
6
3
4
NULL
8
NULL
5
9
11
2
3
EDIT NOTE: Bold numbers were added as a response to #lemon comments below
Importantly, this example is a trivialized version of the real data, and the neatly ascending entries is not something that is present in the actual data, I'm just doing that to simplify the understanding.
I've constructed a query to get what I'm calling the TerminalItemID, which in this example case is ItemID 4, 6, and 7, and populated that into a temporary table #TerminalItems, the resultset of which would look like:
#TerminalItems
TerminalItemID
4
6
7
8
11
What I need, is a final mapping table that would look something like this (using the above example -- note that it also contains for 4, 6, and 7 mapping to themselves, this is needed by the business logic):
#Mapping
ItemID
TerminalItemID
1
4
2
4
3
4
4
4
5
6
6
6
7
7
8
8
9
11
11
11
What I need help with is how to build this last #Mapping table. Any assistance in this direction is greatly appreciated!
This should do:
with MyTbl as (
select *
from (values
(NULL, 1 )
,(1, 2 )
,(2, 3 )
,(3, 4 )
,(NULL, 5 )
,(5, 6 )
,(NULL, 7 )
) T(OriginalItem, ItemID)
)
, TerminalItems as (
/* Find all leaf level items: those not appearing under OriginalItem column */
select LeafItem=ItemId, ImmediateOriginalItem=M.OriginalItem
from MyTbl M
where M.ItemId not in
(select distinct OriginalItem
from MyTbl AllParn
where OriginalItem is not null
)
), AllLevels as (
/* Use a recursive CTE to find and report all parents */
select ThisItem=LeafItem, ParentItem=ImmediateOriginalItem
from TerminalItems
union all
select ThisItem=AL.ThisItem, M.OriginalItem
from AllLevels AL
inner join
MyTbl M
on M.ItemId=AL.ParentItem
)
select ItemId=coalesce(ParentItem,ThisItem), TerminalItemId=ThisItem
from AllLevels
order by 1,2
Beware of the MAXRECURSION setting; by default SQLServer iterates through recursion 100 times; this would mean that the depth of your tree can be 100, max (the maximum number of nodes between a terminal item and its ultimate original item). This can be increased by OPTION(MAXRECURSION nnn) where nnn can be adjusted as needed. It can also be removed entirely by using 0 but this is not recommended because your data can cause infinite loops.
This is a typical gaps-and-islands problem and can also be carried out without recursion in three steps:
assign 1 at the beginning of each partition
compute a running sum over your flag value (generated at step 1)
extract the max "ItemID" on your partition (generated at step 2)
WITH cte1 AS (
SELECT *, CASE WHEN OriginalItem IS NULL THEN 1 ELSE 0 END AS changepartition
FROM Item
), cte2 AS (
SELECT *, SUM(changepartition) OVER(ORDER BY ItemID) AS parts
FROM cte1
)
SELECT ItemID, MAX(ItemID) OVER(PARTITION BY parts) AS TerminalItemID
FROM cte2
Check the demo here.
Assumption: Your terminal id items correspond to the "ItemID" value preceding a NULL "OriginalItem" value.
EDIT: "Fixing orphaned records."
The query works correctly when records are not orphaned. The only way to deal them, is to get missing records back, so that the query can work correctly on the full data.
This is carried out by an extra subquery (done at the beginning), that will apply a UNION ALL between:
the available records of the original table
the missing records
WITH fix_orphaned_records AS(
SELECT * FROM Item
UNION ALL
SELECT NULL AS OriginalItem,
i1.OriginalItem AS ItemID
FROM Item i1
LEFT JOIN Item i2 ON i1.OriginalItem = i2.ItemID
WHERE i1.OriginalItem IS NOT NULL AND i2.ItemID IS NULL
), cte AS (
...
Missing records correspond to "OriginalItem" values that are never found within the "ItemID" field. A self left join will uncover these missing records.
Check the demo here.
You can use a recursive CTE to compute the last item in the sequence. For example:
with
n (orig_id, curr_id, lvl) as (
select itemid, itemid, 1 from item
union all
select n.orig_id, i.itemid, n.lvl + 1
from n
join item i on i.originalitem = n.curr_id
)
select *
from (
select *, row_number() over(partition by orig_id order by lvl desc) as rn from n
) x
where rn = 1
Result:
orig_id curr_id lvl rn
-------- -------- ---- --
1 4 4 1
2 4 3 1
3 4 2 1
4 4 1 1
5 6 2 1
6 6 1 1
7 7 1 1
See running example at db<>fiddle.

How to process a column that holds a comma-separated or range string values in Oracle

Using Oracle 12c DB, I have the following table data example that I need assistance with using SQL and PL/SQL.
Table data is as follows:
Table Name: my_data
ID ITEM ITEM_LOC
------- ----------- ----------------
1 Item-1 0,1
2 Item-2 0,1,2,3,4,7
3 Item-3 0-48
4 Item-4 0,1,2,3,4,5,6,7,8
5 Item-5 1-33
6 Item-6 0,1
7 Item-7 0,1,5,8
Using the data above within the my_data table, what is the best way to process this ITEM_LOC as I need to use the values in this column as an individual value, i.e:
0,1 means the SQL needs to return either 0 or 1 or
range values, i.e:
0-48 means the SQL needs to return a value between 0 and 48.
The returned values for both scenarios should commence from lowest to highest and can't be re-used once processed.
Based on the above, it would be great to have a function that takes the ID and returns an individual value from ITEM_LOC that hasn't been used, based on my description above. This could be a comma-separated string value or a range string value.
Desired result for ID = 2 could be 7. For this ID = 2, ITEM_LOC = 7 could not be used again.
Desired result for ID = 5 could be 31. For this ID = 5, ITEM_LOC = 31 could not be used again.
For the ITEM_LOC data that could not be used again, against that ID, I am looking at holding another table to hold this or perhaps separate all data into separate rows with a new column called VALUE_USED.
This query shows how to extract list of ITEM_LOC values based on whether they are comma-separated (which means "take exactly those values") or dash-separated (which means "find all values between starting and end point"). I modified your sample data a little bit (didn't feel like displaying ~50 values if 5 of them do the job).
lines #1 - 6 represent sample data.
the first select (lines #7 - 15) splits comma-separated values into rows
the second select (lines #17 - 26) uses a hierarchical query which adds 1 to the starting value, up to item's end value.
SQL> with my_data (id, item, item_loc) as
2 (select 2, 'Item-2', '0,2,4,7' from dual union all
3 select 7, 'Item-7', '0,1,5' from dual union all
4 select 3, 'Item-3', '0-4' from dual union all
5 select 8, 'Item-8', '5-8' from dual
6 )
7 select id,
8 item,
9 regexp_substr(item_loc, '[^,]+', 1, column_value) loc
10 from my_data
11 cross join table(cast(multiset
12 (select level from dual
13 connect by level <= regexp_count(item_loc, ',') + 1
14 ) as sys.odcinumberlist))
15 where instr(item_loc, '-') = 0
16 union all
17 select id,
18 item,
19 to_char(to_number(regexp_substr(item_loc, '^\d+')) + column_value - 1) loc
20 from my_data
21 cross join table(cast(multiset
22 (select level from dual
23 connect by level <= to_number(regexp_substr(item_loc, '\d+$')) -
24 to_number(regexp_substr(item_loc, '^\d+')) + 1
25 ) as sys.odcinumberlist))
26 where instr(item_loc, '-') > 0
27 order by id, item, loc;
ID ITEM LOC
---------- ------ ----------------------------------------
2 Item-2 0
2 Item-2 2
2 Item-2 4
2 Item-2 7
3 Item-3 0
3 Item-3 1
3 Item-3 2
3 Item-3 3
3 Item-3 4
7 Item-7 0
7 Item-7 1
7 Item-7 5
8 Item-8 5
8 Item-8 6
8 Item-8 7
8 Item-8 8
16 rows selected.
SQL>
I don't know what you meant by saying that "item_loc could not be used again". Used where? If you use the above query in, for example, cursor FOR loop, then yes - those values would be used only once as every loop iteration fetches next item_loc value.
As others have said, it's a bad idea to store data in this way. You very likely could have input like this, and you likely could need to display the data like this, but you don't have to store the data the way it is input or displayed.
I'm going to store the data as individual LOC elements based on the input. I assume the data contains only integers separated by commas, or pairs of integers separated by a hyphen. Whitespace is ignored. The comma-separated list does not have to be in any order. In pairs, if the left integer is greater than the right integer I return no LOC element.
create table t as
with input(id, item, item_loc) as (
select 1, 'Item-1', ' 0,1' from dual union all
select 2, 'Item-2', '0,1,2,3,4,7' from dual union all
select 3, 'Item-3', '0-48' from dual union all
select 4, 'Item-4', '0,1,2,3,4,5,6,7,8' from dual union all
select 5, 'Item-5', '1-33' from dual union all
select 6, 'Item-6', '0,1' from dual union all
select 7, 'Item-7', '0,1,5,8,7 - 11' from dual
)
select distinct id, item, loc from input, xmltable(
'let $item := if (contains($X,",")) then ora:tokenize($X,"\,") else $X
for $i in $item
let $j := if (contains($i,"-")) then ora:tokenize($i,"\-") else $i
for $k in xs:int($j[1]) to xs:int($j[count($j)])
return $k'
passing item_loc as X
columns loc number path '.'
);
Now to "use" an element I just delete it from the table:
delete from t where rowid = (
select min(rowid) keep (dense_rank first order by loc)
from t
where id = 7
);
To return the data in the same format it was input, use MATCH_RECOGNIZE:
select id, item, listagg(item_loc, ',') within group(order by first_loc) item_loc
from t
match_recognize(
partition by id, item order by loc
measures a.loc first_loc,
a.loc || case count(*) when 1 then null else '-'||b.loc end item_loc
pattern (a b*)
define b as loc = prev(loc) + 1
)
group by id, item;
ID ITEM ITEM_LOC
1 Item-1 0-1
2 Item-2 0-4,7
3 Item-3 0-48
4 Item-4 0-8
5 Item-5 1-33
6 Item-6 0-1
7 Item-7 1,5,7-11
Note that the output here will not be exactly like the input, because any consecutive integers will be compressed into a pair.

Oracle , select relation

I have 2 tables in oracle DB: Items and Relationship.
items:
ID
---
1
2
3
4
5
6
7
relationship:
ID parent child
--------------------
1 1 2
2 1 3
3 1 4
4 2 5
5 2 6
6 3 7
In the relationship table, I'm storing the hierarchial structure of the "items" (do not ask why it's stored in different tables).
The question:
When I execute this query:
SELECT PARENT_ID, CHILD_ID, CONNECT_BY_ISLEAF, MAX(LEVEL) OVER () + 1 - LEVEL as rev_level
FROM relationship
CONNECT BY PRIOR PARENT_ID = CHILD_ID
START WITH CHILD_ID = 7;
I do not see the root parent because he doesn't exist in this table as a child.
The question is how can I add the root parent(ID = 1) to the query result or join it whith the "items" table and keep the result columns (level and isleaf).
CONNECT_BY_ISLEAF works the other way around when using bottom up search (see this link http://technology.amis.nl/2009/11/14/oracle-11gr2-alternative-for-connect_by_isleaf-function-for-recursive-subquery-factoring-dedicated-to-anton/)
I assume that you want to show more data about the item (like name) If that is the case just left join the items table.
SELECT PARENT_ID AS PARENT_ID,CHILD_ID, i.name AS CHILD_NAME,
CONNECT_BY_ISLEAF,
MAX(LEVEL) OVER () + 1 - LEVEL AS rev_level
FROM items i
LEFT JOIN relationship r ON (i.id = r.CHILD_ID)
CONNECT BY PRIOR PARENT_ID = CHILD_ID
START WITH CHILD_ID = 7
ORDER BY REV_LEVEL;
check this SQLfiddle: http://sqlfiddle.com/#!4/5c9fa/17
In addition check this post about bottom up searches (http://bitbach.wordpress.com/2010/10/18/implementing-bottom-up-path-traversal-for-hierarchical-tables/)
Notice that you have both directions - parent and child.
pick one and dont mix the two.
1 with x as (
2 select 1 as id, 1 as parent, 2 as child from dual union all
3 select 2, 1 , 3 from dual union all
4 select 3 ,1, 4 from dual union all
5 select 4 ,2, 5 from dual union all
6 select 5 ,2, 6 from dual union all
7 select 6 ,3, 7 from dual)
8 select *
9 from x
10 sTART WITH child = 7
11* CONNECT BY PRIOR id= CHILD
SQL> /
ID PARENT CHILD
---------- ---------- ----------
6 3 7
5 2 6
4 2 5
3 1 4
2 1 3
1 1 2
connection is made by prior id = child and not prior parent = child

SQL: Hierarchical query with multiple roots / parents

I have a table describing elements organized in a tree-like structure:
ID, PARENT_ID, NAME
0 null TOP
1 0 A
2 0 B
3 0 C
4 1 AA
5 2 BA
6 3 CA
7 6 CAA
...
There can be many levels in this hierarchy.
Suppose there is a list of elements (say IDs 2 and 3) for which I would like to get all child records from the table.
Something like this:
select *
from MY_TABLE
start with PARENT_ID in (2,3)
connect by PARENT_ID = prior ID
will return:
ID, PARENT_ID, NAME
5 2 BA
6 3 CA
7 6 CAA
However, I want the each output record to be mapped to the original parent from my list (2,3) so that the output would look like this:
ORIGINAL_PARENT_ID, ID, PARENT_ID, NAME
2 5 2 BA
3 6 3 CA
3 7 6 CAA
How can it be done?
connect_by_root may be what you're after?
SQL> select t.*, connect_by_root parent_id as ORIGINAL_PARENT_ID
2 from MY_TABLE t
3 start with PARENT_ID in (2,3)
4 connect by PARENT_ID = prior ID
5 /
ID PARENT_ID NAM ORIGINAL_PARENT_ID
---------- ---------- --- ------------------
5 2 BA 2
6 3 CA 3
7 6 CAA 3
Assuming your names are really as you have them, then the problem can be done without connect by. You can use simple string manipulation.
with ToFind (
select 'C' as parent from dual union all
select 'B' as parent from dual
)
select t.*
from t join
ToFind tf
on t.name like tf.parent, 100)||'%' and t.name <> tf.parent

Why does CONNECT BY LEVEL on a table return extra rows?

Using CONNECT BY LEVEL seems to return too many rows when performed on a table. What is the logic behind what's happening?
Assuming the following table:
create table a ( id number );
insert into a values (1);
insert into a values (2);
insert into a values (3);
This query returns 12 rows (SQL Fiddle).
select id, level as lvl
from a
connect by level <= 2
order by id, level
One row for each in table A with the value of column LVL being 1 and three for each in table A where the column LVL is 2, i.e.:
ID | LVL
---+-----
1 | 1
1 | 2
1 | 2
1 | 2
2 | 1
2 | 2
2 | 2
2 | 2
3 | 1
3 | 2
3 | 2
3 | 2
It is equivalent to this query, which returns the same results.
select id, level as lvl
from dual
cross join a
connect by level <= 2
order by id, level
I don't understand why these queries return 12 rows or why there are three rows where LVL is 2 and only one where LVL is 1 for each value of the ID column.
Increasing the number of levels that are "connected" to 3 returns 13 rows for each value of ID. 1 where LVL is 1, 3 where LVL is 2 and 9 where LVL is 3. This seems to suggest that the rows returned are the number of rows in table A to the power of the value of LVL minus 1.
I would have though that these queries would be the same as the following, which returns
6 rows
select id, lvl
from ( select level as lvl
from dual
connect by level <= 2
)
cross join a
order by id, lvl
The documentation isn't particularly clear, to me, in explaining what should occur. What's happening with these powers and why aren't the first two queries the same as the third?
When connect by is used without start with clause and prior operator, there is no restriction on joining children row to a parent row. And what Oracle does in this situation, it returns all possible hierarchy permutations by connecting a row to every row of level higher.
SQL> select b
2 , level as lvl
3 , sys_connect_by_path(b, '->') as ph
4 from a
5 connect by level <= 2
6 ;
B LVL PH
---------- ----------
1 1 ->1
1 2 ->1->1
2 2 ->1->2
3 2 ->1->3
2 1 ->2
1 2 ->2->1
2 2 ->2->2
3 2 ->2->3
3 1 ->3
1 2 ->3->1
2 2 ->3->2
3 2 ->3->3
12 rows selected
In the first query, you connect by just the level.
So if level <= 1, you get each of the records 1 time. If level <= 2, then you get each level 1 time (for level 1) + N times (where N is the number of records in the table). It is like you are cross joining, because you're just picking all records from the table until the level is reached, without having other conditions to limit the result. For level <= 3, this is done again for each of those results.
So for 3 records:
Lvl 1: 3 record (all having level 1)
Lvl 2: 3 records having level 1 + 3*3 records having level 2 = 12
Lvl 3: 3 + 3*3 + 3*3*3 = 39 (indeed, 13 records each).
Lvl 4: starting to see a pattern? :)
It's not really a cross join. A cross join would only return those records that have level 2 in this query result, while with this connect by, you get the records having level 1 as well as the records having level 2, thus resulting in 3 + 3*3 instead of just 3*3 record.
you're comparing apples to oranges when comparing the final query to the others as the LEVEL is isolated in that to the 1-row dual table.
lets consider this query:
select id, level as lvl
from a
connect by level <= 2
order by id, level
what that is saying is, start with the table set (select * From a). then, for each row returned connect this row to the prior row. as you have not defined a join in the connect by, this is in effect a Cartesian join, so when you have 3 rows of (1,2,3) 1 joins to 2, 1->3, 2->1, 2->3, 3->1 and 3->2 and they also join to themselves 1->1,2->2 and 3->3. these joins are level=2. so we have 9 joins there, which is why you get 12 rows (3 original "level 1" rows plus the Cartesian set).
so the number of rows output = rowcount + (rowcount^2)
in the last query you are isolating level to this
select level as lvl
from dual
connect by level <= 2
which of course returns 2 rows. this is then cartesianed to the original 3 rows, giving 6 rows as output.
You can use technique below to overcome this issue:
select id, level as lvl
from a
left outer join (select level l from dual connect by level <= 2) lev on 1 = 1
order by id