Teradata SQL Reverse Parent Child Hierarchy - sql

I know how to build a hierarchy starting with the root node (i.e. where parent_id is null or something like that), but I can't find anything on how to build a hierarchy upward from the final child/edge node. I'd like to start with a child and build all the way back up to the top. Assume I don't know how many levels, or who the parent is, and we'll have to use SQL to figure it out.
Here is my base table:
old_entity_key,new_entity_key
1,2
2,3
3,4
4,5
5,6
Desired output:
new_entity_key,path
2,1/2
3,1/2/3
4,1/2/3/4
5,1/2/3/4/5
6,1/2/3/4/5/6
This is also acceptable:
new_entity_key,path
2,2/1
3,3/2/1
4,4/3/2/1
5,5/4/3/2/1
6,6/5/4/3/2/1
Here is the CTE I've started with:
with recursive history as (
select
old_entity_key,
new_entity_key,
cast(old_entity_key||'/'||new_entity_key as varchar(1000)) as path
from table
where new_entity_key not in (select old_entity_key from table)
and cast(start_time as date) between current_date - interval '3' day and current_date
union all
select
c.old_entity_key,
c.new_entity_key,
p.new_entity_key||'/'||c.path
from history c
join table p on p.new_entity_key = c.old_entity_key
)
select new_entity_key, old_entity_key, substr(path, 1, instr(path, '/') - 1) as original_entity_key, path
from history s;
The problem with the above query is that it runs forever. I think I've created an infinite loop. I've also tried using the below where filter in the bottom query of the union to try to find the root node, but Teradata gives me an error:
where p.new_entity_key in (select old_entity_key from table)
Any help would be greatly appreciated.

You'll need some sort of counter, and I think your join logic in your CTE doesn't make sense. I threw together a very simple volatile table example:
create volatile table tb
(old_entity_key char(1),
new_entity_key char(1),
rn integer)
on commit preserve rows;
insert into tb values ('1','2',1);
insert into tb values ('2','3',2);
insert into tb values ('3','4',3);
Now we can put together a recursive CTE:
with recursive history as (
select
old_entity_key,
new_entity_key,
cast(old_entity_key||'/'||new_entity_key as varchar(1000)) as path,
rn
from tb t
where
rn = 1
union all
select
t.old_entity_key,
t.new_entity_key,
h.path || '/' || t.new_entity_key,
t.rn
from
tb t
join history h
on t.rn = h.rn + 1
)
select * from history order by rn
The important things here are:
Limit your first pass (accomplished here by rn=1).
The second pass needs to pick up the "next" row, based on the previous row (t.rn = h.rn + 1)

Related

Substring in a column

I have a column that has several items in which I need to count the times it is called, my column table looks something like this:
Table Example
Id_TR Triggered
-------------- ------------------
A1_6547 R1:23;R2:0;R4:9000
A2_1235 R2:0;R2:100;R3:-100
A3_5436 R1:23;R2:100;R4:9000
A4_1245 R2:0;R5:150
And I would like the result to be like this:
Expected Results
Triggered Count(1)
--------------- --------
R1:23 2
R2:0 3
R2:100 2
R3:-100 1
R4:9000 2
R5:150 1
I've tried to do some substring, but cant seem to find how to solve this problem. Can anyone help?
This solution is X3 times faster than the CONNECT BY solution
performance: 15K records per second
with cte (token,suffix)
as
(
select substr(triggered||';',1,instr(triggered,';')-1) as token
,substr(triggered||';',instr(triggered,';')+1) as suffix
from t
union all
select substr(suffix,1,instr(suffix,';')-1) as token
,substr(suffix,instr(suffix,';')+1) as suffix
from cte
where suffix is not null
)
select token,count(*)
from cte
group by token
;
with x as (
select listagg(Triggered, ';') within group (order by Id_TR) str from table
)
select regexp_substr(str,'[^;]+',1,level) element, count(*)
from x
connect by level <= length(regexp_replace(str,'[^;]+')) + 1
group by regexp_substr(str,'[^;]+',1,level);
First concatenate all values of triggered into one list using listagg then parse it and do group by.
Another methods of parsing list you can find here or here
This is a fair solution.
performance: 5K records per second
select triggered
,count(*) as cnt
from (select id_tr
,regexp_substr(triggered,'[^;]+',1,level) as triggered
from t
connect by id_tr = prior id_tr
and level <= regexp_count(triggered,';')+1
and prior sys_guid() is not null
) t
group by triggered
;
This is just for learning purposes.
Check my other solutions.
performance: 1K records per second
select x.triggered
,count(*)
from t
,xmltable
(
'/r/x'
passing xmltype('<r><x>' || replace(triggered,';', '</x><x>') || '</x></r>')
columns triggered varchar(100) path '.'
) x
group by x.triggered
;

Get Row's Sequence (Linked-List) in PostgreSQL

I have a submissions table which is essentially a single linked list. Given the id of a given row I want to return the entire list that particular row is a part of (and it be in the proper order). For example in the table below if had id 2 I would want to get back rows 1,2,3,4 in that order.
(4,3) -> (3,2) -> (2,1) -> (1,null)
I expect 1,2,3,4 here because 4 is essentially the head of the list that 2 belongs to and I want to traverse all the through the list.
http://sqlfiddle.com/#!15/c352e/1
Is there a way to do this using postgresql's RECURSIVE CTE? So far I have the following but this will only give me the parents and not the descendants
WITH RECURSIVE "sequence" AS (
SELECT * FROM submissions WHERE "submissions"."id" = 2
UNION ALL SELECT "recursive".* FROM "submissions" "recursive"
INNER JOIN "sequence" ON "recursive"."id" = "sequence"."link_id"
)
SELECT "sequence"."id" FROM "sequence"
This approach uses what you have already come up with.
It adds another block to calculate the rest of the list and then combines both doing a custom reverse ordering.
WITH RECURSIVE pathtobottom AS (
-- Get the path from element to bottom list following next element id that matches current link_id
SELECT 1 i, -- add fake order column to reverse retrieved records
* FROM submissions WHERE submissions.id = 2
UNION ALL
SELECT pathtobottom.i + 1 i, -- add fake order column to reverse retrieved records
recursive.* FROM submissions recursive
INNER JOIN pathtobottom ON recursive.id = pathtobottom.link_id
)
, pathtotop AS (
-- Get the path from element to top list following previous element link_id that matches current id
SELECT 1 i, -- add fake order column to reverse retrieved records
* FROM submissions WHERE submissions.id = 2
UNION ALL
SELECT pathtotop.i + 1 i, -- add fake order column to reverse retrieved records
recursive2.* FROM submissions recursive2
INNER JOIN pathtotop ON recursive2.link_id = pathtotop.id
), pathtotoprev as (
-- Reverse path to top using fake 'i' column
SELECT pathtotop.id FROM pathtotop order by i desc
), pathtobottomrev as (
-- Reverse path to bottom using fake 'i' column
SELECT pathtobottom.id FROM pathtobottom order by i desc
)
-- Elements ordered from bottom to top
SELECT pathtobottomrev.id FROM pathtobottomrev where id != 2 -- remove element to avoid duplicate
UNION ALL
SELECT pathtotop.id FROM pathtotop;
/*
-- Elements ordered from top to bottom
SELECT pathtotoprev.id FROM pathtotoprev
UNION ALL
SELECT pathtobottom.id FROM pathtobottom where id != 2; -- remove element to avoid duplicate
*/
In was yet another quest for my brain. Thanks.
with recursive r as (
select *, array[id] as lst from submissions s where id = 6
union all
select s.*, r.lst || s.id
from
submissions s inner join
r on (s.link_id=r.id or s.id=r.link_id)
where (not array[s.id] <# r.lst)
)
select * from r;

Teradata SQL stack rows per user

Is there a way to stack/group string/text per user ?
data I have
USER STATES
1 CA
1 AR
1 IN
2 CA
3 CA
3 NY
4 CA
4 AL
4 SD
4 TX
What I need is
USER STATES
1 CA / AR / IN
2 CA
3 CA / NY
4 CA / AL / SD / TX
I tried cross join and then another cross join however but the data spools out. Thanks!
If Teradata's XML-services are installed there's a function named XMLAGG, which returns a similar result: CA, AR, IN
SELECT user,
TRIM(TRAILING ',' FROM (XMLAGG(TRIM(states)|| ',' /* optionally ORDER BY ...*/) (VARCHAR(10000))))
FROM tab
GROUP BY 1
Btw, using recursion will result in huge spool usage, because you keep all the intermediate rows in spool before returning the final row.
I am not an expert but this should work. You may need to modify it a bit per your exact requirement. Hope this helps!
CREATE VOLATILE TABLE temp AS (
SELECT
USER
,STATES
,ROW_NUMBER() OVER (PARTITION BY USER ORDER BY STATES) AS rn
FROM yourtable
) WITH DATA PRIMARY INDEX(USER) ON COMMIT PRESERVE ROWS;
WITH RECURSIVE rec_test(US,ST, LVL)
AS
(
SELECT USER,STATES (VARCHAR(10)),1
FROM temp
WHERE rn = 1
UNION ALL
SELECT USER, TRIM(STATES) || ', ' || ST,LVL+1
FROM temp INNER JOIN rec_test
ON USER = US
AND temp.rn = rec_test.lvl+1
)
SELECT US,ST, LVL
FROM rec_test
QUALIFY RANK() OVER(PARTITION BY US ORDER BY LVL DESC) = 1;
Unfortunately there is no GROUP_CONCAT or any string aggregate functions in Teradata (at least none that I'm aware of) so one way to achieve your result would be to use recursion, since you don't know the maximum values of states per user.
For recursion you should use a Volatile Table, as OLAP functions are not allowed in the recursive part. This is a non-tested code (I've got no way of testing it unfortunately), so there might be several bugs, but should give you the concept and with some troubleshooting (if needed) give you expected result.
Replace yourtable in definition of Volatile Table with your real table name.
CREATE VOLATILE TABLE vt AS (
SELECT
user
, states
, ROW_NUMBER() OVER (PARTITION BY user ORDER BY states) AS rn
, COUNT(*) OVER (PARTITION BY user) AS cnt
FROM yourtable
) WITH DATA
UNIQUE PRIMARY INDEX(user, rn)
ON COMMIT PRESERVE ROWS;
WITH RECURSIVE cte (user, list, rn) AS (
SELECT
user
, CAST(states AS VARCHAR(1000)) -- maximum size based on maximum number of rows * length of states
, rn
FROM vt
WHERE rn = cnt -- start with last states row
UNION ALL
SELECT
vt.user
, cte.list || ',' || vt.states
, vt.rn
FROM vt
JOIN cte ON vt.user = cte.user AND vt.rn = cte.rn - 1 -- append a row that is rn-1 of your rows for a given user
)
SELECT user, list
FROM cte
WHERE rn = 1; -- going from last to first, in this condition there should be entire list
This solution isn't perfect - it forces the engine to store immediate results in a temporary area during query processing. You may encounter a No more spool space error.

How to split and display distinct letters from a word in SQL?

Yesterday in a job interview session I was asked this question and I had no clue about it. Suppose I have a word "Manhattan " I want to display only the letters 'M','A','N','H','T'
in SQL. How to do it?
Any help is appreciated.
Well, here is my solution (sqlfiddle) - it aims to use a "Relational SQL" operations, which may have been what the interviewer was going for conceptually.
Most of the work done is simply to turn the string into a set of (pos, letter) records as the relevant final applied DQL is a mere SELECT with a grouping and ordering applied.
select letter
from (
-- All of this just to get a set of (pos, letter)
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
-- Or use another form to create a "numbers table"
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
) as pairs
group by letter -- guarantees distinctness
order by min(pos) -- ensure output is ordered MANHT
The above query works in SQL Server 2008, but the "Numbers Table" may have to be altered for other vendors. Otherwise, there is nothing used that is vendor specific - no CTE, or cross application of a function, or procedural language code ..
That being said, the above is to show a conceptual approach - SQL is designed for use with sets and relations and multiplicity across records; the above example is, in some sense, merely a perversion of such.
Examining the intermediate relation,
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
uses a cross join to generate the Cartesian product of the string (1 row) with the numbers (9 rows); the substring function is then applied with the string and each number to obtain each character in accordance with its position. The resulting set contains the records-
POS LETTER
1 M
2 A
3 N
..
9 N
Then the outer select groups each record according to the letter and the resulting records are ordered by the minimum (first) occurrence position of the letter that establishing the grouping. (Without the order by the letters would have been distinct but the final order would not be guaranteed.)
One way (if using SQL Server) is with a recursive CTE (Commom Table Expression).
DECLARE #source nvarchar(100) = 'MANHATTAN'
;
WITH cte AS (
SELECT SUBSTRING(#source, 1, 1) AS c1, 1 as Pos
WHERE LEN(#source) > 0
UNION ALL
SELECT SUBSTRING(#source, Pos + 1, 1) AS c1, Pos + 1 as Pos
FROM cte
WHERE Pos < LEN(#source)
)
SELECT DISTINCT c1 from cte
SqlFiddle for this is here. I had to inline the #source for SqlFiddle, but the code above works fine in Sql Server.
The first SELECT generates the initial row(in this case 'M', 1). The second SELECT is the recursive part that generates the subsequent rows, with the Pos column getting incremented each time until the termination condition WHERE Pos < LEN(#source) is finally met. The final select removes the duplicates. Internally, SELECT DISTINCT sorts the rows in order to facilitate the removal of duplicates, which is why the final output happens to be in alphabetic order. Since you didn't specify order as a requirement, I left it as-is. But you could modify it to use a GROUP instead, that ordered on MIN(Pos) if you needed the output in the characters' original order.
This same technique can be used for things like generating all the Bigrams for a string, with just a small change to the general structure above.
declare #charr varchar(99)
declare #lp int
set #charr='Manhattan'
set #lp=1
DECLARE #T1 TABLE (
FLD VARCHAR(max)
)
while(#lp<=LEN(#charr))
begin
if(not exists(select * from #T1 where FLD=(select SUBSTRING(#charr,#lp,1))))
begin
insert into #T1
select SUBSTRING(#charr,#lp,1)
end
set #lp=#lp+1
end
select * from #T1
check this it may help u
Here's an Oracle version of #user2864740's answer. The only difference is how you construct the "numbers table" (plus slight differences in aliasing)
select letter
from (
select ns.n as pos, substr(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s from dual) ss
cross join (
SELECT LEVEL as n
FROM DUAL
CONNECT BY LEVEL <= 9
ORDER BY LEVEL) ns
) pairs
group by letter
order by min(pos)

Ordering a SQL query based on the value in a column determining the value of another column in the next row

My table looks like this:
Value Previous Next
37 NULL 42
42 37 3
3 42 79
79 3 NULL
Except, that the table is all out of order. (There are no duplicates, so that is not an issue.) I was wondering if there was any way to make a query that would order the output, basically saying "Next row 'value' = this row 'next'" as it's shown above ?
I have no control over the database and how this data is stored. I am just trying to retrieve it and organize it. SQL Server I believe 2008.
I realize that this wouldn't be difficult to reorganize afterwards, but I was just curious if I could write a query that just did that out of the box so I wouldn't have to worry about it.
This should do what you need:
WITH CTE AS (
SELECT YourTable.*, 0 Depth
FROM YourTable
WHERE Previous IS NULL
UNION ALL
SELECT YourTable.*, Depth + 1
FROM YourTable JOIN CTE
ON YourTable.Value = CTE.Next
)
SELECT * FROM CTE
ORDER BY Depth;
[SQL Fiddle] (Referential integrity and indexes omitted for brevity.)
We use a recursive common table expression (CTE) to travel from the head of the list (WHERE Previous IS NULL) to the trailing nodes (ON YourTable.Value = CTE.Next) and at the same time memorize the depth of the recursion that was needed to reach the current node (in Depth).
In the end, we simply sort by the depth of recursion that was needed to reach each of the nodes (ORDER BY Depth).
Use a recursive query, with the one i list here you can have multiple paths along your linked list:
with cte (Value, Previous, Next, Level)
as
(
select Value, Previous, Next, 0 as Level
from data
where Previous is null
union all
select d.Value, d.Previous, d.Next, Level + 1
from data d
inner join cte c on d.Previous = c.Value
)
select * from cte
fiddle here
If you are using Oracle, try Starts with- connect by
select ... start with initial-condition connect by
nocycle recursive-condition;
EDIT: For SQL-Server, use WITH syntax as below:
WITH rec(value, previous, next) AS
(SELECT value, previous, next
FROM table1
WHERE previous is null
UNION ALL
SELECT nextRec.value, nextRec.previous, nextRec.next
FROM table1 as nextRec, rec
WHERE rec.next = nextRec.value)
SELECT value, previous, next FROM rec;
One way to do this is with a join:
select t.*
from t left outer join
t tnext
on t.next = tnext.val
order by tnext.value
However, won't this do?
select t.*
from t
order by t.next
Something like this should work:
With Parent As (
Select
Value,
Previous,
Next
From
table
Where
Previous Is Null
Union All
Select
t.Value,
t.Previous,
t.Next
From
table t
Inner Join
Parent
On Parent.Next = t.Value
)
Select
*
From
Parent
Example