Recursion - chaining data in sql - stuck - sql

I have different tables and the goal is to obtain the approval workflow for every customer
Customers have different approval workflows, take a look at this:
In my table "entities" i have this
(12, 'Math Andrew', 308, 'CHAIN1-MathAndrew').
It means that when the row was created the number 12 was assigned to Math Andrew... 308 is the number that says that Matt Andrew is a CLIENT
Table type_entities
(308,'CLIENT'),
(309,'APPROVER1'),
(310,'APPROVER2'),
(311,'APPROVER3'),
(312,'J3 APPROVER4'),
(313,'J4 APPROVER4'),
(314,'J5 APPROVER4'),
(315, 'J6 APPROVER4'),
(316,'J7 APPROVER4');
Because Math Andrew is a CLIENT (also known as CUSTOMER) he must be linked to one or more APPROVERS
A client could have 1 APPROVER, OR 2 APPROVERS OR 3 APPROVERS OR 4 APPROVERS, there exist different approvers inside entities table:
(18, 'ZATCH', 309, null),
(19, 'MAX', 309, null),
(20, 'Ger',310, null),
(21, 'Mar',310, null),
(22, 'Maxwell',311, null),
(23, 'Ryan',312, null),
(24, 'Juy',313, null),
(25, 'Angel',314, null),
(26, 'John',315, null);
Types of relations between entities:
(444,'J6 CLIENT-APPROVER4'),
(445,'J3 CLIENT-APPROVER4'),
(446,'J4 CLIENT-APPROVER4'),
(447,'J10 CLIENT-APPROVER4'),
(448,'J4 CLIENT-APPROVER4'),
(449,'J5 CLIENT-APPROVER4'),
(450,'J10 CLIENT-APPROVER4'),
(451,'J3 CLIENT-APPROVER4'),
(452,'J8 CLIENT-APPROVER4'),
(453,'J5 CLIENT-APPROVER4'),
(454,'J6 CLIENT-APPROVER4'),
(455,'J7 CLIENT-APPROVER4'),
(456,'J7 CLIENT-APPROVER4'),
(457,'J8 CLIENT-APPROVER4'),
(458,'CLIENT-APPROVER3'),
(459,'CLIENT-APPROVER1'),
(460,'APPROVER1-APPROVER2'),
(461,'APPROVER1-APPROVER3'),
(462,'J3 APPROVER1-APPROVER4'),
(463,'APPROVER2-APPROVER3'),
(464,'J3 APPROVER3-APPROVER4'),
(465,'J4 APPROVER3-APPROVER4'),
(466,'J5 APPROVER3-APPROVER4'),
(467,'J6 APPROVER3-APPROVER4'),
(468,'J7 APPROVER3-APPROVER4'),
(469,'J8 APPROVER3-APPROVER4'),
(470,'J10 APPROVER3-APPROVER4'),
(471,'CLIENT-APPROVER2');
This is the important part: when a client is linked to one approver, a relation is created inside relationships table.
In this case MathAndrew was linked to Approver #18 (ZATCH), THIS ROW WAS CREATED AFTER THE ASSIGNATION:
(787,459,'CHAIN1-MathAndrew',18)--
787 IS THE NUMBER THAT WAS ASSIGNED WHEN THAT ROW WAS CREATED 459
REPRESENTS THE RELATION CLIENT - APPROVER
CHAIN1-MathAndre is the
client 18 is the approver
Also, in this case APPROVER1 was linked to APPROVER2
(788,460,18,20)
Then, APPROVER2 was linked to APPROVER3
(789,463,20,21)
Finally, APPROVER3 was linked to APPROVER4
(790,467,21,26)
I WANT TO OBTAIN THE COMPLETE APPROVAL WORKFLOW CHAIN, I mean this:
CHAIN1-MathAndrew-ZATCH-Ger-Mar-John
I did this but i am not getting what i want:
WITH relationships_CTE as
select description_entity_1,description_entitiy_2
from relationships
where description_entitiy_1 like 'CHAIN1-MathAndrew'
UNION ALL
select description_entity_1,description_entitiy_2
from relationships
where relationships.description_entitiy_2 = relationships_CTE.description_entitiy_2
select *
from relationships_CTE ma
left join relationships_CTE na
This is my SQL FIDDLE:
http://sqlfiddle.com/#!9/51bb39/4
Could you please help me?

So you have a couple of major issues with your demo, firstly that you are trying to use a CTE on a version of MySQL that doesn't support it (CTE support was introduced in MySQL version 8), and secondly you are trying to insert a string into a column in the relationships table (which should have been left as a reference to the entities table. Having corrected those issues, we can look at the CTE. There you have a syntax error because you have not enclosed your CTE query in (), and also you have failed to declare the CTE as recursive (since it refers to itself).
Now, based on your question, you want to get names out of the entities table to correspond to the values in the relationships table. So we start the CTE by finding the appropriate entities.id value for CHAIN1-MathAndrew, and then in the recursive part of the CTE we loop through all the entities that are related to that entity, grabbing the names as we go. This gives us this query:
WITH recursive relationships_CTE as (
select e.id, e.description AS name
from entities e
where e.description like 'CHAIN1-MathAndrew'
UNION ALL
select r.description_entitiy_2, e.name
from relationships_CTE cte
left join relationships r
on r.description_entitiy_1 = cte.id
join entities e ON r.description_entitiy_2 = e.id
)
If we now
select *
from relationships_CTE
we get
id name
12 CHAIN1-MathAndrew
18 ZATCH
20 Ger
21 Mar
26 John
or we can use GROUP_CONCAT to string those names together:
select group_concat(name separator '-')
from relationships_CTE
Output:
CHAIN1-MathAndrew-ZATCH-Ger-Mar-John
Demo on dbfiddle

Related

How to use LIKE with ORDER BY CASE for partial text search in WHEN statement?

How do you use LIKE with ORDER BY CASE in SQL? Or in other words, how do you do a partial text search for the following:
ORDER BY CASE [Column Name] WHEN [value partial text%]
Problem: I'm referencing a table (named "Personnel") with column (titled "Rank"), which lists each employee's job title followed by their level of certification (many variables). I would like to order the SQL query results by job title, ignoring the certification level that follows title name.
Example values in Personnel.Rank Column:
Captain Paramedic
Captain Intermediate
Captain EMT
Lieutenant Paramedic
Lieutenant Intermediate
Lieutenant EMT
Apparatus Operator Paramedic
Firefighter EMT
Firefighter AEMT
This works, but I don't want to list every variable as a WHEN clause:
SELECT
p.Rank
FROM Personnel p
ORDER BY
CASE p.Rank
WHEN 'Captain Paramedic' THEN 1
WHEN 'Captain EMT' THEN 1
WHEN 'Lieutenant Paramedic' THEN 2
WHEN 'Lieutenant EMT' THEN 2
ELSE 3
END
I would like to know how to do something like this instead:
SELECT
p.Rank
FROM Personnel p
ORDER BY
CASE p.Rank
WHEN LIKE 'Captain%' THEN 1
WHEN LIKE 'Lieutenant%' THEN 2
ELSE 3
END
Thoughts?
LIKE operator is not permitted with ORDER BY CASE [column name] WHEN statement
It is possible you can just fix up the syntax and still use your case statement. Untested but the syntax I would expect is:
SELECT
p.Rank
FROM Personnel p
ORDER BY
CASE
WHEN p.Rank LIKE 'Captain%' THEN 1
WHEN p.Rank LIKE 'Lieutenant%' THEN 2
ELSE 3
END
A more general purpose solution would be to use a reference or lookup table to get your order by values. Here is an example written in PSEUDOSQL just to show the idea. In real life, you can create your table or use a temp table or a CTE. The pro here is you can maintain this a little more cleanly (in your ref table - or SortOrder can be a field right in your Personnel table). The con here is you have "promoted" a simple ordering problem into something more permanent and if this is only an ad hoc need for an ad hoc query then it might be overkill.
create temporary table SortOrder (Rank, SortOrder)
insert into SortOrder
values
('Captain Paramedic', 10),
('Captain Intermediate', 10),
('Captain EMT', 10),
('Lieutenant Paramedic', 20),
('Lieutenant Intermediate', 20),
('Lieutenant EMT', 20),
('Apparatus Operator Paramedic', 30),
('Firefighter EMT', 40),
('Firefighter AEMT', 40)
SELECT
p.Rank
FROM
Personnel p
LEFT JOIN SortOrder s
ON p.Rank = s.Rank
ORDER BY
COALESCE(s.SortOrder, 100)

Create a hardcoded "mapping table" in Trino SQL

I have a query (several CTEs) that get data from different sources. The output has a column name, but I would like to map this nameg to a more user-friendly name.
Id
name
1
buy
2
send
3
paid
I would like to hard code somewhere in the query (in another CTE?) a mapping table. Don't want to create a separate table for it, just plain text.
name_map=[('buy', 'Item purchased'),('send', 'Parcel in transit'), ('paid', 'Charge processed')]
So output table would be:
Id
name
1
Item purchased
2
Parcel in transit
3
Charge processed
In Trino I see the function map_from_entries and element_at, but don't know if they could work in this case.
I know "case when" might work, but if possible, a mapping table would be more convenient.
Thanks
As a simpler alternative to the other answer, you don't actually need to create an intermediate map using map_from_entries and look up values using element_at. You can just create an inline mapping table with VALUES and use a regular JOIN to do the lookups:
WITH mapping(name, description) AS (
VALUES
('buy', 'Item purchased'),
('send', 'Parcel in transit'),
('paid', 'Charge processed')
)
SELECT description
FROM t JOIN mapping ON t.name = mapping.name
(The query assumes your data is in a table named t that contains a column named name to use for the lookup)
Super interesting idea, and I think I got it working:
with tmp as (
SELECT *
FROM (VALUES ('1', 'buy'),
('2', 'send'),
('3', 'paid')) as t(id, name)
)
SELECT element_at(name_map, name) as name
FROM tmp
JOIN (VALUES map_from_entries(
ARRAY[('buy', 'Item purchased'),
('send', 'Parcel in transit'),
('paid', 'Charge processed')])) as t(name_map) ON TRUE
Output:
name
Item purchased
Parcel in transit
Charge processed
To see a bit more of what's happening, we can look at:
SELECT *, element_at(name_map, name) as name
id
name
name_map
name
1
buy
{buy=Item purchased, paid=Charge processed, send=Parcel in transit}
Item purchased
2
send
{buy=Item purchased, paid=Charge processed, send=Parcel in transit}
Parcel in transit
3
paid
{buy=Item purchased, paid=Charge processed, send=Parcel in transit}
Charge processed
I'm not sure how efficient this is, but it's certainly an interesting idea.

Recursive query to find previous related row meeting criteria

I have a database full of messages from various chatbots. The chatbots all
follow decision tree format and ultimately are questions presented with choices
to which the user responds.
The bot may send a message (Hello would you like A or B?) which has options
attached, A and B for example. The user responds B. Both of these messages are
recorded and the previous message id attached.
id
message
options
previous_id
1
Hello would you like A or B?
A,B
2
A
1
The structure of these conversations is not fixed. There may be various forms
of message flow. The above is a simplistic example of how the messages are
chained together. For example
// text question on same message as options, with preceding unrelated messages
Hello -> My name is Ben the bot. -> How are you today? (good, bad) -> [good]
// text question not on same message as options
Pick your favourite colour -> {picture of blue and red} (blue, red) -> [blue]
// no question just option prompt - here precending text wasn't a question
[red] -> (ferrari, lamborghini) -> [ferrari]
-> denotes separation of messages
[] denotes reply to bot from user
() denotes options attached to messages
{} denotes attachments
What I am trying to get from this data is a row for every question with its
corresponding answer. The problem i'm facing is the (presumable) recursion i'd
have to use to retrieve the previous message each time until it met criteria
indicating it's gone back far enough for that particular answer in the chain of
messages.
In theory what I am trying to achieve is
Find all answers to questions
From those results look at the previous message
2a. If previous message has text and is not an answer itself then use said text and stop recursing
2b. Else move onto the next previous message until the criteria is met.
Return rows containing answer/response, with question and other columns from question row (id, timestamp for example)
This would leave me with lots of rows containing a message and a response
in the following dataset for example,
id
message
other
previous_id
1
Hello would you like A or B?
2
B
1
3
Hello would you like A or B?
4
A
3
5
Hello would you like A or B?
6
B
5
7
A is a great answer. C or D?
4
8
D
7
9
Green or red?
10
image
9
11
Red
10
I'd hope to end up with
id
message
response
1
Hello would you like A or B?
B
3
Hello would you like A or B?
A
5
Hello would you like A or B?
B
7
A is a great answer. C or D?
D
8
Green or red?
Red
I have made a (somewhat) simplified version of some sample data which is at the bottom of this question for reference/use.
It uses the following structure
WITH data ( id, message, node, options, previous, attachment) AS ()
Answers can be found with select where node is null so I assumed that is the
best starting point and I can work backwards towards the question. previous
and options are json columns because that's how they are in the real data so
I left them as they were.
I have tried various means by which to get the data as I wanted but I haven't managed the recursion/unknown number of levels bit.
For example, this attempt can dig two levels deep but I couldn't coalesce the
id of the message i found because obviously both have non null values.
select COALESCE(d2.message, d3.message) as question, d.message as answer
-- select COALESCE(d2.message, d2.attachment, d3.message, d3.attachment) as question, d.message as answer
from data as d
left join data as d2 on (d.previous->>'id')::int = d2.id
left join data as d3 on (d2.previous->>'id')::int = d3.id
where d.previous->>'node' in (
SELECT node from data where options is not null group by node
)
I believe this answer https://dba.stackexchange.com/a/215125/4660 may be the
path to what I need but I've thus far been unable to get it to run as I'd like.
I think this would allow me to replace the two left joins in my above example
with say a recursive union which i can use conditions on the on clause to stop
it at the right point. Hopefully this sounds like it might be along the right
lines and someone can point me in the right direction. Something like the below
perhaps?
WITH data (
id,
message,
node,
options,
previous,
attachment
) AS (
VALUES ...
), RecursiveTable as (
select * from data d where node is null # all answers?
union all
select * from RecursiveTable where ??
)
select * from RecursiveTable
--
Basic sample dataset
WITH data (
id,
message,
node,
options,
previous,
attachment
) AS (
VALUES
-- QUESTION TYPE 1
-- pineapple questions
(1, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(2, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(3, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(4, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(5, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
-- pineapple answers
(6, 'No', null, null, '{"id": 1, "node": "pineapple"}'::json, null),
(7, 'Yes', null, null, '{"id": 2, "node": "pineapple"}'::json, null),
(8, 'No', null, null, '{"id": 3, "node": "pineapple"}'::json, null),
(9, 'Yes', null, null, '{"id": 4, "node": "pineapple"}'::json, null),
(10, 'No', null, null, '{"id": 5, "node": "pineapple"}'::json, null),
-- ----------------------------
-- QUESTION TYPE 2 - Previous message, then question with text + options followed by answer
--- previous messages to stuffed crust questions (we don't care about
--these but they're here to ensure we aren't accidentally getting them
--as the question in results)
(11, 'Hello', 'hello_pre_stuffed_crust', null, null::json, null),
(12, 'Hello', 'hello_pre_stuffed_crust', null, null::json, null),
(13, 'Hello', 'hello_pre_stuffed_crust', null, null::json, null),
-- stuffed crust questions
(14, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, '{"id": 11, "node": "hello_pre_stuffed_crust"}'::json, null),
(15, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, '{"id": 12, "node": "hello_pre_stuffed_crust"}'::json, null),
(16, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, '{"id": 13, "node": "hello_pre_stuffed_crust"}'::json, null),
-- stuffed crust answers
(17, 'More cheese!', null, null, '{"id": 14, "node": "stuffed_crust"}'::json, null),
(18, 'Crunchy crust', null, null, '{"id": 15, "node": "stuffed_crust"}'::json, null),
(19, 'Crunchy crust', null, null, '{"id": 16, "node": "stuffed_crust"}'::json, null),
-- ----------------------------
-- QUESTION TYPE 3
-- two part question, no text with options only image, should get text from previous
-- part 1
(20, 'What do you think of this pizza?', 'check_this_image', null, null::json, null),
(21, 'What do you think of this pizza?', 'check_this_image', null, null::json, null),
(22, 'What do you think of this pizza?', 'check_this_image', null, null::json, null),
-- part two
(23, null, 'image', '["Looks amazing!","Not my cup of tea"]'::json, '{"id": 20, "node": "check_this_image"}'::json, 'https://images.unsplash.com/photo-1544982503-9f984c14501a'),
(24, null, 'image', '["Looks amazing!","Not my cup of tea"]'::json, '{"id": 21, "node": "check_this_image"}'::json, 'https://images.unsplash.com/photo-1544982503-9f984c14501a'),
(25, null, 'image', '["Looks amazing!","Not my cup of tea"]'::json, '{"id": 22, "node": "check_this_image"}'::json, 'https://images.unsplash.com/photo-1544982503-9f984c14501a'),
-- two part answers
(26, 'Looks amazing!', null, null, '{"id": 23, "node": "image"}'::json, null),
(27, 'Not my cup of tea', null, null, '{"id": 24, "node": "image"}'::json, null),
(28, 'Looks amazing!', null, null, '{"id": 25, "node": "image"}'::json, null),
-- ----------------------------
-- QUESTION TYPE 4
-- no text, just options straight after responding to something else - options for text value would be options, or image
-- directly after question 3 was answered, previous message was user message - but we don't have text here - just an image and options
(29, null, 'which_brand', '["Dominos","Papa Johns"]'::json, '{"id": 27}'::json, 'https://peakstudentmediadotcom.files.wordpress.com/2018/11/vs.jpg'),
(30, null, 'which_brand', '["Dominos","Papa Johns"]'::json, '{"id": 28}'::json, 'https://peakstudentmediadotcom.files.wordpress.com/2018/11/vs.jpg'),
(31, null, 'which_brand', '["Dominos","Papa Johns"]'::json, '{"id": 29}'::json, 'https://peakstudentmediadotcom.files.wordpress.com/2018/11/vs.jpg')
)
SELECT * from data
You can use WIT HRECURSIVE to achieve your goal. You just need to specify when to stop the recursion and find a way to select only those records, where the recursion did not produce any additional rows for.
Have a look here:
WITH RECURSIVE comp (
id, message, node, options, previous, attachment,
id2, message2, node2, options2, previous2, attachment2,
rec_depth
) AS (
SELECT
t.id, t.message, t.node, t.options, t.previous, t.attachment,
null::integer AS id2, null::text AS message2, null::text AS node2, null::json AS options2, null::json AS previous2, null::text AS attachment2,
0
FROM data t
WHERE t.node IS NULL
UNION ALL
SELECT
c.id, c.message, c.node, c.options, c.previous, c.attachment,
prev.id, prev.message, prev.node, prev.options, prev.previous, prev.attachment,
c.rec_depth + 1
FROM comp c
INNER JOIN data prev ON prev.id = ((COALESCE(c.previous2, c.previous))->>'id')::int
WHERE prev.node IS NOT NULL -- do not reach back to the next answer
AND c.message2 IS NULL -- do not reach back beyond a message with text (the question text)
), data (id, message, node, options, previous, attachment) AS (
VALUES [...]
) SELECT
c.id2 AS question_id, c.id AS answer_id
FROM comp c
WHERE
NOT EXISTS(
SELECT 1
FROM comp c2
WHERE c2.id = c.id
AND c2.rec_depth > c.rec_depth
)
comp holds before the recursion only the "answers" (this is the part above UNION ALL). Then, in the first recursion step, they are joined with the predecesors. In the second step, another new record is created per answer-predecessor pair, where the predecessor replaces itself with its predecessor. This is done, until the "base-condition" (the joined partner is a record with message aka question text or the next partner is a record without node aka an answer) is reached (this means until no new records get created).
As we also compute the recursion depth (rec_depth) of each row, we can finally check that we use only those records generated per answer with the maximal recursion depth.
The second WITH statement can and should of course be removed and you should reference your real table in the WITH RECURSIVE part.
I chose to only select the ids of the answer and the corresponding question, but the WITH RECURSIVE is already built in a way, that you can use all of the columns.
Further reading in the docs:
https://www.postgresql.org/docs/13/sql-select.html#SQL-WITH
https://www.postgresql.org/docs/13/queries-with.html

returning entries based on a is contained predicate with sql [duplicate]

This question already has answers here:
how to make a sql loop?
(3 answers)
Closed 7 years ago.
I stumbled upon a very hideous problem, here is my table
filesystem (id, name, parentid);
and some entries for the example
(1, 'root', null)
(2, 'folder1', 1)
(3, 'subfolder1.1', 2)
(4, 'subfolder1.2', 2)
(5, 'folder2', 1)
(6, 'subfolder2.1', 5)
(7, 'subfolder2.2', 5)
(8, 'megaSubfolder', 6)
that leaves us with the following paths :
root
root/folder1
root/folder2
root/folder1/subfolder1.1
root/folder1/subfolder1.2
root/folder2/subfolder2.1
root/folder2/subfolder2.2
root/folder2/subfolder2.1/megaSubfolder
what i want is to select all the folders that is contained in another one
for example megaSubfolder, subfolder2.1, subfolder2.2 are contained in folder2 (id 5)
How should I write the request as to return these 3 entries (id 8, 7, 6) where the predicate is 5 for instance ?
You can do it like this:
WITH RECURSIVE search_path(id, name) AS (
SELECT f.id, f.name
FROM filesystem f
WHERE id=5
UNION ALL
SELECT f.id, f.name
FROM filesystem f
JOIN search_path sf ON f.parentid=sf.id
)
SELECT * FROM search_path;
The top part of UNION ALL selects the starting rows of your query. The bottom part "connects" additional rows to the rows that have been selected previously.
The result includes the row with id of 5. If you do not want it, add WHERE id <> 5 after SELECT * FROM search_path.
Demo.

How to check between range of values of same column

This is the table I have:
Now, I want to check if the input is between 95-91 or 80-90 or 70-79...and so on.
How can I do that ?
Here we join the table to itself to get the min and max values for each grade.
select
g1.Courseid,
g1.GradeValue MinGradeValue,
isnull(min(g2.GradeValue)-1,100) MaxGradeValue,
g1.Description
from YourTable g1
left join YourTable g2
ON g2.CourseId = g1.CourseId
and g2.GradeValue > g1.GradeValue
group by
g1.Courseid,
g1.GradeValue,
g1.Description
You can join this as a CTE or something to a Student's grade with Student.Grade between MinGradeValue and MaxGradeValue. Let me know if I can help you further.
First off, stop thinking in inclusive upper-bound ranges; read this post about BETWEEN (which is an inclusive range) - this applies to anything that is conceptually not an integral count (ie, pretty much everything). What happens when somebody gets a grade of 79.5?
Fortunately, your table is perfectly setup for constructing a bounding-range table (which can be done as a CTE here, or as a materialized view if strictly necessary). I tend to prefer OLAP functions for this sort of work (and 2012 has a nice one for this):
SELECT courseId, description,
gradeValue as minimumValue,
LEAD(gradeValue) OVER(PARTITION BY courseId ORDER BY gradeValue) as nextGradeMinimumValue
FROM Grade
... Which you can then query against similar to this:
SELECT StudentGrade.studentId, StudentGrade.courseId, StudentGrade.grade,
Grade.description
FROM (VALUES(1, 1, 38),
(2, 1, 99),
(3, 2, 74.5),
(4, 2, 120)) StudentGrade(studentId, courseId, grade)
JOIN (SELECT courseId, description,
gradeValue as minimumValue,
LEAD(gradeValue) OVER(PARTITION BY courseId ORDER BY gradeValue) as nextGradeMinimumValue
FROM Grade) Grade
ON Grade.courseId = StudentGrade.courseId
AND Grade.minimumValue >= StudentGrade.grade
AND (Grade.nextGradeMinimumValue IS NULL OR Grade.nextGradeMinimumValue > StudentGrade.grade)
(ordinarily I'd have an SQL Fiddle example, but I can't access it at the moment, so this is untested).
This should work for all (positive) grade ranges, including an unlimited amount of "extra credit" (any score higher than the top boundary is assigned that description).