Recursive tree search using postgresql join table - sql

I have a table stories and a table blockings which has the columns story_id (referencing a story), and a blocked_story_id (also referencing a story, which is blocked by the story_id)
I'm trying to construct a query to return all the stories in order of precedence based on their blockers - so blockers first, traversing down the tree.
One story can be blocked by many stories, and can itself be a blocker for many stories.
I've been reading and re-reading the PostgreSQL docs on WITH RECURSIVE but I'm a little lost on where I should be going with this, and how to construct the relevant query.
I have got as far as:
select s.id, b.story_id as blocker_id
from stories s
left outer join blockings b on s.id = b.blocked_story_id
where s.deleted_at is null
as for getting a list of stories and their blockers, but some pointers as to what I need to join/union to get the desired result would be helpful.
Context
I want to know which stories I can work on first. So I want an output that contains all stories in an order that allows me to work top down and never hit a blocked story.
The content of the blockings table gives me a simple join table between stories that block one another. The story_id being the blocker, the blocked_story_id being the one being blocked.
Sample Data
Stories
id | title
------------------
1 | Story title 1
2 | Story title 2
3 | Story title 3
4 | Story title 4
5 | Story title 5
Blockings
story_id | blocked_story_id
---------------------------
4 | 2
4 | 3
3 | 1
3 | 5
I would expect to see the following result:
id | title
------------------
4 | Story title 4
2 | Story title 2
3 | Story title 3
1 | Story title 1
5 | Story title 5

Disclaimer: Because it is not clear to me why you need a recursion for finding the blocked stories (Which can be achieved easily by SELECT blocked_story_id FROM blocking) I would ask you for further information. A real recursion case could be: "All blocking that are reachable from story 4" or something like that.
Here's what I've done so far as I understood your problem:
Your blocking table says: story 4 blocks stories 2 and 3. Story 3 blocks stories 1 and 5. So there are blocked stories 1, 2, 3, 5. Because of the recursion, story 4 can block 1 and 5 via 3. So there a two ways of blocking them (directly with starting point 3 and and from starting point 4 via 3). I gave out all possible paths with this query:
WITH RECURSIVE blocks AS (
SELECT blocked_story_id, ARRAY[story_id]::int[] as path FROM blockings
UNION
SELECT bk.blocked_story_id, b.path || bk.story_id
FROM blockings bk INNER JOIN blocks b ON b.blocked_story_id = bk.story_id
)
SELECT b.blocked_story_id, s.title, b.path
FROM blocks b INNER JOIN stories s ON s.id = b.blocked_story_id;
Result:
blocked_story_id title path
2 Title 2 {4}
3 Title 3 {4}
1 Title 1 {3}
5 Title 5 {3}
1 Title 1 {4,3}
5 Title 5 {4,3}
demo: db<>fiddle

#S-Man I figured it out thanks to your help pointing me in the right direction.
WITH recursive blockings_tree(id, title, path) AS (
SELECT stories.id, title, ARRAY[blockings.blocked_story_id, blockings.story_id]
FROM stories
LEFT OUTER JOIN blockings ON blockings.story_id = stories.id
UNION ALL
SELECT stories.id, stories.title, path || stories.id
FROM blockings_tree
JOIN blockings ON blockings.story_id = blockings_tree.id
JOIN stories ON blockings.blocked_story_id = stories.id
WHERE NOT blockings.blocked_story_id = any(path)
)
SELECT stories.*
FROM stories
JOIN (SELECT id, MAX(path) AS path FROM blockings_tree GROUP BY id) bt ON bt.id = stories.id
ORDER BY path

Related

Knex.js Getting values from comma-separated

I have two SQlite3 tables task and tags
task is my master table and tags is storing tag names
I store comma-separated values in task
Now I want to get Tag names with use of a knex.js
table task
id task tags
---------------------
1 abc 1,2,3
2 xyz 3,1
3 apple 2
table tags
id tag
------------
1 cold
2 hot
3 normal
Now i want output as below
OUTPUT:
id task tags
---------------------
1 abc cold,hot,normal
2 xyz normal,cold
3 apple hot
I know i will have to use joins but not sure how to actually use it in knex.js. Please do help me.
Part of the problem is that your database is not properly normalised. Instead of having two tables task and tabs, with table tasks containing multiple tag IDs in the column 'tags' you should have three tables; 'tasks', 'tags' and the 'joining' table 'task_tags'. They would store the following data...
Tasks
id task
----------
1 abc
2 xyz
3 apple
Tags
id tag
------------
1 cold
2 hot
3 normal
task_tags
task_id tag_id
1 1
1 2
1 3
2 1
2 3
3 2
Now you can have as many tags as you like (whether or not any tasks use them) and as many tasks as you like (whether or not they use any tags) and you associate a task with it's tags via the task_tags table.
Then to get the result you want you would use the select
SELECT
tasks.id,
tasks.task,
GROUP_CONCAT(tags.tag) -- this gives you the csv line eg cold,hot,normal
from tasks
left join task_tags
ON tasks.id = task_tags.task_id
left join tags
on tags.id = task_tags.tag_id
GROUP BY task.id, tags.id
see https://www.sqlite.org/lang_aggfunc.html for explanation of GROUP_CONCAT
Your task table should be redesigned to hold one tag per row, not multiple tags in a single row:
id task tag
---------- ---------- ----------
1 abc 1
1 abc 2
1 abc 3
2 xyz 3
2 xyz 1
3 apple 2
Then it's easy:
SELECT task.id, task.task, group_concat(tags.tag, ',') AS tags
FROM task
JOIN tags ON task.tag = tags.id
GROUP BY task.id, task.task
ORDER BY task.id;
which gives
id task tags
---------- ---------- ---------------
1 abc cold,hot,normal
2 xyz normal,cold
3 apple hot
A design that follows the rules of relational databases makes life much easier (And the above can be normalized further; see the other answer); while some databases do support array types, sqlite is not one of them. If you insist on keeping your current design, though, there's an ugly hack involving the JSON1 extension and turning your CSV list of numbers into a JSON array:
SELECT task.id, task.task, group_concat(tags.tag, ',') AS tags
FROM task
JOIN json_each('[' || task.tags || ']') AS j
JOIN tags ON tags.id = j.value
GROUP BY task.id, task.task
ORDER BY task.id;

SQL Spatial Subquery Issue

Greetings Benevolent Gods of Stackoverflow,
I am presently struggling to get a spatially enabled query to work for a SQL assignment I am working on. The wording is as follows:
SELECT PURCHASES.TotalPrice, STORES.GeoLocation, STORES.StoreName
FROM MuffinShop
join (SELECT SUM(PURCHASES.TotalPrice) AS StoreProfit, STORES.StoreName
FROM PURCHASES INNER JOIN STORES ON PURCHASES.StoreID = STORES.StoreID
GROUP BY STORES.StoreName
HAVING (SUM(PURCHASES.TotalPrice) > 600))
What I am trying to do with this query is perform a function query (like avg, sum etc) and get the spatial information back as well. Another example of this would be:
SELECT STORES.StoreName, AVG(REVIEWS.Rating),Stores.Shape
FROM REVIEWS CROSS JOIN
STORES
GROUP BY STORES.StoreName;
This returns a Column 'STORES.Shape' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. error message.
I know I require a sub query to perform this task, I am just having endless trouble getting it to work. Any help at all would be wildly appreciated.
There are two parts to this question, I would tackle the first problem with the following logic:
List all the store names and their respective geolocations
Get the profit for each store
With that in mind, you need to use the STORES table as your base, then bolt the profit onto it through a sub query or an apply:
SELECT s.StoreName
,s.GeoLocation
,p.StoreProfit
FROM STORES s
INNER JOIN (
SELECT pu.StoreId
,StoreProfit = SUM(pu.TotalPrice)
FROM PURCHASES pu
GROUP BY pu.StoreID
) p
ON p.StoreID = s.StoreID;
This one is a little more efficient:
SELECT s.StoreName
,s.GeoLocation
,profit.StoreProfit
FROM STORES s
CROSS APPLY (
SELECT StoreProfit = SUM(p.TotalPrice)
FROM PURCHASES p
WHERE p.StoreID = s.StoreID
GROUP BY p.StoreID
) profit;
Now for the second part, the error that you are receiving tells you that you need to GROUP BY all columns in your select statement with the exception of your aggregate function(s).
In your second example, you are asking SQL to take an average rating for each store based on an ID, but you are also trying to return another column without including that inside the grouping. I will try to show you what you are asking SQL to do and where the issue lies with the following examples:
-- Data
Id | Rating | Shape
1 | 1 | Triangle
1 | 4 | Triangle
1 | 1 | Square
2 | 1 | Triangle
2 | 5 | Triangle
2 | 3 | Square
SQL Server, please give me the average rating for each store:
SELECT Id, AVG(Rating)
FROM Store
GROUP BY StoreId;
-- Result
Id | Avg(Rating)
1 | 2
2 | 3
SQL Server, please give me the average rating for each store and show its shape in the result (but don't group by it):
SELECT Id, AVG(Rating), Shape
FROM Store
GROUP BY StoreId;
-- Result
Id | Avg(Rating) | Shape
1 | 2 | Do I show Triangle or Square ...... ERROR!!!!
2 | 3 |
It needs to be told to get the average for each store and shape:
SELECT Id, AVG(Rating), Shape
FROM Store
GROUP BY StoreId, Shape;
-- Result
Id | Avg(Rating) | Shape
1 | 2.5 | Triangle
1 | 1 | Square
2 | 3 | Triangle
2 | 3 | Square
As in any spatial query you need an idea of what your final geometry will be. It looks like you are attempting to group by individual stores but delivering an average rating from the subquery. So if I'm reading it right you are just looking to get the stores shape info associated with the average ratings?
Query the stores table for the shape field and join the query you use to get the average rating
select a.shape
b.*
from stores a inner join (your Average rating query with group by here) b
on a.StoreID = b.Storeid

Find a value based on a table result

First of all, sorry for the title. Couldn't think of any better title.
This is what I got:
SELECT study FROM old_employee;
study
---------
STUDY1
STUDY2
STUDY3
STUDY1
STUDY2
SELECT id,name_string FROM studies;
id | name_string
----+-------------------
1 | STUDY1
2 | STUDY2
3 | STUDY3
Now I would like to find the id's based on the first output. This is what i've attempted but obviously it's not working.
SELECT id FROM studies WHERE name_string LIKE (SELECT study FROM old_employee);
My desired output:
id
----
1
2
3
1
2
edit: I'm saving old_employee as a view and i'm wondering if there's a smarter way of including it in the answers below instead of creating this view first.
CREATE VIEW old_employee AS
SELECT *
FROM dblink('dbname=mydb', 'select study from personnel')
AS t1(study char(10));
This can be accomplished without using SQL LIKE Operator. Here is the query.
SELECT s.id
FROM studies s,
old_employee o
WHERE s.name_string = o.study;
Second query (According to what #a_horse_with_no_name said):
SELECT studies.id
FROM studies
INNER JOIN old_employee
ON studies.name_string = old_employee.study

How to query the name of each section and the number of threads at it using sql query?

I have 2 tables at mysql database the first one contains parts like this:
parts
primary part_name part_id
0 web 1
0 graphic 2
1 php 3
1 asp 4
2 photoshop 5
2 illustrator 6
1 html 7
some of parts are primary like web, graphic. and others are subsections
for example web section contains (php,asp,html) parts
so at primary field there is the id of father part
graphic part contains (photoshop, illustrator) parts.
the other table is for posts;
posts
post_content post_title part_id post_id
anything any title 3 1
anything any title 6 2
anything any title 3 3
anything any title 3 4
anything any title 7 5
anything any title 6 6
anything any title 4 7
anything any title 4 8
anything any title 3 9
I want to get a table contains the main parts (primary = 0)
and sum of posts at it
the result should be like this;
query result
count_posts part_name part_id
7 web 1
2 graphic 2
I tried this:
SELECT p.*,count(s.post_id)
FROM part p,post s
where s.part_id = 1
and p.belong = 1
but it get results only for one part
Can't yet comment on other posts than my own so I'm submiting a new answer.
#StuartLC answer is missing the join between primary parts and subsections.
SELECT COUNT(po.post_id) AS count_posts, pa.part_name, pa.part_id
FROM parts pa
INNER JOIN parts pa2 on pa2.[primary] = pa.part_id
INNER JOIN posts po on po.part_id = pa2.part_id
WHERE pa.[primary] = 0
GROUP BY pa.part_name, pa.part_id
http://sqlfiddle.com/#!6/1dd90/3/0

SQL group related rows in a list

I'm a bit stuck with this...
I have items table:
id | name
1 | item 1
2 | item 2
3 | item 3
4 | item 4
and related items table:
id | item_id | related_item_id
2 | 1 | 2
3 | 1 | 4
so this means that item 1 is related to items 2 and 4.
Now I'm trying to display these in a list where related items follow always the main item they are related to:
item 1
item 2
item 4
item 3
Then I can visually show that these items 2 and 4 are related to item one and draw something like:
item 1
-- item 2
-- item 4
item 3
To be honest, haven't got any ideas myself. I quess I could query for items which are not related to any other item and get a list of "parent items" and then query relations separately in a script loop. This is not definately the sexiest solution...
I am assuming that this question is about ordering the items list, without duplicates. That is, a given item does not have more than one parent (which I ask in a comment).
If so, you can do this with a left outer join and cleverness in the order by.
select coalesce(r.related_item_id, i.id) as item_id
from items i left join
related r
on i.id = r.related_item_id
order by coalesce(r.item_id, i.id),
(r.related_item_id is null) desc;
The left outer join identifies parents because they will not have any rows that match. If so, the coalesce() finds them and uses the item id.
In my opinion , rather than implementing this logic in a query , you should move it to your actual code.
assuming that item_ids are sequential, you can find the largest number of item_id, then in a loop
you can find related_item_id to each item_id and make a convenient data structure out of it.
This functionality comes under the category of hierarchical queries. In Oracle its handled by connect by clause not sure about mysql. But you can search "hierarchical queries mysql" to get the answer.