Postgresql hierarchical (tree) query - sql

I found few topics about it but none fits my expected results.
I have levels of categories stored in the table, just want to display it as tree structure.
All answers are kind of following query:
DB FIDDLE
WITH RECURSIVE cte AS (
SELECT category_id, category_name, parent_category, 1 AS level
FROM category
WHERE level = 1
UNION ALL
SELECT c.category_id, c.category_name, c.parent_category, ct.level + 1
FROM cte ct
JOIN category c ON c.parent_category = ct.category_id
)
SELECT *
FROM cte;
But the results are like
level
1
1
2
2
2
3
3
3
3
3
What I want to achieve is
level
1
2
3
3
2
3
3
1
2
3
3
2
3
3

You would typically keek track of the path to each node and use that for ordering. In Postgres, arrays come handy for this:
with recursive cte as (
select category_id, category_name, parent_category, 1 as level, array[category_id] path
from category
where parent_category is null
union all
select c.category_id, c.category_name, c.parent_category, ct.level + 1, ct.path || c.category_id
from cte ct
join category c on c.parent_category = ct.category_id
)
select *
from cte
order by path
Note that there is no need to store the level in the table; you can compute the information on the fly as you iterate. To identify the root nodes, you can filter on rows whose parent is null.
In your db fiddle, the query returns:
category_id | category_name | parent_category | level | path
----------: | :------------ | --------------: | ----: | :-------
1 | cat1 | null | 1 | {1}
3 | cat3 | 1 | 2 | {1,3}
8 | cat8 | 3 | 3 | {1,3,8}
9 | cat9 | 3 | 3 | {1,3,9}
4 | cat4 | 1 | 2 | {1,4}
6 | cat6 | 4 | 3 | {1,4,6}
7 | cat7 | 4 | 3 | {1,4,7}
5 | cat5 | 1 | 2 | {1,5}
10 | cat10 | 5 | 3 | {1,5,10}
11 | cat11 | 5 | 3 | {1,5,11}
2 | cat2 | null | 1 | {2}

You can keep track of the hierarchy as an array and use that for ordering:
WITH RECURSIVE cte AS (
SELECT category_id, category_name, parent_category, 1 AS level, array[category_id] as categories
FROM category
WHERE level = 1
UNION ALL
SELECT c.category_id, c.category_name, c.parent_category, ct.level + 1, ct.categories || c.category_id
FROM cte ct JOIN
category c
ON c.parent_category = ct.category_id
)
SELECT *
FROM cte
ORDER BY categories;
Here is a db<>fiddle.

Related

Postgres - Unique values for id column using CTE, Joins alongside GROUP BY

I have a table referrals:
id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
3 | 2 | c | t | agent | 3
5 | 3 | e | f | customer | 5
4 | 1 | d | t | agent | 4
2 | 1 | b | f | agent | 2
1 | 1 | a | t | agent | 1
And another table activations
id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
2 | 2 | 3 | 3.0 | 3 | a
4 | 1 | 1 | 6.0 | 5 | b
5 | 4 | 4 | 3.0 | 6 | c
1 | 1 | 2 | 2.0 | 2 | b
3 | 1 | 2 | 5.0 | 4 | b
6 | 1 | 2 | 7.0 | 8 | a
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Here is the query I ran:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select id, app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id )
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
Here is the result I got:
id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
2 | 3 | 14.0 | 2 | 8 | 2 | b | 2 | 1
1 | 1 | 6.0 | 1 | 5 | 1 | b | 1 | 2
2 | 3 | 14.0 | 2 | 8 | 2 | a | 1 | 2
4 | 1 | 3.0 | 4 | 6 | 4 | c | 1 | 2
The problem with this result is that the table has a duplicate id of 2. I only need unique values for the id column.
I tried a workaround by harnessing distinct that gave desired result but I fear the query results may not be reliable and consistent.
Here is the workaround query:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select
distinct on(id), app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id
order by id, best_selling_app_count desc)
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
I need a recommendation on how best to achieve this.
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Your question is really complicated with a very complicated SQL query. However, the above is what looks like the actual question. If so, you can use:
select r.*,
a.app_id as most_common_app_id,
a.cnt as most_common_app_id_count
from referrals r left join
(select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
from activations a
group by a.referral_id, a.app_id
order by a.referral_id, count(*) desc
) a
on a.referral_id = r.id;
You have not explained the other columns that are in your result set.

How can I generate sequence number for sql select that gives sub numbers for descendant items?

I would like to generate sequence numbers for select that gives sub numbers for descendant items.
I want the numbers be the following format:
root: 1...n
children of root: 1.1 -> 1.n
sub children: 1.1.1 -> 1.1.n
and so on...
I have Item table which has an owner_ref foreign key
the table: (name of items is just an example, it can be anything)
id | item_name | parent_id | owner_ref_id
----|------------|-----------|--------------
1 | item_1 | null | 1
2 | item_1.1 | 1 | 1
3 | item_1.1.1 | 2 | 1
4 | item_2 | null | 1
5 | item_2.1 | 4 | 1
6 | item_2.2 | 4 | 1
--------------------------------------------
The outcome should looks like :
seq_num | item_name | parent_id | owner_ref_id
---------|------------|-----------|--------------
1 | item_1 | null | 1
1.1 | item_1.1 | 1 | 1
1.1.1 | item_1.1.1 | 2 | 1
2 | item_2 | null | 1
2.1 | item_2.1 | 4 | 1
2.2 | item_2.2 | 4 | 1
--------------------------------------------
Use recursive CTE to form a tree-like structure -
with recursive nodes(id,item_name, parent_id,lvl, path) as (
select id,item_name, parent_id, 1
, row_number() OVER (order by parent_id nulls first)::text as path
from items where parent_id is null
union all
select o.id,o.item_name, o.parent_id,n.lvl+1, n.path || '.' ||
row_number() OVER (partition by o.parent_id order by o.parent_id)::text
from items o
join nodes n on n.id = o.parent_id
)
select *
from nodes
order by id
View on DBFiddle

Recursive query for hirarchical data based on adjacency list

Learing SQL, and have a bit of a problem. I have 2 tables level and level_hierarchy
|name | id | |parent_id | child_id|
------------------- ---------------------
| Level1_a | 1 | | NULL | 1 |
| Level2_a | 19 | | 1 | 19 |
| Level2_b | 3 | | 1 | 3 |
| Level3_a | 4 | | 3 | 4 |
| Level3_b | 5 | | 3 | 5 |
| Level4_a | 6 | | 5 | 6 |
| Level4_b | 7 | | 5 | 7 |
Now what I need, is a query that will return all entries from table level from every hirarchy level based on parameter that marks what level hierarchy level I want to get entries from.
Getting Level1 entries is quite easy.
SELECT name FROM level INNER JOIN level_hierarchy ON level.id =
level_hierarchy.child_id WHERE level_hierarchy.parent_id=NULL
Level2 entries:
Level2_a
Level2_b
are just the ones that have a parent and the parent of their parent is NULL and so on. This is where I suspect that recursion comes in.
Is there anyone who can guide thorugh it?
Your query for the first level (here depth to distinguish from the table) should look like this:
select l.name, h.child_id, 1 as depth
from level l
join level_hierarchy h on l.id = h.child_id
where h.parent_id is null;
name | child_id | depth
----------+----------+-------
Level1_a | 1 | 1
(1 row)
Note the proper use of is null (do not use = to compare with null as it always gives null).
You can use the above as an initial query in a recursive cte:
with recursive recursive_query as (
select l.name, h.child_id, 1 as depth
from level l
join level_hierarchy h on l.id = h.child_id
where h.parent_id is null
union all
select l.name, h.child_id, depth + 1
from level l
join level_hierarchy h on l.id = h.child_id
join recursive_query r on h.parent_id = r.child_id
)
select *
from recursive_query
-- where depth = 2
name | child_id | depth
----------+----------+-------
Level1_a | 1 | 1
Level2_b | 3 | 2
Level2_a | 19 | 2
Level3_a | 4 | 3
Level3_b | 5 | 3
Level4_a | 6 | 4
Level4_b | 7 | 4
(7 rows)
Good question, recursion is a difficult topic in SQL and its implementation varies by engine. Thanks for tagging your post with PostgreSQL. PostgreSQL has some excellent documentation on the topic.
WITH RECURSIVE rec_lh(child_id, parent_id) AS (
SELECT child_id, parent_id FROM level_hierarchy
UNION ALL
SELECT lh.child_id, lh.parent_id
FROM rec_lh rlh INNER JOIN level_hierarchy lh
ON lh.parent_id = rlh.child_id
)
SELECT DISTINCT level.name, child_id
FROM rec_lh INNER JOIN level
ON rec_lh.parent_id = level.id
ORDER BY level.name ASC;
See Also:
Recursive query in PostgreSQL. SELECT *

SQL - Database Hierarchy Structure

I have a table with value like this
id | parent | folder_name
-------------------------
1 | 0 | Root
2 | 1 | NSW
3 | 1 | QLD
4 | 2 | Sydney
5 | 3 | Brisbane
from this table i want to get a folder with all parents until higher level. Example: folder_name = Brisbane
id | parent | folder_name
-------------------------
5 | 3 | Brisbane
3 | 1 | QLD
1 | 0 | Root
i want to use JOIN in sql not CTE
Any help would be great
Recursive CTE is what you are looking for
;WITH cte
AS (SELECT *
FROM Yourtable
WHERE folder_name = 'Brisbane'
UNION ALL
SELECT b.*
FROM cte a
INNER JOIN Yourtable b
ON a.parent = b.id)
SELECT *
FROM cte

How to select hierarchy collection? (mixed with non hierarchy data, etc)

Having the table:
I need to show the following:
| ID | PERSONID | MASTERID | CHILDID | VALUE | DEPTHLEVEL |
---------------------------------------------------------------
| 1 | 3 | 78452 | 21456 | 100 | 1 |
| 2 | 3 | 21456 | | 0 | 2 |
| 3 | 3 | 652314 | 417859 | 115 | 1 |
| 4 | 3 | 417859 | | 0 | 2 |
| 5 | 4 | 998654 | 223655 | 300 | 1 |
| 6 | 4 | 223655 | | 0 | 2 |
| 7 | 4 | 201302 |789654,441592| 200 | 1 |
| 8 | 4 | 789654 | | 0 | 2 |
| 9 | 4 | 441592 | | 0 | 2 |
| 10 | 5 | 999852 | | 123 | 1 |
Look at the row with id 10 this row has not relations (childs), the row with id 7 has two childs.
I need to quit (put value to 0) the value for every child/leaf.
For the row 1-9 I try the following query:
select v.* from
(
select v.id, v.personid,
case when level > 1
then 0
else
v.value
end thevalue,
v.masterid, v.childid, level depthlevel
from tmpsimpleexample v
start with v.childid is not null
connect by v.masterid = prior v.childid
) v
order by v.id
Results:
Look the rows with id 7, 8 is the master with two childs, I need to put this in one row.
This is the first problem.
Also I need to show the data with no hierarchy relation(id 10 in expected result table, id 11 in image table data).
I think that I can query all rows with masterid not referenced by a childid and then make an union between the first query(above) and the query to search all master id not referenced by childid.
The query to to search all rows with masterid not referenced by childid will show me the row without relation and the master rows of level 1.
select id, personid, value thevalue, masterid, childid, 1 depthlevel
from TMPSIMPLEEXAMPLE
where masterid not in
(select childid from TMPSIMPLEEXAMPLE where childid is not null)
Here I can do an union and the result will fit my requirements(except the childid concatenate for master row).
select v.* from
(
select v.id, v.personid,
case when level > 1
then 0
else
v.value
end thevalue,
v.masterid, v.childid, level depthlevel
from tmpsimpleexample v
start with v.childid is not null
connect by v.masterid = prior v.childid
union
select id, personid, value thevalue, masterid, childid, 1 depthlevel
from TMPSIMPLEEXAMPLE
where masterid not in
(select childid from TMPSIMPLEEXAMPLE where childid is not null)
) v
order by v.id
Almost final result:
But knowing that my real table has hundred of thousands of records make union like that are a good approach?
I've taken a stab at what I think your source data looks like:
| ID | PERSONID | MASTERID | CHILDID | VALUE |
-----------------------------------------------
| 1 | 3 | 78452 | 21456 | 100 |
| 2 | 3 | 21456 | | -1 |
| 3 | 3 | 652314 | 417859 | 115 |
| 4 | 3 | 417859 | | -1 |
| 5 | 4 | 998654 | 223655 | 300 |
| 6 | 4 | 223655 | | -1 |
| 7 | 4 | 201302 | 441592 | 200 |
| 7 | 4 | 201302 | 789654 | 200 |
| 9 | 4 | 441592 | | -1 |
| 8 | 4 | 789654 | | -1 |
| 10 | 4 | 999852 | | 123 |
-----------------------------------------------
The following query gets you your desired results:
enter code here
select id,
personid,
masterid,
listagg(childid, ',') within group (order by childid) childid,
-- Took a guess that all values for a personid were the same and didn't need to be aggregated...
min(decode(depthlevel, 1, value, null)) value,
min(depthlevel) depthlevel
from (select v.*, level depthlevel
from tmpsimpleexample v
connect by v.masterid = prior v.childid
-- Trick here is to start with all of the desired starting conditions...
start with not exists ( select 'X' from tmpsimpleexample v2 where v2.childid = v.masterid ))
group by id, personid, masterid;
If ordering of your CHILDID is important, you would need to re-join the nested view with TMPSIMPLEEXAMPLE:
select v1.id,
v1.personid,
v1.masterid,
listagg(v1.childid, ',') within group (order by v2.id) childid,
min(decode(depthlevel, 1, v1.value, null)) value,
min(depthlevel) depthlevel
from (select v.*, level depthlevel
from tmpsimpleexample v
connect by v.masterid = prior v.childid
start with not exists ( select 'X' from tmpsimpleexample v2 where v2.childid = v.masterid )) v1,
tmpsimpleexample v2
-- Outer Join is important!
where v1.childid = v2.masterid (+)
group by v1.id, v1.personid, v1.masterid;
The real magic here is the LISTAGGG function. If you are not on 11g or better yet (why not?!?), then the following article can guide you in building your own aggregate function:
http://www.oracle-base.com/articles/misc/string-aggregation-techniques.php