SQL - How to check if users are in the same hierarchy? - sql

I want to find out if users are directly in a parent child relation.
Given my user table schema
User_id | Parent_ID | Name
For example, I have a list of user_id's and I want to know if they are all in the same hierarchical tree.
I have tried using CTE recursive.
Sample data
User_id | Parent_ID | Name
1 | | A
2 | 1 | B
3 | 2 | C
4 | 3 | D
5 | 2 | E
6 | | F
7 | 6 | G
user_id varchar(100)
parent_id varchar(100)
Desired result: Input [2,3,4] => Same Team
Input [2,3,7] => Not same team

Use the top-level parents' parent_id as the hierarchy identifier:
with recursive hierarchies as (
select user_id, user_id as hierarchy_id
from ttable
where parent_id is null
union all
select c.user_id, p.hierarchy_id
from hierarchies p
join ttable c on c.parent_id = p.user_id
)
select * from hierarchies;
With that mapping of each user_id to a single hierarchy_id, you can join to your list of users.
EDIT BEGINS
Since you added sample data and example results that do not match your original question, here is an example of how any minimally competent programmer could slightly tweak the above to match the newly added contradictory examples:
with recursive subhierarchies as (
select user_id, array[user_id] as path
from ttable
where parent_id is null
union all
select c.user_id, p.path||c.user_id as path
from subhierarchies p
join ttable c on c.parent_id = p.user_id
)
select d.user_ids, count(s.path) > 0 as same_team
from (values (array[2, 3, 4]), (array[2, 3, 6])) as d(user_ids)
left join subhierarchies s
on s.path #> d.user_ids
group by d.user_ids
;

Related

Recursive delete of multiple rows in one table

I have a problem with my postgres database. I have a table Tasks with 3 columns: ID, Name and Parent_ID (which refers to another task id in this table):
id | name | parent_id
---+------+-----------
1 | A | 0
2 | B | 1
3 | C | 2
4 | D | 1
5 | E | 0
6 | F | 0
So basically it's like this:
1. A
2. B
3. C
4. D
5. E
6. F
What I'm trying to do is to delete task A, and delete all of its children and all children of children etc etc..(in this case B and D, along with C as its children of B which is deleted) something like cascade delete, but i cant do this. Maybe any function will work?
The result after delete should be
id | name | parent_id
---+------+-----------
5 | E | 0
6 | F | 0
Hope you guys can help me.
You only need a cascading FK-constraint:
\i tmp.sql
CREATE TABLE employees (
id serial PRIMARY KEY
, name VARCHAR NOT NULL
, parent_id INT REFERENCES employees(id) ON DELETE CASCADE
);
INSERT INTO employees
VALUES (1,'A', NULL),(2,'B', 1),(3,'C', 2),
(4,'D', 1),(5,'E', NULL),(6,'F', NULL);
DELETE FROM employees
WHERE name = 'A'
;
SELECT * FROM employees
;
----------
Result:
----------
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 6
DELETE 1
id | name | parent_id
----+------+-----------
5 | E |
6 | F |
(2 rows)
You can build the list of all descendent rows from the starting name with a recursive query, and then use it as a filter for the deletetions.
with recursive cte(id, parent_id) as (
select id, parent_id from mytable where name = 'A'
union all
select t.id, t.parent_id from mytable t inner join cte c on c.id = t.parent_id
)
delete from mytable where id in (select id from cte)
Demo on DB Fiddle - table content after executing the query:
id | name | parent_id
-: | :--- | --------:
5 | E | 0
6 | F | 0
Use recursive cte to get all rows with Name = 'A' and it's subordinates.
Then delete it from table employees.
Here is the step to create the table:
Sample table: CREATE TABLE employees (
id serial PRIMARY KEY,
name VARCHAR NOT NULL,
parent_id INT
);
INSERT INTO employees
VALUES (1,'A', 0),(2,'B', 1),(3,'C', 2),
(4,'D', 1),(5,'E', 0),(6,'F', 0);
Query:
WITH RECURSIVE subordinates AS (
SELECT
id,
parent_id,
name
FROM
employees
WHERE
name = 'A'
UNION
SELECT
e.id,
e.parent_id,
e.name
FROM
employees e
INNER JOIN subordinates s ON s.id = e.parent_id
)
DELETE
FROM employees
WHERE id in (
SELECT
id
FROM
subordinates);
SELECT * FROM employees;

Select many to many with hierarchical table

dbo.Tags
---------------
[TagsId]
dbo.TagsDetail
----------------
[TagsDetailId]
[TagsId]
[TagsGroupId]
dbo.TagsGroup (hierarchical table with 2 level)
----------------
[TagsGroupId]
[ParentId]
Tags
+--------+
| Tagsld |
+--------+
| 1 |
| 2 |
+--------+
TagsDetails
+-------------+-----------+
| Tagsld |TagsGroupId|
+-------------+-----------+
| 1 | 1 |
| 2 | 2 |
+-------------+-----------+
TagsGroup
+-------------+-----------+
| TagsGroupId | ParentId |
+-------------+-----------+
| 1 | null |
| 2 | null |
| 3 | 1 |
+-------------+-----------+
Input TagGroupsId = 2 => all taggroup(1, 2, 3)
How can I select all related TagsGroupIds by input one #TagsGroupId?
I tried to solved by selecting all TagsDetailIds by #TagsGroupId so I find all TagsIds and from TagsIds. I found all TagsDetailIds and then get all relate TagsGroupIds and it's descendants and on each TagsGroupId, I started a loop again.
I don't know where I can stop to make sure I have selected all TagsGroupId.
Note: Using plurals is not typically how this is done -- so the column should be called TagId not TagsId and the table should be called TagGroup not TagsGroup. Just easier this way. It has no effect but that is the style convention everyone uses.
So as I understand it, a tag group can have up to two parents and can have children and grand children.
You can do it all in one query (with sub queries and joins) but I think a CTE will make the logic easier.
WITH Parent AS
(
SELECT TD.TagId
FROM TagGroup TG
JOIN TagDetail ON TD.TagGroupId = TG.ParentId
WHERE TG.TagGroupId = #TagGroupId
), GrandParent AS
(
SELECT TD.TagId
FROM TagGroup TG
JOIN TagDetail ON TD.TagGroupId = TG.ParentId
WHERE TG.TagGroupId = (SELECT TagId FROM Parent)
), Child AS
(
SELECT TD.TagId
FROM TagGroup TG
JOIN TagDetail ON TD.TagGroupId = TG.TagGroupId
WHERE TG.ParentId = #TagGroupId
), GrandChild AS
(
SELECT TD.TagId
FROM TagGroup TG
JOIN TagDetail ON TD.TagGroupId = TG.TagGroupId
WHERE TG.ParentId = (SELECT TagId FROM Child)
)
SELECT TagId
FROM Parent
UNION
SELECT TagId
FROM GrandParent
UNION
SELECT TagId
FROM Child
UNION
SELECT TagId
FROM GrandChild

SQL server matching two table on a column

I have two tables one storing user skills another storing skills required for a job. I want to match how many skills a of each user matches with a job.
The table structure is
Table1: User_Skills
| ID | User_ID | Skill |
---------------------------
| 1 | 1 | .Net |
---------------------------
| 2 | 1 | Software|
---------------------------
| 3 | 1 | Engineer|
---------------------------
| 4 | 2 | .Net |
---------------------------
| 5 | 2 | Software|
---------------------------
Table2: Job_Skills_Requirement
| ID | Job_ID | Skill |
--------------------------
| 1 | 1 | .Net |
---------------------------
| 2 | 1 | Engineer|
---------------------------
| 3 | 1 | HTML |
---------------------------
| 4 | 2 | Software|
---------------------------
| 5 | 2 | HTML |
---------------------------
I was trying to have comma separated skills and compare but these can be in different order.
Edit
All the answers here are excellent. The result I am looking for is matching all jobs with all users as later on I will match other properties as well.
You could join the tables by the skill columns and count the matches:
SELECT user_id, job_id, COUNT(*) AS matching_skills
FROM user_skills u
JOIN job_skills_requirement j ON u.skill = j.skill
GROUP BY user_id, job_id
EDIT:
IF you want to also show users and jobs that have no matching skills, you can use a full outer join instead.
SELECT user_id, job_id, COUNT(*) AS matching_skills
FROM user_skills u
FULL OUTER JOIN job_skills_requirement j ON u.skill = j.skill
GROUP BY user_id, job_id
EDIT 2:
As Jiri Tousek commented, the above query will produce nulls where there's no match between a user and a job. If you want a full Cartesian products between them, you could use (abuse?) the cross join syntax and count how many skills actually match between each user and each job:
SELECT user_id,
job_id,
COUNT(CASE WHEN u.skill = j.skill THEN 1 END) AS matching_skills
FROM user_skills u
CROSS JOIN job_skills_requirement j
GROUP BY user_id, job_id
If you want to match all users and all jobs, then Mureinik's otherwise excellent answer is not correct.
You need to generate all the rows first, which I would do using a cross join and then count the matching ones:
select u.user_id, j.job_id, count(jsr.job_id) as skills_in_common
from users u cross join
jobs j left join
user_skills us
on us.user_id = u.user_id left join
Job_Skills_Requirement jsr
on jsr.job_id = j.job_id and
jsr.skill = us.skill
group by u.user_id, j.job_id;
Note: This assumes the existence of a users and a jobs table. You can of course generate these using subqueries.
WITH User_Skills(ID,User_ID,Skill)AS(
SELECT 1,1,'.Net' UNION ALL
SELECT 2,1,'Software' UNION ALL
SELECT 3,1,'Engineer' UNION ALL
SELECT 4,2,'.Net' UNION ALL
SELECT 5,2 ,'Software'
),Job_Skills_Requirement(ID,Job_ID,Skill)AS(
SELECT 1,1,'.Net' UNION ALL
SELECT 2,1,'Engineer' UNION ALL
SELECT 3,1,'HTML' UNION ALL
SELECT 4,2,'Software' UNION ALL
SELECT 5,2 ,'HTML'
),Job_User_Skill AS (
SELECT j.Job_ID,u.User_ID,u.Skill
FROM Job_Skills_Requirement AS j INNER JOIN User_Skills AS u ON u.Skill=j.Skill
)
SELECT jus.Job_ID,jus.User_ID,COUNT(jus.Skill),STUFF(c.Skills,1,1,'') AS Skill
FROM Job_User_Skill AS jus
CROSS APPLY(SELECT ','+j.Skill FROM Job_User_Skill AS j WHERE j.Job_ID=jus.Job_ID AND j.User_ID=jus.User_ID FOR XML PATH('')) c(Skills)
GROUP BY jus.Job_ID,jus.User_ID,c.Skills
ORDER BY jus.Job_ID
Job_ID User_ID Skill
----------- ----------- ----------- -------------
1 1 2 .Net,Engineer
1 2 1 .Net
2 1 1 Software
2 2 1 Software

SQL To Find All Descendents

I have a data table like this
Entities:
ID | Parent_ID
1 | null
2 | 1
3 | 1
4 | 3
5 | 4
6 | 4
I'd like a sql expression that will return a row for every entity and a linear descendant, plus a row for null if the entity has no descendants. So given the above data my result set would be:
Entity | Descendant
1 | 2
1 | 3
1 | 4
1 | 5
1 | 6
2 | null
3 | 4
3 | 5
3 | 6
4 | 5
5 | null
6 | null
I tried using a common table expression to achieve this, and think it's the way to do it, given its ability to recurse, but I couldn't get my head wrapped around the spawning of many rows for a single parent.
with all_my_children (my_father,my_id,my_descendant,level)
as
(
select parent_id,id,null,0
from Entities
where id not in (select parent_id from entities)
union all
select e.parent_id,e.id,amc.my_id,amc.level+1
from Entities e
inner join all_my_children amc
on e.id = amc.my_father
WHERE ????? --How do I know when I'm done? and How do I keep repeating parents for each descendant?
)
select my_id, my_descendant from all_my_children
Thanks for your time.
Here's what you asked for
WITH TEMP AS
(
SELECT ID AS ENTITY, PID AS DESCENDANTS
FROM YPC_BI_TEMP.DBO.SV7104
WHERE PID IS NULL
UNION ALL
SELECT PID AS ENTITY, ID AS DESCENDANTS
FROM YPC_BI_TEMP.DBO.SV7104
WHERE PID IS NOT NULL
UNION ALL
SELECT PRNT.ENTITY, CHILD.ID AS DESCENDANTS
FROM YPC_BI_TEMP.DBO.SV7104 AS CHILD
INNER JOIN TEMP AS PRNT
on PRNT.DESCENDANTS = CHILD.PID
--AND PRNT.ENTITY IS NOT NULL
)
SELECT DISTINCT ENTITY, DESCENDANTS FROM TEMP
UNION
SELECT ID AS ENTITY, NULL AS DESCENDANTS FROM YPC_BI_TEMP.DBO.SV7104
WHERE ID NOT IN (SELECT ENTITY FROM TEMP)
Deleted my previous answer, but I think this might do the trick...
WITH all_my_children AS (my_father,my_id,my_descendant,level)
(
SELECT parent_id, id, null, 0
FROM Entities
WHERE parent_id IS NULL -- the roots of your tree
UNION ALL
SELECT COALESCE(e2.parent_id, e.parent_id), e.id, amc.my_id, amc.level+1
FROM Entities e
JOIN all_my_children amc
ON e.parent_id = amc.my_id
LEFT JOIN Entities e2
ON e.id = e2.parent_id
)
SELECT * FROM all_my_children

Select record, and if it has children, select the newest child instead

I have a table:
element_id, element_parent_id
Records are:
1, null
2, 1
3, 1
4, null
So, visualization might look like:
1
2
3
4
The question is, how to select for:
form_id
3
4
...in other words: how to select parent if there is no children, or the newest child if those children exist. So far I managed to select for:
1 and 4
2 and 3
1, 2, 3 and 4
Just to be a bit more useful I'll explain my reasoning to arrive at the final query (at the bottom).
You are selecting two different entities from the same table, so you need a JOIN of the table against itself; this will give you parents and children.
But children may not be there and so it will have to be a LEFT JOIN.
SELECT p.id, c.id AS cid
FROM yourtable AS p
LEFT JOIN yourtable AS c ON (c.parent_id = p.id);
+------+------+
| id | cid |
+------+------+
| 1 | 3 | # This one...
| 1 | 2 | # ...and this one must be grouped, and 3 taken.
| 3 | NULL | <-- this must be ignored because it's a child
| 4 | NULL |
| 2 | NULL | <-- this must be ignored because it's a child
+------+------+
Now to refine, we see that we need to ignore children in the "parent" role and to this purpose we add WHERE p.parent_id IS NULL. "Real" children have a parent, and will then be skipped.
Note: this is the point where we would need to do something more complicated if we had a multi-level hierarchy. If we only wanted the
bottom level, i.e. the "true" children (ignore those parents that have
themselves a parent), we could for example run a second LEFT JOIN
to get the *grand*parents, and impose that the grandparent's id be not
NULL. This is only true for third level and greater grandchildren. Or
we could get the children of the children and impose that they have
NULL id, i.e. they don't exist; this is only true for the bottom or
last-level children. Other requirements could call for yet another set
of bounds.
Then you want the "top" child in relation to a parent id, and this calls for a GROUP BY, which will yield the desired result when children are there, or NULL when there aren't.
SELECT p.id, MAX(c.id) AS cid
FROM yourtable AS p
LEFT JOIN yourtable AS c ON (c.parent_id = p.id)
WHERE p.parent_id IS NULL
GROUP BY p.id;
+------+------+
| id | cid |
+------+------+
| 1 | 3 |
| 4 | NULL |
+------+------+
In that case we see the information is in the parent-side column (id).
And to choose which, I would resort to a SUBSELECT.
SELECT CASE WHEN cid IS NULL THEN id ELSE cid END AS wanted
FROM (
SELECT p.id, MAX(c.id) AS cid
FROM yourtable AS p
LEFT JOIN yourtable AS c ON (c.parent_id = p.id)
WHERE p.parent_id IS NULL
GROUP BY p.id
) AS x;