Using CTE to determine a specific Hierarchical ID for Family Members - sql

I'm trying to figure out how to attach an incrementing ID to my resultset while using CTE.
My table has data like so:
PersonId ParentLinkId Relation Name
1 NULL F John Doe
2 1 S Jane Doe
3 1 C Jack Doe
4 1 C Jill Doe
I want to add a column called RelationId. Basically the "F" person will always get "1", The relation "S" will always get "2" and any subsequent "C" relation will get 3,4,5...etc
They are linked by the ParentLinkId so ParentLinkId = PersonId.
I tried to use CTE to recursively increment this value but I keep getting stuck on an infinite loop
I tried :
WITH FinalData( ParentId, ParentLinkId, Name, Relationship, RelationshipId) AS
(
SELECT ParentId
,ParentLinkId
,Name
,Relationship
,1
FROM FamTable
WHERE ParentLinkId IS NULL
UNION ALL
SELECT FT.ParentId
,ParentLinkId
,Name
,Relationship
,RelationshipId + 1
FROM FamTable FT
INNER JOIN FinalData ON FT.ParentLinkId = FinalData.ParentId
)
SELECT * FROM
FinalData
This is the result I keep on getting:
PersonId ParentLinkId Relation Name RelationshipId
1 NULL F John Doe 1
2 1 S Jane Doe 2
3 1 C Jack Doe 2
4 1 C Jill Doe 2
It should be
PersonId ParentLinkId Relation Name RelationshipId
1 NULL F John Doe 1
2 1 S Jane Doe 2
3 1 C Jack Doe 3
4 1 C Jill Doe 4
I think I'm getting close using CTE but any help or prod in the right direction would be greatly appreciated!

This sounds like a simple row_number():
select f.*,
row_number() over (partition by coalesce(ParentLinkId, PersonId)
order by (case when relation = 'F' then 1
when relation = 'S' then 2
when relation = 'C' then 3
end), PersonId
) as relationId
from famtable f;
Here is a db<>fiddle.

Related

How to delete rows after the item which equals to exact value?

I have the following dataframe
Block_id step name
1 1 Marie
1 2 Bob
1 3 John
1 4 Lola
2 1 Alex
2 2 John
2 3 Kate
2 4 Herald
3 1 Alec
3 2 Paul
3 3 Rex
As you can see data frame is sorted by block_id and then by step. I want to delete only in one block_id everything after the row where I have name John(the row with John as well). So the desired output would be
Block_id step name
1 1 Marie
1 2 Bob
2 1 Alex
3 1 Alec
3 2 Paul
3 3 Rex
An updatable CTE with a cumulative conditional COUNT seems to be what you are after:
CREATE TABLE dbo.YourTable (BlockID int,
Step int,
[Name] varchar(10));
GO
INSERT INTO dbo.YourTable
VALUES(1,1,'Marie'),
(1,2,'Bob'),
(1,3,'John'),
(1,4,'Lola'),
(2,1,'Alex'),
(2,2,'John'),
(2,3,'Kate'),
(2,4,'Herald'),
(3,1,'Alec'),
(3,2,'Paul'),
(3,3,'Rex');
GO
WITH CTE AS(
SELECT COUNT(CASE [Name] WHEN 'John' THEN 1 END) OVER (PARTITION BY BlockID ORDER BY Step) AS Johns
FROM dbo.YourTable)
DELETE FROM CTE
WHERE Johns >= 1;
GO
SELECT *
FROM dbo.YourTable;
GO
DROP TABLE dbo.YourTable;
One method uses an updatable CTE:
with todelete as (
select t.*,
min(case when name = 'John' then step end) over (partition by block_id) as john_id
from t
)
delete from todelete
where id >= john_id;
Or, if you prefer, a correlated subquery:
delete from t
where id >= (select min(t2.id)
from t t2
where t2.blockid = t.blockid and t2.name = 'John'
);
For performance, both of these can take advantage of an index on (blockid, name, id).

Why does the cte return the error that it does not exist?

Here is my code
WITH CTE AS(
SELECT COUNT(CASE name WHEN 'John' THEN 1 END) OVER (PARTITION BY BlockID ORDER BY Step) AS Johns
FROM dbo.YourTable)
DELETE FROM CTE
WHERE Johns >= 1;
SELECT *
FROM dbo.YourTable;
It returns me the following error when I run the code in the notebook
ERROR: syntax error at or near "DELETE"
But I can't seem to find any mistake in the query
When I try to do it in online compiler it returns the error that relation "cte" does not exist
Maybe this errors can be related?...
Here what I'm trying to do with cte:
My first table:
Block_id step name
1 1 Marie
1 2 Bob
1 3 John
1 4 Lola
2 1 Alex
2 2 John
2 3 Kate
2 4 Herald
3 1 Alec
3 2 Paul
3 3 Rex
As you can see data frame is sorted by block_id and then by step. I want to delete only in one block_id everything after the row where I have name John(the row with John as well). So the desired output would be
Block_id step name
1 1 Marie
1 2 Bob
2 1 Alex
3 1 Alec
3 2 Paul
3 3 Rex
Create a CTE that returns for each Block_id the step of the first John.
Then join the table to the CTE:
WITH cte AS (
SELECT Block_id, MIN(step) step
FROM tablename
WHERE name = 'John'
GROUP BY Block_id
)
DELETE FROM tablename t
USING cte c
WHERE c.Block_id = t.Block_id AND c.step <= t.step
See the demo.

How to select all duplicate rows except original one?

Let's say I have a table
CREATE TABLE names (
id SERIAL PRIMARY KEY,
name CHARACTER VARYING
);
with data
id name
-------------
1 John
2 John
3 John
4 Jane
5 Jane
6 Jane
I need to select all duplicate rows by name except the original one. So in this case I need the result to be this:
id name
-------------
2 John
3 John
5 Jane
6 Jane
How do I do that in Postgresql?
You can use ROW_NUMBER() to identify the 'original' records and filter them out. Here is a method using a cte:
with Nums AS (SELECT id,
name,
ROW_NUMBER() over (PARTITION BY name ORDER BY ID ASC) RN
FROM names)
SELECT *
FROM Nums
WHERE RN <> 1 --Filter out rows numbered 1, 'originals'
select * from names where not id in (select min(id) from names
group by name)

postgresql statement prob

I have two table A and B as following.
A:
key type
0 t
1 f
2 t
3 f
4 t
5 t
.......
B:
key name
0 Mary
0 Tony
0 Krolik
1 Tom
2 Tony
3 Tony
3 Mary
3 Tom
4 Tony
4 Tim
5 Tim
5 Mary
5 Wuli
.....
I hope to find top n occurence name that it's type is 'f'.
For example, in A, the type of key 1 and 3 are 'f', we find key 1 and 3 in table B, there are 2 'Tom' and 1 'Mary' and 1 'Tony'.
1 Tom
3 Tony
3 Mary
3 Tom
if n = 1 and the table is just showed as before, we hope to get 'Tom', because its occurence is top 1.
How can I write sql statement to satisfy these requirement?
I write something like below, but it is wrong. Can anyone help me? I assume n = 20.
SELECT DISTINCT TOP 20 name
FROM B
WHERE key IN (
SELECT key
FROM A
WHERE "type" = 'f'
)
GROUP BY name
ORDER BY DESC;
You don't seem to need aggregation. And the equivalent of top in Postgres is limit or fetch first <row> rows:
SELECT name, key
FROM B
WHERE B.key IN (SELECT A.key
FROM A
WHERE "type" = 'f'
)
ORDER BY key;
This corresponds to the results presented in the question. Your description doesn't quite match those results.

SQL: Select criteria for two tables, compare 1 field, return using condition

These are the tables in the query. Want to compare the ID_Skills in the following 2 tables. And in the returning table from the select query, display ID_Skills with condition saying whether or not TrainingRequired (Yes/No)
tblEmployeeCurrentSkills
ID_EmployeeCurrentSkills ID_Employee ID_Skills
1 1 1
2 1 2
3 2 1
tblSkillsRequired
ID_SkillsRequired ID_Employee ID_Skills ID_Position
1 1 1 1
2 1 2 1
3 1 3 1
4 2 3 2
tblSkills
ID_Skills Skill
1 Reading
2 Wiring
3 Stapling
tblPosition
ID_Position Position
1 Tech1
2 Stapler
tblEmployee
ID_Employee EmployeeName
1 Hannah
2 Bob
SQL for qrySkillsGap table - determines whether training is necessary
SELECT tblEmployee.[Employee Name],
tblSkillsRequired.ID_Skills,
tblSkills.Skill,
IIf([tblEmployeeCurrentSkills].[ID_Skills]
Like [tblSkillsRequired].[ID_Skills],"No","Yes") AS TrainingRequired
FROM (tblSkills
INNER JOIN tblSkillsRequired
ON tblSkills.ID_Skills = tblSkillsRequired.ID_Skills)
INNER JOIN (tblEmployee INNER JOIN tblEmployeeCurrentSkills
ON tblEmployee.ID_Employee = tblEmployeeCurrentSkills.ID_Employee)
ON tblSkills.ID_Skills = tblEmployeeCurrentSkills.ID_Skills;
This is the current output:
EmployeeName ID_Skill TrainingRequired
Hannah 1 No
Hannah 1 No
Hannah 2 No
Bob 1 No
Bob 1 No
I want it to display this:
EmployeeName ID_Skill TrainingRequired
Hannah 1 No
Hannah 2 No
Hannah 3 Yes
Bob 1 No
Bob 3 Yes
Thanks for any help!
I was able to create the tables you provided and used a union to bring together the employee skills and required skills.
SELECT te.EmployeeName
, emp.ID_Skills
, CASE WHEN MIN(emp.TrainingRequired) = 0 THEN 'No'
ELSE 'Yes'
END AS TrainingRequired
FROM dbo.tblEmployee AS te
JOIN (SELECT tecs.ID_Employee
, tecs.ID_Skills
, 0 AS TrainingRequired
FROM dbo.tblEmployeeCurrentSkills AS tecs
UNION
SELECT tsr.ID_Employee
, tsr.ID_Skills
, 1 AS TrainingRequired
FROM dbo.tblSkillsRequired AS tsr
) emp
ON te.ID_Employee = emp.ID_Employee
GROUP BY te.ID_Employee
, te.EmployeeName
, emp.ID_Skills
ORDER BY te.ID_Employee
, emp.ID_Skills