SQL statement : average - sql

My question: What is the average age to become the first grandpa. The solution should be given out as average_age. The day a person becomes grandpa is where his first grandchild was born.
Relations:
human (name, gender, age)
parent (ParentName, ChildName) -> is subset of human(name).
Table:
I do know that grandpa is the person which has a parentname and a child in childname which is also a person(father) in parentname which has children in childname (grandchildren). The problem is now how do I get the average age to become grandpa.
What I got so far:
SELECT AVG(age) as average_age
FROM human h JOIN
parent p
ON h.name = p.parentname
WHERE h.gender = 'm' AND p.parentname = p.childname AND h.name = p.parentname
Expected outcome:
average_age : 52

It is extremely unusually to be storing the AGE of people in a table, because that changes -- every day. The data should be stored with a date of birth.
This is an aggregation query, but you have to join the tables multiple times. To get grandparents, you need a join on the parents table. Then you need to bring in humans for filtering:
select avg(min_age * 1.0)
from (select min(h_grandparent.age - h.grandchild.age) as min_age
from parent p join -- p.parentname is the grandparent
parent pchild
on p.childname = pchild.parentname join
human h_grandparent
on p.parentname = h_grandparent.name join
human h_grandchild
on pchild.childname = h_grandchild.name
where h_grandparent.gender = 'm'
group by h_grandparent.name
) a

I would address this with an exists condition that filters on humans that have grandchilds:
select avg(age) avg_age_of_grandpas
from human h
where
gender = 'm'
and exists (
select 1
from parent p1
inner join parent p2 on p2.parentName = p1.childName
where p1.parentName = h.name
)
The exists condition ensures that the person has at least one child and one grand child. The the outer query computes the average of such humans. Given the information available in your table structure, this seems to me like the most logical approach. Unlike joins, using exists avoids duplicating the records (and getting wrong results in the average) when a person has more than one line of descendants.
If you want the age of the grand parent at the date when their first grand child was born, then it is a bit complicated. This should get you close to what you expect:
select avg(h.age - g.maxGrandChildAge) avg_age_of_grandpas
from human h
inner join (
select
p1.parentName grandParentName,
max(h1.age) maxGrandChildAge
from parent p1
inner join parent p2 on p2.parentName = p1.childName
inner join human h1 on h1.name = p2.childName
) g
on g.grandParentName = h.name

You can do this with 2 more joins and subtraction the age of biggest grandchild:
SELECT AVG(p_age) average_age
FROM
(
SELECT h.name, h.age-MAX(h2.age) as p_age
FROM parent p1
LEFT JOIN parent p2 ON P1.childname = P2.parentname
INNER JOIN human h ON P1.parentname = h.name
INNER JOIN human h2 ON P2.ChildName = h2.name
WHERE h.gender = 'm' AND p2.childname IS NOT NULL
GROUP BY h.name, h.age
)pAges
Please consider that the name is not appropriate data for doing this task.

Related

"Column "parent_id" for "cte" is specified more than once" in SQL recursive query

I have 5 SQL tables with columns as follows:
tbl_request_listEmpB
listEmpB_id request_id
tbl_request_listEmpD
listEmpD_id request_id
tbl_employee
id, parent_id (this one refers to id in tbl_department)
tbl_department
id, parent_id (that one referes to id of parent department)
tbl_department_manager
department_id, manager_employee_id
As input data I have employee_id and request_id.
I need to figure out whether the employee has access to the request (whether he's a manager or not)
Here's the query which is supposed to return 1 if the current user is a manager, 0 otherwise
with reqEmployees as (
select listEmpB_id as employee_id
from tbl_request_listEmpB
where request_id = ${request_id}
union all --concatenate the two tables
select listEmpD_id
from tbl_request_listEmpD
where request_id = ${request_id}
),
cte as (
select e.parent_id, null as parent_id
from reqEmployees r
join tbl_employee e on e.id = r.employee_id -- get these employees' departments
union all
select d.id, d.parent_id
from cte
join tbl_department d on d.id = cte.parent_id -- and get parent departments
)
select case when exists (select 1
from cte
join tbl_department_manager dm on dm.department_id = cte.id
where dm.manager_employee_id = ${employee_id})
then 1 else 0 end;
Finally, there's the logic that I believe is implemented in the query above:
First we need to identify whether the employee_id is a manager or not. If he is - find in which departments. So we query to tbl_department_manager based on manager_employee_id(=employee_id from input data) to get a list of corresponding department_id and store them in a variable. If the query returned 0 departments - terminate and return false
Based on request_id we collect ids of employees from both tbl_request_listEmpB and tbl_request_listEmpD. Later we refer to them as employee_id from reqEmployees
Query to tbl_employee based on ids retrieved from p.2 to get parent_id (list of unique departments employees belong to)
If there's a match between at least one department from p.1 and a one from p.3 return true
If not, there's a need to query to tbl_department and recursively search for a match between at least one element from p.1 and one element in p.3.parent_id
Here's what I mean
Consider the following chart
And here's the corresponding SQL table:
tbl_department (id, parent_id)
dep0 null
dep1 dep0
dep2 dep1
dep3 dep1
dep4 dep2
dep5 dep0
So, if we have a departments list returned from p.1 of ['dep1'] (there might be more than one element, we have to iterate through each element) we need to return true ONLY if from p.3 we've got dep1|dep2|dep3|dep4 - (dep1 descendants including dep1). If ['dep2'] return true if dep2|dep4. So there should at least one match of at least one element from p.1 and recursive result from p.5. I hope I illustrated it in the clearest way possible
Almost forgot - the query above gives me
"Column "parent_id" for "cte" is specified more than once"
But I don't think that it does what it's supposed to do, I need to rewrite it
Any help would be greatly appreciated
Without some sample data (and parameter values) and expected output for that data (with those parameter values), it's difficult to verify this solution.
I have assumed that your tbl_ou and tbl_department are in fact the same table.
Other than the CTE, it looks like the rest of your code should work. The CTE below now travels "both" directions through the hierarchy (upwards and downwards), finding both parents and children. Note that it only finds parents of parents and children of children, it doesn't find children of parents, for example, so no "siblings", "uncles" or whatever these records should be called!
You may need to cast both fields in the CTE seed record as the relevant data type. Based on the supplied data I have assumed that the datatype for department id (and therefore also for parent_id) is varchar(10).
cte as (
select
cast(e.parent_id as varchar(10)) as id,
cast(o.parent_id as varchar(10)) as parent_id,
0 as iteration
from
reqEmployees r
join tbl_employee e on e.id = r.employee_id
join tbl_department o on e.parent_id = o.id
--extra table here compared to earlier versions to allow us
--to traverse hierarchy in both directions
union all
select --This one finds "child" departments
o.id,
o.parent_id,
cte.iteration + 1
from
cte
join tbl_department o on o.id = cte.parent_id
where
cte.iteration >=0 --prevents siblings/uncles etc
union all
select --This one finds "parent" departments
o.id,
o.parent_id,
cte.iteration - 1
from
cte
join tbl_department o on o.parent_id = cte.id
where
cte.iteration <=0 --prevents siblings/uncles etc
)
You can test my script using this SQL Fiddle (updated).

how can I write a postgresql query to find someone's cousin in database?

Person(ID, name, gender, fatherID, motherID, spouseID);
This is my database columns.
for example if id = 5 how can I find this person's cousins?
i have to use just person table. And cousin means someone's mother's and father's siblings' children.
I try to use nested queries but it was too many query to follow for the result.
For example that query find someone's siblings
SELECT name
FROM person
WHERE motherid = (SELECT motherid
FROM person
WHERE id = x)
AND fatherid = (SELECT fatherid
FROM person
WHERE id = x)
EXCEPT
(SELECT name FROM person WHERE id = x);
Maybe joining to the parents of the parents, then back to their children's children.
(untested notepad scribble)
SELECT DISTINCT
kid.name as kid,
cousin.name as cousin
FROM person kid
LEFT JOIN person AS parent
ON parent.id IN (kid.fatherid, kid.motherid)
LEFT JOIN person AS grandparent
ON grandparent.id IN (parent.fatherid, parent.motherid)
LEFT JOIN person AS auntcle
ON grandparent.id IN (auntcle.fatherid, auntcle.motherid)
AND auntcle.id != parent.id
LEFT JOIN person AS cousin
ON auntcle.id IN (cousin.fatherid, cousin.motherid)
WHERE cousin.fatherid != kid.fatherid AND cousin.motherid != kid.motherid -- redneck check

PostgreSQL: How do I get data from table `A` filtered by a column in table `B`

I want to fetch all parents that have kids in a specific grade only in a school.
Below are trimmed down version of the tables.
TABLE students
id,
last_name,
grade_id,
school_id
TABLE parents_students
parent_id,
student_id
TABLE parents
id,
last_name,
school_id
I tried the below query but it doesn't really work as expected. It rather fetches all parents in a school disregarding the grade. Any help is appreciated. Thank you.
SELECT DISTINCT
p.id,
p.last_name,
p.school_id,
st.school_id,
st.grade_id,
FROM parents p
INNER JOIN students st ON st.school_id = p.school_id
WHERE st.grade_id = 118
AND st.school_id = 6
GROUP BY p.id,st.grade_id,st.school_id;
I would think:
select p.*
from parents p
where exists (select 1
from parents_students ps join
students s
on ps.student_id = s.id
where ps.parent_id = p.id and
s.grade_id = 118 and
s.school_id = 6
);
Your question says that you want information about the parents. If so, I don't see why you are including redundant information about the school and grade (it is redundant because the where clause specifies exactly what those values are).

SQL Server: querying hierarchical and referenced data

I'm working on an asset database that has a hierarchy. Also, there is a "ReferenceAsset" table, that effectively points back to an asset. The Reference Asset basically functions as an override, but it is selected as if it were a unique, new asset. One of the overrides that gets set, is the parent_id.
Columns that are relevant to selecting the heirarchy:
Asset: id (primary), parent_id
Asset Reference: id (primary), asset_id (foreignkey->Asset), parent_id (always an Asset)
---EDITED 5/27----
Sample Relevent Table Data (after joins):
id | asset_id | name | parent_id | milestone | type
3 3 suit null march shape
4 4 suit_banker 3 april texture
5 5 tie null march shape
6 6 tie_red 5 march texture
7 7 tie_diamond 5 june texture
-5 6 tie_red 4 march texture
the id < 0 (like the last row) signify assets that are referenced. Referenced assets have a few columns that are overidden (in this case, only parent_id is important).
The expectation is that if I select all assets from april, I should do a secondary select to get the entire tree branches of the matching query:
so initially the query match would result in:
4 4 suit_banker 3 april texture
Then after the CTE, we get the complete hierarchy and our result should be this (so far this is working)
3 3 suit null march shape
4 4 suit_banker 3 april texture
-5 6 tie_red 4 march texture
and you see, the parent of id:-5 is there, but what is missing, that is needed, is the referenced asset, and the parent of the referenced asset:
5 5 tie null march shape
6 6 tie_red 5 march texture
Currently my solution works for this, but it is limited to only a single depth of references (and I feel the implementation is quite ugly).
---Edited----
Here is my primary Selection Function. This should better demonstrate where the real complication lies: the AssetReference.
Select A.id as id, A.id as asset_id, A.name,A.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 0 as reference, W.phase_name, W.status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on A.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Left Join Workflow as W on W.asset_id = A.id
where A.deleted <= #showDeleted
UNION
Select -1*AR.id as id, AR.asset_id as asset_id, A.name, AR.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 1 as reference, NULL as phase_name, NULL as status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on AR.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Inner Join AssetReference AR on AR.asset_id = A.id
where A.deleted <= #showDeleted
I have a stored procedure that takes a temp table (#temp) and finds all the elements of the hierarchy. The strategy I employed was this:
Select the entire system heirarchy into a temp table (#treeIDs) represented by a comma separated list of each entire tree branch
Get entire heirarchy of assets matching query (from #temp)
Get all reference assets pointed to by Assets from heirarchy
Parse the heirarchy of all reference assets
This works for now because reference assets are always the last item on a branch, but if they weren't, i think i would be in trouble. I feel like i need some better form of recursion.
Here is my current code, which is working, but i am not proud of it, and I know it is not robust (because it only works if the references are at the bottom):
Step 1. build the entire hierarchy
;WITH Recursive_CTE AS (
SELECT Cast(id as varchar(100)) as Hierarchy, parent_id, id
FROM #assetIDs
Where parent_id is Null
UNION ALL
SELECT
CAST(parent.Hierarchy + ',' + CAST(t.id as varchar(100)) as varchar(100)) as Hierarchy, t.parent_id, t.id
FROM Recursive_CTE parent
INNER JOIN #assetIDs t ON t.parent_id = parent.id
)
Select Distinct h.id, Hierarchy as idList into #treeIDs
FROM ( Select Hierarchy, id FROM Recursive_CTE ) parent
CROSS APPLY dbo.SplitIDs(Hierarchy) as h
Step 2. Select the branches of all assets that match the query
Select DISTINCT L.id into #RelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE #treeIDs.id in (Select id FROM #temp)
Step 3. Get all Reference Assets in the branches
(Reference assets have negative id values, hence the id < 0 part)
Select asset_id INTO #REFLinks FROM #AllAssets WHERE id in
(Select #AllAssets.asset_id FROM #AllAssets Inner Join #RelativeIDs
on #AllAssets.id = #RelativeIDs.id Where #RelativeIDs.id < 0)
Step 4. Get the branches of anything found in step 3
Select DISTINCT L.id into #extraRelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE
exists (Select #REFLinks.asset_id FROM #REFLinks WHERE #REFLinks.asset_id = #treeIDs.id)
and Not Exists (select id FROM #RelativeIDs Where id = #treeIDs.id)
I've tried to just show the relevant code. I am super grateful to anyone who can help me find a better solution!
--getting all of the children of a root node ( could be > 1 ) and it would require revising the query a bit
DECLARE #AssetID int = (select AssetId from Asset where AssetID is null);
--algorithm is relational recursion
--gets the top level in hierarchy we want. The hierarchy column
--will show the row's place in the hierarchy from this query only
--not in the overall reality of the row's place in the table
WITH Hierarchy(Asset_ID, AssetID, Levelcode, Asset_hierarchy)
AS
(
SELECT AssetID, Asset_ID,
1 as levelcode, CAST(Assetid as varchar(max)) as Asset_hierarchy
FROM Asset
WHERE AssetID=#AssetID
UNION ALL
--joins back to the CTE to recursively retrieve the rows
--note that treelevel is incremented on each iteration
SELECT A.Parent_ID, B.AssetID,
Levelcode + 1 as LevelCode,
A.assetID + '\' + cast(A.Asset_id as varchar(20)) as Asset_Hierarchy
FROM Asset AS a
INNER JOIN dbo.Batch AS Hierarchy
--use to get children, since the parentId of the child will be set the value
--of the current row
on a.assetId= b.assetID
--use to get parents, since the parent of the Asset_Hierarchy row will be the asset,
--not the parent.
on Asset.AssetId= Asset_Hierarchy.parentID
SELECT a.Assetid,a.name,
Asset_Hierarchy.LevelCode, Asset_Hierarchy.hierarchy
FROM Asset AS a
INNER JOIN Asset_Hierarchy
ON A.AssetID= Asset_Hierarchy.AssetID
ORDER BY Hierarchy ;
--return results from the CTE, joining to the Asset data to get the asset name
---that is the structure you will want. I would need a little more clarification of your table structure
It would help to know your underlying table structure. There are two approaches which should work depending on your environment: SQL understands XML so you could have your SQL as an xml structure or simply have a single table with each row item having a unique primary key id and a parentid. id is the fk for the parentid. The data for the node are just standard columns. You can use a cte or a function powering a calculated column to determin the degree of nesting for each node. The limit is that a node can only have one parent.

Better way to demand, in SQL, that a column contains every specified value

Imagine you have two tables, with a one to many relationship.
For this example, I will suggest that there are two tables: Person, and Homes.
The person table holds a persons name, and gives them an ID. The homes table, holds the association of homes to a person. PID joins to "Person.ID"
And, in this tiny DB, a person can have no homes, or many homes.
I hope I drew that right.
How do I write a select, that returns everyone with every specified house type?
Let's say these are valid "Types" in the homes table:
Cottage, Main, Mansion, Spaceport.
I want to return everyone, in the Person table, who has a spaceport and a Cottage.
The best I could come up with was this:
SELECT DISTINCT( p.name ) AS name
FROM person p
INNER JOIN homes h ON h.pid = p.id
WHERE 'spaceport' in (
SELECT DISTINCT( type ) AS type
FROM homes
WHERE pid = p.id
)
AND 'cottage' in (
SELECT DISTINCT( type ) AS type
FROM homes
WHERE pid = p.id
)
When I wrote that, it works, but I'm pretty sure there has to be a better way.
The HAVING clause here will guarantee that the persons returned have both types, not just one or the other.
SELECT p.name
FROM person p
INNER JOIN homes h
ON p.id = h.pid
AND h.type IN ('spaceport', 'cottage')
GROUP BY p.name
HAVING COUNT(DISTINCT h.type) = 2
select * from homes;
home_id person_id type
--
1 1 cottage
2 1 mansion
3 2 cottage
4 3 mansion
5 4 cottage
6 4 cottage
To find the id numbers of every person who has both a cottage and a mansion, group by the id number, restrict the output to cottages and mansions, and count the distinct types.
select person_id
from homes
where type in ('cottage','mansion')
group by person_id
having count(distinct type) = 2;
person_id
--
1
You can use this query in a join to get all the columns from the person table.
select person.*
from person
inner join (select person_id
from homes
where type in ('cottage','mansion')
group by person_id
having count(distinct type) = 2) T
on person.person_id = T.person_id;
Thanks to Joe for pointing out an error in my count().
Not sure about the performance on this one, but here goes:
SELECT PID FROM (
SELECT PID, COUNT(PID) cnt FROM (
SELECT DISTINCT PID, Type FROM Homes
WHERE Type IN ('Type1', 'Type2', 'Type3')
) a
GROUP BY PID
) b
WHERE b.cnt = 3
You'd have to dynamically generate your IN clause as well as the WHERE b.CNT clause.