SQL Query - Count number of descendants and summarize with condition - sql

Consider a table with columns:
Manager
Manager ID
Headcount
Superior
Superior ID
A
1
3
C
123
B
2
4
D
345
The table consists of a hierarchy within a company. Each manager has a superior.
The task is to calculate headcount on each level of the hierarchy with the condition:
assign the headcount from the lower branch ONLY if the person has at least 3 or more people directly reporting to him/her and the person on the lower branch is not meeting this criteria.
This is effectively assigns all the direct reports to the first person that meets the criteria within the hierarchy.
Visual aid to make things clearer:
I am blocked. I can create a hierarchy with summary telling me what are all the lower levels within that person's hierarchy and what is their respective headcount. However, I fail to see how can I move further from there with SQL.
Code to create hierarchy:
cteHeadcount AS
(
SELECT
m.manager_id as step_id,
m.manager_id as id,
m.superior_id,
m.headcount,
m.eligible,
1 AS step
FRPM LineMngrs as m
UNION ALL
SELECT
c.step_id,
m.manager_id as id,
c.superior_id,
m.headcount,
m.eligible,
c.step + 1 as step
FROM LineMngrs as m
INNER JOIN cteHeadcount as c ON c.id = m.superior_id
)
Thanks for all the help and suggestions!

Related

Count the number of occurences in each bucket Redshift SQL

This might be difficult to explain. But Im trying to write a redshift sql query where I have want the count of organizations that fall into different market buckets. There are 50 markets. For example company x can be only be found in 1 market and company y can be found in 3 markets. I want to preface that I have over 10,000 companies to fit into these buckets. So ideally it would be more like, hypothetically 500 companies are found in 3 markets or 7 companies are found in 50 markets.
The table would like
Market Bucket
Org Count
1 Markets
3
2 Markets
1
3 Markets
0
select count(distinct case when enterprise_account = true and (market_name then organization_id end) as "1 Market" from organization_facts
I was trying to formulate the query from above but I got confused on how to effectively formulate the query
Organization Facts
Market Name
Org ID
Org Name
New York
15683
Company x
Orlando
38478
Company y
Twin Cities
2738
Company z
Twin Cities
15683
Company x
Detroit
99
Company xy
You would need a sub-query that retrieves the number of markets per company, and an outer query that summarises into a count of markets.
Something like:
with markets as (
select
org_name,
count(distinct market_name) as market_count
from organization_facts
)
select
market_count,
count(*) as org_count
from markets
group by market_count
order by market_count
If I follow you correctly, you can do this with two levels of aggregation. Assuming that org_id represents a company in your dataset:
select cnt_markets, count(*) cnt_org_id
from (select count(*) cnt_markets from organization_facts group by org_id) t
group by cnt_markets
The subquery counts the number of markets per company. I assumed no duplicate (ord_id, market_name) tuples in the table ; if that's not the case, then you need count(distinct market_name) instead of count(*) in that spot.
Then, the outer query just counts how many times each market count occurs in the subquery, which yields the result that you want.
Note that I left apart the enterprise_account column ,that appears in your query but not in your data.

Error in getting the value from the database table

I've sorted out people who got full score for the challenge of different difficulty level. However, the question states that query hacker_id and name of the people who got full score for more than once. I'm encountering a problem with COUNT. I tried to count the frequency of the name appearing in the table, but it wouldn't allow me to. I suspect there is something wrong with the GROUPBY syntax. Could anybody help me?
Previous Code
Select s.challenge_id,s.hacker_id,h.name,s.submission_id,c.difficulty_level,s.score
FROM (((Hackers AS h JOIN Submission AS s ON h.hacker_id=s.hacker_id)JOIN Challenges AS c ON c.challenge_id=s.challenge_id)JOIN Difficulty AS d ON d.difficulty_level=c.difficulty_level)
WHERE c.difficulty_level=d.difficulty_level and s.score=d.score
Result
challenge_id | hacker_id | name |submission_id |difficulty_level |score
71055 86870 Todd 94613 2 30
66730 90411 Joe 97397 6 100
71055 90411 Joe 97431 2 30
Problem
Select g.hacker_id,g.name,COUNT(g.name)
FROM (Select s.challenge_id,s.hacker_id,h.name,s.submission_id,c.difficulty_level,s.score
FROM (((Hackers AS h JOIN Submission AS s ON h.hacker_id=s.hacker_id)JOIN Challenges AS c ON c.challenge_id=s.challenge_id)JOIN Difficulty AS d ON d.difficulty_level=c.difficulty_level)
WHERE c.difficulty_level=d.difficulty_level and s.score=d.score) AS g
WHERE COUNT(g.name)>1
GROUBY g.hacker_id,g.name;
If you need COUNT of each name that appeared more than once in your first query then you can use the following
SELECT hacker_id, name, COUNT(name)
FROM
(
Select s.challenge_id,s.hacker_id,h.name,s.submission_id,c.difficulty_level,s.score
FROM (((Hackers AS h JOIN Submission AS s ON h.hacker_id=s.hacker_id)JOIN Challenges AS c ON c.challenge_id=s.challenge_id)JOIN Difficulty AS d ON d.difficulty_level=c.difficulty_level)
WHERE c.difficulty_level=d.difficulty_level and s.score=d.score
) AS T
GROUP BY hacker_id, name
HAVING COUNT(name) > 1
HAVING is used to filter the aggregated result

Multiple Joins to get data and counts from multiple tables

This is probably really simple, but I have been struggling. Basically I need to combine 2 different queries:
Get a list of accounts plus some info for each
Based on each of those accounts, get the count of users and forms associated with each.
So given the following table structure:
I want to get back:
Name Users Forms Active
====================================
Child 1 3 4 T
Child 2 4 3 F
So the problem is that I want to query first based on the Master id:
Select * from ACCOUNT where MasterId = 1026
AccntId Name Master Id Active
====================================
2 Child 1 1026 T
3 Child 2 1026 F
Then for each of those returned I would like to get the counts of users and forms.
Select Count(AccntId) as Users from Form Where AccntId=2
And of course all in one query. I have messed around with Joins and Left Joins and the stumbling block in the initial query.
Ok, the final query for anyone who cares turned out to be:
SELECT
A.Id as AccountId, A.Name, A.Active,
(select count(*) as Users FROM UserProfile UP where A.Id = UP.AccountId),
(select count(*) as Forms FROM Form F where A.Id = F.AccountId)
FROM
Account A
WHERE
A.MasterId = 1026
Group By A.Id, A.Name, A.Active
Which gave me ultimately the numbers I was looking for:
AccountId Name Active Users Forms
1 Child T 3 4
5 Child2 F 4 3
Not sure if that is the most efficient or proper approach, but it does work! Thanks for the hints from the commentators
SELECT acc.MasterId, count(up.AccntId) as Users, count(f.AccntId) as Forms
from Account acc
full join UserProfile up
on up.AccntId = acc.AccntId
full join Form f
on f.AccntId = acc.AccntId
-- Where /* Your conditions */
group by acc.MasterId;
This should work.
Edit: joins changed to left join

Limiting records and putting restrictions

What i want to do is, display all the rooms having capacity to accommodate students, In Rooms table it has a field "Capacity", in which Number of students that can be accommodate is specified,
My idea was to select all the records of the students and check them if more then prescribed limit of student records are found against "capacity " Column in any room, then App shouldn't allow user to insert the record, but i don't know how do i do it. I gotta accommodate students in rooms in a way that it must not accommodate a student in any room if the number of students accommodated in that room exceeds the available seats.
what i tried:
select Student.StudentName,Student.RoomNumber,Rooms.RoomID
From Student
INNER JOIN Rooms
ON Student.RoomNumber=Rooms.RoomId,
that's what i get and that's not what i need,
so what i need is, jawad,hamid,asim are residents of room one which has the capacity to accomodate 3 students only, what i adversly want is to display the rooms which do have capacity to accomodate new students, and if theere are more student records associated with a room record more then it's capacity then user must not be allowed to assign that room to a student.
You can group by the room:
select r.RoomNumber,
r.Capacity,
r.Capacity - count(s.Name) as RemainingCapacity
from Students s
join Rooms r
on r.RoomNumber = s.RoomNumber
group by r.RoomNumber, r.Capacity
This shows:
RoomNumber Capacity RemainingCapacity
1 2 1
2 3 -1
With these values:
Students:
Name RoomNumber
B 1
C 2
D 2
E 2
F 2
Rooms:
RoomNumber Capacity
1 2
2 3
select Student.StudentName,Student.RoomNumber
where Student.RoomNumber IN
( select Student.RoomNumber
From Student
INNER JOIN Rooms
ON Student.RoomNumber=Rooms.RoomId
Group by Student.RoomNumber, Rooms.Capacity
having COUNT(Student.RoomNumber) <= Rooms.Capacity) t

Finding Least Common Ancestor from a Transitive Closure Table

I have a table representing the transitive closure of an organizational hierarchy (i.e., its a tree with a single root):
create table ancestry (
ancestor integer,
descendant integer,
distance integer
);
I have another table that contains the organizations that each user is allowed to access:
create table accessible (
user integer,
organization integer
);
The system shows the user a roll-up of expenditures associated with each organization the user can access. I could always start by showing the user a view of the company (i.e., the root) showing the user a list of immediate child organizations and how much his organizations contribute to the total. In most cases, there would be a single child and the user would be required to drill-down several levels before seeing multiple children. I would prefer to start the presentation with the first organization that shows multiple children (i.e., the LCA).
For a given user, I can find the set of paths to the root easy enough but am having trouble finding the least common ancestor. I am using postgresql 9.1 but would prefer a solution that is database agnostic. In the worst case, I can pull the paths to root back into the application's code and calculate the LCA there.
I took a fresh look at this and developed the following solution. I used a common-table-expression to make it easier to understand how it operates but it could easily be written using a sub-query.
with
hit (id, count) as (
select
ancestry.ancestor
,count(ancestry.descendant)
from
accessible
inner join ancestry
on accessible.organization = ancestry.descendant
where
accessible.user = #user_id
group by
ancestry.ancestor
)
select
ancestry.descendant as lca
from
hit
inner join ancestry
on ancestry.descendant = hit.id
and ancestry.ancestor = #company_id
order by
hit.count desc
,ancestry.distance desc
limit 1
;
The hit CTE counts, for each organization in the hierarchy, the number of paths from a child to the root that traverse the organization. The LCA is then the organization with the most traversals. In the event of a tie, the organization farthest from the root (i.e., max(distance)) is the actual LCA. This is best illustrated with an example.
A
|
B
/ \
C D
Assuming we wish to find the LCA of nodes C and D from the tree above. The hit CTE produces the following counts:
Node Count
A 2
B 2
C 1
D 1
The main query adds the distance:
Node Count Distance
A 2 0
B 2 1
C 1 2
D 1 2
The main query then orders the results by descending count and distance
Node Count Distance
B 2 1
A 2 0
C 1 2
D 1 2
The LCA is the first item in the list.
Just a hunch and not db agnostic (SQL Server) but adaptable
SELECT TOP 1
a1.ancestor
FROM ancestor a1
INNER JOIN
ancestor a2 ON a1.ancestor=a2.ancestor
WHERE a1.descendent = #Dec1
AND
a2.descendent = #Dec2
ORDER BY a1.distance DESC
If you want to put some data in SQLFiddle, I can have a play with it.