SQL Determine the teams having same members - sql

I have an interesting problem.
I've some teams with a team leader stored in one table and the members of teams are stored in child table. I want to determine the teams that have same members.
TEAMS
TEAM_ID LEADER_ID
1 1
2 1
3 2
4 2
MEMBERS
TEAM_ID MEMBER_ID
1 2
1 3
1 4
2 3
2 4
2 5
3 1
3 3
3 4
4 5
4 6
4 7
I was able to write this query to determine the formations and now I am clueless how to proceed.
SELECT
TEAM_ID,
(
SELECT
CONVERT (VARCHAR, MEMBER_ID) + ', '
FROM
(
SELECT
TEAM_ID,
LEADER_ID AS MEMBER_ID
FROM
TEAMS
UNION ALL
SELECT
TEAM_ID,
MEMBER_ID
FROM
MEMBERS
) FORMATIONS
WHERE
TEAM_ID = MT.TEAM_ID
ORDER BY
MEMBER_ID FOR XML PATH ('')
) AS MEMBERS
FROM
TEAMS MT
As it is clear that team id 1 and 3 are same, how can the lowest ID of the duplicate teams can be obtained.
i.e. the query should return the list of TEAM_IDs that are smallest per duplicate group (and only if they are duplicate)
In this scenario id 1 should be returned.
http://sqlfiddle.com/#!18/c845a/5

There are worse ways to approach this than stuffing the members into a string and comparing them. So, I'll follow the route you have started.
All you need to do is to combine the members from the two tables, and then use that for the logic:
with m as (
select team_id, member_id
from members
union -- on purpose to remove duplicates
select team_id, leader_id
from teams
)
select *
from (select team_id, members, count(*) over (partition by members) as num_teams
from (select t.team_id,
stuff( (select concat(',', m.member_id)
from m
where m.team_id = t.team_id
order by m.member_id
for xml path ('')
), 1, 1, ''
) as members
from teams t
) t
) t
where num_teams > 1
order by members;
Here is your SQL Fiddle.
Note that string comparison works fine for this case, which is an exact match of members. For superset relationships is doesn't work so well.

Try below query, it uses CTE to get grouped teams wth its members listed as comma separated list.
Later it is used with group by to determine lowest ID of teams with same members. To ensure thhat there will be only duplicated teams, I used having clause.
;with cte as (
select team_id,
(select cast(member_id as varchar(5)) + ',' from #members innerMembers
where team_id = m.team_id
and not exists(select 1 from #TEAMS
where leader_id = innerMembers.member_id)
order by member_id
for xml path('')) members
from #members m
group by team_id
)
select min(team_id), members from cte
group by members
having count(*) > 1

Using pure SQL.
The main idea is that two sets A and B being equal is defined by A being a subset of B and B being a subset of A.
And we can check whether B is a subset of A by getting the members of B which are in A, counting them, and checking whether this equals the count in A.
As this is a somewhat complicated step I simply did it by cross applying a subquery filtered to teams A and B. There may be a more elegant way.
WITH MembersAll AS
(
SELECT Team_Id, Member_Id FROM Members
UNION
-- Consider leaders as members.
SELECT Team_Id, Leader_Id AS Member_Id FROM Teams
),
-- Teams and any teams which are a subset of that team:
TeamSubsetTeam AS (
SELECT
ThisTeam.Team_Id,
OtherTeam.Team_Id AS SubsetTeam_Id
FROM Teams AS ThisTeam
CROSS JOIN Teams AS OtherTeam -- Considering all pairs of teams.
CROSS APPLY (
-- Get the members in both teams,
-- left join so that we have all members from a given team
-- and all of the members in the other team that are in the given team
-- then filter on the counts of these being the same.
SELECT
COUNT(MembersThisTeam.Member_Id) AS MemberCountThisTeam,
COUNT(MembersOtherTeamInThisTeam.Member_Id) AS MemberCountOtherTeamInThisTeam
FROM MembersAll AS MembersThisTeam
LEFT JOIN MembersAll AS MembersOtherTeamInThisTeam
ON MembersThisTeam.Member_Id = MembersOtherTeamInThisTeam.Member_Id
AND MembersOtherTeamInThisTeam.Team_Id = OtherTeam.Team_Id
WHERE MembersThisTeam.Team_Id = ThisTeam.Team_Id
) MemberCounts
WHERE MemberCounts.MemberCountThisTeam = MemberCounts.MemberCountOtherTeamInThisTeam
),
-- Teams and any teams which are equivalent to that team (including itself):
TeamEquivalentTeam AS (
-- From set theory, team A is equivalent to team B if
-- team A is a subset of team B and
-- team B is a subset of team A.
SELECT
Team_Id,
SubsetTeam_Id AS EquivalentTeamId
FROM TeamSubsetTeam
WHERE Team_Id IN (
SELECT SubsetTeam_Id FROM TeamSubsetTeam AS SubsetTeamSubsetTeam
WHERE SubsetTeamSubsetTeam.Team_Id = TeamSubsetTeam.SubsetTeam_Id
)
)
-- The specified post-processing step.
-- Doesn't seem particularly useful but you can do whatever you like
-- now you have the information in TeamEquivalentTeam.
SELECT DISTINCT MIN(EquivalentTeamId) AS FirstEquivalentTeam
FROM TeamEquivalentTeam
GROUP BY Team_Id
Returns:
FirstEquivalentTeam
1
2
4

Related

Can you explain the difference between these two SQL queries? Codility SQL Exercise 3

I came up with a solution to the below scenario that generated the correct results with the test data, but when it was graded it only got 36% correct when using different data. Someone else asked for the solution to this problem here (How do i crack this SQL Soccer Matches assignment?) and I found Strange Coder's solution to be similar to mine. This solution got a 100%. What is the difference between them?
Set Up
You are given two tables, teams and matches, with the following structures:
create table teams (
team_id integer not null,
team_name varchar(30) not null,
unique(team_id)
);
create table matches (
match_id integer not null,
host_team integer not null,
guest_team integer not null,
host_goals integer not null,
guest_goals integer not null,
unique(match_id)
);
Each record in the table teams represents a single soccer team. Each record in the table matches represents a finished match between two teams. Teams (host_team, guest_team) are represented by their IDs in the teams table (team_id). No team plays a match against itself. You know the result of each match (that is, the number of goals scored by each team).
You would like to compute the total number of points each team has scored after all the matches described in the table. The scoring rules are as follows:
If a team wins a match (scores strictly more goals than the other team), it receives three points.
If a team draws a match (scores exactly the same number of goals as the opponent), it receives one point.
If a team loses a match (scores fewer goals than the opponent), it receives no points.
Write an SQL query that returns a ranking of all teams (team_id) described in the table teams. For each team you should provide its name and the number of points it received after all described matches (num_points). The table should be ordered by num_points (in decreasing order). In case of a tie, order the rows by team_id (in increasing order).
For example, for:
teams:
team_id
team_name
10
Give
20
Never
30
You
40
Up
50
Gonna
matches:
match_id
host_team
guest_team
host_goals
guest_goals
1
30
20
1
0
2
10
20
1
2
3
20
50
2
2
4
10
30
1
0
5
30
50
0
1
your query should return:
team_id
team_name
num_points
20
Never
4
50
Gonna
4
10
Give
3
30
You
3
40
Up
0
My Solution
SELECT t.team_id, t.team_name, COALESCE(SUM(num_points), 0) AS num_points
FROM(
SELECT t.team_id, t.team_name,
(CASE WHEN m.host_goals > m.guest_goals THEN 3
WHEN m.host_goals = m.guest_goals THEN 1
WHEN m.host_goals < m.guest_goals THEN 0
END) AS num_points
FROM teams t
JOIN matches m
ON t.team_id = m.host_team
UNION
SELECT t.team_id, t.team_name,
(CASE WHEN m.guest_goals > m.host_goals THEN 3
WHEN m.guest_goals = m.host_goals THEN 1
WHEN m.guest_goals < m.host_goals THEN 0
END) AS num_points
FROM teams t
JOIN matches m
ON t.team_id = m.guest_team
) AS c
RIGHT JOIN teams t
ON t.team_id = c.team_id
GROUP BY t.team_id, t.team_name
ORDER BY COALESCE(SUM(num_points), 0) DESC, t.team_id
Strange Coder's Solution
How do i crack this SQL Soccer Matches assignment?
From Strange Coder
select team_id, team_name,
coalesce(sum(case when team_id = host_team then
(
case when host_goals > guest_goals then 3
when host_goals = guest_goals then 1
when host_goals < guest_goals then 0
end
)
when team_id = guest_team then
(
case when guest_goals > host_goals then 3
when guest_goals = host_goals then 1
when guest_goals < host_goals then 0
end
)
end), 0) as num_points
from Teams
left join Matches
on
Teams.team_id = Matches.host_team
or Teams.team_id = Matches.guest_team
group by team_id, team_name
order by num_points desc, team_id;
I have figured it out. I should have used UNION ALL instead of UNION.
Alternative solution, can simply unpivot your results with CROSS APPLY instead of using UNION. Also no need to calculate ties in your CASE statement as your simply going to SUM() the results and 0 won't affect it.
Calculate Total Points per Team
DROP TABLE IF EXISTS #Team
DROP TABLE IF EXISTS #Match
CREATE TABLE #Team (team_id INT, team_name VARCHAR(100))
INSERT INTO #Team VALUES (10,'Give'),(20,'Never'),(30,'You'),(40,'Up'),(50,'Gonna')
CREATE TABLE #Match (match_id INT,host_team INT,guest_team INT,host_goals INT,guest_goals INT)
INSERT INTO #Match VALUES
(1,30,20,1,0)
,(2,10,20,1,2)
,(3,20,50,2,2)
,(4,10,30,1,0)
,(5,30,50,0,1)
;WITH cte_TotalPoints AS
(
SELECT C.team_id,SUM(C.Points) AS TotalPoints
FROM #Match AS A
CROSS APPLY (
SELECT host_points = CASE
WHEN A.host_goals > A.guest_goals THEN 3
WHEN A.host_goals = A.guest_goals THEN 1
END
,guest_points = CASE
WHEN A.guest_goals > A.host_goals THEN 3
WHEN A.host_goals = A.guest_goals THEN 1
END
) AS B
CROSS APPLY (
VALUES
(host_team,host_points)
,(guest_team,guest_points)
) AS C(team_id,points)
GROUP BY c.team_id
)
SELECT A.team_id
,A.team_name
,TotalPoints = ISNULL(TotalPoints,0)
FROM #Team AS A
LEFT JOIN cte_TotalPoints AS B
ON A.team_id = B.team_id

In SQL, how can i segment users by number of items they have? (redshift)

I'm not a SQL expert so apologies if this is actually really simple.
I have a table that lists users and the different questionnaires they have taken. Users can take questionnaires in any order and take as many as they like. There are a total of 7 available and I want to get a view of how many have taken 1 out of 7, 2 of 7, 3 of 7 etc etc
So a really rough example is the table might look like this:
And I want a query that will show me:
count Users with 1 Q: 1
count Users with 2 Q: 2
count Users with 3 Q: 0
count Users with 4 Q: 0
count Users with 5 Q: 1
count Users with 6 Q: 0
count Users with 7 Q: 0
You can do this with two levels of aggregation:
select cnt_questionnaires, count(*) cnt_users
from (
select count(*) cnt_questionnaires from mytable group by userID
) t
IF OBJECT_ID('tempdb..#t') IS NOT NULL DROP TABLE #t ;
create table #t (userid INT, q nvarchar(32));
insert into #t
values
(1,'Q1'),
(1,'Q3'),
(2,'Q2'),
(3,'Q1'),
(3,'Q2'),
(3,'Q3'),
(3,'Q4'),
(3,'Q5'),
(4,'Q2'),
(4,'Q3')
-- select * from #t
SELECT
v.qCount,
Count(c.userid) uCount
FROM
(VALUES (1),(2),(3),(4),(5),(6),(7)) v(qCount)
LEFT JOIN (
select
userid, count(q) qCount
from
#t
group by userid
) c ON c.qCount = v.qCount
GROUP BY
v.qCount
Assuming you have user_id on each row, the challenge is getting the zero values. Redshift is not very flexible when it comes to creating tables. Assuming your source data has enough rows, you can use:
select n.n, coalesce(u.cnt, 0)
from (select row_number() over () as n
from t
limit 7
) n left join
(select user_id, count(*) as cnt
from t
group by user_id
) u
on n.n = u.cnt;

Query to select distinct values from different tables and not have them repeat (show them as a flat file)

I'm trying to get all phones, emails, and organizations for a person and show it in a flat file format. There should be n number of rows, where n is the max count of organizations, emails, or phones. NULL values will be shown once all values have been shown in the rows, with NULL being the last values. The emails and phones can only have 1 PreferredInd per person. I want these to be on the same row (1 of them can be NULL). I've tried to do this on a more complex query, but couldn't get it to work, so I've started over using this simpler example.
Example tables and values:
#ContactPerson
Id Name
1 John Doe
#ContactEmail
Id PersonId Email PreferredInd
1 1 johndoe#us.gov 0
2 1 jdoe#us.gov 1
3 1 johndoe#gmail.com 0
#ContactPhone
Id PersonId Phone PreferredInd
1 1 888-867-5309 0
2 1 305-476-5234 1
#ContactOrganization
Id PersonId Organization
1 1 US Government
2 1 US Army
I want a resulting set to look like:
Name Organization PreferredInd Email Phone
John Doe US Government 1 jdoe#us.gov 888-867-5309
John Doe US Army 0 johndoe#us.gov 305-467-5234
John Doe NULL 0 johndoe#gmail.com NULL
The complete sql code that I have for this example is here on pastebin. It also includes code to create the sample tables. It works when the count of emails exceeds the count of organizations or phones, but that won't always be true. I can't seem to figure out how to get the result that I'm looking for. The actual tables I'm working with can have 0 or infinity emails, phones, or organizations per person. There will also be many more values, but I can fix that myself.
Can you help me fix my query or show me a simpler way to do it? If you have any questions, just let me know and I can try to answer them.
something like this?
with cte_e as (
select
*,
row_number() over(order by PreferredInd desc, Id) as rn
from ContactEmail
), cte_p as (
select
*,
row_number() over(order by PreferredInd desc, Id) as rn
from ContactPhone
), cte_o as (
select
*,
row_number() over(order by Organization) as rn
from ContactOrganization
), cte_d as (
select distinct rn, PersonId from cte_e union
select distinct rn, PersonId from cte_p union
select distinct rn, PersonId from cte_o
)
select
pr.Name, o.Organization, e.Email, p.Phone
from cte_d as d
left outer join ContactPerson as pr on pr.Id = d.PersonId
left outer join cte_e as e on e.PersonId = d.PersonId and e.rn = d.rn
left outer join cte_p as p on p.PersonId = d.PersonId and p.rn = d.rn
left outer join cte_o as o on o.PersonId = d.PersonId and o.rn = d.rn
sql fiddle demo
it's a bit clumsy, I can think of couple of other possible ways to do this, but I think this one is most readable one
Step 1
Write a query that does the full join of all the tables, which will end up with lots of duplicate rows for each person (for each email or phone number)
Step 2
Write a second query that uses GroupBy to group the rows, and that uses the Case or Decode keywords (like a c# switch statement) to find the preferred row value and select it as the value to display

Count number of records with specific values

I have a table:
Table Teams
Id_team member_1 member_2 member_3
1 Alice Ben
2 Ben
3 Charles Alice Ben
4 Ben Alice
I will need to know in how many different teams Alice is a member (doesn't count if she is the first member, second or third). In my sample, the right answer is 2 (with Ben in Id_team 1 and 4, with Ben and Charles in Id_team = 3). Thank you!
You have to count "alices" in each column separately to ensure distinct oer column
What you appear to checking is "
SELECT
COUNT(DISTINCT CASE WHEN member_1 = 'Alice' THEN member_1 END) +
COUNT(DISTINCT CASE WHEN member_2 = 'Alice' THEN member_2 END) +
COUNT(DISTINCT CASE WHEN member_3 = 'Alice' THEN member_3 END)
FROM tablename
WHERE 'Alice' IN(member_1, member_2, member_3);
Update: fixed COUNT
Okay, so you want to same teams with different positions (e.g. Alice&Ben, Ben&Alice) count as one.
To do this, order the members in ascending order for alice in every position, and count the results (this returns 2 to your example):
SELECT COUNT(*) FROM
(
SELECT
least( member_2, member_3) AS l,
greatest(member_2, member_3) AS g
FROM teams
WHERE
member_1 = 'Alice'
UNION
SELECT
least( member_1, member_3) AS l,
greatest(member_1, member_3) AS g
FROM teams
WHERE
member_2 = 'Alice'
UNION
SELECT
least( member_1, member_2) AS l,
greatest(member_1, member_2) AS g
FROM teams
WHERE
member_3 = 'Alice'
) q
;
Note that this can only be done to the special case of 3 member teams, because least and greatest can select the two other members - for member coun of 4 and greater, a more complex solution is needed.
You can try to concatenate the fields (sorted alphabetically) in order to turn them into a list of strings.
Then run a distinct on this list (so it will list all separate teams)
Then search how many strings contains Alice
From this the hardest is the "concat alphabetically", as I couldn't really find any good function to do it, but a GROUP_CONCAT with a separate SELECTs and UNIONs to convert the fields into rows should do it:
SELECT COUNT(*)
FROM (
SELECT DISTINCT team_as_string
FROM (
SELECT id tid, GROUP_CONCAT(q ORDER BY q ASC SEPARATOR ',') team_as_string
FROM (
SELECT id, member_1 q FROM teams
UNION SELECT id, member_2 q FROM teams
UNION SELECT id, member_3 q FROM teams
/* add more fields if needed */
) c
GROUP BY tid
) b
) a
WHERE team_as_string LIKE '%Alice%'
I haven't checked it for syntax errors, but it should be fine logically. Tested and gives the correct answer (2)
This can be enhanced for more members, if needed.
Of course if the members are in a separate join table, then the whole group_concat part can be simplified.
I will need to know in how many different teams Alice is a member
Try this:
SELECT 'Alice', COUNT(id_team)
FROM tablename
WHERE 'Alice' IN(member_1, member_2, member_3);
The result:
| ALICE | THECOUNT |
--------------------
| Alice | 3 |
Fiddle Demo.
If id_team is not unique, use COUNT(DISTINCT id_team).

SQL - Assignment in CASE inside a SELECT in a Recursive Query

I'm creating a SQL Server 2008 query that would output the list of employees in a company along with the team they are on with an additional column.
Example of the org tree:
Level 0: CEO
Level 1: A, B, and C
Level 2:
For A:1,2,3
For B:4,5,6
For C:7,8,9
In my resulting set, I should see three columns -- name, level (of the tree), and team. For 1,2, and 3, I'd see 'A' as their team and 2 as the level. For 4,5, and 6, 'B' and 2 for the level and so on.
I'm using a recursive query to navigate the tree (no problems there), but since I need to "carry" the team name down the query (in case there's a level 8 -- it should still show the person in level 1 they report to), I'm doing this:
(...)
UNION ALL
-- Recursive Member Definition
-- in here level increments one each time, and the team should output the child
-- of the top manager
SELECT A.treenodeid, A.parentnodeid, A.email, LEVEL+1, team =
CASE LEVEL
When 1 then SET #salead = A.Email
Else #salead
END
FROM XX as A
INNER JOIN TeamsTable as B on A.parentnodeid = b.treenodeID
Since I'm trying to use a CASE to check if the level is 1 (to update the team name to whatever the team lead's email name is), SQL keeps saying that in the case I have "Incorrect syntax near SET".
Is it possible to do this sort of assignment in a CASE? I've looked around and haven't found if this can work with my recursive case.
Here's all the query (assuming that the root is 'JohnSmith'):
WITH TeamsTable (treenodeid, parentnodeid, email, Level, team)
AS
(
-- Anchor - Level starts with 0, and the team is empty for the top manager
SELECT treenodeid,parentnodeid,email,0,''
FROM XX WHERE email = 'JohnSmith'
UNION ALL
-- Recursive Member Definition - in here level increments one each time, and the team should output the child of the top manager
SELECT
A.treenodeid, A.parentnodeid, A.email, LEVEL+1, team =
CASE LEVEL
When 1 then SET #salead = A.Email
Else #salead
END
FROM XX as A
INNER JOIN TeamsTable as B on A.parentnodeid = b.treenodeID
)
-- Statement that executes the CTE
SELECT *
FROM TeamsTable
Thanks a lot, guys!
Why do you need the variable at all? I don't see it being used subsequently
I would approach this problem from the bottom up, not from the top down.
You've not said what teams A, B, C or CEO should be in, so I've made that up.
I've also included the sample data in a usable form:
create table Org (
ID int not null,
Name varchar(10) not null,
ParentID int null
)
go
insert into Org (ID,Name,ParentID) values
(1,'CEO',null),
(2,'A',1),
(3,'B',1),
(4,'C',1),
(5,'1',2),
(6,'2',2),
(7,'3',2),
(8,'4',3),
(9,'5',3),
(10,'6',3),
(11,'7',4),
(12,'8',4),
(13,'9',4)
Query:
;With AllPeople as (
select ID,ID as LastParentID,ParentID as NextParentID, CASE WHEN ParentID is null THEN 0 ELSE 1 END as Level
from Org
union all
select ap.ID,ap.NextParentID,o.ParentID,Level + 1
from
AllPeople ap
inner join
Org o
on
ap.NextParentID = o.ID and
o.ParentID is not null
), Roots as (
select ID from Org where ParentID is null
), RootedPeople as (
select * from AllPeople where NextParentID is null or NextParentID in (select ID from Roots)
), Names as (
select
oself.Name,
oteam.Name as Team,
Level
from
RootedPeople rp
inner join
Org oself on rp.ID = oself.ID
left join
Org oteam on rp.LastParentID = oteam.ID
)
select * from Names
Result:
Name Team Level
---------- ---------- -----------
CEO CEO 0
A A 1
B B 1
C C 1
9 C 2
8 C 2
7 C 2
6 B 2
5 B 2
4 B 2
3 A 2
2 A 2
1 A 2
Explanation of the CTEs:
AllPeople is a recursive query that climbs up the organisation tree until it reaches the root. We use two columns (LastParentID and NextParentID) to track two levels of the hierarchy - because, apparently, once we reach the root, we want the level before that.
Roots finds all of the people who don't have a parent. It's how we identify the rows from AllPeople that were complete, in,
RootedPeople where we find rows which never successfully found any parents, or where the recursion had reached the top of the tree.
Names and finally we join back to the Org table to assign names to individuals and teams. This one isn't necessary - it could be the final query by itself.
Note also that, due to the recursive way the AllPeople CTE is built, we calculate the levels as we go - every time we recurse, we add one to the Level that this row represents.