Find exactly matched groups in a many to many relationship table - sql

Table shown below maps the many-to-many relationship between courses and students.
CREATE Table CourseStudents
(
CourseId INT NOT NULL,
StudentId INT NOT NULL,
PRIMARY KEY (CourseId, StudentId)
);
INSERT INTO CourseStudents VALUES (1, 1), (1, 2), (2, 1), (2, 2), (3, 3), (3, 2),
(4, 3), (4, 2), (5, 1)
Example data
| CourseId | StudentId |
|----------|-----------|
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 3 | 2 |
| 3 | 3 |
| 4 | 2 |
| 4 | 3 |
| 5 | 1 |
I'm looking for a query that returns all courses that have the exact same students. I was able to come up with the query shown below.
WITH CourseGroups AS
(
SELECT c.CourseId,
STUFF ((
SELECT ',' + CAST(c2.StudentId AS VARCHAR)
FROM CourseStudents c2
WHERE c2.CourseId = c.CourseId
ORDER BY c2.StudentId
FOR XML PATH ('')), 1, 1, '') AS StudentList
FROM CourseStudents c
GROUP BY c.CourseId)
SELECT cg.StudentList,
STUFF ((
SELECT ',' + CAST(cg2.CourseId AS VARCHAR(10))
FROM CourseGroups cg2
WHERE cg2.StudentList = cg.StudentList
FOR XML PATH ('')), 1, 1, '') AS ExactMatchCourseList
FROM CourseGroups cg
GROUP BY cg.StudentList
HAVING COUNT(*) > 1
Query returns
| StudentList | ExactMatchCourseList |
|-------------|----------------------|
| 1,2 | 1,2 |
| 2,3 | 3,4 |
Above result is fine. But I only need the ExactMatchCourseList.
The table I'm dealing has more than a billion rows so I need an efficient query that can find any matched courses within a few minutes of run time. Appreciate any help.
SqlFiddle

This only does 2 runs over your CourseStudents table, instead of the 4 you're currently doing. And if you add an index on CourseId on the CourseStudents table, the first run will only be an index scan. It also only runs the original STUFF once for each course, instead of once for each student, then grouping by course. I left out the final STUFF that I wasn't sure if you wanted or if it was just a byproduct of how you were calculating it.
CREATE TABLE #Course
(
CourseId INT NOT NULL PRIMARY KEY
);
INSERT INTO #Course
SELECT CourseId
FROM
CourseStudents s
GROUP BY
CourseId
ORDER BY
CourseId;
CREATE TABLE #CourseStudentList
(
CourseId INT NOT NULL PRIMARY KEY,
StudentList VARCHAR(MAX) NOT NULL
);
INSERT INTO #CourseStudentList
SELECT
c.CourseId,
STUFF ((
SELECT ',' + CAST(c2.StudentId AS VARCHAR)
FROM CourseStudents c2
WHERE c2.CourseId = c.CourseId
ORDER BY c2.StudentId
FOR XML PATH ('')), 1, 1, '') AS StudentList
FROM
#Course c
ORDER BY
c.CourseId;
SELECT *
FROM
(
SELECT
l.CourseId,
l.StudentList,
COUNT(*) OVER (PARTITION BY l.StudentList) AS [Count]
FROM
#CourseStudentList l
) l
WHERE
l.[Count] > 1
ORDER BY
l.StudentList;

This will give you a list of course pairs, although if you are going to get triplicates (or more) then you'll end up with some extra results. I don't have time to toy with this further to correct that issue, but maybe it points you in the right direction:
WITH CTE_CourseMatches AS (
SELECT
CS1.CourseId AS CourseId_1,
CS2.CourseId AS CourseId_2,
COUNT(*) AS cnt
FROM
CourseStudents CS1
INNER JOIN CourseStudents CS2 ON CS2.StudentId = CS1.StudentId AND CS2.CourseId > CS1.CourseId
GROUP BY
CS1.CourseId,
CS2.CourseId
),
CTE_CourseCounts AS (SELECT CourseId, COUNT(*) AS cnt FROM CourseStudents GROUP BY CourseID)
SELECT
CM.CourseId_1,
CM.CourseId_2
FROM
CTE_CourseMatches CM
INNER JOIN CTE_CourseCounts CC1 ON CC1.CourseId = CM.CourseId_1 AND CC1.cnt = CM.cnt
INNER JOIN CTE_CourseCounts CC2 ON CC2.CourseId = CM.CourseId_2 AND CC2.cnt = CM.cnt

Related

Select all records of one table that contain two records in another with certain id

I have two tables of 1:m relation. Need to select which People records have both records in Actions table whit id 1 and 2
People
+----+------+--------------+
| id | name | phone_number |
+----+------+--------------+
| 1 | John | 111111111111 |
+----+------+--------------+
| 3 | Jane | 222222222222 |
+----+------+--------------+
| 4 | Jack | 333333333333 |
+----+------+--------------+
Action
+----+------+------------+
| id | PplId| ActionId |
+----+------+------------+
| 1 | 1 | 1 |
+----+------+------------+
| 2 | 1 | 2 |
+----+------+------------+
| 3 | 2 | 1 |
+----+------+------------+
| 4 | 4 | 2 |
+----+------+------------+
Output
+----+------+--------------+----------
|PplId| name | Phone |ActionId |
+-----+------+-------------+----+-----
| 1 | John | 111111111111| 1 |
+-----+------+-------------+----+-----
| 1 | John | 111111111111| 2 |
+-----+------+-------------+----+-----
Return records of People that have both Have Actionid 1 and Action id 2(Have records in Actions).
Window functions are one method. Assuming actions are not duplicated for a person:
select pa.*
from (select p.*, a.action, count(*) over (partition by p.id) as num_actions
from people p join
action a
on p.id = a.pplid
where a.action in (1, 2)
) pa
where num_actions = 2;
In my opinion, getting two rows with the action detail seems superfluous -- you already know the actions. If you only want the people, then exists comes to mind:
select p.*
from people p
where exists (select 1 from actions where a.pplid = p.id and a.action = 1) and
exists (select 1 from actions where a.pplid = p.id and a.action = 2);
With the right index (actions(pplid, action)), I would expect two exists to be faster than group by.
Try this below query using subquery and join
select a.Pplid, name, phone, actionid from (
select a.pplid as Pplid, name, phone_number as phone
from People P
join Action A on a.pplid= p.id
group by a.pplid, name, phone_number
having count(*)>1 )P
join Action A on a.Pplid= p.Pplid
Try something like this
IF OBJECT_ID('tempdb..#People') IS NOT NULL DROP TABLE #People
CREATE TABLE #People (id INT, name VARCHAR(255), phone_number VARCHAR(50))
INSERT #People
SELECT 1, 'John', '111111111111' UNION ALL
SELECT 3, 'Jane', '222222222222' UNION ALL
SELECT 4, 'Jack', '333333333333'
IF OBJECT_ID('tempdb..#Action') IS NOT NULL DROP TABLE #Action
CREATE TABLE #Action (id INT, PplId INT, ActionId INT)
INSERT #Action
SELECT 1, 1, 1 UNION ALL
SELECT 2, 1, 2 UNION ALL
SELECT 3, 2, 1 UNION ALL
SELECT 4, 4, 2
GO
SELECT p.ID AS PplId
, p.name
, p.phone_number AS Phone
, a.ActionId
FROM #People p
JOIN #Action a
ON p.ID = a.PplId
WHERE p.ID IN ( SELECT PplId
FROM #Action
WHERE ActionId IN (1, 2)
GROUP BY PplId
HAVING COUNT(*) = 2 )
AND a.ActionId IN (1, 2)
GO

SQL LEFT JOIN: difference between WHERE and condition inside AND [duplicate]

This question already has answers here:
SQL JOIN - WHERE clause vs. ON clause
(22 answers)
Closed 3 years ago.
What's the difference between
select t.*,
a.age
from t
left join a
on t.ID = a.ID and a.column > 10
and
select t.*,
a.age
from t
left join a
on t.ID = a.ID
where a.column > 10
?
Specifically, what's the difference when I put the condition on the table I am joining to the main table inside AND versus inside WHERE condition?
with a left join there is a difference
with condition on left join rows with column > 10 will be there filled with nulls
with where condition rows will be filtered out
with a inner join there is no difference
example:
declare #t table (id int, dummy varchar(20))
declare #a table (id int, age int, col int)
insert into #t
select * from (
values
(1, 'pippo' ),
(2, 'pluto' ),
(3, 'paperino' ),
(4, 'ciccio' ),
(5, 'caio' ),
(5, 'sempronio')
) x (c1,c2)
insert into #a
select * from (
values
(1, 38, 2 ),
(2, 26, 5 ),
(3, 41, 12),
(4, 15, 11),
(5, 39, 7 )
) x (c1,c2,c3)
select t.*, a.age
from #t t
left join #a a on t.ID = a.ID and a.col > 10
Outputs:
id dummy age
1 pippo NULL
2 pluto NULL
3 paperino 41
4 ciccio 15
5 caio NULL
5 sempronio NULL
While
select t.*, a.age
from #t t
left join #a a on t.ID = a.ID
where a.col > 10
Outputs:
id dummy age
3 paperino 41
4 ciccio 15
So with LEFT JOIN you will get ALWAYS all the rows from 1st table
If the join condition is true, you will get columns from joined table filled with their values, if the condition is false their columns will be NULL
With WHERE condition you will get only the rows that match the condition.
So what's the difference between them?
An explanation through examples:
CREATE TABLE Students
(
StudentId INT PRIMARY KEY,
Name VARCHAR(100)
);
✓
CREATE TABLE Scores
(
ScoreId INT PRIMARY KEY,
ExamId INT NOT NULL,
StudentId INT NOT NULL,
Score DECIMAL(4,1) NOT NULL DEFAULT 0,
FOREIGN KEY (StudentId)
REFERENCES Students(StudentId)
);
✓
INSERT INTO Students
(StudentId, Name) VALUES
(11,'Joe Shmoe'),
(12,'Jane Doe'),
(47,'Norma Nelson');
✓
INSERT INTO Scores
(ScoreId, ExamId, StudentId, Score) VALUES
(1, 101, 11, 65.2),
(2, 101, 12, 72.6),
(3, 102, 11, 69.6);
✓
--
-- Using an INNER JOIN
--
-- Only Students that have scores
-- So only when there's a match between the 2 tables
--
SELECT stu.Name, sco.Score
FROM Students AS stu
INNER JOIN Scores AS sco
ON sco.StudentId = stu.StudentId
ORDER BY stu.Name
Name | Score
:-------- | :----
Jane Doe | 72.6
Joe Shmoe | 65.2
Joe Shmoe | 69.6
--
-- Using an LEFT JOIN
--
-- All Students, even those without scores
-- Those that couldn't be matched will show NULL's
-- for the fields from the joined table
--
SELECT stu.Name, sco.Score, sco.ScoreId
FROM Students AS stu
LEFT JOIN Scores AS sco
ON sco.StudentId = stu.StudentId
ORDER BY stu.Name
Name | Score | ScoreId
:----------- | :---- | :------
Jane Doe | 72.6 | 2
Joe Shmoe | 65.2 | 1
Joe Shmoe | 69.6 | 3
Norma Nelson | null | null
--
-- Using an LEFT JOIN
-- But with an extra criteria in the ON clause
--
-- All Students again.
-- That have scores >= 66
-- But also the unmatched without scores
--
SELECT stu.Name, sco.Score, sco.ScoreId
FROM Students AS stu
LEFT JOIN Scores AS sco
ON sco.StudentId = stu.StudentId
AND sco.Score >= 66.0
ORDER BY stu.Name
Name | Score | ScoreId
:----------- | :---- | :------
Jane Doe | 72.6 | 2
Joe Shmoe | 69.6 | 3
Norma Nelson | null | null
--
-- Using an LEFT JOIN
-- But with an extra criteria in the WHERE clause
--
-- Only students with scores >= 66
-- The WHERE filters out the unmatched.
--
SELECT stu.Name, sco.Score
FROM Students AS stu
LEFT JOIN Scores AS sco
ON sco.StudentId = stu.StudentId
WHERE sco.Score >= 66.0
ORDER BY stu.Name
Name | Score
:-------- | :----
Jane Doe | 72.6
Joe Shmoe | 69.6
--
-- Using an INNER JOIN
-- And with an extra criteria in the WHERE clause
--
-- Only Students that have scores >= 66
--
SELECT stu.Name, sco.Score
FROM Students AS stu
INNER JOIN Scores AS sco
ON sco.StudentId = stu.StudentId
WHERE sco.Score >= 66
ORDER BY stu.Name
Name | Score
:-------- | :----
Jane Doe | 72.6
Joe Shmoe | 69.6
db<>fiddle here
Did you notice how the criteria in the WHERE clause can make a LEFT JOIN behave like an INNER JOIN?

Creating natural hierarchical order using recursive SQL

I have a table holding categories with an inner parent child relationship.
The table looks like this:
ID | ParentID | OrderID
---+----------+---------
1 | Null | 1
2 | Null | 2
3 | 2 | 1
4 | 1 | 1
OrderID is the order inside the current level.
I want to create a recursive SQL query to create the natural order of the table.
Meaning the output will be something like:
ID | Order
-----+-------
1 | 100
4 | 101
2 | 200
3 | 201
Appreciate any help.
Thanks
I am not really sure what you mean by "natural order", but the following query generates the results you want for this data:
with t as (
select v.*
from (values (1, NULL, 1), (2, NULL, 2), (3, 2, 1), (4, 1, 1)) v(ID, ParentID, OrderID)
)
select t.*,
(100 * coalesce(tp.orderid, t.orderid) + (case when t.parentid is null then 0 else 1 end)) as natural_order
from t left join
t tp
on t.parentid = tp.id
order by natural_order;

Displaying whole table after stripping characters in SQL Server

This question has 2 parts.
Part 1
I have a table "Groups":
group_ID person
-----------------------
1 Person 10
2 Person 11
3 Jack
4 Person 12
Note that not all data in the "person" column have the same format.
In SQL Server, I have used the following query to strip the "Person " characters out of the person column:
SELECT
REPLACE([person],'Person ','')
AS [person]
FROM Groups
I did not use UPDATE in the query above as I do not want to alter the data in the table.
The query returned this result:
person
------
10
11
12
However, I would like this result instead:
group_ID person
-------------------
1 10
2 11
3 Jack
4 12
What should be my query to achieve this result?
Part 2
I have another table "Details":
detail_ID group1 group2
-------------------------------
100 1 2
101 3 4
From the intended result in Part 1, where the numbers in the "person" column correspond to those in "group1" and "group2" of table "Details", how do I selectively convert the numbers in "person" to integers and join them with "Details"?
Note that all data under "person" in Part 1 are strings (nvarchar(100)).
Here is the intended query output:
detail_ID group1 group2
-------------------------------
100 10 11
101 Jack 12
Note that I do not wish to permanently alter anything in both tables and the intended output above is just a result of a SELECT query.
I don't think first part will be a problem here. Your query is working fine with your expected result.
Schema:
CREATE TABLE #Groups (group_ID INT, person VARCHAR(50));
INSERT INTO #Groups
SELECT 1,'Person 10'
UNION ALL
SELECT 2,'Person 11'
UNION ALL
SELECT 3,'Jack'
UNION ALL
SELECT 4,'Person 12';
CREATE TABLE #Details(detail_ID INT,group1 INT, group2 INT);
INSERT INTO #Details
SELECT 100, 1, 2
UNION ALL
SELECT 101, 3, 4 ;
Part 1:
For me your query is giving exactly what you are expecting
SELECT group_ID,REPLACE([person],'Person ','') AS person
FROM #Groups
+----------+--------+
| group_ID | person |
+----------+--------+
| 1 | 10 |
| 2 | 11 |
| 3 | Jack |
| 4 | 12 |
+----------+--------+
Part 2:
;WITH CTE AS(
SELECT group_ID
,REPLACE([person],'Person ','') AS person
FROM #Groups
)
SELECT D.detail_ID, G1.person, G2.person
FROM #Details D
INNER JOIN CTE G1 ON D.group1 = G1.group_ID
INNER JOIN CTE G2 ON D.group1 = G2.group_ID
Result:
+-----------+--------+--------+
| detail_ID | person | person |
+-----------+--------+--------+
| 100 | 10 | 10 |
| 101 | Jack | Jack |
+-----------+--------+--------+
Try following query, it should give you the desired output.
;WITH MT AS
(
SELECT
GroupId, REPLACE([person],'Person ','') Person
AS [person]
FROM Groups
)
SELECT Detail_Id , MT1.Person AS group1 , MT2.Person AS AS group2
FROM
Details D
INNER JOIN MT MT1 ON MT1.GroupId = D.group1
INNER JOIN MT MT2 ON MT2.GroupId= D.group2
The first query works
declare #T table (id int primary key, name varchar(10));
insert into #T values
(1, 'Person 10')
, (2, 'Person 11')
, (3, 'Jack')
, (4, 'Person 12');
declare #G table (id int primary key, grp1 int, grp2 int);
insert into #G values
(100, 1, 2)
, (101, 3, 4);
with cte as
( select t.id, t.name, ltrim(rtrim(replace(t.name, 'person', ''))) as sp
from #T t
)
-- select * from cte order by cte.id;
select g.id, c1.sp as grp1, c2.sp as grp2
from #G g
join cte c1
on c1.id = g.grp1
join cte c2
on c2.id = g.grp2
order
by g.id;
id grp1 grp2
----------- ----------- -----------
100 10 11
101 Jack 12

Pivoting 2 columns from 3 Tables and creating pivot-column-names to avoid conflict - SQL-Server 2008R2

Intro and Problem
In my example i have teachers, students and courses.I would like to have an overview which course is teached by whom in which rooms and all the studends in this course. I have the basic setup runnig (with some handcoded statements). But until now i had no luck to prepare the correct STUFF statement:
Prepare #colsStudents so that i can put the name in the column header and remove the need to mess with the ids (adding 100) to avoid a conflict between rooms.id and students.id
Prepare #colsRooms so that i do not have to hardocde the roomnames
Putting i all together by using EXEC sp_executesql #sql;
You can find all sql-statements to create this schema and the data at the end.
Wanted Result Overview Courses,
I would like pivot the columns RoomName and StudentName and use the column values as the new column names. All SQL-Statements to create tables and data are at the end.
Id | Course | Teacher | A3 | E7 | Penny | Cooper | Koothrap. | Amy
---+--------+---------+----+----+-------+--------+-----------+-----+
1 | C# 1 | Marc G. | | 1 | 1 | | |
2 | C# 2 | Sam S. | | 1 | 1 | | 1 |
3 | C# 3 | John S. | 1 | | | 1 | |
4 | C# 3 | Reed C. | | 1 | | | 1 |
5 | SQL 1 | Marc G. | 1 | | | | |
6 | SQL 2 | Marc G. | 1 | | | | |
7 | SQL 3 | Marc G. | | 1 | | 1 | | 1
8 | SQL 3 | Gbn | 1 | | | | 1 |
What i have so far
With PivotData as (
Select cd.Id, c.CourseName as Course, t.TeacherName as Teacher
,r.Id as RoomId, r.RoomName as RoomName
,100 + s.Id as StudentId, s.StudentName as Student
FROM CourseDetails cd
Left JOIN Courses c ON cd.CourseId = c.Id
Left JOIN Teachers t ON cd.TeacherId = t.Id
Left JOIN CourseMember cm ON cd.Id = cm.CourseDetailsId
Left JOIN Students s ON cm.StudentId = s.Id
Left JOIN Rooms r ON cd.RoomId = r.Id
)
Select Course, Teacher
, [1] as A3, [2] as E7 -- RoomColumns
, [101] as Koothrappali, [102] as Cooper, [103] as Penny, [104] as Amy -- StudentColumns
FROM (
Select Course, Teacher, RoomName, RoomId,Student, StudentId
From PivotData) src
PIVOT( Max(RoomName) FOR RoomId IN ([1],[2])) as P1
PIVOT( Count(Student) FOR StudentId IN ([101],[102],[103],[104]) ) as P2
What is missing
The above statement is prepared by hand. Since i do not know the Rooms or Students in advance i need to create the Pivot Statement for the Columns Rooms and Students dynamically. On SO are plenty of examples how to do it. The normal way to do that is to use STUFF:
DECLARE #colsStudents AS NVARCHAR(MAX);
SET #colsStudents = STUFF(
(SELECT N',' + QUOTENAME(y) AS [text()] FROM
(SELECT DISTINCT 100 + Id AS y FROM dbo.Students) AS Y
ORDER BY y
FOR XML PATH('')
),1
,1
,N'');
Select #colsStudents
This returns [101],[102],[103],[104] for the Student Ids. I added 100 to each id to avoid conflicts between the students.id and teh rooms.id column.
As mentioned in the intro i need to dynamically create something like this
[1] as RoomName_1, [2] as RoomName_1 -- RoomColumns
[1] as StudentName1, [2] as StudentName2, ... ,[4] as Amy -- StudentColumns
But all my tries with the stuff statement failed.
All SQL Statements to create the tables and data
CREATE TABLE [dbo].[Teachers](
[Id] [int] IDENTITY(1,1) NOT NULL,
[TeacherName] [nvarchar](120) NULL,
CONSTRAINT PK_Teachers PRIMARY KEY CLUSTERED (Id))
CREATE TABLE [dbo].[Students](
[Id] [int] IDENTITY(1,1) NOT NULL,
[StudentName] [nvarchar](120) NULL,
CONSTRAINT PK_Students PRIMARY KEY CLUSTERED (Id))
CREATE TABLE [dbo].[Courses](
[Id] [int] IDENTITY(1,1) NOT NULL,
[CourseName] [nvarchar](120) NULL,
CONSTRAINT PK_Courses PRIMARY KEY CLUSTERED (Id))
CREATE TABLE [dbo].[Rooms](
[Id] [int] IDENTITY(1,1) NOT NULL,
[RoomName] [nchar](120) NULL,
CONSTRAINT PK_Rooms PRIMARY KEY CLUSTERED (Id))
CREATE TABLE [dbo].[CourseDetails](
[Id] [int] IDENTITY(1,1) NOT NULL,
[CourseId] [int] NOT NULL,
[TeacherId] [int] NOT NULL,
[RoomId] [int] NOT NULL,
CONSTRAINT PK_CourseDetails PRIMARY KEY CLUSTERED (Id),
CONSTRAINT FK_CourseDetails_Teachers_Id FOREIGN Key (TeacherId)
REFERENCES dbo.Teachers (Id),
CONSTRAINT FK_CourseDetails_Courses_Id FOREIGN Key (CourseId)
REFERENCES dbo.Courses (Id),
CONSTRAINT FK_CourseDetails_Rooms_Id FOREIGN Key (RoomId)
REFERENCES dbo.Rooms (Id)
)
CREATE TABLE [dbo].[CourseMember](
[Id] [int] IDENTITY(1,1) NOT NULL,
[CourseDetailsId] [int] NOT NULL,
[StudentId] [int] NOT NULL,
CONSTRAINT PK_CourseMember PRIMARY KEY CLUSTERED (Id),
CONSTRAINT FK_CourseMember_CourseDetails_Id FOREIGN Key (CourseDetailsId)
REFERENCES dbo.CourseDetails (Id),
CONSTRAINT FK_CourseMember_Students_Id FOREIGN Key (StudentId)
REFERENCES dbo.Students (Id)
)
INSERT INTO dbo.Courses (CourseName)
VALUES ('SQL 1 - Basics'),
('SQL 2 - Intermediate'),
('SQL 3 - Advanced'),
('C# 1 - Basics'),
('C# 2 - Intermediate'),
('C# 3 - Advanced')
INSERT INTO dbo.Students (StudentName)
VALUES
('Koothrappali'),
('Cooper'),
('Penny'),
('Amy')
INSERT INTO dbo.Teachers (TeacherName)
VALUES
('gbn '),
('Sam S.'),
('Marc G.'),
('Reed C.'),
('John S.')
INSERT INTO dbo.Rooms (RoomName)
VALUES ('A3'), ('E7')
INSERT [dbo].[CourseDetails] (CourseId, TeacherId, RoomId)
VALUES (4, 3, 2),(5, 2, 2),
(6, 5, 1),(6, 4, 2),
(1,3,1),(2,3,1),(3,3,2),
(3,1,1)
INSERT [dbo].[CourseMember] (CourseDetailsId, StudentId)
VALUES (1,3),(2,3),(2,1),(3,2),(4,1),(7,2),(7,4),(8,1)
I personally would do this a bit different. Since you are trying to pivot two separate columns that screams to use the UNPIVOT function.
The unpivot will convert your multiple columns into rows to then pivot.
Since you have SQL Server 2008, you can use CROSS APPLY and values:
select id, course, teacher, col, flag
from
(
Select cd.Id, c.CourseName as Course, t.TeacherName as Teacher
,cast(r.Id as varchar(10))as RoomId
, r.RoomName as RoomName
,cast(100 + s.Id as varchar(10)) as StudentId
, s.StudentName as Student
, '1' flag
FROM CourseDetails cd
Left JOIN Courses c
ON cd.CourseId = c.Id
Left JOIN Teachers t
ON cd.TeacherId = t.Id
Left JOIN CourseMember cm
ON cd.Id = cm.CourseDetailsId
Left JOIN Students s
ON cm.StudentId = s.Id
Left JOIN Rooms r
ON cd.RoomId = r.Id
) d
cross apply
(
values ('roomname', roomname),('student',student)
) c (value, col)
See Demo. The unpivot generates a result similar to this:
| ID | COURSE | TEACHER | COL | FLAG |
-------------------------------------------------------------
| 1 | C# 1 - Basics | Marc G. | E7 | 1 |
| 1 | C# 1 - Basics | Marc G. | Penny | 1 |
| 2 | C# 2 - Intermediate | Sam S. | E7 | 1 |
| 2 | C# 2 - Intermediate | Sam S. | Penny | 1 |
| 2 | C# 2 - Intermediate | Sam S. | E7 | 1 |
| 2 | C# 2 - Intermediate | Sam S. | Koothrappali | 1 |
| 3 | C# 3 - Advanced | John S. | A3 | 1 |
| 3 | C# 3 - Advanced | John S. | Cooper | 1 |
You will see that the col data contains all the values that you want to pivot. Once the data is in the rows, if will be easy to apply one pivot:
select id, course, teacher,
coalesce(A3, '') A3,
coalesce(E7, '') E7,
coalesce(Koothrappali, '') Koothrappali,
coalesce(Cooper, '') Cooper,
coalesce(Penny, '') Penny,
coalesce(Amy, '') Amy
from
(
select id, course, teacher, col, flag
from
(
Select cd.Id, c.CourseName as Course, t.TeacherName as Teacher
,cast(r.Id as varchar(10))as RoomId
, r.RoomName as RoomName
,cast(100 + s.Id as varchar(10)) as StudentId
, s.StudentName as Student
, '1' flag
FROM CourseDetails cd
Left JOIN Courses c
ON cd.CourseId = c.Id
Left JOIN Teachers t
ON cd.TeacherId = t.Id
Left JOIN CourseMember cm
ON cd.Id = cm.CourseDetailsId
Left JOIN Students s
ON cm.StudentId = s.Id
Left JOIN Rooms r
ON cd.RoomId = r.Id
) d
cross apply
(
values ('roomname', roomname),('student',student)
) c (value, col)
) d
pivot
(
max(flag)
for col in (A3, E7, Koothrappali, Cooper, Penny, Amy)
) piv
See SQL Fiddle with Demo.
Then to convert this to dynamic SQL, you are only pivoting one column, so you will use the following to get the list of columns:
select #cols = STUFF((SELECT ',' + QUOTENAME(col)
from
(
select id, roomname col, 1 SortOrder
from rooms
union all
select id, StudentName, 2
from Students
) d
group by id, col, sortorder
order by sortorder, id
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
This will get the list of distinct rooms and students that are then used in the pivot. So the final code will be:
DECLARE #cols AS NVARCHAR(MAX),
#colsNull AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(col)
from
(
select id, roomname col, 1 SortOrder
from rooms
union all
select id, StudentName, 2
from Students
) d
group by id, col, sortorder
order by sortorder, id
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
select #colsNull = STUFF((SELECT ', coalesce(' + QUOTENAME(col)+', '''') as '+QUOTENAME(col)
from
(
select id, roomname col, 1 SortOrder
from rooms
union all
select id, StudentName, 2
from Students
) d
group by id, col, sortorder
order by sortorder, id
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query
= 'SELECT
id, course, teacher,' + #colsNull + '
from
(
select id, course, teacher, col, flag
from
(
Select cd.Id, c.CourseName as Course, t.TeacherName as Teacher
,cast(r.Id as varchar(10))as RoomId
, r.RoomName as RoomName
,cast(100 + s.Id as varchar(10)) as StudentId
, s.StudentName as Student
, ''1'' flag
FROM CourseDetails cd
Left JOIN Courses c
ON cd.CourseId = c.Id
Left JOIN Teachers t
ON cd.TeacherId = t.Id
Left JOIN CourseMember cm
ON cd.Id = cm.CourseDetailsId
Left JOIN Students s
ON cm.StudentId = s.Id
Left JOIN Rooms r
ON cd.RoomId = r.Id
) d
cross apply
(
values (''roomname'', roomname),(''student'',student)
) c (value, col)
) d
pivot
(
max(flag)
for col in (' + #cols + ')
) p '
execute(#query)
See SQL Fiddle with Demo.
Note I implemented a flag to be used in the pivot, this basically generates a Y/N if there is a value for the room or student.
This gives a final result:
| ID | COURSE | TEACHER | A3 | E7 | KOOTHRAPPALI | COOPER | PENNY | AMY |
---------------------------------------------------------------------------------------
| 1 | C# 1 - Basics | Marc G. | | 1 | | | 1 | |
| 2 | C# 2 - Intermediate | Sam S. | | 1 | 1 | | 1 | |
| 3 | C# 3 - Advanced | John S. | 1 | | | 1 | | |
| 4 | C# 3 - Advanced | Reed C. | | 1 | 1 | | | |
| 5 | SQL 1 - Basics | Marc G. | 1 | | | | | |
| 6 | SQL 2 - Intermediate | Marc G. | 1 | | | | | |
| 7 | SQL 3 - Advanced | Marc G. | | 1 | | 1 | | 1 |
| 8 | SQL 3 - Advanced | gbn | 1 | | 1 | | | |
As a side note, this data can also be unpivoted using the unpivot function in sql server. (See Demo with unpivot)
You can create alias string for both pivot columns using dynamic sql query,
For example, for student columns :
DECLARE #colsStudents AS NVARCHAR(MAX),
#colsstudentalias AS NVARCHAR(MAX),
#colsRooms AS NVARCHAR(MAX),
#colsRoomsalias AS NVARCHAR(MAX)
SELECT #colsStudents = STUFF
(
(
SELECT DISTINCT ',' + QUOTENAME(100 + Id)
FROM dbo.Students
FOR XML PATH('')
), 1, 1, ''
)
SELECT #colsstudentalias = STUFF
(
(
SELECT DISTINCT ',' + QUOTENAME(100 + Id)
+ ' as ' + QUOTENAME(ltrim(rtrim(StudentName)))
FROM dbo.Students
FOR XML PATH('')
), 1, 1, ''
)
SELECT #colsRooms = STUFF
(
(
SELECT DISTINCT ',' + QUOTENAME(Id)
FROM dbo.Rooms
FOR XML PATH('')
), 1, 1, ''
)
SELECT #colsRoomsalias = STUFF
(
(
SELECT DISTINCT ',' + QUOTENAME(Id)
+ ' as ' + QUOTENAME(ltrim(rtrim(RoomName)))
FROM dbo.Rooms
FOR XML PATH('')
), 1, 1, ''
)
--SELECT #colsStudents, #colsstudentalias, #colsRooms, #colsRoomsalias
DECLARE #sql varchar(max)
set #sql = ';With PivotData as (
Select cd.Id, c.CourseName as Course, t.TeacherName as Teacher
,r.Id as RoomId, r.RoomName as RoomName
,100 + s.Id as StudentId, s.StudentName as Student
FROM CourseDetails cd
Left JOIN Courses c ON cd.CourseId = c.Id
Left JOIN Teachers t ON cd.TeacherId = t.Id
Left JOIN CourseMember cm ON cd.Id = cm.CourseDetailsId
Left JOIN Students s ON cm.StudentId = s.Id
Left JOIN Rooms r ON cd.RoomId = r.Id
)
Select Course, Teacher
, ' + #colsRoomsalias + '
, ' + #colsstudentalias + '
FROM (
Select Course, Teacher, RoomName, RoomId,Student, StudentId
From PivotData) src
PIVOT( Max(RoomName) FOR RoomId IN (' + #colsRooms + ')) as P1
PIVOT( Count(Student) FOR StudentId IN (' + #colsStudents + ') ) as P2'
exec (#sql)
SQL DEMO
I am going to take a deeper look at both answers above and compare them with the one below.
My problem was in filling the local variables #RoomNames and #StudentNames with the Stuff() Function. One reason was that i had choosen the datatype nchar(120) instead of
nvarchar(120) for the columns StudentName, RoomName.
Another problem i had was that the new columnNames (Student instead of StudentName) where not recognized; therefore i replaced them with * in this statement: Select * From (' + #PivotSrc + N') src
Philip Kelley suggested to use SELECT #RoomIds = isnull(#RoomIds + ',', '') + '[' + Cast(Id as nvarchar(20))+ ']' FROM Rooms instead of STUFF() and since i find it shorter and easier to read i am using it now.
Working Solution
DECLARE #StudentNames NVARCHAR(2000),
#RoomIds NVARCHAR(2000),
#RoomNames NVARCHAR(2000),
#PivotSrc NVARCHAR(MAX),
#PivotBase NVARCHAR(MAX);
SELECT #StudentNames = isnull(#StudentNames + ',', '') + '[' + StudentName + ']' FROM Students
SELECT #RoomIds = isnull(#RoomIds + ',', '') + '[' + Cast(Id as nvarchar(20))+ ']' FROM Rooms
SELECT #RoomNames = isnull(#RoomNames + ',', '') + '[' + RoomName + ']' FROM Rooms
SET #PivotSrc = N'Select cd.Id, c.CourseName as Course, t.TeacherName as Teacher
,r.Id as RoomId, r.RoomName as RoomName
,100 + s.Id as StudentId, s.StudentName as Student
FROM CourseDetails cd
Left JOIN Courses c ON cd.CourseId = c.Id
Left JOIN Teachers t ON cd.TeacherId = t.Id
Left JOIN CourseMember cm ON cd.Id = cm.CourseDetailsId
Left JOIN Students s ON cm.StudentId = s.Id
Left JOIN Rooms r ON cd.RoomId = r.Id'
SET #PivotBase = N' Select Course, Teacher, '
+ #RoomNames + N', '
+ #StudentNames + N' FROM (
Select * From (' + #PivotSrc + N') src
PIVOT( Max(RoomName) FOR RoomName IN ('+#RoomNames+ N')) as P1
PIVOT( Count(Student) FOR Student IN ('+#StudentNames+N') ) as P2) as T'
execute(#PivotBase)