Populating A Many-To-Many Table - sql

I have two tables, The Instructor table, and the Department Table. The Instructor can be involved in many departments and the departments can contain many Instructors. I'm trying to populate the DepartmentInstructor table to create a many-to-many relationship. The tables are populated like so,
Department Table
DepartmentID DepartmentName
1 Aaron Copland School of Music
2 American Studies
3 Art
4 Classical, Middle Eastern, and Asian Languages and Cultures
5 Comparative Literature
6 Drama, Theatre & Dance
7 English
8 European Languages and Literatures
Instructor Table
InstructorID InstructorFullName
1 Abrams, Brian
2 Ciavarella, Peter
3 Franklin, Arnold
4 Shur, Mitchell
5 Reich, Toby
6 Meyers, Allison
7 Dana, Kathryn
8 Rhindress, Mindy
What I'm trying to do is,
DepartmentInstructor Table
DepartmentID InstructorID
1 3
3 7
2 7
6 4
Edit:
Responding to #GeorgeJoseph, We were also given a table that contains all of the data besides the IDs. This table is shown below,
Table X
Semester Sec Code Course(HR,CRD) Description Day Time Instructor Location Enrolled Limit ModeOfInstruction
Spring 2019 02 37366 ACCT 100 (3, 3) Fin & Mgr Acct T, TH 3:10 PM - 4:25 PM Milo, Michael KY 419 20 22 In-Person
Spring 2019 03 37823 ACCT 100 (3, 3) Fin & Mgr Acct M 3:10 PM - 6:00 PM Ho, Vivian HH 17 21 22 In-Person
Spring 2019 01 37365 ACCT 100 (3, 3) Fin & Mgr Acct T, TH 10:45 AM - 12:00 PM Milo, Michael KY 419 22 22 In-Person
Spring 2019 06 7351 ACCT 101 (4, 3) Int Theo & Prac Acct 1 T, TH 12:10 PM - 2:00 PM Feisullin, Anita RA 201 30 30 In-Person
Spring 2019 12 7357 ACCT 101 (4, 3) Int Theo & Prac Acct 1 SU 8:20 AM - 12:00 PM Mintz, Chana PH 204 39 55 In-Person
Spring 2019 11 7356 ACCT 101 (4, 3) Int Theo & Prac Acct 1 S 8:20 AM - 12:00 PM Chan, Joseph PH 110 54 55 In-Person
Spring 2019 10 7355 ACCT 101 (4, 3) Int Theo & Prac Acct 1 F 6:30 PM - 10:30 PM Solarsh, Eva PH 212 30 30 Hybrid
Spring 2019 09 7354 ACCT 101 (4, 3) Int Theo & Prac Acct 1 T, TH 8:50 PM - 10:30 PM Zapf, Michael PH 110 29 55 In-Person
I added the data to the Instructor Table and the Department table through this table. Let's call this table X. The DepartmentName was created by using a case statement over the Course(HR,CRD) column.
Now to answer your question, Table X should help us in forming that many-to-many relationship between the Instructor and the Department Table. I'm currently not sure how to map the relationship. What I tried doing was this,
SELECT DISTINCT [Description], Instructor
FROM Schema.X AS x
INNER JOIN [College].[Instructor] AS I
ON x.Instructor = I.InstructorFullName
This will then give me the corresponding course taught by a professor but I'm unsure of how to go from here.
Edit 2:
Here's how my DB design looks,

As George and yourself have mentioned, you are almost there. I am using SQL Server / T-SQL
In my example you have a course table, an instructor table and a department table.
The course table must have the instructorID and the departmentID as a column. This is how you bridge the gap between all the tables. It means that you have a distinct list of departments, courses (with the linking department and tutor IDs) and a distinct list of tutors. There are considerations where a course has more than one tutor (Could happen I suppose) but test out what suits your setup. Probably add a new row to courses with the same departmentId and the 2nd tutorID.
I have added some extra columns in the output.
You can also see not all departments have courses assigned to them. Lack of funding! Also note I have used left join where inner might work better depending on your situation or where clause. EG Where courseID is not null.
http://sqlfiddle.com/#!18/cf48b/1/0

Ok so, through some trial and error and thoroughly reading through the data. I've come to a solution that I believe to be correct,
INSERT INTO [College].[DepartmentInstructor]
(DepartmentInstructorID, DepartmentKey, InstructorKey)
SELECT
NEXT VALUE FOR [Project3].[SequenceObjectForDepartmentInstructorId],
DepartmentID,
InstructorID
FROM (
SELECT DISTINCT InstructorID, DepartmentID
FROM Uploadfile.CoursesSpring2019 AS CS
INNER JOIN [College].[Instructor] AS I
ON CS.Instructor = I.InstructorFullName
INNER JOIN [College].[Department] AS D
ON CS.[Course (hr, crd)] LIKE CONCAT('%', D.DepartmentName, '%')
) AS Result
I've been able to progress further in my project and I'm about 95% done. I've actually stumbled onto a somewhat similar problem. If you refer back to the database design that I posted, the courses table will need the IDs from multiple tables. This is what I've come up with,
SELECT DISTINCT
TS.TimeSlotID,
I.InstructorID,
BL.BuildingLocationID,
C.CourseID
FROM Uploadfile.CoursesSpring2019 AS CS
INNER JOIN [College].[TimeSlot] AS TS
ON CS.[Time] = TS.[ClassHours]
INNER JOIN [College].[Instructor] AS I
ON CS.[Instructor] = I.[InstructorFullName]
INNER JOIN [College].[BuildingLocation] AS BL
ON CS.[Location] LIKE CONCAT( BL.[BuildingName], '%')
INNER JOIN [College].[Course] AS C
ON CS.[Course (hr, crd)] LIKE CONCAT(C.CourseName, '%')
The problem here is that this query results in approximately 1mil rows. Table X has approximately 4700 rows which means that this query that I currently have is nowhere near the number of rows I should have since.

Related

Find the Age and Name of the Youngest Player for Each Race

Table "participant":
ptcpt_id
ptcpt_name
brt_dt
1
Ana Perez
2001-10-10
2
John Sy
1999-04-03
3
Judy Ann
2001-10-10
Table "race":
race_id
race_name
race_date
1
Vroom Vroom
2023-01-01
2
Fast & Furious
2022-01-01
Table "individual_race_record":
irr_id
ptcpt_id
race_id
run_time
1
1
1
00:59:13
2
1
2
01:19:14
3
2
1
00:48:05
4
2
2
01:01:17
5
3
2
01:31:18
I want to select the name and age of the youngest participant for each race event, as well as the name and year of each race event.
This is what I have so far:
SELECT
r.race_name,
EXTRACT(YEAR FROM r.race_date) AS year,
COALESCE(CAST(min.age AS varchar), 'N/A')
FROM(
SELECT
race_id,
EXTRACT(YEAR FROM MIN(AGE(brt_dt))) AS age
FROM(
SELECT p.ptcpt_id, p.brt_dt, irr.race_id
FROM participant p
INNER JOIN individual_race_record irr
ON p.ptcpt_id = irr.ptcpt_id
) sub
GROUP BY race_id
) min
RIGHT JOIN race r ON r.race_id=min.race_id
ORDER BY year DESC
which resulted to the following table:
race_name
year
age
Vroom Vroom
2023
21
Fast & Furious
2022
21
But what I want is this:
race_name
year
age
ptcpt_name
Vroom Vroom
2023
21
Ana Perez
Fast & Furious
2022
21
Ana Perez
Fast & Furious
2022
21
Judy Ann
The problem is that I can't join it with the participant table. I still need another column for the name of the youngest participant. And if there are multiple youngest participant in a race, I'd like to show them both. When I try to select the ptcpt_id for the 'min' table it resulted to an error saying that I have to also include the ptcpt_id under the GROUP BY function. But I don't need it to be grouped by participants.
I'd appreciate any help and leads on this issue. Thank you.
You can use FETCH FIRST ROWS WITH TIES to gather all records that tie on the first ORDER BY field. Namely, if we use DENSE_RANK to assign a ranking to each person for each race, based on their age, it will allow to get all people with minimum age for each race. Since we're using DENSE_RANK, it will retrieve all people having the minimum age, if there's more than one.
SELECT r.race_name,
EXTRACT(YEAR FROM r.race_date) AS "year",
DATE_PART('year', r.race_date) - DATE_PART('year', p.brt_dt) AS age,
p.ptcpt_name
FROM participant p
INNER JOIN individual_race_record irr ON p.ptcpt_id = irr.ptcpt_id
INNER JOIN race r ON r.race_id = irr.race_id
ORDER BY DENSE_RANK() OVER(
PARTITION BY race_name
ORDER BY DATE_PART('year', r.race_date) - DATE_PART('year', p.brt_dt))
FETCH FIRST 1 ROWS WITH TIES
Output:
race_name
year
age
ptcpt_name
Fast & Furious
2022
21
Ana Perez
Fast & Furious
2022
21
Judy Ann
Vroom Vroom
2023
22
Ana Perez
Check the demo here.

Writing query in SQL in a different way

I am trying to write a query for this question.
Find the IDs of all students who were taught by an instructor named
Einstein; make sure there are no duplicates in the result.
One suitable answer to this would be
select distinct student.ID
from (student join takes using(ID))
join (instructor join teaches using(ID))
using (course_id, sec_id, semester, year)
where instructor.name = 'Einstein'
but I don't want to use the join.....using Syntax. I want to write the same query without using join....using. I was able to write some parts of the query, but don't understand how to write the whole query without it returning any errors. Below is what I am trying to do, As an alternative to the join....using syntax the query I am trying to write is by enumerating relations in the from clause, and adding the corresponding join predicates on ID, course id, section id, semester, and year to the where clause. But when I do that, its giving me back errors, saying that "no such column exists"
select distinct student.id
from student, takes
where student.id = takes.id
and student.sec_id = takes.sec_id
and student.semester = takes.semester
and student.year = takes.year
How can I fix the code?
Table setup:
Student
ID NAME DEPT_NAME TOT_CRED
-------------------------------------
00128 Zhang Comp. Sci. 102
12345 Shankar Comp. Sci. 32
19991 Brandt History 80
44553 Peltier Physics 56
Instructor
ID NAME DEPT_NAME SALARY
---------------------------------------
10101 Srinivasan Comp. Sci. 65000
12121 Wu Finance 90000
15151 Mozart Music 40000
22222 Einstein Physics 95000
Teaches
ID COURSE_ID SEC_ID SEMESTER YEAR
------------------------------------------
10101 CS-101 1 Fall 2009
10101 CS-315 1 Spring 2010
10101 CS-347 1 Fall 2009
22222 PHY-101 1 Fall 2017
Takes
ID COURSE_ID SEC_ID SEMESTER YEAR GRADE
------------------------------------------------
00128 CS-101 1 Fall 2009 A
00128 CS-347 1 Fall 2009 A-
12345 CS-101 1 Fall 2009 C
44553 PHY-101 1 Fall 2017 B-

SQL query to find the faculty which have taught every subject

I need to write a SQL query to find the faculty which has taught every subject (ie Sam)
With nested queries
Without using aggregate functions (no count, avg, min etc).
I can't seem to figure this out, would really appreciate some help =)
Faculty
fid
fname
fqualifications
fexperience
salary
deptname
100
Sam
ME CS
10
100000
IT
101
John
ME IT
8
80000
IT
102
Max
ME CS
9
90000
CS
103
Jenny
ME CS
5
50000
CS
Course
cid
cname
semester
1
SE
4
2
WT
4
3
CG
5
4
DBMS
5
Teaches
fid
cid
year
100
1
2019
100
2
2018
100
3
2020
100
4
2021
101
1
2017
101
2
2018
102
2
2018
102
3
2019
103
3
2020
103
4
2021
I used this query to find the output but according to the question I can't.
select * from faculty f
-> inner join teaches t
-> on f.fid=t.fid
-> inner join course c
-> on t.cid=c.cid
-> group by f.fid,f.fname
-> having count(*)=4;
OUTPUT:
fid
fname
fqualifications
fexperience
salary
deptname
fid
cid
year
cid
cname
semester
100
Sam
ME CS
10
100000
IT
100
1
2019
1
SE
4
Not the most efficient way to proceed, but with the requirements given to you, I would try and rephrase the query like this:
"A faculty that has taught every subject is a faculty that has not skipped even one subject".
Now, faculties that have skipped a subject will have a NULL when LEFT JOINed with the syllabus and all the subjects. Pseudo-SQL:
SELECT DISTINCT faculty.id FROM faculties
LEFT JOIN has_taught ON (has_taught.faculty_id = faculty.id)
LEFT JOIN subjects ON (has_taught.subject_id = subjects.id)
WHERE has_taught.faculty_id IS NULL;
or in some databases you maybe need
SELECT DISTINCT faculty.id FROM faculties
CROSS JOIN subjects
LEFT JOIN has_taught ON
(has_taught.faculty_id = faculty.id AND has_taught.subject_id = subjects.id)
WHERE has_taught.faculty_id IS NULL;
So, faculties that are NOT IN this list would naturally be
SELECT * FROM faculties
WHERE faculty.id NOT IN (
SELECT DISTINCT faculty.id ...
);
and this should only use nested queries, as requested.
Or with a further join
SELECT faculties.* FROM faculties
LEFT JOIN (
SELECT DISTINCT faculty.id ...
) AS they_skipped_some
ON (they_skipped_some.id = faculties.id)
WHERE they_skipped_some.id IS NULL

SQL Complex Filter/Join Issue

I'm a novice at SQL and I think this is a relatively basic query but I can't seem to get it to work.
I have two tables. One has group membership and the other details about the group. The key field between the two is Group.
Membership looks like this.
Person EffectiveDate Group
Mary 8/10/2017 A
Joe 8/05/2017 A
Peter 9/01/2017 B
Mike 9/2/2017 B
Alice 9/2/2017 B
Joe 9/10/2017 B
Pam 9/3/2017 C
Note that there are two entries for Joe because he changed groups.
GroupInformation Looks like this:
Group FullName Location Color
A Panthers New York Blue
B Steelers London Orange
C Archers Moscow Yellow
I want to run a query that, on any given day, will give me the individual's group membership along with team details.
So, I want to find the line with the MAX(EffectiveDate) in Membership for each individual person on the date run and left join the GroupInformation table on key Group
If I ran the query on 9/4 I'd get this:
Person EffectiveDate Group FullName Location Color
Mary 8/10/2017 A Panthers New York Blue
Joe 8/05/2017 A Panthers New York Blue
Peter 9/01/2017 B Steelers London Orange
Mike 9/2/2017 B Steelers London Orange
Alice 9/2/2017 B Steelers London Orange
Pam 9/3/2017 C Archers Moscow Yellow
If I ran the query on 9/13 I'd get this:
Person EffectiveDate Group FullName Location Color
Mary 8/10/2017 A Panthers New York Blue
Peter 9/01/2017 B Steelers London Orange
Mike 9/2/2017 B Steelers London Orange
Alice 9/2/2017 B Steelers London Orange
Joe 9/10/2017 B Steelers London Orange
Pam 9/3/2017 C Archers Moscow Yellow
Note that the difference between the two query results is Joe. The 9/4 run has him in Group A joining on 8/5 where the 9/13 run has him in Group B which he joined on 9/10.
My query code is as follow:
Select s.Person,
s.Group,
s.EffectiveDate,
g.FullName,
g.Location,
g.Color
From Membership s
Join GroupInformation g
on s.Group = g.Group
and s.EffectiveDate = (
Select Max(s1.EffectiveDate)
From Membership s1
where s1.Group = g.Group
and s1.EffectiveDate <= '2017-09-14')
However when I run this code I find in my actual data that it omits records. So if I have 150 records in membership the resulting query join and subquery operations will result in an answer with maybe 80 records.
Can't figure out what I'm doing wrong. Guidance please.
Thanks.
You are on the right track, but using the wrong correlation clause:
Select s.Person, s.Group, s.EffectiveDate, g.FullName, g.Location, g.Color
From Membership s Join
GroupInformation g
on s.Group = g.Group
WHERE s.EffectiveDate = (Select Max(s1.EffectiveDate)
From Membership s1
where s1.Person = s.Person and
s1.EffectiveDate <= '2017-09-14'
);
Note that group is a very poor name for a column name in SQL, because it is a SQL key word.
What you need is to recharacterize the membership data to group member names as well as dates, then use it as a subquery and join to it in this vein. You're basically saying "give me the max membership date of each person prior to a given date of interest." Caveat: if the EffectiveDate field is strictly 'Date' (rather than a DateTime), it could theoretically still fail if someone changed memberships twice on the same day (no date resolution beyond the day).
Suggest this as a possible alternative (warning this is very hastily thrown together and not tested):
select s.person, s.group, s.EffectiveDate, g.FullName,g.location, g.color
from (select m.person,m.group, max(m.effectivedate) effectivedate
from Membership m
where m.EffectiveDate <= '2017-09-14'
group by m.person,m.group) s
join GroupInformation g
on s.group=g.group

SQL remove and filter similar results

I have the below query which gives me a list of orders where an address comes up twice so we can double pack same orders to the same address
select * from Orders
where
address in (select address from orders group by address having count(*) = 2)
CustID StockID Address Company
-----------------------------------------------------------------
1217 23185 1 Some Road Stockton
58458 23185 1 Some Road
58459 23185 4 John St
58457 23185 4 John St
299576 23185 9 Roadway PDE Graceland
59470 23185 9 Roadway PDE Cahill Tow
97656 23185 24 Kent St
97677 23185 24 Kent St
212732 23185 23 Best Rd
226583 23185 23 Best Rd c/o John
191718 23185 98 King St
156363 23185 98 King St
121106 23185 19 Broadway
156362 23185 19 Broadway
I want the result to look like this which excludes any addresses which come up which have a company name in either of the 2 results that come up for it. Some addresses have nothing in the Company name however i want to exclude them as well if the other result for the same address contains a company name.
CustID StockID Address Company
-----------------------------------------------------------------
58459 23185 4 John St
58457 23185 4 John St
97656 23185 24 Kent St
97677 23185 24 Kent St
191718 23185 98 King St
156363 23185 98 King St
121106 23185 19 Broadway
156362 23185 19 Broadway
Hope this all makes sense and appreciate any help
Thank you!
select * from Orders o1
where
0 = (select count(*) from orders o2
where o1.address = o2.address
and company is not null)
This is a correlated sub-query which ties the main query and subquery together over address. I have assumed "empty" company is NULL, replace with <> "" if you don't use nulls.
AS #ClockWork-Muse mentions in the comments, there's a lot of things you need to consider in this scenario, pertaining to data cleanliness as well as business logic. Having said that, you can try this for your issue:
;with filtered as
(
select * from Orders
where
address in (select address from orders group by address having count(*) = 2)
)
,cte as
(select address, max(company) as max
from filtered
group by address
having max(company) = '')
select f.* from filtered f
inner join cte c on f.address = c.address
Basically, you create a CTE to identify those companies which have only blank values for Company, and then join back to the results of your query