Combining window functions and conditions - sql

Consider the classic Student and Classes many-many relationship, where a student can attend multiple classes and a class contains multiple students.
CREATE TABLE students(
id serial PRIMARY KEY,
name text,
gender text NOT NULL
);
CREATE TABLE schools(
id serial PRIMARY KEY,
name text,
);
CREATE TABLE classes(
id serial PRIMARY KEY,
name text,
school_id integer NOT NULL REFERENCES schools (id)
);
CREATE TABLE students_classes(
id serial PRIMARY KEY,
class_id integer NOT NULL REFERENCES classes (id),
student_id integer NOT NULL REFERENCES students (id),
);
The overall query is much bigger - consider that there are schools and other things that add to the complexity of the problem. So I need to use window functions to get things like total_students.
I want a query that gets me all the classes, the total number of students enrolled in that class, the number of guys enrolled and the number of girls.
class_id | n_students | n_guys | n_girls
____________________________________________
| | |
I have the following so far, can I get some help the number of guys and girls?
SELECT
school_id,
w.class_id,
w.n_students,
w.n_guys,
w.n_girls
FROM schools
JOIN classes ON classes.school_id = schools.id
JOIN (
c.id AS class_id,
COUNT(*) OVER (PARTITION BY sc.class_id) AS n_students,
{Something} AS n_guys,
{Something} AS n_girls
FROM students_classes AS sc
JOIN classes AS c ON sc.class_id = c.id
) as w ON w.class_id = classes.id
WHERE school_id = 81;

You could use this, no need to use windows/analytic function
Change male and female to your text value of your students.gender column
SELECT
s.school_id,
c.class_id,
COUNT(*) AS n_students,
SUM(CASE WHEN st.gender = 'male' THEN 1 ELSE 0 END) AS n_guys,
SUM(CASE WHEN st.gender = 'female' THEN 1 ELSE 0 END) AS n_girls
FROM schools s
INNER JOIN classes c
ON c.school_id = schools.id
INNER JOIN students_classes sc
ON sc.class_id = classes.id
INNER JOIN students st
ON st.id = sc.student_id
WHERE s.school_id = 81
GROUP BY s.school_id, c.class_id
ORDER BY s.school_id, c.class_id;

Because you are just using the id, you do not need the schools table. So, this query is basically joins with conditional aggregation:
select c.school_id, c.id as class_id,
count(*) AS n_students,
sum( (st.gender = 'male')::int ) AS n_guys,
sum( (st.gender = 'female')::int ) AS n_girls
from classes c join
students_classes sc
on sc.class_id = c.id join
students st
on st.id = sc.student_id
where c.school_id = 81
group by c.school_id, c.id
order by c.school_id, c.id;

Related

Find missing records in ONE-TO-MANY relationship query

I have 3 tables:
Class
Id
ClassName
guid
1A
guid
2A
Subject
Id
SubjectName
guid
Math
guid
Biography
SubjectOfEachClass
Id
ClassId
SubjectId
TeacherName
guid
guid
guid
James Bond
The way I wanted these tables to work is:
There will be 10 classes in table Class.
There will be 10 subjects in table Subject.
Each class will has 10 subjects and for 10 classes there will be 100 records.
I ran into some problems, I queried the SubjectOfEachClass table and there are only 95 records.
The query command I use to find the missing subjects is:
SELECT *
FROM Subject s
JOIN (
SELECT *
FROM SubjectOfEachClass
WHERE ClassId = 'guid'
) AS sc ON s.Id = sc.SubjectId
I replaced the ClassId several times until I found the class that misses some of the subjects.
I reckon this way of querying is not efficient at all. If I have 100 subjects and 100 classes, there will no chance that I will find the missing subjects.
to find every missing subject in all classes:
select c.id, c.classname , s.id , s.SubjectName
from class c
cross apply Subject s
where not exists (
select 1 from SubjectOfEachClass sc
where sc.classid = c.id and sc.subjectid = s.id
)
Try this:
SELECT c.id AS classId,
count(sc.id) AS countOfSubjects
FROM SubjectOfEachClass AS sc
INNER JOIN Classes AS c ON c.id = sc.classId
GROUP BY c.id
ORDER BY countOfSubjects
The abnormal values will be floated.
Your primary table should be SubjectOfEachClass, then those foreign tables Subject and Class will join your primary table.
select *
from SubjectOfEachClass sc
inner join Subject s on s.guid=sc.guid
inner join Class c on c.guid=sc.guid
where sc.ClassId = 'guid'

SQL: Find common rows in different record

I have 3 tables:
Teacher Table (t_id, email, ...)
Student Table (s_id, email, ...)
Teaching Table (t_id, s_id, class_time, ...)
I have a task which is, given two t_id, find the common students that these 2 teachers have taught.
Is it possible to accomplish this in strictly SQL? If not I might try to retrieve out the student records individually based on different teacher, and do a search to see which students they have in common. This seems a bit overkill for something that seems possible to write a SQL query for.
You can self join to get students for both teachers.
DECLARE #TeacherID1 INT = 1
DECLARE #TeacherID2 INT = 2
SELECT
StudentID = T1.s_id,
Teacher1 = T1.t_id,
Teacher1ClassTime = T1.class_time ,
Teacher2 = T2.t_id,
Teacher2ClassTime = T2.class_time
FROM
TeachingTable T1
INNER JOIN TeachingTable T2 ON T2.s_id=T1._sid AND T2.t_id=#TeacherID2
WHERE
T1.t_id = #TeacherID1
ORDER BY
T1.ClassTime
select s_id
from student a
inner join teaching b on a.s_id = b.s_id
where t_id = 'First give t_id'
INTERSECT
select s_id
from student a
inner join teaching b on a.s_id = b.s_id
where t_id = 'Second give t_id'
This work with MS DB, but probably not with others.
select s_id
from student a
inner join teaching b on a.s_id = b.s_id
where b.t_id = 'First give t_id'
and s_id in (
select s_id
from student c
inner join teaching d on c.s_id = d.s_id
where d.t_id = 'Second give t_id'
)
the second one should work with any DB.

SQL - Selecting highest scores for different categories

Lets say i've got a db with 3 tables:
Players (PK id_player, name...),
Tournaments (PK id_tournament, name...),
Game (PK id_turn, FK id_tournament, FK id_player and score)
Players participate in tournaments. Table called Game keeps track of each player's score for different tournaments)
I want to create a view that looks like this:
torunament_name Winner highest_score
Tournament_1 Jones 300
Tournament_2 White 250
I tried different aproaches but I'm fairly new to sql (and alsoto this forum)
I tried using union all clause like:
select * from (select "Id_player", avg("score") as "Score" from
"Game" where "Id_tournament" = '1' group by "Id_player" order by
"Score" desc) where rownum <= 1
union all
select * from (select "Id_player", avg("score") as "Score" from
"Game" where "Id_tournament" = '2' group by "Id_player" order by
"Score" desc) where rownum <= 1;
and ofc it works but whenever a tournament happens, i would have to manually add a select statement to this with Id_torunament = nextvalue
EDIT:
So lets say that player with id 1 scored 50 points in tournament a, player 2 scored 40 points, player 1 wins, so the table should show only player 1 as the winner (or if its possible 2or more players if its a tie) of this tournament. Next row shows the winner of second tournament. I dont think Im going to put multiple games for one player in the same tournament, but if i would, it would probably count avg from all his scores.
EDIT2:
Create table scripts:
create table players
(id_player numeric(5) constraint pk_id_player primary key, name
varchar2(50));
create table tournaments
(id_tournament numeric(5) constraint pk_id_tournament primary key,
name varchar2(50));
create table game
(id_game numeric(5) constraint pk_game primary key, id_player
numeric(5) constraint fk_id_player references players(id_player),
id_tournament numeric(5) constraint fk_id_tournament references
tournaments(id_tournament), score numeric(3));
RDBM screenshot
FINAL EDIT:
Ok, in case anyone is wondering I used Jorge Campos script, changed it a bit and it works. Thank you all for helping. Unfortunately I cannot upvote comments yet, so I can only thank by posting. Heres the final script:
select
t.name,
p.name as winner,
g.score
from
game g inner join tournaments t
on g.id_tournament = t.id_tournament
inner join players p
on g.id_player = p.id_player
inner join
(select g.id_tournament, g.id_player,
row_number() over (partition by t.name order by
score desc) as rd from game g join tournaments t on
g.id_tournament = t.id_tournament
) a
on g.id_player = a.id_player
and g.id_tournament = a.id_tournament
and a.rd=1
order by t.name, g.score desc;
This query could be simplified depending on the RDBMs you are using.
select
t.name,
p.name as winner,
g.score
from
game g inner join tournaments t
on g.id_tournament = t.id_tournament
inner join players p
on g.id_player = p.id_player
inner join
(select id_tournament,
id_player,
row_number() over (partition by t.name order by score desc) as rd
from game
) a
on g.id_player = a.id_player
and g.id_tournament = a.id_tournament
and a.rd=1
order by t.name, g.score desc
Assuming what you want as "Display high score of each player in each tournament"
your query would be like below in MS Sql server
select
t.name as tournament_name,
p.name as Winner,
Max(g.score) as [Highest_Score]
from Tournmanents t
Inner join Game g on t.id_tournament=g.id_tournament
inner join Players p on p.id_player=g.id_player
group by
g.id_tournament,
g.id_player,
t.name,
p.name
Please check this if this works for you
SELECT tournemntData.id_tournament ,
tournemntData.name ,
dbo.Players.name ,
tournemntData.Score
FROM dbo.Game
INNER JOIN ( SELECT dbo.Tournaments.id_tournament ,
dbo.Tournaments.name ,
MAX(dbo.Game.score) AS Score
FROM dbo.Game
INNER JOIN dbo.Tournaments ONTournaments.id_tournament = Game.id_tournament
INNER JOIN dbo.Players ON Players.id_player = Game.id_player
GROUP BY dbo.Tournaments.id_tournament ,
dbo.Tournaments.name
) tournemntData ON tournemntData.id_tournament =Game.id_tournament
INNER JOIN dbo.Players ON Players.id_player = Game.id_player
WHERE tournemntData.Score = dbo.Game.score

SQL query using NULL

I have two tables, student and school.
student
stid | stname | schid | status
school
schid | schname
Status can be many things for temporary students, but NULL for permanent students.
How do I list names of schools which has no temporary students?
Using Conditional Aggregate you can count the number of permanent student in each school.
If total count of a school is same as the conditional count of a school then the school does not have any temporary students.
Using JOIN
SELECT sc.schid,
sc.schname
FROM student s
JOIN school sc
ON s.schid = sc.schid
GROUP BY sc.schid,
sc.schname
HAVING( CASE WHEN status IS NULL THEN 1 END ) = Count(*)
Another way using EXISTS
SELECT sc.schid,
sc.schname
FROM school sc
WHERE EXISTS (SELECT 1
FROM student s
WHERE s.schid = sc.schid
HAVING( CASE WHEN status IS NULL THEN 1 END ) = Count(*))
You can use not exists to only select schools that do not have temporary students:
select * from school s
where not exists (
select 1 from student s2
where s2.schid = s.schid
and s2.status is not null
)
You can use a regular join.
SELECT DISTINCT c.schName
FROM Students s
INNER JOIN Schools c ON s.schid = c.schid
WHERE s.status IS NULL

Complicated table join

I thought I had a good grasp on table joins but there is one problem here I can't figure out.
I am trying to track the progress of students on specifically required courses. Some students are required to complete an exact list of courses before further qualification.
Tables (simplified):
students
--------
id INT PRIMARY KEY
name VARCHAR(50)
student_courses
---------------
student_id INT PRIMARY KEY
course_id TINYINT PRIMARY KEY
course_status TINYINT (Not done, Started, Completed)
steps_done TINYINT
total_steps TINYINT
date_created DATETIME
date_modified DATETIME
courses
-------
id TINYINT PRIMARY KEY
name VARCHAR(50)
I want to insert a list of required courses, for example 5 different courses in the courses table and then select a specific student and get list of all the courses required, whether a row exists for that course in the student_courses table or not.
I guess I could insert all rows from the courses table in the student_courses table for each student, but I don't want that because not all students need to do these courses. And what if new courses are added later.
I just want a result which is something like this:
students table:
id name
--- ------------------
1 George Smith
2 Dana Jones
3 Maria Cobblestone
SELECT * FROM students (JOIN bla bla bla - this is the point where I'm lost...)
WHERE students.id = 1
Result:
id name course_id courses.name course_status steps_done
--- ------------------ --------- ------------ ------------- ----------
1 George Smith 1 Botany Not started 0
1 George Smith 2 Biology NULL NULL
1 George Smith 3 Physics NULL NULL
1 George Smith 4 Algebra Completed 34
1 George Smith 5 Sewing Started 2
If the course_status or steps_done is NULL it means that no row exists for this student for this course in the student_courses table.
The idea is then using this in MS Access (or some other system) and have the row automatically inserted in the student_courses table once you enter a value in the NULL field.
You can't just use an outer join to do this, you need to create a list of all students/classes combinations that you're interested in first, then use that list in a LEFT JOIN. Can be done in a cte/subquery using CROSS JOIN:
;WITH cte AS (SELECT DISTINCT s.id Student_ID
,s.name
,c.id Course_ID
,c.name Class_Name
FROM Students s
CROSS JOIN Courses c)
SELECT cte.*,sc.status
FROM cte
LEFT JOIN student_courses sc
ON cte.course_id = sc.course_id
Can also use a subquery if needs to be done in Access (not 100% on syntax in Access):
SELECT sub.*,sc.status
FROM (SELECT DISTINCT s.id Student_ID
,s.name
,c.id Course_ID
,c.name Class_Name
FROM Students s
CROSS JOIN Courses c
) AS sub
LEFT JOIN student_courses sc
ON sub.course_id = sc.course_id
Demo: SQL Fiddle
You want a left outer join. The first table is from the courses table and is used for the required courses (defined in the where clause).
select s.id, s.name, c.id, c.name, c.course_status, c.steps_done
from (courses as c left join
student_courses as sc
on sc.course_id = c.id and
sc.student_id = 1
) left join
students as s
on sc.student_id = s.id
where c.id in (<list of required courses>)
order by s.id, c.id;
I think I have all the "Access"isms in there.
Actually, the above will be missing the student name when s/he is missing a course. The following is more correct:
select s.id, s.name, c.id, c.name, c.course_status, c.steps_done
from (courses as c left join
student_courses as sc
on sc.course_id = c.id and
sc.student_id = 1
) cross join
students as s
on s.id = 1
where c.id in (<list of required courses>)
order by s.id, c.id;