Calculating weighted marks within weighted marks - sql

I have a maths / SQL problem that I've been grappling with.
I have two tables with the following structure:
CREATE TABLE Exams
(
ExamID INT PRIMARY KEY,
ExamName VARCHAR(100),
CourseID INT,
RelatedExamID INT NULL,
Weighting DECIMAL (5,3)
)
CREATE TABLE ExamMarks
(
ExamMarkID INT IDENTITY PRIMARY KEY,
StudentID VARCHAR(8),
ExamID INT FOREIGN KEY REFERENCES Exams(ExamID),
ExamMark DECIMAL (5,4)
)
The exams table contains the following data:
INSERT INTO Exams (ExamID, ExamName, CourseID, RelatedExamID, Weighting)
VALUES (1, 'English',1,NULL,1),
(2, 'French',2,NULL,1),
(3, 'Maths',3,NULL,0.6),
(4, 'Statistics',3,NULL,0.4),
(5, 'Physics Part 1',4,NULL,0.5),
(6, 'Physics Part 2',4,NULL,0.5),
(7, 'Heat and Mass',4,6,0.25)
The Exam Marks table contains the following data:
INSERT INTO ExamMarks (StudentID, ExamID, ExamMark)
VALUES ('00112233', 1, 0.75),
('00112233', 2, 0.52),
('00112233', 3, 0.68),
('00112233', 4, 0.8),
('00112233', 5, 0.50),
('00112233', 6, 0.66),
('00112233', 7, 0.45)
The idea here is that a given course may have
A single exam (such as with English and French)
Multiple exams (such as with course 3, which has 2 exams called "Maths" and "Physics") which have independent weightings - in this case, the course is structured so that the Maths exam is 60% of the total, and the Physics exam contributes 40%
Exams with sub-exams, such as with Course 4, more on which shortly.
If I want to get the weighted total marks for each candidate for each exam - forgetting about course 4 for the moment - I do the following:
SELECT em.StudentID,e.CourseID, SUM(em.ExamMark * e.Weighting)/SUM(e.Weighting)
FROM Exams e
INNER JOIN ExamMarks em ON e.ExamID = em.ExamID
GROUP BY em.StudentID,e.CourseID
However, Course 4 is made up of 3 parts:
Physics Part 1 - 50% of the total, and
Physics Part 2 - also 50% of the total
Heat and Mass, which makes up 25% of Physics Part 2 (thus its ID in the 'RelatedExamID' column)
To be clear, Heat and Mass makes up 25% of Physics Part 2, which itself makes up 50% of the course.
I've put these figures into an Excel spreadsheet, and after a lot of head-scratching, was able to figure that our student should end up with a mark for Course 4 of 55.375%.
However, unfortunately, my SQL (and maths / logic) skills are not good enough to get to this result in a SQL query.
The data above represents something of a simplification. There are, in fact, around 10000 marks to be considered (around 500 students involved), for about 200 different exams, of which perhaps 30 are "sub-exams". Each year, these have to be totalled up to give the student their mark per course, given these weightings.

Okay, so I have managed to find a solution. I'd still be grateful for any others which might be more efficient or robust from those who know more than me.
--Get SubComponent marks
WITH SubComponents
AS
(SELECT
em.StudentID
,em.ExamID
,e.RelatedExamID
,e.Weighting
,e.Weighting * em.ExamMark AS WeightedMark
FROM Exams e
INNER JOIN ExamMarks em
ON e.ExamID = em.ExamID
WHERE e.RelatedExamID IS NOT NULL),
--Get marks for those components which have subcomponents
ParentComponents
AS
(SELECT
em.StudentID
,e.CourseID
,em.ExamID
,e.RelatedExamID
,e.Weighting
,((1 - SubComponents.Weighting) * em.ExamMark)
+ SubComponents.WeightedMark AS OverallComponentMark
FROM Exams e
INNER JOIN ExamMarks em
ON e.ExamID = em.ExamID
INNER JOIN SubComponents
ON SubComponents.RelatedExamID = e.ExamID
AND SubComponents.StudentID=em.StudentID),
--Get marks for those components which are neither parent nor child components
StandaloneComponents
AS
(SELECT
em.StudentID
,e2.CourseID
,em.ExamID
,e2.RelatedExamID
,e2.Weighting
,em.ExamMark
FROM Exams e2
INNER JOIN ExamMarks em
ON e2.ExamID = em.ExamID
WHERE NOT EXISTS (SELECT
*
FROM Exams
WHERE RelatedExamID = e2.ExamID)
AND e2.RelatedExamID IS NULL),
-- Bring all the above together
ComponentMarks
AS
(SELECT
StudentID
,CourseID
,ExamID
,Weighting
,ExamMark
FROM StandaloneComponents
UNION
SELECT
StudentID
,CourseID
,ExamID
,Weighting
,OverallComponentMark
FROM ParentComponents)
-- Finally group and combine marks at course level
SELECT
StudentID
,CourseID
,SUM(ExamMark * Weighting) / SUM(Weighting)
FROM ComponentMarks
GROUP BY StudentID
,CourseID

Related

How to self join only a subset of rows in PostgreSQL?

Given the following table:
CREATE TABLE people (
name TEXT PRIMARY KEY,
age INT NOT NULL
);
INSERT INTO people VALUES
('Lisa', 30),
('Marta', 27),
('John', 32),
('Sam', 41),
('Alex', 12),
('Aristides',43),
('Cindi', 1)
;
I am using a self join to compare each value of a specific column with all the other values of the same column. My query looks something like this:
SELECT DISTINCT A.name as child
FROM people A, people B
WHERE A.age + 16 < B.age;
This query aims to spot potential sons/daughters based on age difference. More specifically, my goal is to identify the set of people that may have stayed in the same house as one of their parents (ordered by name), assuming that there must be an age difference of at least 16 years between a child and their parents.
Now I would like to combine this kind of logic with the information that is in another table.
The other table looks something like that:
CREATE TABLE houses (
house_name TEXT NOT NULL,
house_member TEXT NOT NULL REFERENCES people(name)
);
INSERT INTO houses VALUES
('house Smith', 'Lisa'),
('house Smith', 'Marta'),
('house Smith', 'John'),
('house Doe', 'Lisa'),
('house Doe', 'Marta'),
('house Doe', 'Alex'),
('house Doe', 'Sam'),
('house McKenny', 'Aristides'),
('house McKenny', 'John'),
('house McKenny', 'Cindi')
;
The two tables can be joined ON houses.house_member = people.name.
More specifically I would like to spot the children only within the same house. It does not make sense to compare the age of each person with the age of all the others, but instead it would be more efficient to compare the age of each person with all the other people in the same house.
My idea is to perform the self join from above but only within a PARTITION BY household_name. However, I don't think this is a good idea since I do not have an aggregate function. Same applies for GROUP BY statements as well. What could I do here?
The expected output should be the following, ordered by house_member:
house_member
Alex
Cindi
For simplicity I have created a fiddle.
At first join two tables to build one table that has all three bits of info: house_name, house_member, age.
And then join it with itself just as you did originally and add one extra filter to look only at the same households.
WITH
CTE_All
AS
(
SELECT
houses.house_name
,houses.house_member
,people.age
FROM
houses
INNER JOIN people ON people.name = houses.house_member
)
SELECT DISTINCT
Children.house_name
,Children.house_member AS child_name
FROM
CTE_All AS Children
INNER JOIN CTE_All AS Parents
ON Children.age + 16 < Parents.age
-- this is our age difference
AND Children.house_name = Parents.house_name
-- within the same house
;
All this is one single query. You don't have to use CTE, you can inline it as a subquery, but it is more readable with CTE.
Result
house_name | child_name
:------------ | :---------
house Doe | Alex
house McKenny | Cindi

I cannot execute this query SQL

Show how to define the view student grades (ID, GPA) giving the grade-point average of each student; recall that we used a relation grade_points (grade, points) to get the numeric points associated with a letter grade. Make sure your view definition correctly handles the case of null values for the grade attribute of the takes relation.
create view student_grades(ID, GPA) as
select ID, credit_ points / decode(credit sum, 0, NULL, credit_sum)
from ((select ID, sum(decode(grade, NULL, 0, credits)) as credit_sum,
sum(decode(grade, NULL, 0, credits*points)) as credit_points
from(takes natural join course) natural left outer join grade points group by ID)
union
select ID, NULL
from student
where ID not in (select ID from takes));
Can someone please correct this code?
I think you decode() and sum() parameters a mixed up !! Decode ask for 2 arguments
https://www.w3resource.com/mysql/encryption-and-compression-functions/decode().php

Accessing to total number in each second level(Postgres Hierarchical Query Practice)

I was practicing on Postgres and stuck on a point that I couldn't find a way to achieve. I have a simple database which are the attributes:
CREATE TABLE public.department
(
"deptId" integer NOT NULL PRIMARY KEY,
name character varying(30) COLLATE pg_catalog."default" NOT NULL,
"parentId" integer,
"numEmpl" integer NOT NULL,
CONSTRAINT "department_parentId_fkey" FOREIGN KEY ("parentId")
REFERENCES public.department ("deptId") MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
and then I have some data in the table. Short example is
insert into department values (1, 'Headquarter', 1, 10);
insert into department values (2, 'Sales', 1, 15);
insert into department values (3, 'Logistics', 1, 25);
...
I was trying to get the total number of people who are employeed in each second level department.
I am able to get the total number of employeed people in each department but according to my search in the internet this is possible with "Hierarchical Queries". Currently, I am using
parentId=1
while querying.
Any solutions for this? Thank you.
Here is one option:
with recursive cte as (
select deptid as rootid, deptid from department where parentid = 1 and deptid <> 1
union all
select c.rootid, d.deptid
from cte c
inner join department d on d.parentid = c.deptid and d.deptid <> 1
)
select rootid, count(*) cnt from cte group by rootid

PostgreSQL query not returning result as intended

I would like to generate a list of all days where every sailor booked a boat in that particular day.
The table scheme is as follows:
CREATE TABLE SAILOR(
SID INTEGER NOT NULL,
NAME VARCHAR(50) NOT NULL,
RATING INTEGER NOT NULL,
AGE FLOAT NOT NULL,
PRIMARY KEY(SID)
);
CREATE TABLE BOAT(
BID INTEGER NOT NULL,
NAME VARCHAR(50) NOT NULL,
COLOR VARCHAR(50) NOT NULL,
PRIMARY KEY(BID)
);
CREATE TABLE RESERVE (
SID INTEGER NOT NULL REFERENCES SAILOR(SID),
BID INTEGER NOT NULL REFERENCES BOAT(BID),
DAY DATE NOT NULL,
PRIMARY KEY(SID, BID, DAY));
The data is as follows:
INSERT INTO SAILOR(SID, NAME, RATING, AGE)
VALUES
(64, 'Horatio', 7, 35.0),
(74, 'Horatio', 9, 35.0);
INSERT INTO BOAT(BID, NAME, COLOR)
VALUES
(101, 'Interlake', 'blue'),
(102, 'Interlake', 'red'),
(103, 'Clipper', 'green'),
(104, 'Marine', 'red');
INSERT INTO RESERVE(SID, BID, DAY)
VALUES+
(64, 101, '09/05/98'),
(64, 102, '09/08/98'),
(74, 103, '09/08/98');
I have tried using this code:
SELECT DAY
FROM RESERVE R
WHERE NOT EXISTS (
SELECT SID
FROM SAILOR S
EXCEPT
SELECT S.SID
FROM SAILOR S, RESERVE R
WHERE S.SID = R.SID)
GROUP BY DAY;
but it returns a list of all days, no exception. The only day that it should return is "09/08/98". How do I solve this?
I would phrase your query as:
SELECT r.DAY
FROM RESERVE r
GROUP BY r.DAY
HAVING COUNT(DISTINCT r.SID) = (SELECT COUNT(*) FROM SAILOR);
Demo
The above query says to return any day in the RESERVE table whose distinct SID sailor count matches the count of every sailor.
This assumes that SID sailor entries in the RESERVE table would only be made with sailors that actually appear in the SAILOR table. This seems reasonable, and can be enforced using primary/foreign key relationships between the two tables.
Taking a slightly different approach of just counting unique sailors per day:
SELECT day FROM (
SELECT COUNT(DISTINCT sid), day FROM reserve GROUP BY day
) AS sailors_per_day
WHERE count = (SELECT COUNT(*) FROM sailor);
+------------+
| day |
|------------|
| 1998-09-08 |
+------------+

SQL - EXIST OR ALL?

I have two different table student and grades;
grade table has an attribute student_id which references student_id from student table.
How do I find which student has every grade that exists?
If this is not clear,
Student ID Name
1 1 John
2 2 Paul
3 3 George
4 4 Mike
5 5 Lisa
Grade Student_Id Course Grade
1 1 Math A
2 1 English B
3 1 Physics C
4 2 Math A
5 2 English A
6 2 Physics B
7 3 Economics A
8 4 Art C
9 5 Biology A
Assume there is only grade a,b,c (no d, e or fail)
I want to find only John because He has grade a,b,c while
other student like Paul(2) should not be selected because he does not have grade c. It does not matter which course he took, I just need to find if he has all the grades out there available.
Feel like I should something like exist or all function in sql but not sure.
Please help. Thank you in advance.
I would use GROUP BY and HAVING, but like this:
SELECT s.Name
FROM Student s JOIN
Grade g
ON s.ID = g.Student_Id
GROUP BY s.id, s.Name
HAVING COUNT(DISTINCT g.Grade) = (SELECT COUNT(DISTINCT g2.grade) FROM grade g2);
You say "all the grades out there", so the query should not use a constant for that.
You can use HAVING COUNT(DISTINCT Grade) = 3 to check that the student has all 3 grades:
SELECT Name
FROM Student S
JOIN Grade G ON S.ID = G.Student_Id
GROUP BY Name
HAVING COUNT(DISTINCT Grade) = 3
Guessing at S.ID vs S.Student on the join. Not sure what the difference is there.
By using exists
select * from student s
where exists ( select 1
from grades g where g.Student_Id=s.ID
group by g.Student_Id
having count(distinct Grade)=3
)
Example
with Student as
(
select 1 as id,'John' as person
union all
select 2 as id,'Paul' as person
union all
select 3 as id,'jorge'
),
Grades as
(
select 1 as Graden, 1 as Student_Id, 'Math' as Course, 'A' as Grade
union all
select 2 as Graden, 1 as Student_Id, 'English' as Course, 'B' as Grade
union all
select 3 as Graden, 1 as Student_Id, 'Physics' as Course, 'C' as Grade
union all
select 4 as Graden, 2 as Student_Id, 'Math' as Course, 'A' as Grade
union all
select 5 as Graden, 2 as Student_Id, 'English' as Course, 'A' as Grade
union all
select 6 as Graden, 2 as Student_Id, 'Physics' as Course, 'B' as Grade
)
select * from Student s
where exists ( select 1
from Grades g where g.Student_Id=s.ID
group by g.Student_Id
having count(distinct Grade)=3
)
Note having count(distinct Grade)=3 i used this as in your sample data grade type is 3
Before delving into the answer, here's a working SQL Fiddle Example so you can see this in action.
As Gordon Linoff points out in his excellent answer, you should use GroupBy and Having Count(Distinct ... ) ... as an easy way to check.
However, I'd recommend changing your design to ensure that you have tables for each concern.
Currently your Grade table holds each student's grade per course. So it's more of a StudentCourse table (i.e. it's the combination of student and course that's unique / gives you that table's natural key). You should have an actual Grade table to give you the list of available grades; e.g.
create table Grade
(
Code char(1) not null constraint PK_Grade primary key clustered
)
insert Grade (Code) values ('A'),('B'),('C')
This then allows you to ensure that your query would still work if you decided to include grades D and E, without having to amend any code. It also ensures that you only have to query a small table to get the complete list of grades, rather than a potentially huge table; so will give better performance. Finally, it will also help you maintain good data; i.e. so you don't accidentally end up with students with grade X due to a typo; i.e. since the validation/constraints exist in the database.
select Name from Student s
where s.Id in
(
select sc.StudentId
from StudentCourse sc
group by sc.StudentId
having count(distinct sc.Grade) = (select count(Code) from Grade)
)
order by s.Name
Likewise, it's sensible to create a Course table. In this case holding Ids for each course; since holding the full course name in your StudentCourse table (as we're now calling it) uses up a lot more space and again lacks validation / constraints. As such, I'd propose amending your database schema to look like this:
create table Grade
(
Code char(1) not null constraint PK_Grade primary key clustered
)
insert Grade (Code) values ('A'),('B'),('C')
create table Course
(
Id bigint not null identity(1,1) constraint PK_Course primary key clustered
, Name nvarchar(128) not null constraint UK_Course_Name unique
)
insert Course (Name) values ('Math'),('English'),('Physics'),('Economics'),('Art'),('Biology')
create table Student
(
Id bigint not null identity(1,1) constraint PK_Student primary key clustered
,Name nvarchar(128) not null constraint UK_Student_Name unique
)
set identity_insert Student on --inserting with IDs to ensure the ids of these students match data from your question
insert Student (Id, Name)
values (1, 'John')
, (2, 'Paul')
, (3, 'George')
, (4, 'Mike')
, (5, 'Lisa')
set identity_insert Student off
create table StudentCourse
(
Id bigint not null identity(1,1) constraint PK_StudentCourse primary key
, StudentId bigint not null constraint FK_StudentCourse_StudentId foreign key references Student(Id)
, CourseId bigint not null constraint FK_StudentCourse_CourseId foreign key references Course(Id)
, Grade char /* allow null in case we use this table for pre-results; otherwise make non-null */ constraint FK_StudentCourse_Grade foreign key references Grade(Code)
, Constraint UK_StudentCourse_StudentAndCourse unique clustered (StudentId, CourseId)
)
insert StudentCourse (StudentId, CourseId, Grade)
select s.Id, c.Id, x.Grade
from (values
('John', 'Math', 'A')
,('John', 'English', 'B')
,('John', 'Physics', 'C')
,('Paul', 'Math', 'A')
,('Paul', 'English', 'A')
,('Paul', 'Physics', 'B')
,('George', 'Economics','A')
,('Mike', 'Art', 'C')
,('Lisa', 'Biology', 'A')
) x(Student, Course, Grade)
inner join Student s on s.Name = x.Student
inner join Course c on c.Name = x.Course