I have four tables with the following construct:-
I am trying to construct a query which will output offerings which have an attendance below the average attendance for offerings of the course to which they belong. I have constructed two queries so far
This outputs the total number of attendees for each course
This outputs the total number of offerings for each course.
I think what i need to do is divide the results of the first query, by the results of the second query (which will give me the average attendance for each offering of each course) and then output only the offerings with attendance below that result. I really am struggling to build this query so I am basically looking for some help
Any help is much appreciated as always
One way to do it is to first find the number of attendees for each offering, then from this result find the average attendance for each course, join the average attendances to each related offering, and then select the ones where the actual attendance is lover than the average.
This can be done using a CTE:
WITH attendee_counts AS
(SELECT c.course_id, o.offering_id,
COUNT (Student_id) AS attendees -- find attendance
FROM course c
INNER JOIN offering o
ON o.course_id = c.course_id
LEFT JOIN attendance a
ON a.offering_id = o.offering_id
GROUP BY c.course_id, o.offering_id) -- for each offering
SELECT ac.course_id, ac.offering_id,
ac.attendees, avgs.avg_attendees
FROM attendee_counts AS ac
INNER JOIN
(SELECT course_id, AVG(attendees) AS avg_attendees -- then average
FROM attendee_counts
GROUP BY course_id) AS avgs -- by course
ON avgs.course_id = ac.course_id
WHERE ac.attendees < avgs.avg_attendees;
The query (that works in PostgreSQL) can be tested here: http://www.sqlfiddle.com/#!1/f5b60/20/0
Edit:
Oracle seems to require a slightly different solution:
WITH attendee_counts AS
(SELECT c.course_id, o.offering_id,
COUNT (Student_id) AS attendees
FROM course c
INNER JOIN offering o ON o.course_id = c.course_id
LEFT JOIN attendance a ON a.offering_id = o.offering_id
GROUP BY c.course_id, o.offering_id)
SELECT o.course_id, o.offering_id, o.attendees,
avg(c.attendees) AS avg_attendees
FROM attendee_counts o -- connect attendance by offering
LEFT JOIN attendee_counts c
ON c.course_id = o.course_id -- to each offering of the same course
GROUP BY o.course_id, o.offering_id, o.attendees
HAVING o.attendees < avg(c.attendees);
This can be tested here http://www.sqlfiddle.com/#!4/e50e4/4/0 (for Oracle 11g R2)
Related
I am trying to get information for each student in a database. I know that there are exactly 4 students and that between all students, there are 6 enrollments (ie. some students are enrolled in multiple courses). Therefore, the proper output would have 6 rows, all containing the necessary student info. There would be duplicate students in the returned query. I am able to join the students and the enrollments just fine and end up with the 6 total enrollments. However, once I join in the other tables to get data about the courses that the students are enrolled in, I end up getting more and more rows. Depending on how I format my query, I get between 7-11 rows. All that I want is the 6 rows that correspond to the enrollments and nothing more. Why does that happen like this and how do I fix it?
I have tried different kinds of joins, unions, intersections, and have been working at the question for well over an hour. This is what I have currently:
Select s.sid, e.term, c.cno, e.secno, ca.ctitle
from Students as s
join Enrolls as e
on s.sid = e.sid
join Courses as c
on e.secno = c.secno
join Catalogue as ca
on ca.cno = c.cno
question details
database details
It looks like the Courses and Enrollment tables have what we call 'a composite key'. I bet you must join the c and e tables with both term and secno columns.
Your query mus be like this:
SELECT s.sid, e.term, c.cno, e.secno, ca.ctitle
FROM Students AS s
JOIN Enrolls AS e ON s.sid = e.sid
JOIN Courses AS c ON e.secno = c.secno AND e.term = c.term
JOIN Catalogue AS ca ON ca.cno = c.cno
When you have a composite key and uses only one of the columns to join, you will get unwanted rows from the foreign table, making a Cartesian product result
I am starting with SQL and I have a problem grouping 2 columns. I need the counting to be filtered by another column, but I obtain the count of all the rows as if they weren't grouped.
I have 4 different tables with countries, cities, matches and stadiums. I have to obtain names and codes of countries and cities as well as the number of referees that worked in each city and the total number of goals made in every city. I have done this, but I obtain the total number of referees that have worked in the tournament and the total number of goals made during the tournament:
SELECT ci.city_code,
ci.city_name,
co.country_code,
co.country_name,
COUNT(DISTINCT referee_code),
SUM(home_goals+visitor_goals)
FROM euro2021.tb_city AS ci
INNER JOIN euro2021.tb_country AS co ON ci.country_code=co.country_code
INNER JOIN euro2021.tb_match AS m ON ci.city_name=ci.city_name
INNER JOIN euro2021.tb_stadium AS st on st.stadium_code=m.stadium_code
GROUP BY ci.city_code, ci.city_name, co.country_code, co.country_name
Do you have any ideas or do you know if this was answered previously? Thanks in advance.
I think the problem is all in your JOIN clauses. I believe you want this:
SELECT ci.city_code,
ci.city_name,
co.country_code,
co.country_name,
COUNT(DISTINCT m.referee_code),
SUM(m.home_goals+m.visitor_goals)
FROM euri2021.tb_match AS m
INNER JOIN euro2021.tb_stadium AS st ON st.stadium_code=m.stadium_code
INNER JOIN euro2021.tb_city AS ci ON ci.city_code=st.city_code
INNER JOIN euro2021.tb_country AS co ON ci.country_code=co.country_code
IROUP BY ci.city_code, ci.city_name, co.country_code, co.country_name;
I'm trying to find the students that have failed every subject in a set of subjects via PostgreSQL queries.
Students fail a subject if they have a not null mark < 50 for at least one course offering of the subject. And I want to find the students that have failed all subjects in the set of subjects Relevant_subjects.
NOTE: students can have several records per course.
SELECT People.name
FROM
Relevant_subjects
JOIN Courses on (Courses.subject = Relevant_subjects.id)
JOIN Course_enrolments on (Course_enrolments.course = Courses.id)
JOIN Students on (Students.id = Course_enrolments.student)
JOIN People on (People.id = Students.id)
WHERE
Course_enrolments.mark is not null AND
Course_enrolments.mark < 50 AND
;
With the code above, I get the students that has failed any of the Relevant_subjects but I my desired result is to get the students that has failed all Relevant_subjects. How can I do that?
Students fail a subject if they have a not null mark < 50 for at least one course offering of the subject.
One of many possible ways:
SELECT id, p.name
FROM (
SELECT s.id
FROM students s
CROSS JOIN relevant_subjects rs
GROUP BY s.id
HAVING bool_and( EXISTS(
SELECT -- empty list
FROM course_enrolments ce
JOIN courses c ON c.id = ce.course
WHERE ce.mark < 50 -- also implies NOT NULL
AND ce.student = s.id
AND c.subject = rs.id
)
) -- all failed
) sub
JOIN people p USING (id);
Form a Carthesian Product of students and relevant subjects.
Aggregate by student (s.id) and filter those who failed all subjects in the HAVING clause with bool_and()over a correlated EXISTS subquery testing for at least one such failed course for each student-subject combination.
Join to people as final cosmetic step to get student names. I added id to get unique results (as names probably are not guaranteed to be unique).
Depending on actual table definition, your version of Postgres, cardinalities and value distribution, there may be (much) more efficient queries.
It's a case of relational-division at its core. See:
How to filter SQL results in a has-many-through relation
And the most efficient strategy is to eliminate as many students as possible as early in the query as possible - like by checking the subject with the fewest failing students first. Then proceed with only the remaining students etc.
Your case adds the specific difficulty that the number and identities of subjects to be tested are unknown / dynamic. Typically, a recursive CTE or similar offers best performance for this kind of problem:
SQL query to find a row with a specific number of associations
I would use aggregation:
SELECT p.name
FROM Relevant_subjects rs JOIN
Courses c
ON c.subject = rs.id JOIN
Course_enrolments ce
ON ce.course = c.id JOIN
Students s
ON s.id = ce.student JOIN
People p
ON p.id = s.id
WHERE ce.mark < 50
GROUP BY p.id, p.name
HAVING COUNT(*) = (SELECT COUNT(*) FROM relevant_subjects);
Note: This version assumes that students only have one record per course and relevant_subjects has no duplicates. These can easily be handling using COUNT(DISTINCT) if necessary.
To handle duplicates, this would look like:
SELECT p.name
FROM Relevant_subjects rs JOIN
Courses c
ON c.subject = rs.id JOIN
Course_enrolments ce
ON ce.course = c.id JOIN
Students s
ON s.id = ce.student JOIN
People p
ON p.id = s.id
WHERE ce.mark < 50
GROUP BY p.id, p.name
HAVING COUNT(DISTINCT rs.id) = (SELECT COUNT(DISTINCT rs2.id) FROM relevant_subjects rs2);
I am new to Microsoft Access. Accept my apology if my question seems trivial.
I am trying to write a query in Access that shows the total number of students enrolled in courses on a monthly basis. I have two tables named course and confirmed_enrollments.
The course table has only one field named course_name whereas the confirmed_enrolments table has three fields; student_code, course_name, and month_of_enrol.
I want to show all course_name (whether students enrolled in it or not) in my query and their total enrollments of the particular month. The query I have written only shows the courses that have enrollment and does not consider the courses which does not have enrollment.
Looking for your help. Here is my SQL code:
SELECT Course.Course_name,
Count(confirmed_enrolments.Student_code) AS CountOfStudent_code,
confirmed_enrolments.Month_of_enrol
FROM Course
LEFT JOIN confirmed_enrolments
ON Course.Course_name = confirmed_enrolments.Course_name
GROUP BY Course.Course_name, confirmed_enrolments.Month_of_enrol
HAVING confirmed_enrolments.Month_of_enrol="December 2016";
I think you intend this:
SELECT c.Course_name, Count(ce.Student_code) AS CountOfStudent_code,
ce.Month_of_enrol
FROM Course as c LEFT JOIN
(SELECT ce.*
FROM confirmed_enrolments as ce
WHERE ce.Month_of_enrol = "December 2016"
) as ce
ON c.Course_name = ce.Course_name
GROUP BY c.Course_name, ce.Month_of_enrol;
The use of the HAVING clause in your query is clever, but you are filtering by the second table -- and the value is NULL rather than the specified month -- when there are no matches.
The issue is that the condition is on the second table. In a normal SQL dialect, you would write:
SELECT c.Course_name, Count(ce.Student_code) AS CountOfStudent_code,
ce.Month_of_enrol
FROM Course as c LEFT JOIN
confirmed_enrolments as ce
ON c.Course_name = ce.Course_name AND
ce.Month_of_enrol = "December 2016"
GROUP BY c.Course_name, ce.Month_of_enrol;
But MS Access doesn't seem to allow the second comparison in a LEFT JOIN.
By the way, another way to write the query uses a correlated subquery:
SELECT c.Course_name,
(SELECDT Count(ce.*)
FROM confirmed_enrolments as ce
WHERE ce.Month_of_enrol = "December 2016"
) AS CountOfStudent_code,
"December 2016" as ce.Month_of_enrol
FROM Course as c ;
Suppose I have 3 tables.
Sales Rep
Rep Code
First Name
Last Name
Phone
Email
Sales Team
Orders
Order Number
Rep Code
Customer Number
Order Date
Order Status
Customer
Customer Number
Name
Address
Phone Number
I want to get a detailed report of Sales for 2010. I would be doing a join. I am interested in knowing which of the following is more efficient and why ?
SELECT
O.OrderNum, R.Name, C.Name
FROM
Order O INNER JOIN Rep R ON O.RepCode = R.RepCode
INNER JOIN Customer C ON O.CustomerNumber = C.CustomerNumber
WHERE
O.OrderDate >= '01/01/2010'
OR
SELECT
O.OrderNum, R.Name, C.Name
FROM
Order O INNER JOIN Rep R ON (O.RepCode = R.RepCode AND O.OrderDate >= '01/01/2010')
INNER JOIN Customer C ON O.CustomerNumber = C.CustomerNumber
JOINs must reflect the relationship aspect of your tables. WHERE clause, is a place where you filter records. I prefer the first one.
Make it readable first, table relationships should be obvious (by using JOINs), then profile
Efficiency-wise, the only way to know is to profile it, different database have different planner on executing the query
Wherein some database might apply filter first, then do the join subsquently; some database might join tables blindly first, then execute where clause later. Try to profile, on Postgres and MySQL use EXPLAIN SELECT ..., in SQL Server use Ctrl+K, with SQL Server you can see which of the two queries is faster relative to each other