How to join Views with aggregate functions? - sql

My problem:
In #4, I'm having trouble joining two Views because the other has an aggregate function. Same with #5
Question:
Create a view name it as studentDetails, that would should show the student name, enrollment date, total price per unit and subject description of students who are enrolled on the subject Science or History.
Create a view, name it as BiggestPrice, that will show the subject id and highest total price per unit of all the subjects. The view should show only the highest total price per unit that are greater than 1000.
--4.) Create a view name it as studentDetails, that would should show the student name,
-- enrollment date the total price per unit and subject description of students who are
-- enrolled on the subject Science or History.
CREATE VIEW StudentDetails AS
SELECT StudName, EnrollmentDate
--5.) Create a view, name it as BiggestPrice, that will show the subject id and highest total
-- price per unit of all the subjects. The view should show only the highest total price per unit
-- that are greater than 1000.
CREATE VIEW BiggestPrice AS
SELECT SubjId, SUM(Max(Priceperunit)) FROM Student, Subject
GROUP BY Priceperunit
Here is my table:
CREATE TABLE Student(
StudentId char(5) not null,
StudName varchar2(50) not null,
Age NUMBER(3,0),
CONSTRAINT Student_StudentId PRIMARY KEY (StudentId)
);
CREATE table Enrollment(
EnrollmentId varchar2(10) not null,
EnrollmentDate date not null,
StudentId char(5) not null,
SubjId Number(5) not null,
constraint Enrollment_EnrollmentId primary key (EnrollmentId),
constraint Enrollment_StudentId_FK foreign key (StudentId) references Student(StudentId),
constraint Enrollment_SubjId_Fk foreign key (SubjId) references Subject(SubjId)
);
Create table Subject(
SubjId number(5,0) not null,
SubjDescription varchar2(200) not null,
Units number(3,0) not null,
Priceperunit number(9,0) not null,
Constraint Subject_SubjId_PK primary key (SubjId)
);

Since this appears to be a homework question.
You need to use JOINs. Your current query:
CREATE VIEW StudentDetails AS
SELECT StudName, EnrollmentDate
Does not have a FROM clause and the query you have for question 5 uses the legacy comma join syntax with no WHERE filter; this is the same as a CROSS JOIN and will connect every student to every subject and is not what you want.
Don't use the legacy comma join syntax and use ANSI joins and explicitly state the join condition.
SELECT <expression list>
FROM student s
INNER JOIN enrollment e ON ...
INNER JOIN subject j ON ...
Then you can fill in the ... based on the relationships between the tables (typically the primary key of one table = the foreign key of another table).
Then for the <expression list> you need to include the columns asked for in the question: student name and enrolment date and subject name would just be those columns from the appropriate tables; and total price-per-unit (which I assume is actually total-price-per-subject) would be a calculation.
Then for the last part of question 4.
who are enrolled on the subject Science or History.
Add a WHERE filter to only include rows for those subjects.
For question 5, you do not need any JOINS as the question only asks about details in the SUBJECT table.
You need to add a WHERE filter to show "only the highest total price per unit that are greater than 1000". This is a simple multiplication and then you can filter by comparing if it is > 1000.
Then you need to limit the query to return only the row with the "highest total price per unit of all the subjects". From Oracle 12, this would be done with an ORDER BY clause in descending order of total price and then using FETCH FIRST ROW ONLY or FETCH FIRST ROW WITH TIES.

Not sure if i get it fully, but i think its this :
Notes:
Always use Id's to filter records:
where su.SubjId in (1,2)
You can find max record using max() at subquery and join it with main query like this :
where su2.SubjId = su.SubjId
You cannot use alias as filter so you can filter it like:
( su.Units * su.Priceperunit ) > 1000
CREATE VIEW StudentDetails AS
select s.StudName,
e.EnrollmentDate,
su.SubjDescription,
su.Units * su.Priceperunit TotalPrice
from student s
inner join Enrollment e
on e.StudentId = s.StudentId
inner join Subject su
on su.SubjId = e.SubjId
where su.SubjId in (1,2)
CREATE VIEW BiggestPrice AS
select su.SubjId, ( su.Units * su.Priceperunit ) TotalPrice
from Subject su
where ( su.Units * su.Priceperunit ) =
(
select max(su2.Units * su2.Priceperunit)
from Subject su2
where su2.SubjId = su.SubjId
)
and ( su.Units * su.Priceperunit ) > 1000

Related

Update and renew data based on data in other tables

There are 3 tables student, course, and takes as following
CREATE TABLE student
(
ID varchar(5),
name varchar(20) NOT NULL,
dept_name varchar(20),
tot_cred numeric(3,0) CHECK (tot_cred >= 0),
PRIMARY KEY (ID),
FOREIGN KEY (dept_name) REFERENCES department
ON DELETE SET NULL
)
CREATE TABLE takes
(
ID varchar(5),
course_id varchar(8),
sec_id varchar(8),
semester varchar(6),
year numeric(4,0),
grade varchar(2),
PRIMARY KEY (ID, course_id, sec_id, semester, year),
FOREIGN KEY (course_id, sec_id, semester, year) REFERENCES section
ON DELETE CASCADE,
FOREIGN KEY (ID) REFERENCES student
ON DELETE CASCADE
)
CREATE TABLE course
(
course_id varchar(8),
title varchar(50),
dept_name varchar(20),
credits numeric(2,0) CHECK (credits > 0),
PRIMARY KEY (course_id),
FOREIGN KEY (dept_name) REFERENCES department
ON DELETE SET NULL
)
tot_cred column data in the student table now is assigned with random values (not correct), I want to perform the query that updates and renews those data based on the course's grade each student has taken. For those students who received F grade will be excluded and those who didn't take any course will be assigned 0 as tot_cred.
I came up with two approaches, one is
UPDATE student
SET tot_cred = (SELECT SUM(credits)
FROM takes, course
WHERE takes.course_id = course.course_id
AND student.ID = takes.ID
AND takes.grade <> 'F'
AND takes.grade IS NOT NULL)
This query meets all my needs, but for those students who didn't take any course, it does assign NULL value instead of 0.
The second is using case when
UPDATE student
SET tot_cred = (select sum(credits)
case
when sum(credits) IS NOT NULL then sum(credits)
else 0 end
FROM takes as t, course as c
WHERE t.course_id = c.course_id
AND t.grade<>'F' and t.grade IS NOT NULL
)
But it assigned 0 to all students. Is any way to achieve the above requirement?
If the 1st query meets your requirement and the only problem is that it returns NULL for the students that did not take any course then the easiest solution would be to use instead of SUM() aggregate function the function TOTAL() which will return 0 instead of NULL:
UPDATE student AS s
SET tot_cred = (
SELECT TOTAL(c.credits)
FROM takes t INNER JOIN course c
ON t.course_id = c.course_id
WHERE t.ID = s.ID AND t.grade <> 'F' AND t.grade IS NOT NULL
);
The same could be done with COALESCE():
SELECT COALESCE(SUM(credits), 0)...
Also, use a proper join with an ON clause and aliases for the tables to improve readability.

How to count missing rows in left table after right join?

There are two tables:
Table education_data (list of countries with values by year per measured indicator).
create table education_data
(country_id int,
indicator_id int,
year date,
value float
);
Table indicators (list of all indicators):
create table indicators
(id int PRIMARY KEY,
name varchar(200),
code varchar(25)
);
I want to find the indicators for which the highest number of countries lack information entirely
i.e. max (count of missing indicators by country)
I have solved the problem in excel (by counting blanks in a pivot table by country)
pivot table with count for missing indicators by country
I haven't figured our yet the SQL query to return the same results.
I am able to return the number of missing indicators for a set country , read query below, but not for all countries.
SELECT COUNT(*)
FROM education_data AS edu
RIGHT JOIN indicators AS ind ON
edu.indicator_id = ind.id and country_id = 10
WHERE value IS NULL
GROUP BY country_id
I have tried with a cross join without success so far.
You will have to join on the contries as well, otherwise you can not tell if a contry has no entry in education_data at all:
create table countries(id serial primary key, name varchar);
create table indicators
(id int PRIMARY KEY,
name varchar(200),
code varchar(25)
);
create table education_data
(country_id int references countries,
indicator_id int references indicators,
year date,
value float
);
insert into countries values (1,'USA');
insert into countries values (2,'Norway');
insert into countries values (3,'France');
insert into indicators values (1,'foo','xxx');
insert into indicators values (2,'bar', 'yyy');
insert into education_data values(1,1,'01-01-2020',1.1);
SELECT count (c.id), i.id, i.name
FROM countries c JOIN indicators i ON (true) LEFT JOIN education_data e ON(c.id = e.country_id AND i.id = e.indicator_id)
WHERE indicator_id IS NULL
GROUP BY i.id;
count | id | name
-------+----+------
3 | 2 | bar
2 | 1 | foo
(2 rows)
I want to find the indicators for which the highest number of countries lack information entirely i.e. max (count of missing indicators by country)
That's a logical contradiction. The ...
count of missing indicators by country
.. cannot be pinned on any specific indicators, since those countries don't have an indicator.
The counts per country with "missing indicator" (i.e. indicator_id IS NULL):
SELECT country_id, count(*) AS ct_indicator_null
FROM education_data
WHERE indicator_id IS NULL
GROUP BY country_id
ORDER BY count(*) DESC;
Or, more generally, without valid indicator, which also includes rows where indicator_id has no match in table indicators:
SELECT country_id, count(*) AS ct_no_valid_indicator
FROM education_data e
WHERE NOT EXISTS (
SELECT FROM indicators i
WHERE i.id = e.indicator_id
)
GROUP BY country_id
ORDER BY count(*) DESC;
NOT EXISTS is one of four basic techniques that apply here (LEFT / RIGHT JOIN, like you tried being another one). See:
Select rows which are not present in other table
You mentioned a country table. Countries without any indicator entries in education_data are not included in the result above. To find those, too:
SELECT *
FROM country c
WHERE NOT EXISTS (
SELECT
FROM education_data e
JOIN indicators i ON i.id = e.indicator_id -- INNER JOIN this time!
WHERE e.country_id = c.id
);
Reports countries without valid indicator (none, or not valid).
If every country should have a valid indicator, after cleaning up existing data, consider:
1: adding an FOREIGN KEY constraint to disallow invalid entries in education_data.indicator_id.
2: setting education_data.indicator_id NOT NULL to also disallow NULL entries.
Or add a PRIMARY KEY on (country_id, indicator_id), which makes both columns NOT NULL automatically.
.. which brings you closer to a valid many-to-many implementation. See:
How to implement a many-to-many relationship in PostgreSQL?

SQLite, aggregation query as where clause

Given the schema:
CREATE TABLE Student (
studentID INT PRIMARY KEY NOT NULL,
studentName TEXT NOT NULL,
major TEXT,
class TEXT CHECK (class IN ("Freshman", "Sophomore", "Junior", "Senior")),
gpa FLOAT CHECK (gpa IS NULL OR (gpa >= 0 AND gpa <= 4)),
FOREIGN KEY (major) REFERENCES Dept(deptID) ON UPDATE CASCADE ON DELETE CASCADE
);
CREATE TABLE Dept (
deptID TEXT PRIMARY KEY NOT NULL CHECK (LENGTH(deptID) <= 4),
NAME TEXT NOT NULL UNIQUE,
building TEXT
);
CREATE TABLE Course (
courseNum INT NOT NULL,
deptID TEXT NOT NULL,
courseName TEXT NOT NULL,
location TEXT,
meetDay TEXT NOT NULL CHECK (meetDay IN ("MW", "TR", "F")),
meetTime INT NOT NULL CHECK (meetTime >= '07:00' AND meetTime <= '17:00'),
PRIMARY KEY (courseNum, deptID),
FOREIGN KEY (deptID) REFERENCES Dept(deptID) ON UPDATE CASCADE ON DELETE CASCADE
);
CREATE TABLE Enroll (
courseNum INT NOT NULL,
deptID TEXT NOT NULL,
studentID INT NOT NULL,
PRIMARY KEY (courseNum, deptID, studentID),
FOREIGN KEY (courseNum, deptID) REFERENCES Course ON UPDATE CASCADE ON DELETE CASCADE,
FOREIGN KEY (studentID) REFERENCES Student(studentID) ON UPDATE CASCADE ON DELETE CASCADE
);
I'm attempting to find the names, IDs, and the number of courses they are taking, for the students who are taking the highest number of courses. The sELECT to retrieve the names and IDs is simple enough, however I'm having trouble figuring out how to select the number of courses each student is taking, and then find the max of that and use it as a WHERE clause.
This is what I have so far:
SELECT Student.studentName, Student.studentID, COUNT(*) AS count
FROM Enroll
INNER JOIN Student ON Enroll.studentID=Student.studentID
GROUP BY Enroll.studentID
So first you get count of all the enrolled classes per student
SELECT COUNT() AS num
FROM Enroll
GROUP BY studentID
You can then check that against your existing query using HAVING to get your final query.
SELECT Student.studentName,Student.studentID,COUNT(*) AS count
FROM Enroll
INNER JOIN Student ON Enroll.studentID=Student.studentID
GROUP BY Enroll.studentID
HAVING COUNT()=(SELECT COUNT() AS num FROM Enroll GROUP BY studentID);
So to recap this basically gets the number which represents the highest number of enrollments for any student, then gets all students where that number is their count of enrollments, thus all students which have the highest, or equal highest number of enrollments.
We use HAVING because it is applied after the GROUP BY, meaning you can't use aggregate functions such as COUNT() in a WHERE clause.

SQL DML Query AVG and COUNT

I am beginner at SQL and I am trying to create a query.
I have these tables:
CREATE TABLE Hospital (
hid INT PRIMARY KEY,
name VARCHAR(127) UNIQUE,
country VARCHAR(127),
area INT
);
CREATE TABLE Doctor (
ic INT PRIMARY KEY,
name VARCHAR(127),
date_of_birth INT,
);
CREATE TABLE Work (
hid INT,
ic INT,
since INT,
FOREIGN KEY (hid) REFERENCES Hospital (hid),
FOREIGN KEY (ic) REFERENCES Doctor (ic),
PRIMARY KEY (hid,ic)
);
The query is: What is the average in each country of the number of doctors working in hospitals of that country (1st column: each country, 2nd column: average)? Thanks.
You first need to write a query that counts the doctors per hospital
select w.hid, count(w.ic)
from work w
group by w.hid;
Based on that query, you can retrieve the average number of doctors per country:
with doctor_count as (
select w.hid, count(w.ic) as cnt
from work w
group by w.hid
)
select h.country, avg(dc.cnt)
from hospital h
join doctor_count dc on h.hid = dc.hid
group by h.country;
If you have an old DBMS that does not support common table expressions the above can be rewritten as:
select h.country, avg(dc.cnt)
from hospital h
join (
select w.hid, count(w.ic) as cnt
from work
group by w.hid
) dc on h.hid = dc.hid;
Here is an SQLFiddle demo: http://sqlfiddle.com/#!12/9ff79/1
Btw: storing date_of_birth as an integer is a bad choice. You should use a real DATE column.
And work is a reserved word in SQL. You shouldn't use that for a table name.

Join multiple tables, including one table twice, and sort by counting a group

I am an amateur just trying to finish his last question of his assignment (it is past due at this point, just looking for understanding) I sat and shot attempts at this for almost 5 hours now across two days, and have had no success.
I have tried looking through all the different types of joins, couldn't get grouping to work (ever) and have had little luck with the sorting as well. I can do all of these things one at a time, but the difficulty here was getting all of these things to work in union.
This is the question:
Write a SQL query to retrieve a list that has (source city, source code, destination city,
destination code, and number-of-flights) for all source-dest pairs with at least 2 flights. Order
by the number_of_flights. Note that the “dest”, and “source” attributes in the “flights” table
are both referenced to the “airportid” in the “airports” table.
Here are the tables I have to work with (also came with about 3000 lines of dummy entries)
create table airports (
airportid char(3) primary key,
city varchar(20)
);
create table airlines (
airlineid char(2) primary key,
name varchar(20),
hub char(3) references airports(airportid)
);
create table customers (
customerid char(10) primary key,
name varchar(25),
birthdate date,
frequentflieron char(2) references airlines(airlineid)
);
create table flights (
flightid char(6) primary key,
source char(3) references airports(airportid),
dest char(3) references airports(airportid),
airlineid char(2) references airlines(airlineid),
local_departing_time date,
local_arrival_time date
);
create table flown (
flightid char(6) references flights(flightid),
customerid char(10) references customers,
flightdate date
);
The first problem I ran in to was outputting airports.city twice in the same query but with different results. Not only that, but no matter what I tried when grouping I would always get the same result:
Not a GROUP BY expression
Normally I have fun trying to piece these together, but this has been frustrating. Help!
select source.airportid as source_airportid,
source.city source_city,
dest.airportid as dest_airportid,
dest.city as dest_city,
count(*) as flights
from flights
inner join airports source on source.airportid = flights.source
inner join airports dest on dest.airportid = flights.dest
group by
source.airportid,
source.city,
dest.airportid,
dest.city
having count(*) >= 2
order by 5;
Have you tried a subquery?
SELECT source_airports.city,
source_airports.airportid,
dest_airports.city,
dest_airports.airportid,
x.number_of_flights
FROM
(
SELECT source, dest, COUNT(*) as number_of_flights
FROM flights
GROUP BY source, dest
HAVING COUNT(*) > 1
) as x
INNER JOIN airports as dest_airports
ON dest_airports.airportid = x.dest
INNER JOIN airports as source_airports
ON source_airports.airportid = x.source
ORDER BY x.number_of_flights ASC