SQL Query - count - max - sql

I cant manage to come up with a query for a problem. I have three tables
CREATE TABLE institute (
iid INT PRIMARY KEY,
sign VARCHAR(127) UNIQUE,
city VARCHAR(127) NOT NULL,
area INT CHECK (area>0));
CREATE TABLE desease (
did INT PRIMARY KEY,
name VARCHAR(127) UNIQUE,
level INT CHECK (level>0));
CREATE TABLE studies (
did INT,
iid INT,
FOREIGN KEY (did) REFERENCES desease (did),
FOREIGN KEY (iid) REFERENCES institute (iid),
PRIMARY KEY (iid,did));
My question is: What are the names of the deseases by the largest number of institutes from Lisbon (Lisbon beeng the city from institute). This is what i came up with but it doesnt give me the right answer.
SELECT DISTINCT D.name, MAX(I.iid)
FROM desease D, studies S
JOIN institute I ON (S.iid = I.iid)
WHERE I.city = 'Lisboa' AND D.did = S.did
GROUP BY D.nome
HAVING COUNT(I.iid) = MAX(I.city)
As an example : Imagine 5 institutes al with city = 'Lisbon' and with iid A,B,C,D,E respectevely (just for demonstration purposes, I know type is INT); 5 Diseases with name = Z,X,N,V,M respectevely.
Now lets say desease Z,X, and M are studied by institutes A,B,C (in any order), desease N is studied by D(1 inst.) and desease V is studied by E (only one). So the max number of deseases studied by any Lisbon institute is 3 (A,B and C all study 3 deseases) so the table would look like this
Z - 3
X - 3
M - 3
Edit : I managed to found a way to do it. Here is the query that I came up with
SELECT DISTINCT D.name, COUNT(*) AS C
FROM desease D, studies E, institute I
WHERE I.iid = E.iid AND D.did = E.did AND I.city = "Lisboa"
GROUP BY D.name
HAVING C >= ALL (
SELECT COUNT(*)
FROM desease D, studies E, institute I
WHERE I.iid = E.iid AND D.did = E.did AND I.cidade = "Lisboa"
GROUP BY D.name
);

I don't understand structure/problme well enough but I did see that you were mixing joins and had a cross join which would inflate the number of recrds.
SELECT DISTINCT D.name, MAX(I.iid)
FROM desease D
INNER JOIN studies S ON D.iid=S.Did
INNER JOIN institute I ON (S.iid = I.iid)
WHERE I.city = 'Lisboa' AND D.did = S.did
GROUP BY D.nome
HAVING COUNT(I.iid) = MAX(I.city)

This would return a list of disease names that have an institute in Lisbon starting with the one with the greatest number of institutes in Lisbon and going down:
SELECT D.name, COUNT(*) as numberOfInstitutes
FROM desease D
INNER JOIN studies S ON D.did=S.did
INNER JOIN institute I ON (S.iid = I.iid)
WHERE I.city = 'Lisbon'
GROUP BY D.did
ORDER BY COUNT(*) desc
If you need only the one that has the most institutes and you need the rest of the columns from the desease table, you can do this (in Sql Server):
SELECT TOP 1 D.*
FROM desease D
INNER JOIN
(
SELECT D.did, COUNT(*) as numberOfInstitutes
FROM desease D
INNER JOIN studies S ON D.did=S.did
INNER JOIN institute I ON (S.iid = I.iid)
WHERE I.city = 'Lisbon'
GROUP BY D.did
) as tblCount on tblCount.did = D.did
ORDER BY numberOfInstitutes desc

Just a rough guess what you need:
SELECT stu.iid, COUNT(*) AS nstudies
FROM studies stu, institute ins
WHERE stu.iid=ins.iid
AND ins.city='Lisboa'
GROUP BY stu.iid
ORDER BY nstudies DESC;
This should give you a list of institutes that are in Lisboa and the number of studies they did.
SELECT stu.did, COUNT(*) AS ninst
FROM studies stu, institute ins, disease dis
WHERE stu.iid=ins.iid
AND stu.did=dis.did
AND ins.city='Lisboa'
GROUP BY stu.did
ORDER BY ninst DESC;
This gives you a list of deseases and the number of Lisboa instutitues that did it.
Unfortunately your question leaves a lot of room for speculation as to what you need -- maybe you should add some example data and the expected result.

Related

SQL aggregate functions, inner join

I am working on writing a sql to get the SID and SNAME. In this task, I need to count which team win the max number of League and find out the SID.
Leagues(LID, CHAMPION_TID)
LID: League ID ; CHAMPION_TID: champion team ID
SUPPORT(SID, LID)
SPONSORS(SID, SNAME)
PRIMARY KEY: LID,SID
Now, I can find out which team win the max number of League through the following SQL:
SELECT
MAX(y.cham)
FROM
(SELECT
CHAMPION_TID, COUNT(L.CHAMPION_TID) AS cham
FROM
LEAGUES L
GROUP BY
L.CHAMPION_TID) y, LEAGUES L
WHERE
y.CHAMPION_TID = L.CHAMPION_TID;
I am confusing in the following step. My idea get the LID, then use the join table to display SID and SNAME. But I suck in this step.
SELECT L.LID, MAX(y.cham)
FROM
(SELECT CHAMPION_TID, COUNT(L.CHAMPION_TID) AS cham
FROM LEAGUES L
GROUP BY L.CHAMPION_TID) y, LEAGUES L
WHERE
y.CHAMPION_TID = L.CHAMPION_TID
You can use the following to find the Sponsor ID and Sponsor Name:
SELECT DISTINCT
sp.SID,
sp.SNAME
FROM
LEAGUES l3
INNER JOIN support s ON
l3.LID = s.LID
INNER JOIN SPONSORS sp ON
s.SID = sp.SID
WHERE
l3.CHAMPION_TID IN (
SELECT
l2.CHAMPION_TID
FROM
LEAGUES l2
GROUP BY
l2.CHAMPION_TID
HAVING
count(l2.CHAMPION_TID) = (
SELECT
count(l1.CHAMPION_TID)
FROM
LEAGUES l1
GROUP BY
l1.CHAMPION_TID
ORDER BY
count(l1.CHAMPION_TID) DESC
FETCH FIRST 1 ROW ONLY
)
);
It finds the count of CHAMPION_TID in LEAGUES, orders it by desc (such that the highest count is always on top), then uses it to find the associated CHAMPION_TID. It handles ties for max(count(CHAMPION_TID)) as well :)
If fetch first 1 row only does not work, you can use select top 1 l1.CHAMPION_TID...
Here is a working demo using Postgres.

Borrowers who borrowed ONLY all books of a specified author and editor

For a given author and editor, i need to list names of borrowers who borrowed "ONLY" "ALL" the books of that author and editor
MEMBER (mid primary key, name, birthdate)
AUTHOR (authorcode primary key,name)
EDITOR(edcode primary key,name)
BORROW (bid primary key, copyid, mid)
COPIES (copyid primary key, bookid)
BOOK (bookid primary key, title, themecode)
THEME (themecode primary key, label)
I tried the following:
SELECT m.name
FROM MEMBER m,COPIES c,BORROW b1,BOOK b,AUTHOR a,EDITOR e,WRITE w
WHERE m.mid=b1.mid AND b1.copyid=c.copyid AND c.bookid=b.bookid AND b.editor=e.edcode AND a.authorcode=w.author AND w.book=b.bookid
MINUS
(SELECT m.name
FROM MEMBER m,COPIES c,BORROW b1,BOOK b,AUTHOR a,EDITOR e,WRITE w
WHERE m.mid=b1.mid AND b1.copyid=c.copyid AND c.bookid=b.bookid AND b.editor=e.edcode AND a.authorcode=w.author AND w.book=b.bookid AND (a.authorcode<>:P58_AUTHOR OR e.edcode<>:P58_EDITOR)
)
It gave me a member who borrowed at least one book and didnt borrow any book other than that author and editor , so I need to make sure he borrowed all the books
I think the following will give you what you need:
WITH book_ed_auth_dets AS (SELECT b.bookid,
a.authorcode,
e.edcode,
COUNT(*) OVER (PARTITION BY a.authorcode, e.edcode) tot_num_bks_per_auth_ed
FROM book b
INNER JOIN WRITE w ON b.bookid = w.book
INNER JOIN author a ON w.author = a.authorcode
INNER JOIN editor e ON b.editor = e.edcode
WHERE a.authorcode = :p58_author
AND e.edcode = :p58_editor)
SELECT m.name
FROM MEMBER m
INNER JOIN borrow b1 ON m.mid = b1.mid
INNER JOIN copies ON c.bookid ON b1.copyid = c.copyid
INNER JOIN book_ed_auth_dets bead ON c.bookid = bead.bookid
GROUP BY m.id,
m.name,
bead.authorcode,
bead.edcode
bead.tot_num_bks_per_auth_ed
HAVING COUNT(DISTINCT bead.bookid) = bead.tot_num_bks_per_auth_ed);
N.B. Untested, since you didn't provide sample data to work with.
This finds the books for a given author and editor, and uses an analytic count to write out the total number of books for that author and editor for each row.
Then we inner join the member's borrowed books to that, find the count of distinct bookids (in case the member borrowed the same book more than once) and only report members if that count matches the count of books for the author and editor.
If I am not wrong then WRITE table is the most imp table which you missed to describe.
I have taken an idea from your existing query and tried to create the solution as follows:
SELECT
NAME
FROM
(
SELECT
M.NAME,
-- NO OF BOOK BORROWED BY MEMBER OF THAT AUTHOR AND EDITOR
COUNT(DISTINCT B.BOOKID) NO_BOOKS_BORROWED
FROM
MEMBER M
JOIN BORROW B1 ON ( M.MID = B1.MID )
JOIN COPIES C ON ( B1.COPYID = C.COPYID )
JOIN BOOK B ON ( C.BOOKID = B.BOOKID )
JOIN EDITOR E ON ( B.EDITOR = E.EDCODE )
JOIN WRITE W ON ( A.AUTHORCODE = W.AUTHOR )
JOIN AUTHOR A ON ( W.BOOK = B.BOOKID )
GROUP BY
M.NAME
--CONDITION TO CHECK THAT BORROWER HAS NOT BOUGHT ANY OTHER BOOK
HAVING
SUM(CASE
WHEN A.AUTHORCODE <> :P58_AUTHOR
OR E.EDCODE <> :P58_EDITOR THEN 1
END) = 0
) BORRROWER
--
JOIN (
SELECT
-- NO OF ALL BOOKS WRITTEN BY AUTHOR AND EDITOR
COUNT(DISTINCT B.BOOKID) NO_BOOKS_WRITTEN
FROM
EDITOR E
JOIN WRITE W ON ( A.AUTHORCODE = W.AUTHOR )
JOIN AUTHOR A ON ( W.BOOK = B.BOOKID )
WHERE
A.AUTHORCODE = :P58_AUTHOR
AND E.EDCODE = :P58_EDITOR
) AUTHOR_EDITOR ON ( BORRROWER.NO_BOOKS_BORROWED = AUTHOR_EDITOR.NO_BOOKS_WRITTEN )
Cheers!!
The following returns all the books by a given author and editor:
select w.bookid
from book b join
writes w
on b.bookid = w.bookid
where b.edcode = :P58_EDITOR and
w.authorcode = :P58_AUTHOR;
The following finds members who have borrowed all these books:
with bae as (
select w.bookid
from book b join
writes w
on b.bookid = w.bookid
where b.edcode = :P58_EDITOR and
w.authorcode = :P58_AUTHOR
)
select b.mid
from borrow b join
copies c
on c.copyid = b.copyid join
bae
on b.bookid = bae.bookid
group by b.mid
having count(distinct b.bookid) = (select count(*) from bae);
Note that the count(distinct) is needed because a member could, presumably, borrow the same book twice.

SQL - Selecting highest scores for different categories

Lets say i've got a db with 3 tables:
Players (PK id_player, name...),
Tournaments (PK id_tournament, name...),
Game (PK id_turn, FK id_tournament, FK id_player and score)
Players participate in tournaments. Table called Game keeps track of each player's score for different tournaments)
I want to create a view that looks like this:
torunament_name Winner highest_score
Tournament_1 Jones 300
Tournament_2 White 250
I tried different aproaches but I'm fairly new to sql (and alsoto this forum)
I tried using union all clause like:
select * from (select "Id_player", avg("score") as "Score" from
"Game" where "Id_tournament" = '1' group by "Id_player" order by
"Score" desc) where rownum <= 1
union all
select * from (select "Id_player", avg("score") as "Score" from
"Game" where "Id_tournament" = '2' group by "Id_player" order by
"Score" desc) where rownum <= 1;
and ofc it works but whenever a tournament happens, i would have to manually add a select statement to this with Id_torunament = nextvalue
EDIT:
So lets say that player with id 1 scored 50 points in tournament a, player 2 scored 40 points, player 1 wins, so the table should show only player 1 as the winner (or if its possible 2or more players if its a tie) of this tournament. Next row shows the winner of second tournament. I dont think Im going to put multiple games for one player in the same tournament, but if i would, it would probably count avg from all his scores.
EDIT2:
Create table scripts:
create table players
(id_player numeric(5) constraint pk_id_player primary key, name
varchar2(50));
create table tournaments
(id_tournament numeric(5) constraint pk_id_tournament primary key,
name varchar2(50));
create table game
(id_game numeric(5) constraint pk_game primary key, id_player
numeric(5) constraint fk_id_player references players(id_player),
id_tournament numeric(5) constraint fk_id_tournament references
tournaments(id_tournament), score numeric(3));
RDBM screenshot
FINAL EDIT:
Ok, in case anyone is wondering I used Jorge Campos script, changed it a bit and it works. Thank you all for helping. Unfortunately I cannot upvote comments yet, so I can only thank by posting. Heres the final script:
select
t.name,
p.name as winner,
g.score
from
game g inner join tournaments t
on g.id_tournament = t.id_tournament
inner join players p
on g.id_player = p.id_player
inner join
(select g.id_tournament, g.id_player,
row_number() over (partition by t.name order by
score desc) as rd from game g join tournaments t on
g.id_tournament = t.id_tournament
) a
on g.id_player = a.id_player
and g.id_tournament = a.id_tournament
and a.rd=1
order by t.name, g.score desc;
This query could be simplified depending on the RDBMs you are using.
select
t.name,
p.name as winner,
g.score
from
game g inner join tournaments t
on g.id_tournament = t.id_tournament
inner join players p
on g.id_player = p.id_player
inner join
(select id_tournament,
id_player,
row_number() over (partition by t.name order by score desc) as rd
from game
) a
on g.id_player = a.id_player
and g.id_tournament = a.id_tournament
and a.rd=1
order by t.name, g.score desc
Assuming what you want as "Display high score of each player in each tournament"
your query would be like below in MS Sql server
select
t.name as tournament_name,
p.name as Winner,
Max(g.score) as [Highest_Score]
from Tournmanents t
Inner join Game g on t.id_tournament=g.id_tournament
inner join Players p on p.id_player=g.id_player
group by
g.id_tournament,
g.id_player,
t.name,
p.name
Please check this if this works for you
SELECT tournemntData.id_tournament ,
tournemntData.name ,
dbo.Players.name ,
tournemntData.Score
FROM dbo.Game
INNER JOIN ( SELECT dbo.Tournaments.id_tournament ,
dbo.Tournaments.name ,
MAX(dbo.Game.score) AS Score
FROM dbo.Game
INNER JOIN dbo.Tournaments ONTournaments.id_tournament = Game.id_tournament
INNER JOIN dbo.Players ON Players.id_player = Game.id_player
GROUP BY dbo.Tournaments.id_tournament ,
dbo.Tournaments.name
) tournemntData ON tournemntData.id_tournament =Game.id_tournament
INNER JOIN dbo.Players ON Players.id_player = Game.id_player
WHERE tournemntData.Score = dbo.Game.score

Complicated table join

I thought I had a good grasp on table joins but there is one problem here I can't figure out.
I am trying to track the progress of students on specifically required courses. Some students are required to complete an exact list of courses before further qualification.
Tables (simplified):
students
--------
id INT PRIMARY KEY
name VARCHAR(50)
student_courses
---------------
student_id INT PRIMARY KEY
course_id TINYINT PRIMARY KEY
course_status TINYINT (Not done, Started, Completed)
steps_done TINYINT
total_steps TINYINT
date_created DATETIME
date_modified DATETIME
courses
-------
id TINYINT PRIMARY KEY
name VARCHAR(50)
I want to insert a list of required courses, for example 5 different courses in the courses table and then select a specific student and get list of all the courses required, whether a row exists for that course in the student_courses table or not.
I guess I could insert all rows from the courses table in the student_courses table for each student, but I don't want that because not all students need to do these courses. And what if new courses are added later.
I just want a result which is something like this:
students table:
id name
--- ------------------
1 George Smith
2 Dana Jones
3 Maria Cobblestone
SELECT * FROM students (JOIN bla bla bla - this is the point where I'm lost...)
WHERE students.id = 1
Result:
id name course_id courses.name course_status steps_done
--- ------------------ --------- ------------ ------------- ----------
1 George Smith 1 Botany Not started 0
1 George Smith 2 Biology NULL NULL
1 George Smith 3 Physics NULL NULL
1 George Smith 4 Algebra Completed 34
1 George Smith 5 Sewing Started 2
If the course_status or steps_done is NULL it means that no row exists for this student for this course in the student_courses table.
The idea is then using this in MS Access (or some other system) and have the row automatically inserted in the student_courses table once you enter a value in the NULL field.
You can't just use an outer join to do this, you need to create a list of all students/classes combinations that you're interested in first, then use that list in a LEFT JOIN. Can be done in a cte/subquery using CROSS JOIN:
;WITH cte AS (SELECT DISTINCT s.id Student_ID
,s.name
,c.id Course_ID
,c.name Class_Name
FROM Students s
CROSS JOIN Courses c)
SELECT cte.*,sc.status
FROM cte
LEFT JOIN student_courses sc
ON cte.course_id = sc.course_id
Can also use a subquery if needs to be done in Access (not 100% on syntax in Access):
SELECT sub.*,sc.status
FROM (SELECT DISTINCT s.id Student_ID
,s.name
,c.id Course_ID
,c.name Class_Name
FROM Students s
CROSS JOIN Courses c
) AS sub
LEFT JOIN student_courses sc
ON sub.course_id = sc.course_id
Demo: SQL Fiddle
You want a left outer join. The first table is from the courses table and is used for the required courses (defined in the where clause).
select s.id, s.name, c.id, c.name, c.course_status, c.steps_done
from (courses as c left join
student_courses as sc
on sc.course_id = c.id and
sc.student_id = 1
) left join
students as s
on sc.student_id = s.id
where c.id in (<list of required courses>)
order by s.id, c.id;
I think I have all the "Access"isms in there.
Actually, the above will be missing the student name when s/he is missing a course. The following is more correct:
select s.id, s.name, c.id, c.name, c.course_status, c.steps_done
from (courses as c left join
student_courses as sc
on sc.course_id = c.id and
sc.student_id = 1
) cross join
students as s
on s.id = 1
where c.id in (<list of required courses>)
order by s.id, c.id;

Better way to demand, in SQL, that a column contains every specified value

Imagine you have two tables, with a one to many relationship.
For this example, I will suggest that there are two tables: Person, and Homes.
The person table holds a persons name, and gives them an ID. The homes table, holds the association of homes to a person. PID joins to "Person.ID"
And, in this tiny DB, a person can have no homes, or many homes.
I hope I drew that right.
How do I write a select, that returns everyone with every specified house type?
Let's say these are valid "Types" in the homes table:
Cottage, Main, Mansion, Spaceport.
I want to return everyone, in the Person table, who has a spaceport and a Cottage.
The best I could come up with was this:
SELECT DISTINCT( p.name ) AS name
FROM person p
INNER JOIN homes h ON h.pid = p.id
WHERE 'spaceport' in (
SELECT DISTINCT( type ) AS type
FROM homes
WHERE pid = p.id
)
AND 'cottage' in (
SELECT DISTINCT( type ) AS type
FROM homes
WHERE pid = p.id
)
When I wrote that, it works, but I'm pretty sure there has to be a better way.
The HAVING clause here will guarantee that the persons returned have both types, not just one or the other.
SELECT p.name
FROM person p
INNER JOIN homes h
ON p.id = h.pid
AND h.type IN ('spaceport', 'cottage')
GROUP BY p.name
HAVING COUNT(DISTINCT h.type) = 2
select * from homes;
home_id person_id type
--
1 1 cottage
2 1 mansion
3 2 cottage
4 3 mansion
5 4 cottage
6 4 cottage
To find the id numbers of every person who has both a cottage and a mansion, group by the id number, restrict the output to cottages and mansions, and count the distinct types.
select person_id
from homes
where type in ('cottage','mansion')
group by person_id
having count(distinct type) = 2;
person_id
--
1
You can use this query in a join to get all the columns from the person table.
select person.*
from person
inner join (select person_id
from homes
where type in ('cottage','mansion')
group by person_id
having count(distinct type) = 2) T
on person.person_id = T.person_id;
Thanks to Joe for pointing out an error in my count().
Not sure about the performance on this one, but here goes:
SELECT PID FROM (
SELECT PID, COUNT(PID) cnt FROM (
SELECT DISTINCT PID, Type FROM Homes
WHERE Type IN ('Type1', 'Type2', 'Type3')
) a
GROUP BY PID
) b
WHERE b.cnt = 3
You'd have to dynamically generate your IN clause as well as the WHERE b.CNT clause.