SQL inquiry, tried absolutely everything I know, sql, four tables - sql

I cant get this inquiry, tried like every thing
TABLE: ARTIST
JMBG NAME AGE ADRESA
--------------------------------------
J1 Ladygaga 35 HOLIVUDHILZ
J2 DUSKO 13 BB
J3 EMINEM 40 REVOLUCIJA 5
J4 BAGI 22 KURAC
J5 MARKO 33 ULICA
TABLE:HALL
DID CAPACITY CITY
---------------------------------
D1 500 PODGORICA
D2 300 NIS
D3 1000 BAR
D4 2000 NEWYORK
D5 750 BEOGRAD
TABLE: CITY
-----------------------------------------
BAR montenegro 5000
BEOGRAD Serbia 2000000
BUDVA montenegro 50000
NEWYORK AMERICA 7000000
NIS Serbia 1000000
PODGORICA montenegro 250000
TABLE: CONCERT
ID JMBG HALL
------------------------
K1 J3 D4
K2 J4 D1
K3 J1 D1
K4 J1 D5
K5 J1 D1
K6 J3 D1
K7 J5 D1
The inquiry is: Find the countries where the artist with the most held concerts
has performed in. I really did spend a lot of time on this and energy. I would greatly appreciate if someone could do this that has experience, and doesnt find it too difficult.
I tried this:
SELECT DISTINCT COUNTRY FROM CITY G, HALL D, CONCERT K
WHERE K.DID = D.DID AND D.NAZIV = G.NAZIV AND EXISTS(
SELECT JMBG FROM CONCERT K1,HALL D1, CITY G1
WHERE K.KID=K1.KID
GROUP BY JMBG
HAVING COUNT (*) >= ALL(SELECT COUNT(*) FROM CONCERT
GROUP BY JMBG))

Break it down. The artist with the most held concerts... Which artist had the most held concerts? (We're going to assume that we're interested in the total number of concerts held overall (in all countries), not the number of concerts held in a particular country.
How many concerts did each artist hold?
SELECT c.jmbg
, COUNT(1) AS cnt
FROM concert c
GROUP BY c.jmbg
Which artist held the most concerts? MySQL and MS SQL Server both have some convenient short cuts we can use here. A question we should ask here, what if there are two or more artists held the same number of concerts? Do we want to return both (or all) of those artists, or just return one of them? Which one? (We'd prefer the query to be deterministic... to return the same result given the same rows in the tables.)
Assuming that we want to return just one artist that held the most concerts...
For MySQL:
SELECT c.jmbg
FROM concert c
GROUP BY c.jmbg
ORDER BY COUNT(1) DESC, c.jmbg DESC
LIMIT 1
For SQL Server:
SELECT TOP 1 c.jmbg
FROM concert c
GROUP BY c.jmbg
ORDER BY COUNT(1) DESC, c.jmbg DESC
So that gets us the artist.
The other part of the "inquiry"... which countries did the artist hold concerts in.
Given a particular artist, we could write a query that performs join operations on the concert, hall and city tables. We'll just take a guess at the name of that first column in the city table (since it isn't provided in the question).
SELECT i.country
FROM city i
JOIN hall h
ON h.city = i.cid
JOIN concert o
ON o.hall = h.did
WHERE o.jmbg = 'Ladygaga'
GROUP BY i.country
To combine the two queries, we could use the first as a subquery. My preference is to use an inline view.
SELECT g.country
FROM city g
JOIN hall h
ON h.city = g.cid
JOIN concert o
ON o.hall = h.did
JOIN (
SELECT c.jmbg
FROM concert c
GROUP BY c.jmbg
ORDER BY COUNT(1) DESC, c.jmbg DESC
LIMIT 1
) m
ON m.jmbg = o.jmbg
GROUP BY g.country
Obviously, there are obviously other query patterns that will return an equivalent result.
As I noted in a comment on the question, the specification for this "inquiry" is a bit ambiguous, as to what is meant by "where the artist with the most held concerts has performed in".
There is another interpretation of that specification. If we're interested in getting and analyzing a count of "how many concerts were held in each country by each artist", that's a different query.
FOLLOWUP
"... not allowed to use TOP DESC"
Then just write the query differently. Here's a different way to get the "largest number of concerts held by any artist", and use that to get all the artists that all held that number of concerts.
SELECT n.jmbg
FROM ( -- largest number of concerts by artist
SELECT MAX(p.cnt) AS maxcnt
FROM (
SELECT COUNT(1) AS cnt
FROM concert d
GROUP BY d.jmbg
) p
) o
JOIN ( -- count of concerts by artist
SELECT c.jmbg
, COUNT(1) AS cnt
FROM concert c
GROUP BY c.jmbg
) n
ON n.cnt = o.maxcnt
Since that has the potential to return more than one row (more than one artist), your outer query may want to return a list of countries for each of the returned artists. That is to say, rather than just GROUP BY g.country, you'll likely want to return the artist in the SELECT list, and
GROUP BY m.jmbg, g.country
ORDER BY m.jmbg, g.country

This is a basic question that looks like coming out of school type of question. This answer will give you some hints but you need to work it out for yourself.
JOIN is your friend, find source below:
JOIN - MySQL
JOIN - SQL Server
What you need to do:
join CONCERT table with HALL table by HALL ID
join HALL table to CITY table by CITY name
sum the count of country appearance or hall capacity (either one you need) grouped by artist
order descending by the sum of count if you need it
Good luck

Related

How to make sure result pairs are unique - without using distinct?

I have three tables I want to iterate over. The tables are pretty big so I will show a small snippet of the tables. First table is Students:
id
name
address
1
John Smith
New York
2
Rebeka Jens
Miami
3
Amira Sarty
Boston
Second one is TakingCourse. This is the course the students are taking, so student_id is the id of the one in Students.
id
student_id
course_id
20
1
26
19
2
27
18
3
28
Last table is Courses. The id is the same as the course_id in the previous table. These are the courses the students are following and looks like this:
id
type
26
History
27
Maths
28
Science
I want to return a table with the location (address) and the type of courses that are taken there. So the results table should look like this:
address
type
The pairs should be unique, and that is what's going wrong. I tried this:
select S.address, C.type
from Students S, Courses C, TakingCourse TC
where TC.course_id = C.id
and S.id = TC.student_id
And this does work, but the pairs are not all unique. I tried select distinct and it's still the same.
Multiple students can (and will) reside at the same address. So don't expect unique results from this query.
Only an overview is needed, so that's why I don''t want duplicates
So fold duplicates. Simple way with DISTINCT:
SELECT DISTINCT s.address, c.type
FROM students s
JOIN takingcourse t ON s.id = t.student_id
JOIN courses c ON t.course_id = c.id;
Or to avoid DISTINCT (why would you for this task?) and, optionally, get counts, too:
SELECT c.type, s.address, count(*) AS ct
FROM students s
JOIN takingcourse t ON s.id = t.student_id
JOIN courses c ON t.course_id = c.id
GROUP BY c.type, s.address
ORDER BY c.type, s.address;
A missing UNIQUE constraint on takingcourse(student_id, course_id) could be an additional source of duplicates. See:
How to implement a many-to-many relationship in PostgreSQL?

SQL MS Access join help average not correct

I am trying to create a query that displays the average attendance by conference when at least one team was in a game.
Relationships
this is very close to what im looking for
SELECT
Conference.ConferenceName,
AVG(Game.Attendance) AS AVG_ATT
FROM
(
Conference
INNER JOIN School ON Conference.[ConferenceID] = School.[ConferenceID]
)
INNER JOIN Game ON
(
School.[SchoolID] = Game.[Team1]
OR
School.[SchoolID] = Game.[Team2]
)
GROUP BY
Conference.ConferenceName;
the problem is if a game has 2 teams from the same conference it adds the attendance twice, and should only do it once.
consider 2 games
game1
Team1- Wisconsin
Conference - BIG10
Team2 - Michigan
Conference - BIG10
Attendance - 100,000
game2
Team1- Wisconsin
Conference - BIG10
Team2 - USC
Conference - PAC12
Attendance - 65,000
Results
BIG10-correct 82,500
PAC12 65,000
BIG10-Actual 88,333
Get a distinct list of the games by conference in a derived query, then do your average.
SELECT
ConferenceName,
AVG(Attendance) AS AVG_ATT
FROM
(
SELECT DISTINCT
GameID,
Conference.ConferenceName,
Game.Attendance
FROM
(
Conference
INNER JOIN School ON Conference.[ConferenceID] = School.[ConferenceID]
)
INNER JOIN Game ON
(
School.[SchoolID] = Game.[Team1]
OR
School.[SchoolID] = Game.[Team2]
)
) DerivedDistinctGamesAndConferences
GROUP BY
ConferenceName;
USE UNION to discard duplicated first before doing the average
SELECT
Game.GameID
Game.Attendance
Conference.ConferenceName
FROM
(
Conference
INNER JOIN School ON Conference.[ConferenceID] = School.[ConferenceID]
)
INNER JOIN Game ON
(
School.[SchoolID] = Game.[Team1] -- TEAM 1
)
UNION
SELECT
Game.GameID
Game.Attendance
Conference.ConferenceName
FROM
(
Conference
INNER JOIN School ON Conference.[ConferenceID] = School.[ConferenceID]
)
INNER JOIN Game ON
(
School.[SchoolID] = Game.[Team2] -- TEAM 2
)
So you will get query1 union query2
GameID Attendance Conference
1 100,000 BIG10 < one row will disapear
2 80,000 BIG10
1 100,000 BIG10 < after union
2 80,000 PAC12
Then you calculate the avergage over this result
I am not exactly sure in terms of coding, but conceptually, the fact that you need to aggregate totals "when at least one" team was- as you then say, only ONCE- at a conference, makes me intuit you may need to use some "if...then" clause statements within VBA script to get the specific calculation you want. Right now, the SQL "or" statement isn't enough.

SQL Group By Before Averaging

Currently I have a query that finds the average of points scored depending upon opponent.
Here is the query:
SELECT NBAGameLog.Opp, AVG(NBAGameLog.Points)
FROM Players INNER JOIN
NBAGameLog
ON Players.Player_ID = NBAGameLog.Player_ID
WHERE (NBAGameLog.Date_Played Between Date()-15 And Date() AND
Players.Position = "C"
GROUP BY NBAGameLog.Opp;
The issue happens if I have something like this:
NBAGameLog table:
Player_ID Team Opp Points Position
1 MIA ATL 15 C
2 MIA ATL 25 C
3 BOS ATL 23 C
The result from this would be:
Position Opp Average
C ATL 21
But I'd like the query to first group together the Teams. So instead of (15+25+23)/3, it would see that the first two players were on the same team, so only count that as one and do (40+23)/2
Is this possible?
You can do this using a subquery, aggregating first in the subquery then again in the outer query:
SELECT t.Opp, avg(Points)
FROM (SELECT gl.team, gl.Opp, AVG(gl.Points) as Points
FROM Players p INNER JOIN
NBAGameLog gl
ON p.Player_ID = gl.Player_ID
WHERE (gl.Date_Played Between Date()-15 And Date() AND
p.Position = "C"
GROUP BY gl.team, gl.Opp;
) t
GROUP BY t.Opp;

sql select records with matching subsets

There are two sets of employees: managers and grunts.
For each manager, there's a table manager_meetings that holds a list of which meetings each manager attended. A similar table grunt_meetings holds a list of which meetings each grunt attended.
So:
manager_meetings grunt_meetings
managerID meetingID gruntID meetingID
1 a 4 a
1 b 4 b
1 c 4 c
2 a 4 d
2 b 5 a
3 c 5 b
3 d 5 c
3 e 6 a
6 c
7 b
7 a
The owner doesn't like it when a manager and a grunt know exactly the same information. It makes his head hurt. He wants to identify this situation, so he can demote the manager to a grunt, or promote the grunt to a manager, or take them both golfing. The owner likes to golf.
The task is to list every combination of manager and grunt where both attended exactly the same meetings. If the manager attended more meeting than the grunt, no match. If the grunt attended more meetings than the manager, no match.
The expected results here are:
ManagerID GruntID
2 7
1 5
...because manager 2 and grunt 7 both attended (a,b), while manager 1 and grunt 5 both attended (a,b,c).
I can solve it in a clunky way, by pivoting up the subset of meetings in a subquery into XML, and comparing each grunt's XML list to each manager's XML. But that's horrible, and also I have to explain to the owner what XML is. And I don't like golfing.
Is there some better way to do "WHERE {subset1} = {subset2}"? It feels like I'm missing some clever kind of join.
SQL Fiddle
Here is a version that works:
select m.mId, g.gId, count(*) --select m.mid, g.gid, mm.meetingid, gm.meetingid as gmm
from manager m cross join
grunt g left outer join
(select mm.*, count(*) over (partition by mm.mid) as cnt
from manager_meeting mm
) mm
on mm.mid = m.mId full outer join
(select gm.*, count(*) over (partition by gm.gid) as cnt
from grunt_meeting gm
) gm
on gm.gid = g.gid and gm.meetingid = mm.meetingid
group by m.mId, g.gId, mm.cnt, gm.cnt
having count(*) = mm.cnt and mm.cnt = gm.cnt;
The string comparison method is shorter, perhaps easier to understand, and probably faster.
EDIT:
For your particular case of getting exact matches, the query can be simplified:
select mm.mId, gm.gId
from (select mm.*, count(*) over (partition by mm.mid) as cnt
from manager_meeting mm
) mm join
(select gm.*, count(*) over (partition by gm.gid) as cnt
from grunt_meeting gm
) gm
on gm.meetingid = mm.meetingid and
mm.cnt = gm.cnt
group by mm.mId, gm.gId
having count(*) = max(mm.cnt);
This might be more competitive with the string version, both in terms of performance and clarity.
It counts the number of matches between a grunt and a manager. It then checks that this is all the meetings for each.
An attempt at avenging Aaron's defeat – a solution using EXCEPT:
SELECT
m.mID,
g.gID
FROM
manager AS m
INNER JOIN
grunt AS g
ON NOT EXISTS (
SELECT meetingID
FROM manager_meeting
WHERE mID = m.mID
EXCEPT
SELECT meetingID
FROM grunt_meeting
WHERE gID = g.gID
)
AND NOT EXISTS (
SELECT meetingID
FROM grunt_meeting
WHERE gID = g.gID
EXCEPT
SELECT meetingID
FROM manager_meeting
WHERE mID = m.mID
);
Basically, subtract a grunt's set of meetings from a manager's set of meetings, then the other way round. If neither result contains rows, the grunt and the manager attended the same set of meetings.
Please note that this query will match managers and grunts that never attended a single meeting.
An alternative version - but requires another table. Basically, we give each meeting a distinct power of two as it's 'value', then sum every manager's meeting value and each grunt's meeting value. Where they're the same, we have a match.
It should be possible to make the meeting_values table a TVF, but this is a little bit simpler.
SQL Fiddle
Additional table:
CREATE TABLE meeting_values (value INT, meetingID CHAR(1));
INSERT INTO meeting_values VALUES
(1,'a'),(2,'b'),(4,'c'),(8,'d'),(16,'e');
And the query:
SELECT managemeets.mID, gruntmeets.gID
FROM
( SELECT gm.gID, sum(value) AS meeting_totals
FROM grunt_meeting gm
INNER JOIN
meeting_values mv ON gm.meetingID = mv.meetingID
GROUP BY gm.gID
) gruntmeets
INNER JOIN
( SELECT mm.mID, sum(value) AS meeting_totals
FROM manager_meeting mm
INNER JOIN
meeting_values mv ON mm.meetingID = mv.meetingID
GROUP BY mm.mID
) managemeets ON gruntmeets.meeting_totals = managemeets.meeting_totals

Trying to find all the cities that there is not a direct flight to from a city (PostgreSQL)

I'm trying to write a query that determines which cities I can't fly to directly from a city, say London. Given the schema:
cities:
| c_id | city_name |
flights:
| f_id | departure_city_id | destination_city_id |
currently my query returns the opposite, i.e. it returns the cities for which there is a direct flight from London
SELECT c2.city_name as "City"
FROM flights AS f
JOIN cities AS c2 ON f.destination_city_id != c2.c_id
JOIN cities AS c ON c.c_id = c.c_id
WHERE c.city_name = 'London'
AND c.c_id != c2.c_id
AND f.departure_city_id = c.c_id;
I would have thought it would be easy to change it to get what I want.
I thought changing the third line to
JOIN cities AS c2 ON f.destination_city_id = c2.c_id
Would have done the trick but it didn't. Any help?
cities I can't fly to directly from a city, say London.
Meaning one can fly there, just not directly from London. So JOIN (not LEFT JOIN) city to flight via destination_city_id:
SELECT DISTINCT c.city_name
FROM cities c
JOIN flights f ON f.destination_city_id = c.c_id
JOIN cities c2 ON c2.c_id = f.departure_city_id
WHERE c2.city_name <> 'London';
Then I only have to exclude flights originating from London, apply DISTINCT to get unique city names and we are done.
A more sophisticated interpretation of this question would be:
"Cities you can fly to from London, just not directly"
But since this looks like basic homework I don't assume they'd expect a recursive query from you.
Try something like:
SELECT *
FROM cities c
WHERE c.c_id NOT IN
(SELECT f.destination_city_id
FROM flights f
JOIN cities c2 ON f.departure_city_id = c.c_id
WHERE c2.city_name = 'London')