SQL Using MAX and Distinct simultaneously - sql

I'm arranging a sort of Tennis Players database, and I'd like to show each country's top-scoring player. I have the table Players with a column called Country which is the Country the player is from, and a table Rating with a column called Points which is the total number of points the player scored.
Since there are multiple players from each country, I don't know how to show the player with the maximum score from each country.
I tried the following:
select
playerstbl.FirstName, playerstbl.Country, ratingtbl.Points
from
playerstbl
join
ratingtbl on playerstbl.PlayerId = ratingtbl.PlayerId
where
ratingtbl.Points = (select MAX(ratingtbl.Points)
from ratingtbl
group by playerstbl.Country);

The following query is a somewhat non-intuitive way to answer this question. It is standard SQL though:
select p.FirstName, p.Country, r.Points
from playerstbl p join
ratingtbl r
on p.PlayerId = r.PlayerId
where not exists (select 1
from playerstbl p2 join
ratingtbl r2
on p2.PlayerId = r2.PlayerId
where p2.Country = p.Country and
r2.Points > r.Points
);
And, this structure often performs best. It gets the answer to this question: "Get me all players where there is no player in the same country with more points." That is equivalent to getting the max.
For your query to work, you need to incorporate the country into the subquery:
select p.FirstName, p.Country, r.Points
from playerstbl p join
ratingtbl r
on p.PlayerId = r.PlayerId
where r.Points = (select MAX(r2.Points)
from playerstbl p2 join
ratingtbl r2
on p2.PlayerId = r2.PlayerI
where p2.Country = p.Country
);
The where clause in the subquery refers to the outer query. This is called a "correlated subquery" and is a very powerful construct in SQL. Your original query returned an error, no doubt, saying that the subquery returned more than one row. This version fixed that problem.

Related

Counting subquery results SQL Oracle

So the code I have is trying to count the number of ratings given to a movie per state. That's all easy done. I also need to count the number of ratings given to award winning movies, per state.
SELECT DISTINCT ad.state "State",
COUNT(r.ratingid) OVER (PARTITION BY ad.state) "Number of Ratings",
COUNT(
SELECT DISTINCT r.ratingid
FROM netflix.ratings100 r JOIN netflix.movies_awards a
ON r.movieid = a.movieid
JOIN netflix.addresses ad
ON ad.custid = r.custid
WHERE a.awardid IS NOT NULL
) OVER (PARTITION BY ad.state) "Number of Award Winning Movies Rated"
FROM netflix.addresses ad JOIN netflix.ratings100 r
ON ad.custid = r.custid
JOIN netflix.movies_awards a
ON r.movieid = a.movieid
GROUP BY "State"
The second count statement should be counting the number of ratings made where the awardID is not null. That subquery alone works, and returns distinct ratingIDs, but the thing as a whole does not work. I get ORA-00936: missing expression. Solutions?
You haven't got brackets around the subquery - you have the brackets to indicate the count, but you need an extra set to indicate that it's a subquery.
E.g;
count( (select ....) ) over ...
Moreover, you're reusing the aliases from your outer query in your inner query, plus there's nothing to correlate the subquery to your outer query, so I don't think you're going to get the results you're after.
Additionally, you've labelled a column with an identifier that's over 30 characters, so unless you're on 12.2 with the extended identifiers set, you're going to get ORA-00972: identifier is too long.
Finally, I don't think you need that subquery at all; I think you can just use a conditional count, e.g.:
SELECT DISTINCT ad.state "State",
COUNT(r.ratingid) over(PARTITION BY ad.state) "Number of Ratings",
COUNT(DISTINCT CASE WHEN a.awardid IS NOT NULL THEN r.ratingid END) over(PARTITION BY ad.state) "Num Award Winning Movies Rated"
FROM netflix.addresses ad
JOIN netflix.ratings100 r
ON ad.custid = r.custid
JOIN netflix.movies_awards a
ON r.movieid = a.movieid
GROUP BY "State";
You may not even need that distinct; it depends on your data. Hopefully you can play around with that and get it to work for your requirements.
That seems like a complicated query. This should be an aggregation query . . . with a correlated subquery:
SELECT ad.state, COUNT(DISTINCT r.ratingId) as num_rated,
COUNT(DISTINCT CASE WHEN a.awardId IS NOT NULL THEN r.ratingid END) as num_rated_with_award
FROM netflix.addresses ad JOIN
netflix.ratings100 r
ON ad.custid = r.custid LEFT JOIN
netflix.movies_awards a
ON r.movieid = a.movieid
GROUP BY ad.state;
Notes:
There is no reason to give a column an alias equivalent to its original name. So, as "State" is unnecessary, unless you really care about capitalization.
A movie could have more than one award, so to get the number of ratings, use count(distinct).
SELECT DISTINCT is almost never appropriate with GROUP BY.
The query has no need of window functions.

Sort teams by average vote for a given jury

I have the following schema :
teams(id, name)
jury(id, name)
criteria(id, name, coefficient, jury_id)
vote(id, team_id, jury_id, value, criterion_id)
I would like to get every team and order them by average vote for a given jury.
Here is my current SQL:
SELECT teams.*,
SUM(votes.value * criteria.coefficient) / SUM(criteria.coefficient) AS rating
FROM "teams"
LEFT JOIN "votes" ON "teams"."id" = "votes"."team_id"
LEFT JOIN "criteria" ON "votes"."criterion_id" = "criteria"."id"
WHERE (votes.jury_id = 3510 OR votes.jury_id IS NULL)
GROUP BY teams.id
ORDER BY rating DESC NULLS LAST, teams.id
This works well for the following cases:
The team as vote for the selected jury
The team as vote for the selected jury and for other jury (the vote for other jury is not taken into account)
The team as no vote at all (the team appears at the end of the list)
It DOES NOT work for the following case:
The team is voted for another jury but not on the selected jury (in this case, the team does not appear in the list)
How could I make this work.
I finally came with the following SQL:
SELECT "teams".*
FROM (
SELECT teams.*, SUM(votes.value * criteria.coefficient) / SUM(criteria.coefficient) AS rating, teams.id
FROM "teams"
LEFT JOIN "votes" ON "teams"."id" = "votes"."team_id"
LEFT JOIN "criteria" ON "votes"."criterion_id" = "criteria"."id"
WHERE (votes.jury_id = 3613 OR votes.jury_id IS NULL)
GROUP BY teams.id
UNION
SELECT teams.*, NULL AS rating, teams.id
FROM "teams"
INNER JOIN "votes" ON "votes"."team_id" = "teams"."id"
INNER JOIN "criteria" ON "criteria"."id" = "votes"."criterion_id"
GROUP BY teams.id HAVING EVERY(votes.jury_id != 3613)
) AS teams
ORDER BY rating DESC NULLS LAST, teams.created_at
However, I could not sort on teams.id because it says the column in ambiguous. I tried replacing 'AS teams' with another alias without success so I used another column created_at instead.

SQL Query using multiple JOINS without sub query

Lets say I have three tables with these columns,
Players - id, name
Events - id, name
Games - first_player_id, second_player_id, event_id.
And I need the players details who are playing in a game which is happening in an event.
And I could write query like,
SELECT players.id, events.id as event_id,
(SELECT name as player_one_name from players where id = games.first_player_id),
(SELECT name as player_two_name from players where id = games.second_player_id),
games.id as game_id
FROM events
INNER JOIN games on events.id = games.event_id
INNER JOIN players on games.first_player_id = players.id;"
Here I am using two sub queries to fetch players name. And it gives correct results. Can this query be optimized? For ex, can I remove any subquery or innerjoin ?
FYI, I use PostgreSQL database.
Thanks.
If you do not want sub queries in your select statement then you must provide a join for each subset. Since your database is set oriented the two INNER JOINS would prove more efficient.
SELECT players.id, events.id as event_id,
player_one_name=player_one.name,
player_tow_name=player_two.name
FROM events
INNER JOIN games on events.id = games.event_id
INNER JOIN players player_one on games.first_player_id = player_one.id
INNER JOIN players player_two on games.second_player_id = player_two.id
You must do a join for each foreign key
SELECT players_a.id, events.id as event_id,
players_a.name as player_one_name,
players_b.name as player_two_name,
games.id as game_id
FROM events
INNER JOIN games on events.id = games.event_id
INNER JOIN players players_a on games.first_player_id = players.id
INNER JOIN players players_b on games.first_player_id = players.id
The currently accepted answer is right about joining the players table twice, but mostly wrong otherwise. This would work:
SELECT e.id AS event_id
,g.id AS game.id
,p1.name AS first_player
,p2.name AS second_player
FROM events e
LEFT JOIN games g ON g.event_id = e.id
LEFT JOIN players p1 ON p1.id = g.first_player_id
LEFT JOIN players p2 ON p2.id = g.second_player_id;
Use LEFT [OUTER] JOIN to cover the cases where an event does not have a game or a game does not have both players (yet).
Use table aliases to simplify your syntax. To join the same table twice you also need at least one table alias.
After attaching an alias to a table in the FROM list, only that alias is visible in your query, not the original name of the table.
Study the manual for details.

SQL aggregate query error

I have 3 tables like this
player(id,name,age,teamid)
team(id,name,sponsor,totalplayer,totalchampion,boss,joindate)
playerdetail(id,playerid,position,number,allstar,joindate)
I want to select teaminfo include name,sponsor,totalplayer,totalchampion,boss,
the average age of the players, the number of the allstar players
I write the t-sql as below
SELECT T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE,
AVG(P.AGE) AS AverageAge,COUNT(D.ALLSTAR) As AllStarPlayer
FROM Team T,Player P,PlayerDetail D
WHERE T.ID=P.TID AND P.ID=D.PID
but it doesn't work, the error message is
'Column 'Team.Name' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.'
Who can help me?
Thx in advance!
Add
GROUP BY
T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE
In most RDBMS (except MySQL which will guess for you), a column must be either aggregated (COUNT, AVG) or in the GROUP BY
Also, you should use explicit JOINs.
This is clearer, less ambiguous and more difficult to bollix your code
SELECT
T.NAME, T.SPONSOR, T.TOTALPLAYER, T.TOTALCHAMPION, T.BOSS, T.JOINDATE,
AVG(P.AGE) AS AverageAge,
COUNT(D.ALLSTAR) As AllStarPlayer
FROM
Team T
JOIN
Player P ON T.ID=P.TID
JOIN
PlayerDetail D ON P.ID=D.PID
GROUP BY
T.NAME, T.SPONSOR, T.TOTALPLAYER, T.TOTALCHAMPION, T.BOSS, T.JOINDATE;
Given that you want this data per team, and team.ID uniquely identifies team, I suggest the following:
SELECT max(T.NAME) As TeamName,
max(T.SPONSOR) As Sponsor,
max(T.TOTALPLAYER) As TotalPlayers,
max(T.TOTALCHAMPION) As TotalChampions,
max(T.BOSS) As Boss,
max(T.JOINDATE) As JoinDate,
AVG(P.AGE) AS AverageAge,
COUNT(D.PID) As AllStarPlayer
FROM Team T
join Player P on T.ID=P.TID
left join PlayerDetail D on P.ID=D.PID and D.ALLSTAR = 'Y'
group by T.ID
Use:
SELECT T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE,
AVG(P.AGE) AS AverageAge,COUNT(D.ALLSTAR) As AllStarPlayer
FROM Team T
JOIN Player P ON T.ID = P.TEAMID
JOIN PlayerDetail D ON P.ID = D.PLAYERID
GROUP BY T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE

Getting individual counts of a tables column after joining other tables

I'm having problems getting an accurate count of a column after joining others. When a column is joined I would still like to have a DISTINCT count of the table that it is being joined on.
A restaurant has multiple meals, meals have multiple food groups, food groups have multiple ingredients.
Through the restaurants id I want to be able to calculate how many of meals, food groups, and ingrediants the restaurant has.
When I join the food_groups the count for meals increases as well (I understand this is natural behavior I just don't understand how to get what I need due to it.) I have tried DISTINCT and other things I have found, but nothing seems to do the trick. I would like to keep this to one query rather than splitting it up into multiple ones.
SELECT
COUNT(meals.id) AS countMeals,
COUNT(food_groups.id) AS countGroups,
COUNT(ingrediants.id) AS countIngrediants
FROM
restaurants
INNER JOIN
meals ON restaurants.id = meals.restaurant_id
INNER JOIN
food_groups ON meals.id = food_groups.meal_id
INNER JOIN
ingrediants ON food_groups.id = ingrediants.food_group_id
WHERE
restaurants.id='43'
GROUP BY
restaurants.id
Thanks!
The DISTINCT goes inside the count
SELECT
COUNT(DISTINCT meals.id) AS countMeals,
COUNT(DISTINCT food_groups.id) AS countGroups,
COUNT(DISTINCT ingrediants.id) AS countIngrediants
FROM
restaurants
INNER JOIN
meals ON restaurants.id = meals.restaurant_id
INNER JOIN
food_groups ON meals.id = food_groups.meal_id
INNER JOIN
ingrediants ON food_groups.id = ingrediants.food_group_id
WHERE
restaurants.id='43'
GROUP BY
restaurants.id
You're going to have to do subqueries, I think. Something like:
SELECT
(SELECT COUNT(1) FROM meals m WHERE m.restaurant_id = r.id) AS countMeals,
(SELECT COUNT(1) FROM food_groups fg WHERE fg.meal_id = m.id) AS countGroups,
(SELECT COUNT(1) FROM ingrediants i WHERE i.food_group_id = fg.id) AS countGroups
FROM restaurants r
Where were you putting your DISTINCT and on which columns? When using COUNT() you need to do the distinct inside the parentheses and you need to do it over a single column that is distinct for what you're trying to count. For example:
SELECT
COUNT(DISTINCT M.id) AS count_meals,
COUNT(DISTINCT FG.id) AS count_food_groups,
COUNT(DISTINCT I.id) AS count_ingredients
FROM
Restaurants R
INNER JOIN Meals M ON M.restaurant_id = R.id
INNER JOIN Food_Groups FG ON FG.meal_id = M.id
INNER JOIN Ingredients I ON I.food_group_id = FG.id
WHERE
R.id='43'
Since you're selecting for a single restaurant, you shouldn't need the GROUP BY. Also, unless this is in a non-English language, I think you misspelled ingredients.