Shorten queries with multiple joins - sql

Is there a way to shorten this query?
This query returns the result I want but I feel like this is to long. Are there tips to make an efficient query wilth miltiple joins?
SELECT home.team_id, home.name, ((home.hwins+away.awins)*1.0/(home.hwins+away.awins+draw.nowin)) as winratio
FROM(
SELECT m.home_team_api_id AS team_id, t.team_long_name AS name, COUNT(m.id) as hwins
FROM match m
LEFT JOIN team t
ON m.home_team_api_id=t.team_api_id
WHERE m.home_team_goal > m.away_team_goal
GROUP BY m.home_team_api_id) AS home
LEFT JOIN(
SELECT m.away_team_api_id AS team_id, t.team_long_name AS name, COUNT(m.id) as awins
FROM match m
LEFT JOIN team t
ON m.away_team_api_id=t.team_api_id
WHERE m.away_team_goal > m.home_team_goal
GROUP BY m.away_team_api_id) AS away
ON home.team_id=away.team_id
LEFT JOIN(
SELECT m.away_team_api_id AS team_id, t.team_long_name AS name, COUNT(m.id) as nowin
FROM match m
LEFT JOIN team t
ON m.away_team_api_id=t.team_api_id
WHERE m.away_team_goal = m.home_team_goal
GROUP BY m.away_team_api_id) AS draw
ON home.team_id=away.team_id
GROUP BY home.team_id
ORDER BY winratio DESC
LIMIT 10;
The result:
team_id
name
winratio
8634
FC Barcelona
0.8897338403041825
8633
Real Madrid CF
0.8871595330739299
9925
Celtic
0.8825910931174089
9823
FC Bayern Munich
0.8693693693693694
10260
Manchester United
0.8687782805429864
9885
Juventus
0.8669724770642202
9772
SL Benfica
0.8644859813084113
9773
FC Porto
0.8632075471698113
8593
Ajax
0.861904761904762
9931
FC Basel
0.861244019138756

You can use conditional aggregation, according to which you count only when there's a satisfied condition. This will avoid you to have three subqueries.
SELECT team_id, name, ((hwins+awins)*1.0/(hwins+awins+nowin)) as winratio
FROM(SELECT m.home_team_api_id AS team_id,
t.team_long_name AS name,
COUNT(CASE WHEN m.home_team_goal > m.away_team_goal
THEN m.id END) AS hwins,
COUNT(CASE WHEN m.away_team_goal > m.home_team_goal
THEN m.id END) AS awins,
COUNT(CASE WHEN m.away_team_goal = m.home_team_goal
THEN m.id END) AS nowin
FROM match m
LEFT JOIN team t
ON m.home_team_api_id=t.team_api_id
GROUP BY m.home_team_api_id, t.team_long_name
) AS all_matches
ORDER BY winratio DESC
LIMIT 10;
Or if you wish to do it without subqueries:
SELECT m.home_team_api_id AS team_id,
t.team_long_name AS name,
(COUNT(CASE WHEN m.home_team_goal <> m.away_team_goal
THEN m.id END) *0.1) / COUNT(*) AS winratio
FROM match m
LEFT JOIN team t
ON m.home_team_api_id=t.team_api_id
GROUP BY m.home_team_api_id, t.team_long_name
Note: You're missing some fields in your aggregation, and the final outer GROUP BY is not needed. In general you want to use this clause only when you're using aggregate functions.

Related

For each country, report the movie genre with the highest average rates

For each country, report the movie genre with the highest average ratings, and I am missing only one step that i cant figure it out.
Here's my current code:
SELECT c.code AS c_CODE, menres.genre AS GENRE, AVG(RATE) as AVERAGE_rate,MAX(RATE) AS MAXIMUM_rate, MIN(RATE) AS MINIMUM_rate from movirates
leftJOIN movgenres ON movgenres.movieid = movratings.movieid
left JOIN users ON users.userid = movrates.userid
left JOIN c ON c.code = users.city
LEFT JOIN menres ON movenres.genreid = menres.code
GROUP BY menres.genre , c.code
order by c.code asc, avg(rate) desc, menres.genre desc ;
You can use the ROW_NUMBER window function to assign a unique rank to each of your rows:
partitioned by country code
ordered by descendent average rating
Once you get this ranking, you may want to select all those rows which have the highest average rating (which are the same having the ranking equal to 1).
WITH cte AS (
SELECT c.code AS COUNTRY_CODE,
mg.genre AS GENRE,
AVG(rating) AS AVERAGE_RATING,
MAX(rating) AS MAXIMUM_RATING,
MIN(RATING) AS MINIMUM_RATING
FROM moviesratings r
INNER JOIN moviesgenres g ON g.movieid = r.movieid
INNER JOIN users u ON u.userid = r.userid
INNER JOIN countries c ON c.code = u.country
LEFT JOIN mGenres mg ON mg.code = g.genreid
GROUP BY mg.genre,
c.code
ORDER BY c.code,
AVG(rating) DESC,
mg.genre DESC;
)
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER(
PARTITION BY COUNTRY_CODE,
ORDER BY AVERAGE_RATING) AS rn
FROM cte) ranked_averages
WHERE rn = 1
Note: The code inside the common table expression is equivalent to yours. If you're willing to share your input tables, I may even suggest an improved query.
You should use window function in this case by using rank() then select the first rank only.
with mov_rates(c.code, genre, average, max, min)
as.
select c.code c_code,
e.genre genre,
avg (rate) avg
max (rate) max
min (rate) min
from movrates a
LEFT join movge.nres b on a.movieid = b.movieid
LEFT join users c on a.userid = c.user
LEFT join countr.ies d on c.code = d.code
left join mGenres e on b.genreid = e.code
group by d.country_code, e.x
),
rategenre (rank, c_code, genre, avgrate, max, min)
as
(
select rank() over (partition by c.c order by avgrates asc) rank,
country code,
genre,
average_r.ating,
maximum_rating,
minimum_.ating
from movrate \\just practicing on something
)
selec.t 2
from genre
where rank = 5
Reference:
OVER Clause

How to query data that is ommitted from the results

So I have the following information
Output
Diabetic Schools studentcount
false 9010 180
true 9010 3
false 9012 245
true 9012 4
Query
Select s.diabetic as diabetic, sch.buildingid as Schools,
count(distinct s.studentnmr) as Studentcount
from student s
inner join studentschool ss.studentid = s.studentid
inner join school sch.id = ss.schoolid
order by sch.id
I want
Diabetic addresse studentcount calculation
true 9010 3 1,64 %
true 9012 4 1,61 %
where calculation is
( sum(diabetic=true)/sum(total number of students of the school) )*100
Additional tip there is another field called
diabeticdate
which has a date when diabetic is true.
My problem
when I select
select sum(Case when s.diabetic is null then 1 else 0 end) AS notD
I get obviously nothing next to the record diabetic - True status
How do I work around this
note: if you have a better title for the question, please suggest!
You can use conditional aggregation in order to show one row per school with their diabetic rates:
select
sch.buildingid as Schools,
count(distinct s.studentnmr) as Studentcount
count(distinct case when s.diabetic then s.studentnmr end) as Diabeticcount,
count(distinct case when s.diabetic then s.studentnmr end) /
count(distinct s.studentnmr) * 100 as rate
from student s
inner join studentschool on ss.studentid = s.studentid
inner join school on sch.id = ss.schoolid
group by sch.buildingid
having count(distinct case when s.diabetic then s.studentnmr end) > 0
order by sch.buildingid;
Remove the HAVING clause, if you also want to see schools without diabetics.
you may try below by using over()
with t1 as
(
Select s.diabetic as diabetic, sch.buildingid as Schools,
count(distinct s.studentnmr) as Studentcount
from student s
inner join studentschool ss.studentid = s.studentid
inner join school sch.id = ss.schoolid
order by sch.id
),
t2 as
(
select case when Diabetic='true' then Schools end as addresse,
case when when Diabetic='true' then studentcount end as studentcount,
((case when when Diabetic='true' then studentcount end)::decimal/(sum(studentcount) over())) *100 as calculation
) select * from t2
You can use the window function SUM OVER to get the total number of students. Window functions run over the results you already have, a post aggregation so to say :-)
select
s.diabetic as diabetic,
sch.buildingid as Schools,
count(distinct s.studentnmr) as Studentcount,
count(distinct s.studentnmr)::decimal /
sum(count(distinct s.studentnmr)) over (partition by sch.buildingid) * 100 as rate
from student s
inner join studentschool on ss.studentid = s.studentid
inner join school on sch.id = ss.schoolid
group by sch.buildingid, s.diabetic
order by sch.buildingid, s.diabetic;

SQL JOIN COUNT and GROUP BY

I have three tables
1. players(id, first_name, last_name, age, position, team_id)
2. teams(id, team_name, stadium, wins, draws,defeats,goal_for,goal_against)
3. goals_scored(id, player_id, goal_time)
SQL statement
SELECT
players.first_name,
players.last_name,
teams.name,
players.position,
players.age,
COUNT(*) AS goals
FROM
players
JOIN goals_scored
ON players.id = goals_scored.player_id
JOIN teams
ON players.team_id = teams.id
GROUP BY players.id;
teams table
id team_name stadium wins draws defeats goal_for goal_against
1 APF Club Dasharath 7 2 7 29 25
players table
id first_name last_name position age team_id
4 Dipendra Shrestha forward 19 1
goals_scored table
id player_id goal_time
1 4 34
2 4 57
I want to group goals on players id so that I can get the count of goals of an individual player.
Like
first_name last_name team_name position age goals
Dipendra Shrestha APF Club forward 19 2
How can I do it?
Prefer to group on as few as possible columns, especially if multiple tables get involved so that a good index can be applied to handle the group by.
WITH GoalsPerPlayer (playerId, nrOfGoals)
AS
(
SELECT player_id, count(*)
FROM goals_scored
GROUP BY player_id
)
SELECT p.first_name, p.last_name, t.team_name, p.position, p.age, g.numberOfGoals as goals
FROM GoalsPerPlayer g
INNER JOIN players p ON p.id = g.player_id
INNER JOIN teams t ON t.id = p.team_id
Edit:
Fixed typo's in query as mentioned by PO in comment below.
WITH GoalsPerPlayer (playerId, nrOfGoals)
AS
(
SELECT player_id, count(*)
FROM goals_scored
GROUP BY player_id
)
SELECT p.first_name, p.last_name, t.team_name, p.position, p.age, g.nrOfGoals as goals
FROM GoalsPerPlayer g
INNER JOIN players p ON p.id = g.playerId
INNER JOIN teams t ON t.id = p.team_id
Your query basically looks fine. I would adjust the GROUP BY to be more complete:
SELECT p.first_name, p.last_name, t.name, p.position, p.age,
COUNT(*) AS goals
FROM players p JOIN
goals_scored gs
ON p.id = gs.player_id JOIN
teams t
ON p.team_id = t.id
GROUP BY p.first_name, p.last_name, t.name, p.position, p.age;

How to have SQL query with 2 subqueries divided

I have a database which has these tables:
Users (id, email)
Trips (id, driver_id)
MatchedTrips (id, trip_id)
I need to get for each user the total number of trips he created divided by the total matches found.
I am stuck in building the raw SQL query for this. Here is what I tried, and sure it's far from being correct.
SELECT
users.email,
total_trips.count1 / total_matches.count2
FROM users CROSS JOIN (SELECT
users.email,
count(trips.driver_id) AS count1
FROM trips
INNER JOIN users ON trips.driver_id = users.id
GROUP BY users.email) total_trips
CROSS JOIN (SELECT users.email, count(matches.trip_id) AS count2
FROM matches
LEFT JOIN trips ON matches.trip_id = trips.id
LEFT JOIN users ON trips.driver_id = users.id
GROUP BY users.email) total_matches;
You can calculate total trips and total matches for each driver in the way like this:
select driver_id, count(t.id) as total_trips, count(m.id) as total_matches
from trips t
left join matches m on (t.id = trip_id)
group by 1
Use this query as a derived table in join with users:
select email, total_trips, total_matches, total_trips::dec/ nullif(total_matches, 0) result
from users u
left join (
select driver_id, count(t.id) as total_trips, count(m.id) as total_matches
from trips t
left join matches m on (t.id = trip_id)
group by 1
) s on u.id = driver_id
order by 1;
SQLFiddle.
The simplest way is probably to use count(distinct):
select u.email,
count(distinct t.id) as num_trips,
count(distinct m.id) as num_matches,
(count(distinct t.id) / count(distinct m.id)) as ratio
from users u left join
trips t
on t.driver_id = u.id left join
matches m
on m.trip_id = t.trip_id
group by u.email;
Note: If emails are unique, then the query can be simplified. count(distinct) can be expensive under some circumstances.

How to count number of different items in SQL

Database structure:
Clubs: ID, ClubName
Teams: ID, TeamName, ClubID
Players: ID, Name
Registrations: PlayerID, TeamID, Start_date, End_date, SeasonID
Clubs own several teams. Players may get registered into several teams (inside same club or into different club) during one year.
I have to generate a query to list all players that have been registered into DIFFERENT CLUBS during one season. So if player swapped teams that were owned by the same club then it doesn't count.
My attempts so far:
SELECT
c.short_name,
p.surname,
r.start_date,
r.end_date,
(select count(r2.id) from ejl_registration as r2
where r2.player_id=r.player_id and r2.season=r.season) as counter
FROM
ejl_registration AS r
left Join ejl_players AS p ON p.id = r.player_id
left Join ejl_teams AS t ON r.team_id = t.id
left Join ejl_clubs AS c ON t.club_id = c.id
WHERE
r.season = '2008'
having counter >1
I can't figure out how to count and show only different clubs... (It's getting too late for clear thinking). I use MySQL.
Report should be like: Player name, Club name, Start_date, End_date
This is a second try at this answer, simplifying it to merely count the distinct clubs, not report a list of club names.
SELECT p.surname, r.start_date, r.end_date, COUNT(DISTINCT c.id) AS counter
FROM ejl_players p
JOIN ejl_registration r ON (r.player_id = p.id)
JOIN ejl_teams t ON (r.team_id = t.id)
JOIN ejl_clubs c ON (t.club_id = c.id)
WHERE r.season = '2008'
GROUP BY p.id
HAVING counter > 1;
Note that since you're using MySQL, you can be pretty flexible with respect to columns in the select-list not matching columns in the GROUP BY clause. Other brands of RDBMS are more strict about the Single-Value Rule.
There's no reason to use a LEFT JOIN as in your example.
Okay, here's the first version of the query:
You have a chain of relationships like the following:
club1 <-- team1 <-- reg1 --> player <-- reg2 --> team2 --> club2
Such that club1 must not be the same as club2.
SELECT p.surname,
CONCAT_WS(',', GROUP_CONCAT(DISTINCT t1.team_name),
GROUP_CONCAT(DISTINCT t2.team_name)) AS teams,
CONCAT_WS(',', GROUP_CONCAT(DISTINCT c1.short_name),
GROUP_CONCAT(DISTINCT c2.short_name)) AS clubs
FROM ejl_players p
-- Find a club where this player is registered
JOIN ejl_registration r1 ON (r1.player_id = p.id)
JOIN ejl_teams t1 ON (r1.team_id = t1.id)
JOIN ejl_clubs c1 ON (t1.club_id = c1.id)
-- Now find another club where this player is registered in the same season
JOIN ejl_registration r2 ON (r2.player_id = p.id AND r1.season = r2.season)
JOIN ejl_teams t2 ON (r2.team_id = t2.id)
JOIN ejl_clubs c2 ON (t2.club_id = c2.id)
-- But the two clubs must not be the same (use < to prevent duplicates)
WHERE c1.id < c2.id
GROUP BY p.id;
Here's a list of players for one season.
SELECT sub.PlayerId
FROM
(
SELECT
r.PlayerId,
(SELECT t.ClubID FROM Teams t WHERE r.TeamID = t.ID) as ClubID
FROM Registrations r
WHERE r.Season = '2008'
) as sub
GROUP BY PlayerId
HAVING COUNT(DISTINCT sub.ClubID) > 1
Here's a list of players and seasons, for all seasons.
SELECT PlayerId, Season
FROM
(
SELECT
r.PlayerId,
r.Season,
(SELECT t.ClubID FROM Teams t WHERE r.TeamID = t.ID) as ClubID
FROM Registrations r
) as sub
GROUP BY PlayerId, Season
HAVING COUNT(DISTINCT sub.ClubID) > 1
By the way, this works in MS SQL.
SELECT p.Name, x.PlayerID, x.SeasonID
FROM (SELECT DISTINCT r.PlayerID, r.SeasonID, t.ClubID
FROM Registrations r
JOIN Teams t ON t.ID = r.TeamID) x
JOIN Players p ON p.ID = x.PlayerID
GROUP BY p.rName, x.PlayerID, x.SeasonID
HAVING COUNT(*) > 1