Find entities sorted by aggregated property of their relations - sql

Question is quite similar to Find entity with most relations filtered by criteria, but slightly different.
model Player {
id String #id
name String #unique
game Game[]
}
model Game {
id String #id
isWin Boolean
playerId String
player Player #relation(fields: [playerId], references: [id])
}
I would like to find 10 players with best win rate (games with isWin=true divided by total amount of games by that player).
Direct and slow way to do that is to find all the players who won at least once and count their wins (first query). Then for each of them count their total amount of games (second query). Then do the math and sorting on the application side while holding results in memory.
Is there simpler way to do that? How would I do that with prisma? If there is no prisma "native" way to do it, what is the most efficient way to do this with raw SQL?

This is simple aggregation:
SELECT p.id, p.name
, COUNT(CASE WHEN isWin THEN 1 END) AS wins
, COUNT(g.playerId) AS played
, 100 * COUNT(CASE WHEN isWin THEN 1 END)
/ COUNT(g.playerId) AS rate
FROM player AS p
JOIN game AS g
ON g.playerId = p.id
GROUP BY p.name -- since p.name is unique, not the id.
ORDER BY rate DESC
LIMIT 10
;
Adjust as needed for your database. I adjusted in case the join becomes a LEFT JOIN, to handle players with no games played.
Executable example with PG, as indicated by the comments below.
Working test case - no data

Related

Is count() being used correctly for my query?

Find all teams that only won 1 game in tourney #3 (1 column, 4 rows)
My thought process to create this query is that I need to count (WonGame)s for each Team. And that that number cannot be has to equal 1. But When I run my query I get no results (I should get 4 teams).
Experimenting with my query I changed the equals to a greater than and that returned 8 results. So I don't understand why equals 1 returns no results.
Also I checked my Data and there is indeed 4 teams that one only one game during Tournament #3.
select Teams.TeamName
from Teams
join Bowlers on Teams.TeamID = Bowlers.TeamID
join Bowler_Scores on Bowlers.BowlerID = Bowler_Scores.BowlerID
join Match_Games on Bowler_Scores.GameNumber = Match_Games.GameNumber
join Tourney_Matches on Match_Games.MatchID = Tourney_Matches.MatchID
where Tourney_Matches.TourneyID = 3
group by Teams.TeamName
having count(Bowler_Scores.WonGame) = 1;
Bowling League DB Structure
Bowling League Data
The diagram seems to indicate that the relationship between Match_Games and Bowler_Scores is on BOTH of MatchID and GameNumber
If you change your JOIN conditions to be both columns
join Match_Games on Bowler_Scores.GameNumber = Match_Games.GameNumber and Bowler_Scores.MatchID = Match_Games.MatchID
Then you might get the required answer.
I can only speculate on what your data really looks like. However, it is doubtful that this expression:
having count(Bowler_Scores.WonGame) = 1;
does what you want. This counts the number of non-NULL values. Presumably, WonGame as some value such as "1" or "W" for the winner. If the value were 1, then the correct expression would be:
having sum(Bowler_Scores.WonGame) = 1
This is just speculation though without a better description of your data.
EDIT:
Based on the comment:
having sum(convert(int, Bowler_Scores.WonGame)) = 1

Sql Left or Right Join One To Many Pagination

I have one main table and join other tables via left outer or right outer outer join.One row of main table have over 30 row in join query as result. And I try pagination. But the problem is I can not know how many rows will it return for one main table row result.
Example :
Main table first row result is in my query 40 rows.
Main table second row result is 120 row.
Problem(Question) UPDATE:
For pagination I need give the pagesize the count of select result. But I can not know the right count for my select result. Example I give page no 1 and pagesize 50, because of this I cant get the right result.I need give the right pagesize for my main table top 10 result. Maybe for top 10 row will the result row count 200 but my page size is 50 this is the problem.
I am using Sql 2014. I need it for my ASP.NET project but is not important.
Sample UPDATE :
it is like searching an hotel for booking. Your main table is hotel table. And the another things are (mediatable)images, (mediatable)videos, (placetable)location and maybe (commenttable)comments they are more than one rows and have one to many relationship for the hotel. For one hotel the result will like 100, 50 or 10 rows for this all info. And I am trying to paginate this hotels result. I need get always 20 or 30 or 50 hotels for performance in my project.
Sample Query UPDATE :
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Hotel Table :
HotelId HotelName
1 Barcelona
2 Berlin
Media Table :
MediaID MediaUrl HotelId
1 www.xxx.com 1
2 www.xxx.com 1
3 www.xxx.com 1
4 www.xxx.com 1
Location Table :
LocationId Adress HotelId
1 xyz, Berlin 1
2 xyz, Nice 1
3 xyz, Sevilla 1
4 xyz, Barcelona 1
Comment Table :
CommentId Comment HotelId
1 you are cool 1
2 you are great 1
3 you are bad 1
4 hmm you are okey 1
This is only sample! I have 9999999 hotels in my database. Imagine a hotel maybe it has 100 images maybe zero. I can not know this. And I need get 20 hotels in my result(pagination). But 20 hotels means 1000 rows maybe or 100 rows.
First, your query is poorly written for readability flow / relationship of tables. I have updated and indented to try and show how/where tables related in hierarchical relativity.
You also want to paginate, lets get back to that. Are you intending to show every record as a possible item, or did you intend to show a "parent" level set of data... Ex so you have only one instance per Media, Per User, or whatever, then once that entry is selected you would show details for that one entity? if so, I would do a query of DISTINCT at the top-level, or at least grab the few columns with a count(*) of child records it has to show at the next level.
Also, mixing inner, left and right joins can be confusing. Typically a right-join means you want the records from the right-table of the join. Could this be rewritten to have all required tables to the left, and non-required being left-join TO the secondary table?
Clarification of all these relationships would definitely help along with the context you are trying to get out of the pagination. I'll check for comments, but if lengthy, I would edit your original post question with additional details vs a long comment.
Here is my SOMEWHAT clarified query rewritten to what I THINK the relationships are within your database. Notice my indentations showing where table A -> B -> C -> D for readability. All of these are (INNER) JOINs indicating they all must have a match between all respective tables. If some things are NOT always there, they would be changed to LEFT JOINs
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Readability of a query is a BIG help for yourself, and/or anyone assisting or following you. By not having the "on" clauses near the corresponding joins can be very confusing to follow.
Also, which is your PRIMARY table where the rest are lookup reference tables.
ADDITION PER COMMENT
Ok, so I updated a query which appears to have no context to the sample data and what you want in your post. That said, I would start with a list of hotels only and a count(*) of things per hotel so you can give SOME indication of how much stuff you have in detail. Something like
select
H.HotelID,
H.HotelName,
coalesce( MedSum.recs, 0 ) as MediaItems,
coalesce( LocSum.recs, 0 ) as NumberOfLocations,
coalesce( ComSum.recs, 0 ) as NumberOfLocations
from
Hotel H
LEFT JOIN
( select M.HotelID,
count(*) recs
from Media M
group by M.HotelID ) MedSum
on H.HotelID = MedSum.HotelID
LEFT JOIN
( select L.HotelID,
count(*) recs
from Location L
group by L.HotelID ) LocSum
on H.HotelID = LocSum.HotelID
LEFT JOIN
( select C.HotelID,
count(*) recs
from Comment C
group by C.HotelID ) ComSum
on H.HotelID = ComSum.HotelID
order by
H.HotelName
--- apply any limit per pagination
Now this will return every hotel at a top-level and the total count of things per the hotel per the individual counts which may or not exist hence each sub-check is a LEFT-JOIN. Expose a page of 20 different hotels. Now, as soon as one person picks a single hotel, you can then drill-into the locations, media and comments per that one hotel.
Now, although this COULD work, having to do these counts on an every-time query might get very time consuming. You might want to add counter columns to your main hotel table representing such counts as being performed here. Then, via some nightly process, you could re-update the counts ONCE to get them primed across all history, then update counts only for those hotels that have new activity since entered the date prior. Not like you are going to have 1,000,000 posts of new images, new locations, new comments in a day, but of 22,000, then those are the only hotel records you would re-update counts for. Each incremental cycle would be short based on only the newest entries added. For the web, having some pre-aggregate counts, sums, etc is a big time saver where practical.

How can I select the highest counts attributes from different groups?

So I have a table with players data(name, team, etc..) and a table with goals (player who scored it, local team, etc...). What I need to do is, get from each team the highest scorer. So the result I'm getting is something like:
germany - whatever name - 1
germany - another dude - 5
spain - another name - 8
italy - one more name - 6
As you can see teams repeat, and I want them not to, just get the highest scorer of each team.
Right now I have this:
SELECT P.TEAM_PLAYER, G.PLAYER_GOAL, COUNT(*) AS "TOTAL GOALS" FROM PLAYER P, GOAL G
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
AND P.NAME = G.PLAYER_GOAL
GROUP BY G.PLAYER_GOAL, P.TEAM_PLAYER
HAVING COUNT(*)>=ALL (SELECT COUNT(*) FROM PLAYER P2 where P.TEAM_PLAYER = P2.TEAM_PLAYER GROUP BY P2.TEAM_PLAYER)
ORDER BY COUNT(*) DESC;
I am 100% sure I'm close, and I'm pretty sure I have to do this with the HAVING feature, but I can't get it right.
Without the HAVING it returns a list of all the players, their teams and how many goals have they scored, now I want to cut it down to only one player for each team.
PD: the teams in the table GOAL are local and visiting team, so I have to use the Player table to get the team. Also the Goal table is not a list of the players and how many goals they have scored, but a list of every individual goal and the player who scored it.
If I understand correctly you can try this query.
just get MAX of PLAYER_GOAL column,SUM(G.PLAYER_GOAL) instead of COUNT(*)
SELECT P.TEAM_PLAYER,
MAX(G.PLAYER_GOAL) "PLAYER_GOAL",
SUM(G.PLAYER_GOAL) AS "TOTAL GOALS"
FROM PLAYER P
INNER JOIN GOAL G
ON P.NAME = G.PLAYER_NAME
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
GROUP BY P.TEAM_PLAYER
ORDER BY SUM(G.PLAYER_GOAL) DESC;
NOTE :
Avoid using commas to join tables it's a old join style, You can use inner-join instead.
Edit
I don't know your table schema, but this query might be work.
use a subquery to contain your current result set. then get MAX function and GROUP BY
SELECT T.TEAM_PLAYER,
T.PLAYER_GOAL,
MAX(TOTAL_GOALS) AS "TOTAL GOALS"
FROM
(
SELECT P.TEAM_PLAYER, G.PLAYER_GOAL, COUNT(*) AS "TOTAL_GOALS" FROM
PLAYER P, GOAL G
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
AND P.NAME = G.PLAYER_GOAL
GROUP BY G.PLAYER_GOAL, P.TEAM_PLAYER
HAVING COUNT(*)>=ALL (SELECT COUNT(*) FROM PLAYER P2 where P.TEAM_PLAYER = P2.TEAM_PLAYER GROUP BY P2.TEAM_PLAYER)
) T
GROUP BY T.TEAM_PLAYER,
T.PLAYER_GOAL
ORDER BY MAX(TOTAL_GOALS) DESC

Relational Algebra on 3 Tables

I am having trouble forming a relational algebra query for a question in an assignment. I have to find the name of all the teams that won a game on a specific date.
The database now has the following three schemas:
Team(teamid,teamname,stadium)
Player(playerid,name,teamid, height)
Game (gameid, hometeamid, guestteamid, date, home-score, guest-score)
I am a little confused on how to do this since the tables that I seem to need do not have anything in common (Game and Team). I see that Game has an id for both the home and away teams, but how can you find out who which team won?
The exact question that I have to answer is:
Find the name of all the teams that won on 6/1/15. (Assume a team plays only one game a day and that a tie is not possible)
Try This
(select teamname from Team t, Game g
where t.teamid = g.hometeamid
and home-score > guest-score and date = '6/1/15')
UNION
(select teamname from Team t, Game g
where t.teamid = g.guestteamid
and guest-score > home-score and date = '6/1/15')
The first query represents games which home teams have won while the second query represents games which guest teams have won. The union of the two will be the required answer

Access SQL Query: Find the most recent date entry for each player's test date with 2 joined tables

I have 1 table and 1 query that are joined by Player ID. I want to show only the latest test date result for height and weight columns in tblPlayerLogistics and Player Name and PlayerID from qryPlayersExtended
PlayerID is located in both table and query and they are joined.
I have playerID, height, weight,and testdate in tblplayerlogistics
I have PlayerID, Player Name in qryPlayersExtended
I would like a query that returns only one player record labeling the playerId and player name with the most current height and weight of each player determined by the testdate.
Query name: qryPlayersExtended
Table Name: tblPlayerLogistics
I have attached what I am trying to do in an image. This is an example of one player but I will have multiple players with multiple test dates
I have been struggling with this for weeks any help would be appreciated. I looked at this previous post but still couldnt figure it out
similar post
Assuming that there is not more than one test for a player on any one date. Warning, air code. Create a new query qryLastTestDate:
SELECT PlayerID, Max(TestDate) as LastTestDate FROM tblplayerlogistics Group By PlayerID Create a second query qryLastTest:
SELECT tblplayerlogistics.PlayerID, tblplayerlogistics.TestDate, tblplayerlogistics.height, tblplayerlogistics.weight FROM tblplayerlogistics INNER JOIN qryLastTestDate ON tblplayerlogistics.PlayerID = qryLastTestDate.PlayerID and tblplayerlogistics.TestDate = qryLastTestDate.LastTestDate
Your final query would be:
SELECT qryPlayersExtended.PlayerID, qryPlayersExtended.[Player Name], qryLastTest.TestDate, qryLastTest.Height, qryLastTest.Weight FROM qryPlayersExtended INNER JOIN qryLastTest ON qryPlayersExtended.PlayerID = qryLastTest.PlayerID; Add in additional fields as needed.
You will need two queries for this.
First, you need a query which gets the latest date per player. It will have two columns: PlayerId and Max(TestDate), Grouped by PlayerId. I've named this qryMaxPlayerTestDates. It will look something like this:
SELECT PlayerId, Max(TestDate) AS MaxDate
FROM tblPlayerLogistics
GROUP BY PlayerId;
Second, you join the PlayerId and Dates (MaxDate/TestDate) to get results limited by Max(TestDate). It will look something like this:
SELECT tblPlayerLogistics.PlayerId, tblPlayerLogistics.Height,
tblPlayerLogistics.Weight, tblPlayerLogistics.TestDate,
qryPlayersExtended.PlayerName
FROM qryPlayersExtended INNER JOIN (qryMaxPlayerTestDates
INNER JOIN tblPlayerLogistics
ON (qryMaxPlayerTestDates.MaxDate = tblPlayerLogistics.TestDate)
AND (qryMaxPlayerTestDates.PlayerId = tblPlayerLogistics.PlayerId))
ON qryPlayersExtended.PlayerId = qryMaxPlayerTestDates.PlayerId;
If your dates are not duplicated per Player (No Players have more than one test per date, or two tests on same date have different times), you will get one row per Player in the result with the Height/Weight for the latest test date.