How can I select the highest counts attributes from different groups? - sql

So I have a table with players data(name, team, etc..) and a table with goals (player who scored it, local team, etc...). What I need to do is, get from each team the highest scorer. So the result I'm getting is something like:
germany - whatever name - 1
germany - another dude - 5
spain - another name - 8
italy - one more name - 6
As you can see teams repeat, and I want them not to, just get the highest scorer of each team.
Right now I have this:
SELECT P.TEAM_PLAYER, G.PLAYER_GOAL, COUNT(*) AS "TOTAL GOALS" FROM PLAYER P, GOAL G
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
AND P.NAME = G.PLAYER_GOAL
GROUP BY G.PLAYER_GOAL, P.TEAM_PLAYER
HAVING COUNT(*)>=ALL (SELECT COUNT(*) FROM PLAYER P2 where P.TEAM_PLAYER = P2.TEAM_PLAYER GROUP BY P2.TEAM_PLAYER)
ORDER BY COUNT(*) DESC;
I am 100% sure I'm close, and I'm pretty sure I have to do this with the HAVING feature, but I can't get it right.
Without the HAVING it returns a list of all the players, their teams and how many goals have they scored, now I want to cut it down to only one player for each team.
PD: the teams in the table GOAL are local and visiting team, so I have to use the Player table to get the team. Also the Goal table is not a list of the players and how many goals they have scored, but a list of every individual goal and the player who scored it.

If I understand correctly you can try this query.
just get MAX of PLAYER_GOAL column,SUM(G.PLAYER_GOAL) instead of COUNT(*)
SELECT P.TEAM_PLAYER,
MAX(G.PLAYER_GOAL) "PLAYER_GOAL",
SUM(G.PLAYER_GOAL) AS "TOTAL GOALS"
FROM PLAYER P
INNER JOIN GOAL G
ON P.NAME = G.PLAYER_NAME
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
GROUP BY P.TEAM_PLAYER
ORDER BY SUM(G.PLAYER_GOAL) DESC;
NOTE :
Avoid using commas to join tables it's a old join style, You can use inner-join instead.
Edit
I don't know your table schema, but this query might be work.
use a subquery to contain your current result set. then get MAX function and GROUP BY
SELECT T.TEAM_PLAYER,
T.PLAYER_GOAL,
MAX(TOTAL_GOALS) AS "TOTAL GOALS"
FROM
(
SELECT P.TEAM_PLAYER, G.PLAYER_GOAL, COUNT(*) AS "TOTAL_GOALS" FROM
PLAYER P, GOAL G
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
AND P.NAME = G.PLAYER_GOAL
GROUP BY G.PLAYER_GOAL, P.TEAM_PLAYER
HAVING COUNT(*)>=ALL (SELECT COUNT(*) FROM PLAYER P2 where P.TEAM_PLAYER = P2.TEAM_PLAYER GROUP BY P2.TEAM_PLAYER)
) T
GROUP BY T.TEAM_PLAYER,
T.PLAYER_GOAL
ORDER BY MAX(TOTAL_GOALS) DESC

Related

Find entities sorted by aggregated property of their relations

Question is quite similar to Find entity with most relations filtered by criteria, but slightly different.
model Player {
id String #id
name String #unique
game Game[]
}
model Game {
id String #id
isWin Boolean
playerId String
player Player #relation(fields: [playerId], references: [id])
}
I would like to find 10 players with best win rate (games with isWin=true divided by total amount of games by that player).
Direct and slow way to do that is to find all the players who won at least once and count their wins (first query). Then for each of them count their total amount of games (second query). Then do the math and sorting on the application side while holding results in memory.
Is there simpler way to do that? How would I do that with prisma? If there is no prisma "native" way to do it, what is the most efficient way to do this with raw SQL?
This is simple aggregation:
SELECT p.id, p.name
, COUNT(CASE WHEN isWin THEN 1 END) AS wins
, COUNT(g.playerId) AS played
, 100 * COUNT(CASE WHEN isWin THEN 1 END)
/ COUNT(g.playerId) AS rate
FROM player AS p
JOIN game AS g
ON g.playerId = p.id
GROUP BY p.name -- since p.name is unique, not the id.
ORDER BY rate DESC
LIMIT 10
;
Adjust as needed for your database. I adjusted in case the join becomes a LEFT JOIN, to handle players with no games played.
Executable example with PG, as indicated by the comments below.
Working test case - no data

finding people with possible incorrectly spelled cities where zip codes match

I am trying to create a report that will return a list of people whose cities most likely need to be corrected.
I was thinking of comparing the data against other data within the table to leverage the assumption that most of the cities are spelled correctly. Take Albuquerque, for example. We have records for many of the zip codes, but the city isn't always spelled correctly.
I can't figure out my next step.
Here's what I have started with:
SELECT city, zip_5_digits, COUNT(*) AS "COUNT"
FROM people
INNER JOIN addresses
ON addresses.people_id = people.id
AND city LIKE 'Albu%que'
GROUP BY city, zip_5_digits
Doing this results in
Albuqureque 87108 1
Albuquerque 87108 238
Albuqerque 87109 1
Albuquerque 87109 34
What I'd like to do is, for each row, find the maximum records where the zip code matches but the city does not match. If there is no match, I want to return that record, and I'll use this to return people's id and names, since I most likely need to correct the name of the city for those people who have it mis-spelled.
This is hard, because some "cities" have very few residents. And, some zip codes might just have a small part of a city.
I would recommend two rules:
Look at zip codes that have at least a certain number of people -- say 100.
Look at cities in the zip code that have less than some number -- say 5.
There are candidates for misspellings:
SELECT pa.*
FROM (SELECT city, zip_5_digits, COUNT(*) AS cnt,
MAX(COUNT(*)) OVER (PARTITION BY zip_5_digits) as max_cnt,
SUM(COUNT(*)) OVER (PARTITION BY zip_5_digits) as sum_cnt
FROM people p, INNER JOIN
addresses a
ON a.people_id = p.id
GROUP BY city, zip_5_digits
) pa
WHERE sum_cnt >= 100 AND cnt <= 5;

Access SQL Query: Find the most recent date entry for each player's test date with 2 joined tables

I have 1 table and 1 query that are joined by Player ID. I want to show only the latest test date result for height and weight columns in tblPlayerLogistics and Player Name and PlayerID from qryPlayersExtended
PlayerID is located in both table and query and they are joined.
I have playerID, height, weight,and testdate in tblplayerlogistics
I have PlayerID, Player Name in qryPlayersExtended
I would like a query that returns only one player record labeling the playerId and player name with the most current height and weight of each player determined by the testdate.
Query name: qryPlayersExtended
Table Name: tblPlayerLogistics
I have attached what I am trying to do in an image. This is an example of one player but I will have multiple players with multiple test dates
I have been struggling with this for weeks any help would be appreciated. I looked at this previous post but still couldnt figure it out
similar post
Assuming that there is not more than one test for a player on any one date. Warning, air code. Create a new query qryLastTestDate:
SELECT PlayerID, Max(TestDate) as LastTestDate FROM tblplayerlogistics Group By PlayerID Create a second query qryLastTest:
SELECT tblplayerlogistics.PlayerID, tblplayerlogistics.TestDate, tblplayerlogistics.height, tblplayerlogistics.weight FROM tblplayerlogistics INNER JOIN qryLastTestDate ON tblplayerlogistics.PlayerID = qryLastTestDate.PlayerID and tblplayerlogistics.TestDate = qryLastTestDate.LastTestDate
Your final query would be:
SELECT qryPlayersExtended.PlayerID, qryPlayersExtended.[Player Name], qryLastTest.TestDate, qryLastTest.Height, qryLastTest.Weight FROM qryPlayersExtended INNER JOIN qryLastTest ON qryPlayersExtended.PlayerID = qryLastTest.PlayerID; Add in additional fields as needed.
You will need two queries for this.
First, you need a query which gets the latest date per player. It will have two columns: PlayerId and Max(TestDate), Grouped by PlayerId. I've named this qryMaxPlayerTestDates. It will look something like this:
SELECT PlayerId, Max(TestDate) AS MaxDate
FROM tblPlayerLogistics
GROUP BY PlayerId;
Second, you join the PlayerId and Dates (MaxDate/TestDate) to get results limited by Max(TestDate). It will look something like this:
SELECT tblPlayerLogistics.PlayerId, tblPlayerLogistics.Height,
tblPlayerLogistics.Weight, tblPlayerLogistics.TestDate,
qryPlayersExtended.PlayerName
FROM qryPlayersExtended INNER JOIN (qryMaxPlayerTestDates
INNER JOIN tblPlayerLogistics
ON (qryMaxPlayerTestDates.MaxDate = tblPlayerLogistics.TestDate)
AND (qryMaxPlayerTestDates.PlayerId = tblPlayerLogistics.PlayerId))
ON qryPlayersExtended.PlayerId = qryMaxPlayerTestDates.PlayerId;
If your dates are not duplicated per Player (No Players have more than one test per date, or two tests on same date have different times), you will get one row per Player in the result with the Height/Weight for the latest test date.

Count rows from LEFT JOIN where column is value

I have teams, users, and points tables.
When I call get the teams from the database I need to also calculate the points as a generated column and also the number of users in that team as a new column.
I have this so far but it is returning all users in the member count column, not just the members in that team.
SELECT ar_teams.*, COALESCE(SUM(ar_points.amount),0) AS points, COUNT(ar_users.team_id) AS members
FROM ar_teams
LEFT JOIN ar_users ON ar_users.team_id = ar_teams.id
LEFT JOIN ar_points ON ar_points.user_id = ar_users.id
WHERE ar_teams.id = 1 AND ar_users.team_id = 1
GROUP BY ar_teams.id
The results I am getting:
id name name_alt points members
1 UK & Ireland NULL 3076 48
The members count should be 12, at the moment it seems to be counting all people without taking their team into account.
I suspect your problem is the use of count() rather than count(distinct). I think this is what you want:
SELECT ar_teams.*,
COALESCE(SUM(ar_points.amount), 0) AS points,
COUNT(distinct ar_users.id) AS members

Max of two columns from different tables

I need to get a max of the values from two columns from different tables.
eg the max of suburbs from schoolorder and platterorder. platterorder has clientnumbers that links to normalclient, and schoolorder has clientnumbers that links to school.
I have this:
SELECT MAX (NC.SUBURB) AS SUBURB
FROM normalClient NC
WHERE NC.CLIENTNO IN
(SELECT PO.CLIENTNO
FROM platterOrder PO
WHERE NC.CLIENTNO = PO.CLIENTNO)
GROUP BY NC.SUBURB
UNION
SELECT MAX (S.SUBURB) AS SCHOOLSUBURB
FROM school S
WHERE S.CLIENTNO IN
(SELECT S.CLIENTNO
FROM schoolOrder SO
WHERE S.CLIENTNO = SO.CLIENTNO)
GROUP BY S.SUBURB)
However that gets the max from platter order and joins it with the max of school. what I need is the max of both of them together.
=================================================
sorry for making this so confusing!
the output should only be one row.
it should be the suburb where the maxmimum orders have come from for both normal client and school clients. the orders are listed in platter order for normal clients, and school order for school clients. so it's the maximum value for two table's that don't have a direct relation.
hope that clears it up a bit !
If I'm understanding your question correctly, you don't need to use a GROUP BY since you're wanting the MAX of the field. I've also changed your syntax to use a JOIN instead of IN, but the IN should work just the same:
SELECT MAX (NC.SUBURB) AS SUBURB
FROM normalClient NC
JOIN platterOrder PO ON NC.ClientNo = PO.ClientNo
UNION
SELECT MAX (S.SUBURB) AS SCHOOLSUBURB
FROM school S
JOIN schoolOrder SO ON S.CLIENTNO = SO.CLIENTNO
Withouth knowing your table structures and seeing sample data, the best way I can recommend getting the MAX of results from the UNION is to use a subquery. There may be a better way with JOINs, but it's difficult to infer from your question:
SELECT MAX(Suburb)
FROM (
SELECT MAX (NC.SUBURB) AS SUBURB
FROM normalClient NC
JOIN platterOrder PO ON NC.ClientNo = PO.ClientNo
UNION
SELECT MAX (S.SUBURB)
FROM school S
JOIN schoolOrder SO ON S.CLIENTNO = SO.CLIENTNO
) T