Cumulative Game Score SQL - sql

I have developed a game recently and the database is running on MSSQL.
Here is my database structure
Table : Player
PlayerID uniqueIdentifier (PK)
PlayerName nvarchar
Table : GameResult
ID bigint (PK - Auto Increment)
PlayerID uniqueIdentifier (FK)
DateCreated Datetime
Score int
TimeTaken bigint
PuzzleID int
I have done an SQL listing Top 50 players that sort by highest score (DESC) and timetaken (ASC)
WITH ResultSet (PlayerID, Score, TimeTaken) AS(
SELECT DISTINCT(A.[PlayerID]), MAX(A.[Score]),MIN(A.[TimeTaken])
FROM GameResult A
WHERE A.[puzzleID] = #PuzzleID
GROUP BY A.[PlayerID])
SELECT TOP 50 RSP.[PlayerID], RSP.[PlayerName], RSA.[Score], RSA.[TimeTaken]
FROM ResultSet RSA
INNER JOIN Player RSP WITH(NOLOCK) ON RSA.PlayerID = RSP.PlayerID
ORDER By RSA.[Score] DESC, RSA.[timetaken] ASC
However above is applicable for just 1 puzzle.
Question
1) I need to modify the SQL to do a cumulative rank of 3 puzzle ID. For example, Puzzle 1, 2, 3 and it should be sort by highest sum score (DESC), and sum timetaken (ASC)
2) I also need an overall score population for all the possible 1 to 7 puzzle.
3) Each player only allowed to appear on the list once. First played and first to get highest score will be rank 1st.
I tried using CTE with UNION but the SQL statement doesn't work.
I hope gurus here can help me out on this. Much appreciated.
UPDATED WITH NEW SQL
Sql below allowed me to get the result for each puzzle id. I'm not sure if it is 100% but I believe it is correct.
;with ResultSet (PlayerID, maxScore, minTime, playedDate)
AS
(
SELECT TOP 50 PlayerID, MAX(score) as maxScore, MIN(timetaken) as minTime, MIN(datecreated) as playedDate
FROM gameresult
WHERE puzzleID = #PuzzleID
GROUP BY PlayerID
ORDER BY maxScore desc, minTime asc, playedDate asc
)
SELECT RSP.[PlayerID], RSP.[PlayerName], RSA.maxScore, RSA.minTime, RSA.PlayedDate
FROM ResultSet RSA
INNER JOIN Player RSP WITH(NOLOCK)
ON RSA.PlayerID = RSP.PlayerID
ORDER BY
maxScore DESC,
minTime ASC,
playedDate ASC

I would first like to point out that I do not believe your original query is correct. If you are looking for the best player for a particular puzzle, would that be the combination of the highest score plus the best time for that puzzle? If yes, using max and min does not guarantee that the max and min come from the same game (or row), which I believe should be a requirement. Instead you should have first determined the best game per player by using a row number windowing function. You can then do the top 50 sort off of that data.
The cumulative metrics should be easier to calculate because you only have to aggregate the sum of their score and the sum of their time and then sort, which means the new query should most likely look something like this:
;with ResultSet (PlayerID, Score, TimeTaken)
AS
(
SELECT TOP 50
A.[PlayerID],
SUM(A.[Score]),
SUM(A.[TimeTaken])
FROM GameResult A
WHERE
A.[puzzleID] in(1,2,3)
GROUP BY
A.PlayerID
ORDER BY
SUM(A.[Score]) DESC,
SUM(A.[TimeTaken]) ASC
)
SELECT RSP.[PlayerID], RSP.[PlayerName], RSA.[Score], RSA.[TimeTaken]
FROM ResultSet RSA
INNER JOIN Player RSP WITH(NOLOCK)
ON RSA.PlayerID = RSP.PlayerID
ORDER BY
Score DESC,
TimeTaken ASC
UPDATE:
Based on the new criteria, you will have to do something like this.
;WITH ResultSet (PlayerID, PuzzleId, Score, TimeTaken, seq)
AS
(
SELECT
A.[PlayerID],
A.PuzzleID,
A.[Score],
A.[TimeTaken],
seq = ROW_NUMBER() over(PARTITION BY PlayerID, PuzzleId ORDER BY Score DESC)
FROM GameResult A
WHERE
A.[puzzleID] in(1,2,3)
)
SELECT TOP 50
RSP.[PlayerID],
RSP.[PlayerName],
Score = SUM(RSA.[Score]), --total score
TimeTaken = SUM(RSA.[TimeTaken]) --total time taken
FROM ResultSet RSA
INNER JOIN Player RSP
ON RSA.PlayerID = RSP.PlayerID
WHERE
--this is used to filter the top score for each puzzle per player
seq = 1
GROUP BY
RSP.[PlayerID],
RSP.[PlayerName]
ORDER BY
SUM(RSA.Score) DESC,
SUM(RSA.TimeTaken) ASC

Related

Hive Script, DISTINCT with SUM

I am trying to distinct and then find the count of the teams a player played for in any single season and number of teams he played for. This is tripping me up and ofcourse i have a sample down below(2nd) one. The first ones is my failed attempt
SELECT o.id,o.year,COUNT(DISTINCT(o.team)) b JOIN
(SELECT id, year, team FROM batting
GROUP BY id,year,team
ORDER BY id DESC
LIMIT 25) o
0.id =b.id;
SELECT id, year, team FROM batting
GROUP BY id,year,team
ORDER BY id DESC
LIMIT 25;
produces
IGNORE the ^A, i think they represent either space or comma, just column seperatpr
Get the count of teams for each player for each year and order by the count desc,get the 1 row
SELECT id, year, COUNT(DISTINCT(team)) FROM batting
GROUP BY id,year
ORDER BY COUNT(DISTINCT(team)) DESC
LIMIT 1;

TSQL - Sum of Top 3 records of multiple teams

I am trying to generate a TSQL query that will take the top 3 scores (out of about 50) for a group of teams, sum the total of just those 3 scores and give me a result set that has just the name of the team, and that total score ordered by the score descending. I'm pretty sure it is a nested query - but for the life of me can't get it to work!
Here are the specifics, there is only 1 table involved....
table = comp_lineup (this table holds a separate record for each athlete in a match)
* athlete
* team
* score
There are many athletes to a match - each one belongs to a team.
Example:
id athlete team score<br>
1 1 1 24<br>
2 2 1 23<br>
3 3 2 21<br>
4 4 2 25<br>
5 5 1 20<br>
Thank You!
It is indeed a subquery, which I often put in a CTE instead just for clarity. The trick is the use of the rank() function.
;with RankedScores as (
select
id,
athlete,
team,
score,
rank() over (partition by team order by score desc) ScoreRank
from
#scores
)
select
Team,
sum(Score) TotalScore
from
RankedScores
where
ScoreRank <= 3
group by
team
order by
TotalScore desc
To get the top n value for every group of data a query template is
Select group_value, sum(value) total_value
From mytable ext
Where id in (Select top *n* id
From mytable sub
Where ext.group_value = sub.group_value
Order By value desc)
Group By group_value
The subquery retrieve only the ID of the valid data for the current group_value, the connection between the two dataset is the Where ext.group_value = sub.group_value part, the WHERE in the main query is used to mask every other ID, like a cursor.
For the specific question the template became
Select team, sum(score) total_score
From mytable ext
Where id in (Select top 3 id
From mytable sub
Where ext.team = sub.team
Order By score desc)
Group By team
Order By sum(score) Desc
with the added Order By in the main query for the descending total score

Order by date different columns

I'm having a problem with a complex SELECT, so I hope some of you can help me out, because I'm really stuck with it... or maybe you can point me in a direction.
I have a table with the following columns:
score1, gamedate1, score2, gamedate2, score3, gamedate3
Basically I need to determine the ultimate winner of all the games, who got the SUMMED MAX score FIRST, based on the game times in ASCENDING order.
Assuming that the 1,2,3 are different players, something like this should work:
-- construct table as the_lotus suggests
WITH LotusTable AS
(
SELECT 'P1' AS Player, t.Score1 AS Score, t.GameDate1 as GameDate
FROM Tbl t
UNION ALL
SELECT 'P2' AS Player, t.Score2 AS Score, t.GameDate2 as GameDate
FROM Tbl t
UNION ALL
SELECT 'P3' AS Player, t.Score3 AS Score, t.GameDate3 as GameDate
FROM Tbl t
)
-- get running scores up through date for each player
, RunningScores AS
(
SELECT b.Player, b.GameDate, SUM(a.Score) AS Score
FROM LotusTable a
INNER JOIN LotusTable b -- self join
ON a.Player = b.Player
AND a.GameDate <= b.GameDate -- a is earlier dates
GROUP BY b.Player, b.GameDate
)
-- get max score for any player
, MaxScore AS
(
SELECT MAX(r.Score) AS Score
FROM RunningScores r
)
-- get min date for the max score
, MinGameDate AS
(
SELECT MIN(r.GameDate) AS GameDate
FROM RunningsScores r
WHERE r.Score = (SELECT m.Score FROM MaxScore m)
)
-- get all players who got the max score on the min date
SELECT *
FROM RunningScores r
WHERE r.Score = (SELECT m.Score FROM MaxScore m)
AND r.GameDate = (SELECT d.GameDate FROM MinGameDate d)
;
There are more efficient ways of doing it; in particular, the self-join could be avoided.
If your tables are set up three columns: player_id, score1, time
Then you would just need a simple query to sum their scores and group them by player_ID as follows:
SELECT gamedata1.player_ID as 'Player_ID',
sum(gamedata1.score1 + gamedata2.score1 + gamedata3.score1) as 'Total_Score'
FROM gamedata1
LEFT JOIN gamedata2 ON (gamedata1.player_ID = gamedata2.player_ID)
LEFT JOIN gamedata3 ON (gamedata1.player_ID = gamedata3.player_ID)
GROUP BY 'player_ID'
ORDER BY time ASC
Explanation:
You are essentially grouping by each player so you can get a distinct player in each row, and then summing their scores and organizing the data in this fashion. I put the "time" as a date type. The can be changed of coarse to any datetype, etc that you would prefer. The structure of the query would be the same.

Query optimization in Oracle SQL

Let's say I have an oracle database schema like so:
tournaments( id, name )
players( id, name )
gameinfo( id, pid (references players.id), tid (references tournaments.id), date)
So a row in the gameinfo table means that a certain player played a certain game in a tournament on a given date. Tournaments has about 20 records, players about 160 000 and game info about 2 million. I have to write a query which lists tournaments (with tid in the range of 1-4) and the number of players that played their first game ever in that tournament.
I came up with the following query:
select tid, count(pid)
from gameinfo g
where g.date = (select min(date) from gameinfo g1 where g1.player = g.player)
and g.tid in (1,2,3,4)
group by tid;
This is clearly suboptimal (it ran for about 58 minutes).
I had another idea, that I could make a view of:
select pid, tid, min(date)
from gameinfo
where tid in(1,2,3,4)
group by pid, tid;
And run my queries on this view, as it only had about 600 000 records, but this still seems less than optimal.
Can you give any advice on how this could be optimized ?
My first recommendation is to try analytic functions first. The row_number() function will enumerate the tournaments for each user. The first has a seqnum of 1:
select gi.*
from (select gi.*,
row_number() over (partition by gi.player order by date) as seqnum
from gameinfo gi
) gi
where tid in(1,2,3,4) and seqnum = 1
My second suggestion is to put the date of the first tournament into the players table, since it seems like important information for using the database.

Fetch one row per account id from list

I have a table with game scores, allowing multiple rows per account id: scores (id, score, accountid). I want a list of the top 10 scorer ids and their scores.
Can you provide an sql statement to select the top 10 scores, but only one score per account id?
Thanks!
select username, max(score) from usertable group by username order by max(score) desc limit 10;
First limit the selection to the highest score for each account id.
Then take the top ten scores.
SELECT TOP 10 AccountId, Score
FROM Scores s1
WHERE AccountId NOT IN
(SELECT AccountId s2 FROM Scores
WHERE s1.AccountId = s2.AccountId and s1.Score > s2.Score)
ORDER BY Score DESC
Try this:
select top 10 username,
max(score)
from usertable
group by username
order by max(score) desc
PostgreSQL has the DISTINCT ON clause, that works this way:
SELECT DISTINCT ON (accountid) id, score, accountid
FROM scoretable
ORDER BY score DESC
LIMIT 10;
I don't think it's standard SQL though, so expect other databases to do it differently.
SELECT accountid, MAX(score) as top_score
FROM Scores
GROUP BY accountid,
ORDER BY top_score DESC
LIMIT 0, 10
That should work fine in mysql. It's possible you may need to use 'ORDER BY MAX(score) DESC' instead of that order by - I don't have my SQL reference on hand.
I believe that PostgreSQL (at least 8.3) will require that the DISTINCT ON expressions must match initial ORDER BY expressions. I.E. you can't use DISTINCT ON (accountid) when you have ORDER BY score DESC. To fix this, add it into the ORDER BY:
SELECT DISTINCT ON (accountid) *
FROM scoretable
ORDER BY accountid, score DESC
LIMIT 10;
Using this method allows you to select all the columns in a table. It will only return 1 row per accountid even if there are duplicate 'max' values for score.
This was useful for me, as I was not finding the maximum score (which is easy to do with the max() function) but for the most recent time a score was entered for an accountid.