Order by date different columns - sql

I'm having a problem with a complex SELECT, so I hope some of you can help me out, because I'm really stuck with it... or maybe you can point me in a direction.
I have a table with the following columns:
score1, gamedate1, score2, gamedate2, score3, gamedate3
Basically I need to determine the ultimate winner of all the games, who got the SUMMED MAX score FIRST, based on the game times in ASCENDING order.

Assuming that the 1,2,3 are different players, something like this should work:
-- construct table as the_lotus suggests
WITH LotusTable AS
(
SELECT 'P1' AS Player, t.Score1 AS Score, t.GameDate1 as GameDate
FROM Tbl t
UNION ALL
SELECT 'P2' AS Player, t.Score2 AS Score, t.GameDate2 as GameDate
FROM Tbl t
UNION ALL
SELECT 'P3' AS Player, t.Score3 AS Score, t.GameDate3 as GameDate
FROM Tbl t
)
-- get running scores up through date for each player
, RunningScores AS
(
SELECT b.Player, b.GameDate, SUM(a.Score) AS Score
FROM LotusTable a
INNER JOIN LotusTable b -- self join
ON a.Player = b.Player
AND a.GameDate <= b.GameDate -- a is earlier dates
GROUP BY b.Player, b.GameDate
)
-- get max score for any player
, MaxScore AS
(
SELECT MAX(r.Score) AS Score
FROM RunningScores r
)
-- get min date for the max score
, MinGameDate AS
(
SELECT MIN(r.GameDate) AS GameDate
FROM RunningsScores r
WHERE r.Score = (SELECT m.Score FROM MaxScore m)
)
-- get all players who got the max score on the min date
SELECT *
FROM RunningScores r
WHERE r.Score = (SELECT m.Score FROM MaxScore m)
AND r.GameDate = (SELECT d.GameDate FROM MinGameDate d)
;
There are more efficient ways of doing it; in particular, the self-join could be avoided.

If your tables are set up three columns: player_id, score1, time
Then you would just need a simple query to sum their scores and group them by player_ID as follows:
SELECT gamedata1.player_ID as 'Player_ID',
sum(gamedata1.score1 + gamedata2.score1 + gamedata3.score1) as 'Total_Score'
FROM gamedata1
LEFT JOIN gamedata2 ON (gamedata1.player_ID = gamedata2.player_ID)
LEFT JOIN gamedata3 ON (gamedata1.player_ID = gamedata3.player_ID)
GROUP BY 'player_ID'
ORDER BY time ASC
Explanation:
You are essentially grouping by each player so you can get a distinct player in each row, and then summing their scores and organizing the data in this fashion. I put the "time" as a date type. The can be changed of coarse to any datetype, etc that you would prefer. The structure of the query would be the same.

Related

Group By 2 Columns and Find Percentage of Group In SQL

I have a Game table with two columns TeamZeroScore and TeamOneScore. I would like to calculate the % of games that end with each score variance. The max score one team can have is 5.
I have got the following code which selects each team score with an additional 2 columns to have the max and min of these two values in order. I did this because I thought the next step is to group by these two columns
SELECT TOP (100000) [TeamOneScore],[TeamZeroScore],
(SELECT Max(v)
FROM (VALUES ([TeamOneScore]), ([TeamZeroScore])) AS value(v)) as [MaxScore],
(SELECT Min(v)
FROM (VALUES ([TeamOneScore]), ([TeamZeroScore])) AS value(v)) as [MinScore]
FROM [Database].[dbo].[Game]
Below is the sample data I have for the code above.
How do I produce something similar to this? I think I need to Group By MaxScore, MinScore and then use Count on each group to calculate the percentage based on the total.
Select
Count(*) as "number",
(100 * count(*)) / t
As "percentage",
TeamOneScore as score,
TeamTwoScore as score
From
( Select
TeamOneScore,TeamTwoScore
From tablename
Where TeamOneScore <= TeamTwoScore
Union all
Select
TeamTwoScore,TeamOneScore
from tablename
Where TeamOneScore > TeamTwoScore
) a,
(Select count(*) as t
From tablename) b
Group by
TeamOneScore,
TeamTwoScore
Order by
TeamOneScore,
TeamTwoScore;

Using MAX to compute MAX value in a subquery column

What I am trying to do: I have a table, "band_style" with schema (band_id, style).
One band_id may occur multiple times, listed with different styles.
I want ALL rows of band_id, NUM (where NUM is the number of different styles a band has) for the band ids with the SECOND MOST number of styles.
I have spent hours on this query- almost nothing seems to be working.
This is how far I got. The table (data) successfully computes all bands with styles less than the maximum value of band styles. Now, I need ALL rows that have the Max NUM for the resulting table. This will give me bands with the second most number of styles.
However, this final result seems to be ignoring the MAX function and just returning the table (data) as is. Can someone please provide some insight/working method? I have over 20 attempts of this query with this being the closest.
Using SQL*PLUS on Oracle
WITH data AS (
SELECT band_id, COUNT(*) AS NUM FROM band_style GROUP BY band_id HAVING COUNT(*) <
(SELECT MAX(c) FROM
(SELECT COUNT(band_id) AS c
FROM band_style
GROUP BY band_id)))
SELECT data.band_id, data.NUM FROM data
INNER JOIN ( SELECT band_id m, MAX(NUM) n
FROM data GROUP BY band_id
) t
ON t.m = data.band_id
AND t.n = data.NUM;
Something like this... based on a Comment under your post, you are looking for DENSE_RANK()
select band_id
from ( select band_id, dense_rank() over (order by count(style) desc) as drk
from band_style
group by band_id
)
where drk = 2;
I would use a windowing function (RANK() in this case) - which is great for find the 'n' ranked thing in a set.
SELECT DISTINCT bs.band_id
FROM band_style bs
WHERE EXISTS (
SELECT NULL
FROM (
SELECT
bs2.band_id,
bs2.num,
RANK() OVER (ORDER BY bs2.num) AS numrank
FROM (
SELECT bs1.band_id, COUNT(*) as num
FROM band_style bs1
GROUP BY bs1.band_id ) bs2 ) bs3
WHERE bs.band_id = bs3.band_id
AND bs3.numrank = 2 )

Pulling max values grouped by a variable with other columns in SQL

Say I have three columns in a very large table: a timestamp variable (last_time_started), a player name (Michael Jordan), and the team he was on the last time he started (Washington Wizards, Chicago Bulls), how do I pull the last time a player started, grouped by player, showing the team? For example:
if I did
select max(last_time_started), player, team
from table
group by 2
I would not know which team the player was on when he played his last game, which is important to me.
In Postgres the most efficient way is to use distinct on():
SELECT DISTINCT ON (player)
last_time_started,
player,
team,
FROM the_table
ORDER BY player, last_time_started DESC;
Using a window function is usually the second fastest solution, using a join with a derived table is usually the slowest alternative.
Here's a couple of ways to do this in Postgres:
With windowing functions:
SELECT last_time_started, player, team
FROM
(
SELECT
last_time_started,
player,
team,
CASE WHEN max(last_time_started) OVER (PARTITION BY PLAYER) = last_time_started then 'X' END as max_last_time_started
FROM table
)
WHERE max_last_time_started = 'x';
Or with a correlated subquery:
SELECT last_time_started, player, team
FROM table t1
WHERE last_time_started = (SELECT max(last_time_started) FROM table WHERE table.player = t1.player);
Try this solution
select s.*
from table s
inner join (
select max(t.last_time_started) as last_time_started, t.player
from table t
group by t.player) v on s.player = t.player and s.last_time_started = t.last_time_started
Also this approach should be faster, because it does not contain join
select v.last_time_started,
v.player,
v.team
from (
select t.last_time_started,
t.player,
t.team,
row_number() over (partition by t.player order by last_time_started desc) as n
from table t
) v
where v.n = 1

Cumulative Game Score SQL

I have developed a game recently and the database is running on MSSQL.
Here is my database structure
Table : Player
PlayerID uniqueIdentifier (PK)
PlayerName nvarchar
Table : GameResult
ID bigint (PK - Auto Increment)
PlayerID uniqueIdentifier (FK)
DateCreated Datetime
Score int
TimeTaken bigint
PuzzleID int
I have done an SQL listing Top 50 players that sort by highest score (DESC) and timetaken (ASC)
WITH ResultSet (PlayerID, Score, TimeTaken) AS(
SELECT DISTINCT(A.[PlayerID]), MAX(A.[Score]),MIN(A.[TimeTaken])
FROM GameResult A
WHERE A.[puzzleID] = #PuzzleID
GROUP BY A.[PlayerID])
SELECT TOP 50 RSP.[PlayerID], RSP.[PlayerName], RSA.[Score], RSA.[TimeTaken]
FROM ResultSet RSA
INNER JOIN Player RSP WITH(NOLOCK) ON RSA.PlayerID = RSP.PlayerID
ORDER By RSA.[Score] DESC, RSA.[timetaken] ASC
However above is applicable for just 1 puzzle.
Question
1) I need to modify the SQL to do a cumulative rank of 3 puzzle ID. For example, Puzzle 1, 2, 3 and it should be sort by highest sum score (DESC), and sum timetaken (ASC)
2) I also need an overall score population for all the possible 1 to 7 puzzle.
3) Each player only allowed to appear on the list once. First played and first to get highest score will be rank 1st.
I tried using CTE with UNION but the SQL statement doesn't work.
I hope gurus here can help me out on this. Much appreciated.
UPDATED WITH NEW SQL
Sql below allowed me to get the result for each puzzle id. I'm not sure if it is 100% but I believe it is correct.
;with ResultSet (PlayerID, maxScore, minTime, playedDate)
AS
(
SELECT TOP 50 PlayerID, MAX(score) as maxScore, MIN(timetaken) as minTime, MIN(datecreated) as playedDate
FROM gameresult
WHERE puzzleID = #PuzzleID
GROUP BY PlayerID
ORDER BY maxScore desc, minTime asc, playedDate asc
)
SELECT RSP.[PlayerID], RSP.[PlayerName], RSA.maxScore, RSA.minTime, RSA.PlayedDate
FROM ResultSet RSA
INNER JOIN Player RSP WITH(NOLOCK)
ON RSA.PlayerID = RSP.PlayerID
ORDER BY
maxScore DESC,
minTime ASC,
playedDate ASC
I would first like to point out that I do not believe your original query is correct. If you are looking for the best player for a particular puzzle, would that be the combination of the highest score plus the best time for that puzzle? If yes, using max and min does not guarantee that the max and min come from the same game (or row), which I believe should be a requirement. Instead you should have first determined the best game per player by using a row number windowing function. You can then do the top 50 sort off of that data.
The cumulative metrics should be easier to calculate because you only have to aggregate the sum of their score and the sum of their time and then sort, which means the new query should most likely look something like this:
;with ResultSet (PlayerID, Score, TimeTaken)
AS
(
SELECT TOP 50
A.[PlayerID],
SUM(A.[Score]),
SUM(A.[TimeTaken])
FROM GameResult A
WHERE
A.[puzzleID] in(1,2,3)
GROUP BY
A.PlayerID
ORDER BY
SUM(A.[Score]) DESC,
SUM(A.[TimeTaken]) ASC
)
SELECT RSP.[PlayerID], RSP.[PlayerName], RSA.[Score], RSA.[TimeTaken]
FROM ResultSet RSA
INNER JOIN Player RSP WITH(NOLOCK)
ON RSA.PlayerID = RSP.PlayerID
ORDER BY
Score DESC,
TimeTaken ASC
UPDATE:
Based on the new criteria, you will have to do something like this.
;WITH ResultSet (PlayerID, PuzzleId, Score, TimeTaken, seq)
AS
(
SELECT
A.[PlayerID],
A.PuzzleID,
A.[Score],
A.[TimeTaken],
seq = ROW_NUMBER() over(PARTITION BY PlayerID, PuzzleId ORDER BY Score DESC)
FROM GameResult A
WHERE
A.[puzzleID] in(1,2,3)
)
SELECT TOP 50
RSP.[PlayerID],
RSP.[PlayerName],
Score = SUM(RSA.[Score]), --total score
TimeTaken = SUM(RSA.[TimeTaken]) --total time taken
FROM ResultSet RSA
INNER JOIN Player RSP
ON RSA.PlayerID = RSP.PlayerID
WHERE
--this is used to filter the top score for each puzzle per player
seq = 1
GROUP BY
RSP.[PlayerID],
RSP.[PlayerName]
ORDER BY
SUM(RSA.Score) DESC,
SUM(RSA.TimeTaken) ASC

SQL: max(count((x))

I have a table of baseball fielding statistics for a project. There are many fields on this table, but the ones I care about for this are playerID, pos (position), G (games).
This table is historical so it contains multiple rows per playerID (one for each year/pos). What I want to be able to do is return the position that a player played the most for his career.
First what I imaging I have to do is count the games per position per playerID, then return the max of it. How can this be done in SQL? I am using SQL Server. On a side note, there may be a situation where there are ties, what would max do then?
If the player played in the same position over multiple teams over multiple games, I'd be more apt to use the sum() function, instead of count, in addition to using a group by statement, as a sub-query. See code for explanation.
SELECT playerID, pos, MAX( g_sum )
FROM (
SELECT DISTINCT playerID, pos, SUM( G ) as g_sum
FROM player_stats
GROUP BY id, pos
ORDER BY 3 DESC
) game_sums
GROUP BY playerID
It may not be the exact answer, at least it's a decent starting point and it worked on my lame testbed that I whipped up in 10 minutes.
As far as how max() acts with ties: It doesn't (as far as I can tell, at least). It's up to the actual GROUP BY statement itself, and where and how that max value shows up within the query or sub query.
If we were to include pos in the outer GROUP BY statement, in the event of a tie, it would show you both positions and the amount of games the player has played at said positions (which would be the same number). With it not in that GROUP BY statement, the query will go with the last given value for that column. So if position 2 showed up before position 3 in the sub query, the full query will show position 3 as the position that the player has played the most games in.
In SQL, I believe this will do it. Given that the same subquery is needed twice, I expect that doing this as a stored procedure would be more efficient.
SELECT MaxGamesInAnyPosition.playerID, GamesPerPosition.pos
FROM (
SELECT playerID, Max(totalGames) As maxGames
FROM (
SELECT playerID, pos, SUM(G) As totalGames
FROM tblStats
GROUP BY playerId, pos) Tallies
GROUP BY playerID) MaxGamesInAnyPosition
INNER JOIN (
SELECT playerID, pos, SUM(g) As totalGames
FROM tblStats
GROUP BY playerID, pos) GamesPerPosition
ON (MaxGamesInAnyPosition.playerID=GamesPerPosition.playerId
AND MaxGamesInAnyPosition.maxGames=GamesPerPosition.totalGames)
does not look pretty, but it is direct translation of what I built in linq to sql, give it a try and see if that's what you want:
SELECT [t2].[playerID], (
SELECT TOP (1) [t7].[pos]
FROM (
SELECT [t4].[playerID], [t4].[pos], (
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t5].[G]
FROM [players] AS [t5]
WHERE ([t4].[playerID] = [t5].[playerID]) AND ([t4].[pos] = [t5].[pos])
) AS [t6]
) AS [value]
FROM (
SELECT [t3].[playerID], [t3].[pos]
FROM [players] AS [t3]
GROUP BY [t3].[playerID], [t3].[pos]
) AS [t4]
) AS [t7]
WHERE [t2].[playerID] = [t7].[playerID]
ORDER BY [t7].[value] DESC
) AS [pos]
FROM (
SELECT [t1].[playerID]
FROM (
SELECT [t0].[playerID]
FROM [players] AS [t0]
GROUP BY [t0].[playerID], [t0].[pos]
) AS [t1]
GROUP BY [t1].[playerID]
) AS [t2]
Here is a second answer, much better (I think) than my first kick at the can last night. Certainly much easier to read and understand.
SELECT playerID, pos
FROM (
SELECT playerID, pos, SUM(G) As totGames
FROM tblStats
GROUP BY playerID, pos) Totals
WHERE NOT (Totals.totGames < ANY(
SELECT SUM(G)
FROM tblStats
WHERE Totals.playerID=tblStats.playerID
GROUP BY playerID, pos))
The subquery ensures that all rows will be thrown out if its games total at that given position is smaller than the number of games that player played at any other position.
In case of ties, the player in question will have all tied rows appear, as none of the tied records will be thrown out.