Finding entry with maximum date range between two columns in SQL - sql

Note: this is in Oracle not MySQL, limit/top won't work.
I want to return the name of the person that has stayed the longest in a hotel. The longest stay can be found by subtracting the date in the checkout column with the checkin column.
So far I have:
select fans.name
from fans
where fans.checkout-fans.checkin is not null
order by fans.checkout-fans.checkin desc;
but this only orders the length of stay of each person from highest to lowest. I want it to only return the name (or names, if they are tied) of people who have stayed the longest. Also, As more than one person could have stayed for the highest length of time, simply adding limit 1 to the end won't do.
Edit (for gbn), when adding a join to get checkin/checkout from other table it wont work (no records returned)
edit 2 solved now, the below join should of been players.team = teams.name
select
x.name
from
(
select
players.name,
dense_rank() over (order by teams.checkout-teams.checkin desc) as rnk
from
players
join teams
on players.name = teams.name
where
teams.checkout-teams.checkin is not null
) x
where
x.rnk = 1

Should be this using DENSE_RANK to get ties
select
x.name
from
(
select
fans.name,
dense_rank() over (order by fans.checkout-fans.checkin desc) as rnk
from
fans
where
fans.checkout-fans.checkin is not null
) x
where
x.rnk = 1;
SQL Server has TOP..WITH TIES for this, but this is a generic solution for any RDBMS that has DENSE_RANK.

Longest is a fuzzy word, you should first define what is long for you. Using limit may not be a solution for this case. So you can define your treshold and try to filter your results where fans.checkout-fans.checkin > 10 for instance.

Try this:
select name, (checkout-checkin) AS stay
from fans
where stay is not null -- remove fans that never stayed at a hotel
order by stay desc;

For Oracle:
select * from
(
select fans.name
from fans
where fans.checkout-fans.checkin is not null
order by fans.checkout-fans.checkin desc)
where rownum=1

Another way that should work in all dbms (or almost all, at least those that support subqueries):
select fans.name
from fans
where fans.checkout-fans.checkin =
( select max(f.checkout-f.checkin)
from fans f
) ;

If both the columns are date fields you can use this query:
select fans.name from fans where fans.checkout-fans.checkin in (select max(fans.checkout-fans.checkin) from fans );

Related

SQL: select max(A), B but don't want to group by or aggregate B

If I have a house with multiple rooms, but I want the color of the most recently created, I would say:
select house.house_id, house.street_name, max(room.create_date), room.color
from house, room
where house.house_id = room.house_id
and house.house_id = 5
group by house.house_id, house.street_name
But I get the error:
Column 'room.color' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
If I say max(room.color), then sure, it will give me the max(color) along with the max(create_date), but I want the COLOR OF THE ROOM WITH THE MAX CREATE DATE.
just added the street_name because I do need to do the join, was just trying to simplify the query to clarify the question..
Expanding this to work for any number of houses (rather than only working for exactly one house)...
SELECT
house.*,
room.*
FROM
house
OUTER APPLY
(
SELECT TOP (1) room.create_date, room.color
FROM room
WHERE house.house_id = room.house_id
ORDER BY room.create_date DESC
)
AS room
One option is WITH TIES and also best to use the explicit JOIN
If you want to see ties, change row_number() to dense_rank()
select top 1 with ties
house.house_id
,room.create_date
, room.color
from house
join room on house.house_id = room.house_id
Where house.house_id = 5
Order by row_number() over (partition by house.house_ID order by room.create_date desc)
In standard SQL, you would write this as:
select r.house_id, r.create_date, r..color
from room r
where r.house_id = 5
order by r.create_date desc
offset 0 row fetch first 1 row only;
Note that the house table is not needed. If you did need columns from it, then you would use join/on syntax.
Not all databases support the standard offset/fetch clause. You might need to use limit, select top or something else depending on your database.
The above works in SQL Server, but is probably more commonly written as:
select top (1) r.house_id, r.create_date, r..color
from room r
where r.house_id = 5
order by r.create_date desc;
You could order by create_date and then select top 1:
select top 1 house.house_id, room.create_date, room.color
from house, room
where house.house_id = room.house_id
and house.house_id = 5
order by room.create_date desc

Sql max trophy count

I Create DataBase in SQL about Basketball. Teacher give me the task, I need print out basketball players from my database with the max trophy count. So, I wrote this little bit of code:
select surname ,count(player_id) as trophy_count
from dbo.Players p
left join Trophies t on player_id=p.id
group by p.surname
and SQL gave me this:
but I want, that SQL will print only this:
I read info about select in selects, but I don't know how it works, I tried but it doesn't work.
Use TOP:
SELECT TOP 1 surname, COUNT(player_id) AS trophy_count -- or TOP 1 WITH TIES
FROM dbo.Players p
LEFT JOIN Trophies t
ON t.player_id = p.id
GROUP BY p.surname
ORDER BY COUNT(player_id) DESC;
If you want to get all ties for the highest count, then use SELECT TOP 1 WITH TIES.
;WITH CTE AS
(
select surname ,count(player_id) as trophy_count
from dbo.Players p
group by p.surname;
)
select *
from CTE
where trophy_count = (select max(trophy_count) from CTE)
While select top with ties works (and is probably more efficient) I would say this code is probably more useful in the real world as it could be used to find the max, min or specific trophy count if needed with a very simple modification of the code.
This is basically getting your group by first, then allowing you to specify what results you want back. In this instance you can use
max(trophy_count) - get the maximum
min(trophy_count) - get the minimum
# i.e. - where trophy_count = 3 - to get a specific trophy count
avg(trophy_count) - get the average trophy_count
There are many others. Google "SQL Aggregate functions"
You will eventually go down the rabbit hole of needing to subsection this (examples are by week or by league). Then you are going to want to use windows functions with a cte or subquery)
For your example:
;with cte_base as
(
-- Set your detail here (this step is only needed if you are looking at aggregates)
select surname,Count(*) Ct
left join Trophies t on player_id=p.id
group by p.surname
, cte_ranked as
-- Dense_rank is chosen because of ties
-- Add to the partition to break out your detail like by league, surname
(
select *
, dr = DENSE_RANK() over (partition by surname order by Ct desc)
from cte_base
)
select *
from cte_ranked
where dr = 1 -- Bring back only the #1 of each partition
This is by far overkill but helping you lay the foundation to handle much more complicated queries. Tim Biegeleisen's answer is more than adequate to answer you question.

Unable to find Max Age of a Player

i am a newbie to SQL.
I wanna find out what which player is oldest by age.
So here is my table..
Somehow my Query give error.
Can you please tell me where i am doing it wrong.
Thanks.
select * from players
where age = (select max(age) as Oldest_Player from players);
limit 1
SQL has a SELECT TOP command, which allows you to retrieve a set number of rows. You can do SELECT TOP 1 name AS 'Oldest Person' FROM players ORDER BY age DESC
What this will do is: first retrieve all the players, sort them by age descending (oldest first), then take the first one.
You can use row_number as below:
Select * from (
Select *, RowN = Row_Number() over(order by age desc) from Players
) a Where a.RowN = 1

Pulling max values grouped by a variable with other columns in SQL

Say I have three columns in a very large table: a timestamp variable (last_time_started), a player name (Michael Jordan), and the team he was on the last time he started (Washington Wizards, Chicago Bulls), how do I pull the last time a player started, grouped by player, showing the team? For example:
if I did
select max(last_time_started), player, team
from table
group by 2
I would not know which team the player was on when he played his last game, which is important to me.
In Postgres the most efficient way is to use distinct on():
SELECT DISTINCT ON (player)
last_time_started,
player,
team,
FROM the_table
ORDER BY player, last_time_started DESC;
Using a window function is usually the second fastest solution, using a join with a derived table is usually the slowest alternative.
Here's a couple of ways to do this in Postgres:
With windowing functions:
SELECT last_time_started, player, team
FROM
(
SELECT
last_time_started,
player,
team,
CASE WHEN max(last_time_started) OVER (PARTITION BY PLAYER) = last_time_started then 'X' END as max_last_time_started
FROM table
)
WHERE max_last_time_started = 'x';
Or with a correlated subquery:
SELECT last_time_started, player, team
FROM table t1
WHERE last_time_started = (SELECT max(last_time_started) FROM table WHERE table.player = t1.player);
Try this solution
select s.*
from table s
inner join (
select max(t.last_time_started) as last_time_started, t.player
from table t
group by t.player) v on s.player = t.player and s.last_time_started = t.last_time_started
Also this approach should be faster, because it does not contain join
select v.last_time_started,
v.player,
v.team
from (
select t.last_time_started,
t.player,
t.team,
row_number() over (partition by t.player order by last_time_started desc) as n
from table t
) v
where v.n = 1

SQL: max(count((x))

I have a table of baseball fielding statistics for a project. There are many fields on this table, but the ones I care about for this are playerID, pos (position), G (games).
This table is historical so it contains multiple rows per playerID (one for each year/pos). What I want to be able to do is return the position that a player played the most for his career.
First what I imaging I have to do is count the games per position per playerID, then return the max of it. How can this be done in SQL? I am using SQL Server. On a side note, there may be a situation where there are ties, what would max do then?
If the player played in the same position over multiple teams over multiple games, I'd be more apt to use the sum() function, instead of count, in addition to using a group by statement, as a sub-query. See code for explanation.
SELECT playerID, pos, MAX( g_sum )
FROM (
SELECT DISTINCT playerID, pos, SUM( G ) as g_sum
FROM player_stats
GROUP BY id, pos
ORDER BY 3 DESC
) game_sums
GROUP BY playerID
It may not be the exact answer, at least it's a decent starting point and it worked on my lame testbed that I whipped up in 10 minutes.
As far as how max() acts with ties: It doesn't (as far as I can tell, at least). It's up to the actual GROUP BY statement itself, and where and how that max value shows up within the query or sub query.
If we were to include pos in the outer GROUP BY statement, in the event of a tie, it would show you both positions and the amount of games the player has played at said positions (which would be the same number). With it not in that GROUP BY statement, the query will go with the last given value for that column. So if position 2 showed up before position 3 in the sub query, the full query will show position 3 as the position that the player has played the most games in.
In SQL, I believe this will do it. Given that the same subquery is needed twice, I expect that doing this as a stored procedure would be more efficient.
SELECT MaxGamesInAnyPosition.playerID, GamesPerPosition.pos
FROM (
SELECT playerID, Max(totalGames) As maxGames
FROM (
SELECT playerID, pos, SUM(G) As totalGames
FROM tblStats
GROUP BY playerId, pos) Tallies
GROUP BY playerID) MaxGamesInAnyPosition
INNER JOIN (
SELECT playerID, pos, SUM(g) As totalGames
FROM tblStats
GROUP BY playerID, pos) GamesPerPosition
ON (MaxGamesInAnyPosition.playerID=GamesPerPosition.playerId
AND MaxGamesInAnyPosition.maxGames=GamesPerPosition.totalGames)
does not look pretty, but it is direct translation of what I built in linq to sql, give it a try and see if that's what you want:
SELECT [t2].[playerID], (
SELECT TOP (1) [t7].[pos]
FROM (
SELECT [t4].[playerID], [t4].[pos], (
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t5].[G]
FROM [players] AS [t5]
WHERE ([t4].[playerID] = [t5].[playerID]) AND ([t4].[pos] = [t5].[pos])
) AS [t6]
) AS [value]
FROM (
SELECT [t3].[playerID], [t3].[pos]
FROM [players] AS [t3]
GROUP BY [t3].[playerID], [t3].[pos]
) AS [t4]
) AS [t7]
WHERE [t2].[playerID] = [t7].[playerID]
ORDER BY [t7].[value] DESC
) AS [pos]
FROM (
SELECT [t1].[playerID]
FROM (
SELECT [t0].[playerID]
FROM [players] AS [t0]
GROUP BY [t0].[playerID], [t0].[pos]
) AS [t1]
GROUP BY [t1].[playerID]
) AS [t2]
Here is a second answer, much better (I think) than my first kick at the can last night. Certainly much easier to read and understand.
SELECT playerID, pos
FROM (
SELECT playerID, pos, SUM(G) As totGames
FROM tblStats
GROUP BY playerID, pos) Totals
WHERE NOT (Totals.totGames < ANY(
SELECT SUM(G)
FROM tblStats
WHERE Totals.playerID=tblStats.playerID
GROUP BY playerID, pos))
The subquery ensures that all rows will be thrown out if its games total at that given position is smaller than the number of games that player played at any other position.
In case of ties, the player in question will have all tied rows appear, as none of the tied records will be thrown out.