SQL: select max(A), B but don't want to group by or aggregate B - sql

If I have a house with multiple rooms, but I want the color of the most recently created, I would say:
select house.house_id, house.street_name, max(room.create_date), room.color
from house, room
where house.house_id = room.house_id
and house.house_id = 5
group by house.house_id, house.street_name
But I get the error:
Column 'room.color' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
If I say max(room.color), then sure, it will give me the max(color) along with the max(create_date), but I want the COLOR OF THE ROOM WITH THE MAX CREATE DATE.
just added the street_name because I do need to do the join, was just trying to simplify the query to clarify the question..

Expanding this to work for any number of houses (rather than only working for exactly one house)...
SELECT
house.*,
room.*
FROM
house
OUTER APPLY
(
SELECT TOP (1) room.create_date, room.color
FROM room
WHERE house.house_id = room.house_id
ORDER BY room.create_date DESC
)
AS room

One option is WITH TIES and also best to use the explicit JOIN
If you want to see ties, change row_number() to dense_rank()
select top 1 with ties
house.house_id
,room.create_date
, room.color
from house
join room on house.house_id = room.house_id
Where house.house_id = 5
Order by row_number() over (partition by house.house_ID order by room.create_date desc)

In standard SQL, you would write this as:
select r.house_id, r.create_date, r..color
from room r
where r.house_id = 5
order by r.create_date desc
offset 0 row fetch first 1 row only;
Note that the house table is not needed. If you did need columns from it, then you would use join/on syntax.
Not all databases support the standard offset/fetch clause. You might need to use limit, select top or something else depending on your database.
The above works in SQL Server, but is probably more commonly written as:
select top (1) r.house_id, r.create_date, r..color
from room r
where r.house_id = 5
order by r.create_date desc;

You could order by create_date and then select top 1:
select top 1 house.house_id, room.create_date, room.color
from house, room
where house.house_id = room.house_id
and house.house_id = 5
order by room.create_date desc

Related

How do i get only the max value of a column SQL

i'm using this code
SELECT S.id_usuario, C.cnt
FROM promos_usuarios S
INNER JOIN (SELECT id_usuario, count(id_usuario) as cnt
FROM promos_usuarios
GROUP BY id_usuario) C ON S.id_usuario = c.id_usuario
after using this code, i get this
table
And i only need the max value of the column cnt, not the entire count, what can i do to make this work? i'm sorry if this doesnt make any sense, english isnt my main lenguage.
You can use a correlated subquery:
SELECT id_usuario, COUNT(*) as cnt
FROM promos_usuarios
GROUP BY id_usuario
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
Note all databases support FETCH FIRST. Your database may use SELECT TOP or LIMIT or something else.

Sql max trophy count

I Create DataBase in SQL about Basketball. Teacher give me the task, I need print out basketball players from my database with the max trophy count. So, I wrote this little bit of code:
select surname ,count(player_id) as trophy_count
from dbo.Players p
left join Trophies t on player_id=p.id
group by p.surname
and SQL gave me this:
but I want, that SQL will print only this:
I read info about select in selects, but I don't know how it works, I tried but it doesn't work.
Use TOP:
SELECT TOP 1 surname, COUNT(player_id) AS trophy_count -- or TOP 1 WITH TIES
FROM dbo.Players p
LEFT JOIN Trophies t
ON t.player_id = p.id
GROUP BY p.surname
ORDER BY COUNT(player_id) DESC;
If you want to get all ties for the highest count, then use SELECT TOP 1 WITH TIES.
;WITH CTE AS
(
select surname ,count(player_id) as trophy_count
from dbo.Players p
group by p.surname;
)
select *
from CTE
where trophy_count = (select max(trophy_count) from CTE)
While select top with ties works (and is probably more efficient) I would say this code is probably more useful in the real world as it could be used to find the max, min or specific trophy count if needed with a very simple modification of the code.
This is basically getting your group by first, then allowing you to specify what results you want back. In this instance you can use
max(trophy_count) - get the maximum
min(trophy_count) - get the minimum
# i.e. - where trophy_count = 3 - to get a specific trophy count
avg(trophy_count) - get the average trophy_count
There are many others. Google "SQL Aggregate functions"
You will eventually go down the rabbit hole of needing to subsection this (examples are by week or by league). Then you are going to want to use windows functions with a cte or subquery)
For your example:
;with cte_base as
(
-- Set your detail here (this step is only needed if you are looking at aggregates)
select surname,Count(*) Ct
left join Trophies t on player_id=p.id
group by p.surname
, cte_ranked as
-- Dense_rank is chosen because of ties
-- Add to the partition to break out your detail like by league, surname
(
select *
, dr = DENSE_RANK() over (partition by surname order by Ct desc)
from cte_base
)
select *
from cte_ranked
where dr = 1 -- Bring back only the #1 of each partition
This is by far overkill but helping you lay the foundation to handle much more complicated queries. Tim Biegeleisen's answer is more than adequate to answer you question.

How to do order by and group by in this scenario?

I need to select TOP 3 Sports based on TotalUsers by DESC and group them by Individual Sports.
What I've done till now is
SELECT *
FROM (
SELECT R.Sports, R.RoomID ,R.Name,
COUNT(C.ChatUserLogId) AS TotalUsers,
ROW_NUMBER()
OVER (PARTITION BY R.SPORTS ORDER BY R.SPORTS DESC ) AS Rank
FROM Room R JOIN ChatUserLog C
ON R.RoomID = C.RoomId
GROUP BY
R.RoomID,
R.Name,
R.Sports
) rs WHERE Rank IN (1, 2, 3)
ORDER BY Sports, TotalUsers DESC
Below is the output of the SQL
Sports RoomID Name TotalUsers Rank
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aerobics 6670 Aerobic vs. Anaerobic Exercise: Which is Best to Burn more Fat? 17 1
Aerobics 7922 Is it okay to be fat if you’re fit? 13 2
Aerobics 6669 What is the best time of the day to do an aerobic work out? 7 3
Archery 7924 Who were the best archers in history? 8 1
Archery 7925 Should I get into shooting or archery? 7 2
Archery 7926 What advantages, if any, do arrows have over bullets? 6 3
Badminton 6678 Which is more challenging, physically and mentally: badminton or tennis? 9 1
Badminton 6677 Who is your favorite - Lee chong wei or Lin dan? 8 2
Badminton 6794 Which single athlete most changed the sport? 7 3
Billiards  6691 How to get great at billiards? 34 1
Billiards  6692 Why is Efren Reyes the greatest billiards and pool player of all time? 31 2
Boxing 6697 Mike Tyson: The greatest heavyweight of all time? 13 1
Boxing 6700 Who is considered the greatest boxer of all time? Why? 13 2
Boxing 6699 What is the greatest, most exciting boxing fight of all-time? 12 3
But my query does not solve my requirement. I need the output something like below. The below output selects the TotalUsers and groups them by Sports.
Sports TotalUsers
-----------------------
Billiards 34
Billiards 31
Aerobics 17
Aerobics 13
Aerobics 7
Boxing 13
Boxing 13
Boxing 12
Any help is appreciated.
Your code looks very close, but there appear to be three issues.
Over clause
There looks to be an error in your OVER clause:
ROW_NUMBER() OVER(PARTITION BY R.SPORTS ORDER BY R.SPORTS DESC)
The PARTITION BY statement is correct in restarting the ranking for each partition. However, within each partition you are ordering by the partition criteria, which is nondeterministic (R.SPORTS will necessarily be equal for each value in the partition, so ORDER BY will have no effect). What you want, instead, is to order by the total users. The statement is then:
ROW_NUMBER() OVER(PARTITION BY R.SPORTS ORDER BY COUNT(C.CHATUSERLOGID) DESC)
(You can also use RANK() in place of ROW_NUMBER if you want to rooms with equal number of users to have the same ranking.)
Final query ordering
The question indicates you are seeking to order the result set as follows:
First, by sport; sports should be ordered by the largest room within that category
Second, the top 3 rooms for each sport in descending order of size
The first criteria requires a new column in your inner select statement: for each room, what was the highest number of users for any room for that sport? This can be written as:
MAX(COUNT(C.CHATUSERLOGID)) OVER (PARTITION BY R.SPORTS) MaxSportsUsers
With this column available, you can order by MaxSportsUsers descending followed by Rank ascending.
Limiting to top 3 sports: a problem arises
The question solution indicates you only want the top three sports, ranked by the number of users in its top room. Thus, you need to do a ranking of the form:
RANK() OVER (PARTITION BY CATEGORY ORDER BY MAX(COUNT(USERID)) OVER (PARTITION BY CATEGORY)) CategoryTop
But SQL Server does not support this, and attempting it will raise the error "Windowed functions cannot be used in the context of another windowed function or aggregate".
There are a few alternatives. As one, note that if we run SELECT TOP 3 SPORT, MAX(TotalUsers) MaxUsers FROM RS ORDER BY 2 DESC against the inner query (rs), the query will produce the top three sports and highest user count. Joining these records against RS on Sport will limit the final output to the top three sports.
This approach requires that RS to be referenced from an inner join. To do so, it's necessary to convert the nested query (SELECT * FROM (SELECT...) rs) to Common Table Expression form (WITH RS AS (SELECT...) SELECT * FROM RS). This allows a query of the form WITH RS AS (SELECT...) SELECT * FROM RS JOIN (SELECT... FROM RS) R2...
Once the query is in CTE format, we can join on the CTE query, i.e., INNER JOIN (SELECT TOP 3 SPORT, MAX(TOTALUSERS) MaxSportsUsers FROM RS GROUP BY SPORT ORDER BY 2 DESC) RS2 ON RS2.SPORT = RS.SPORT), keeping the ORDER BY clause the same. The inner join will limit the final dataset to the top 3 sports.
With the MaxSportsUsers column moved to the inner join, it can be removed from RS (formerly the inner query).
Final query
Combining the above, we get the final query:
WITH RS AS
(
SELECT R.Sports, R.RoomID ,R.Name,
COUNT(C.ChatUserLogId) AS TotalUsers,
ROW_NUMBER() OVER (PARTITION BY R.SPORTS ORDER BY COUNT(C.ChatUserLogId) DESC ) AS Rank
FROM Room R
JOIN ChatUserLog C ON R.RoomID = C.RoomId
GROUP BY R.RoomID, R.Name, R.Sports
)
SELECT rs.Sports, rs.TotalUsers
FROM rs
INNER JOIN (
SELECT TOP 3 SPORTS, MAX(TOTALUSERS) MaxSportsUsers FROM RS GROUP BY SPORTS ORDER BY 2 DESC
) RS2 ON RS2.SPORTS = RS.SPORTS
WHERE Rank IN (1, 2, 3)
ORDER BY MaxSportsUsers DESC, RANK;
From the description of your desired data, you appear to only want to select two columns from the subquery:
SELECT rs.Sports, rs.TotalUsers
FROM (SELECT R.Sports, R.RoomID ,R.Name,
COUNT(C.ChatUserLogId) AS TotalUsers,
ROW_NUMBER() OVER (PARTITION BY R.SPORTS ORDER BY R.SPORTS DESC ) AS Rank
FROM Room R JOIN
ChatUserLog C
ON R.RoomID = C.RoomId
GROUP BY R.RoomID, R.Name, R.Sports
) rs
WHERE Rank IN (1, 2, 3)
ORDER BY Sports, TotalUsers DESC;
The only change is that the outer query selects the two columns you want.
If you want the top 3, start by getting the top 3. Something like this:
with top3Sports as (
select top 3 sports, count(chatUserLogId) users
from room r join chatUserLog c on r.roomId = c.roomId
group by sports
order by count(chatUserLogId) desc
)
select the fields you need
from top3Sports join other tables etc
It's a lot simpler than the approach you tried. Bear in mind, however, that no matter what approach you take, ties will mess you up.

Finding entry with maximum date range between two columns in SQL

Note: this is in Oracle not MySQL, limit/top won't work.
I want to return the name of the person that has stayed the longest in a hotel. The longest stay can be found by subtracting the date in the checkout column with the checkin column.
So far I have:
select fans.name
from fans
where fans.checkout-fans.checkin is not null
order by fans.checkout-fans.checkin desc;
but this only orders the length of stay of each person from highest to lowest. I want it to only return the name (or names, if they are tied) of people who have stayed the longest. Also, As more than one person could have stayed for the highest length of time, simply adding limit 1 to the end won't do.
Edit (for gbn), when adding a join to get checkin/checkout from other table it wont work (no records returned)
edit 2 solved now, the below join should of been players.team = teams.name
select
x.name
from
(
select
players.name,
dense_rank() over (order by teams.checkout-teams.checkin desc) as rnk
from
players
join teams
on players.name = teams.name
where
teams.checkout-teams.checkin is not null
) x
where
x.rnk = 1
Should be this using DENSE_RANK to get ties
select
x.name
from
(
select
fans.name,
dense_rank() over (order by fans.checkout-fans.checkin desc) as rnk
from
fans
where
fans.checkout-fans.checkin is not null
) x
where
x.rnk = 1;
SQL Server has TOP..WITH TIES for this, but this is a generic solution for any RDBMS that has DENSE_RANK.
Longest is a fuzzy word, you should first define what is long for you. Using limit may not be a solution for this case. So you can define your treshold and try to filter your results where fans.checkout-fans.checkin > 10 for instance.
Try this:
select name, (checkout-checkin) AS stay
from fans
where stay is not null -- remove fans that never stayed at a hotel
order by stay desc;
For Oracle:
select * from
(
select fans.name
from fans
where fans.checkout-fans.checkin is not null
order by fans.checkout-fans.checkin desc)
where rownum=1
Another way that should work in all dbms (or almost all, at least those that support subqueries):
select fans.name
from fans
where fans.checkout-fans.checkin =
( select max(f.checkout-f.checkin)
from fans f
) ;
If both the columns are date fields you can use this query:
select fans.name from fans where fans.checkout-fans.checkin in (select max(fans.checkout-fans.checkin) from fans );

SQL query to select distinct row with minimum value

I want an SQL statement to get the row with a minimum value.
Consider this table:
id game point
1 x 5
1 z 4
2 y 6
3 x 2
3 y 5
3 z 8
How do I select the ids that have the minimum value in the point column, grouped by game? Like the following:
id game point
1 z 4
2 y 5
3 x 2
Use:
SELECT tbl.*
FROM TableName tbl
INNER JOIN
(
SELECT Id, MIN(Point) MinPoint
FROM TableName
GROUP BY Id
) tbl1
ON tbl1.id = tbl.id
WHERE tbl1.MinPoint = tbl.Point
This is another way of doing the same thing, which would allow you to do interesting things like select the top 5 winning games, etc.
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Point) as RowNum, *
FROM Table
) X
WHERE RowNum = 1
You can now correctly get the actual row that was identified as the one with the lowest score and you can modify the ordering function to use multiple criteria, such as "Show me the earliest game which had the smallest score", etc.
This will work
select * from table
where (id,point) IN (select id,min(point) from table group by id);
As this is tagged with sql only, the following is using ANSI SQL and a window function:
select id, game, point
from (
select id, game, point,
row_number() over (partition by game order by point) as rn
from games
) t
where rn = 1;
Ken Clark's answer didn't work in my case. It might not work in yours either. If not, try this:
SELECT *
from table T
INNER JOIN
(
select id, MIN(point) MinPoint
from table T
group by AccountId
) NewT on T.id = NewT.id and T.point = NewT.MinPoint
ORDER BY game desc
SELECT DISTINCT
FIRST_VALUE(ID) OVER (Partition by Game ORDER BY Point) AS ID,
Game,
FIRST_VALUE(Point) OVER (Partition by Game ORDER BY Point) AS Point
FROM #T
SELECT * from room
INNER JOIN
(
select DISTINCT hotelNo, MIN(price) MinPrice
from room
Group by hotelNo
) NewT
on room.hotelNo = NewT.hotelNo and room.price = NewT.MinPrice;
This alternative approach uses SQL Server's OUTER APPLY clause. This way, it
creates the distinct list of games, and
fetches and outputs the record with the lowest point number for that game.
The OUTER APPLY clause can be imagined as a LEFT JOIN, but with the advantage that you can use values of the main query as parameters in the subquery (here: game).
SELECT colMinPointID
FROM (
SELECT game
FROM table
GROUP BY game
) As rstOuter
OUTER APPLY (
SELECT TOP 1 id As colMinPointID
FROM table As rstInner
WHERE rstInner.game = rstOuter.game
ORDER BY points
) AS rstMinPoints
This is portable - at least between ORACLE and PostgreSQL:
select t.* from table t
where not exists(select 1 from table ti where ti.attr > t.attr);
Most of the answers use an inner query. I am wondering why the following isn't suggested.
select
*
from
table
order by
point
fetch next 1 row only // ... or the appropriate syntax for the particular DB
This query is very simple to write with JPAQueryFactory (a Java Query DSL class).
return new JPAQueryFactory(manager).
selectFrom(QTable.table).
setLockMode(LockModeType.OPTIMISTIC).
orderBy(QTable.table.point.asc()).
fetchFirst();
Try:
select id, game, min(point) from t
group by id