Limiting rows in SQL doesn't work properly - sql

I want to get the first 5 rows of every season in my select. I have 4 seasons: SUM, SPR, AUT, WIN.
So there should be 20 rows in total.
My select looks like this:
select *
from (
select year, season, ROUND(avg(temperature),1) as avgTemp
from temperature join month on temperature.MONTH = month.MONTH
group by (season, year)
order by season, avgTemp asc
) where rownum <= 5;
It works for just one season. The output is:
1993 AUT 8,7
2007 AUT 9,9
1996 AUT 10
1998 AUT 10
2008 AUT 10,5
But it should look like that:
1996 SPR 9.6
1991 SPR 10.3
2006 SPR 10.3
2004 SPR 10.6
1995 SPR 10.6
1996 SUM 18.9
1993 SUM 19.1
2007 SUM 19.5
1998 SUM 19.5
2000 SUM 19.6
1993 AUT 8.7
2007 AUT 9.9
1998 AUT 10.0
1996 AUT 10.0
2008 AUT 10.5
1996 WIN .3
1991 WIN 1.2
2003 WIN 1.6
2006 WIN 1.9
2005 WIN 2.0
Do you know how to improve the select or do you have any other suggestions? Thanks in advance!

You need to do it in three steps:
Group by season and year, calculating the average temperature
Assign a row number: it restart with each season and assigns in ascending order according to the average temperature
Select only the rows with a row number between 1 and 5
The SQL should look like this (untested):
select year, season, avg_temp
from (
select year, season, avg_temp,
row_number() over(partition by season order by avg_temp) rn
from (
select year, season, ROUND(avg(temperature),1) as avg_temp
from temperature
join month on temperature.MONTH = month.MONTH
group by season, year
)
)
where rn <= 5;
Update
For you special ordering by season, add this:
order by case season
when 'SPR' then 1
when 'SUM' then 2
when 'AUT' then 3
when 'WIN' then 4
end, avg_temp;

WITH cteAverageTempByYearBySeason AS (
SELECT
year
,season
,ROUND(AVG(temperature),1) as AvgTemp
FROM
Temperature t
INNER JOIN Month m
On t.MONTH = m.MONTH
GROUP BY
year
,season
)
, cteRowNumber AS (
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY season ORDER BY AvgTemp ASC) as RowNumber
FROM
cteAverageTempByYearBySeason
)
SELECT *
FROM
cteRowNumber
WHERE
RowNumber <= 5
Here is an example. I broke out the derived tables into Common Table Expressions to make the logic more noticeable. You need to create a PARTITIONED ROW_NUMBER() not just use oracles special rownumber. The latter will only return the same as TOP/LIMIT 5 where as the former will allow you to identify 5 rows per season.
Edit added a neat trick for your order by so you don't have to write a case expression. This one utilizes your month number which I assume is what MONTH column is.
WITH cteAverageTempByYearBySeason AS (
SELECT
year
,season
,ROUND(AVG(temperature),1) as AvgTemp
,MAX(m.MONTH) as SeasonOrderBy
FROM
Temperature t
INNER JOIN Month m
On t.MONTH = m.MONTH
GROUP BY
year
,season
)
, cteRowNumber AS (
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY season ORDER BY AvgTemp ASC) as RowNumber
FROM
cteAverageTempByYearBySeason
)
SELECT
year
,season
,AVG
FROM
cteRowNumber
WHERE
RowNumber <= 5
ORDER BY
SeasonOrderBy
,AvgTemp
,Year

You need to use row_number to get 5 rows for each grouping:
select
year,
season,
round(avg(temperature), 1) as avgTemp
from (
select *,
row_number() over(partition by season, year order by season, avgTemp) as rn
from temperature t
join month m
on m.MONTH = t.MONTH
) a
where
a.rn <= 1

Related

SQL group by highest occurance

By executing this query
SELECT year, genre, COUNT(genre)
FROM Oscar
GROUP BY year, genre
I got the following output:
2016 Action 2
2016 Romance 1
2017 Action 1
2017 Romance 2
2018 Fantasy 1
2019 Action 1
2019 Fantasy 2
2020 Action 3
2020 Fantasy 1
2020 Romance 1
Now i want to display only the genre with the highest number per year to display. What is the best way to do this?
So I want the output to look like this:
2016 Action
2017 Romance
2018 Fantasy
2019 Fantasy
2020 Action
You can use window functions:
SELECT year, genre
FROM (
SELECT year, genre, RANK() OVER(PARTITION BY year ORDER BY COUNT(*) DESC) rn
FROM Oscar
GROUP BY year, genre
) t
WHERE rn = 1
If your database does not support window functions (eg MySQL < 8.0), another option is:
SELECT year, genre
FROM Oscar o
GROUP BY year, genre
HAVING COUNT(*) = (
SELECT COUNT(*)
FROM Oscar o1
WHERE o1.year = o.year
GROUP BY o1.category
ORDER BY COUNT(*) DESC LIMIT 1
)
Use window functions:
SELECT year, genre
FROM (SELECT year, genre, COUNT(*) as cnt,
RANK() OVER (PARTITION BY year ORDER BY COUNT(*) DESC) as seqnum
FROM Oscar
GROUP BY year, genre
) yg
WHERE seqnum = 1;
If there are ties, RANK() returns all highest ranked values. Use ROW_NUMBER() if you specifically want one row, even when there are ties for first.

Find out which people stay in a population/cohort from the base year to last year and the years in between

I have a table with columns: c
The table contains information about players/team relationships over multiple years (say, 2010 to 2020)
What I want to know is:
- For starting year, which players belonged to team Blueberry
- For year 2, who of the Blueberry players in the starting year still belong to the Blueberry team
-..and so on until the last year studied
A nagging feeling I have is that this is presentable as a single table using only one query.
Please help.
Year Player_id team_id
2012 kitliu Blueberry
2012 bobross Blueberry
2012 jacksnake Blueberry
2012 kittyjr Blueberry
2013 kitliu Blueberry
2013 bobross Blueberry
2013 narutol yellow
2014 kitliu Blueberry
2014 narutol Red
result:
2012 kitliu Blueberry
2012 bobross Blueberry
2012 jacksnake Blueberry
2012 kittyjr Blueberry
2013 kitliu Blueberry
2013 bobross Blueberry
2014 kitliu Blueberry
result, count retained player/team combos from base year:
Year Count
2012 4
2013 2
2012 1```
Enumerate the players from the base year. Then use this to check that there are no gaps:
select team, year, count(*)
from (select t.*,
row_number() over (partition by team, player_id order by year) as seqnum
from t
where year >= 2012
) t
where year = 2012 + seqnum - 1
group by team, year;
Here is a db<>fiddle.
I guess below query might help like I did it as an alternate meaning selecting by each year if the player hasnt switched any teams.
SELECT year, playerid,
count(distinct teamid)
from
table t group by year, playerid having
playerid,count(distinct teamid) IN (
Select
playerid,count(distinct teamid) group
by
playerid)
;
You can use analytical function.
SELECT YEAR, PLAYER_ID, TEAM_ID
FROM
(SELECT YEAR, PLAYER_ID, TEAM_ID,
ROW_NUMBER() OVER (PARTITION BY PLAYER_ID, TEAM_ID ORDER BY YEAR) AS RN,
DENSE_RANK() OVER (PARTITION BY TEAM_ID ORDER BY YEAR) AS RNK_YEAR
FROM YOUR_TABLE)
WHERE RN = RNK_YEAR
You can use count and group by to get the count year wise on top of this query.
Cheers!!
I hope this works for you:
with teamyr(year, playerid, teamid) as
(
select min(year), playerid, teamid
from teams
group by playerid, teamid
)
select t1.year, t1.playerid, t1.teamid
from teamyr t1
where t1.year = (select min(year) from teamyr)
union all
select t2.year, t1.playerid, t1.teamid
from teamyr t1
inner join teams t2 on t2.playerid = t1.playerid and t2.teamid = t1.teamid
and t2.year > t1.year

Display max year and its max month with their corresponding value in oracle?

Year Month Value
2015 1 300
2015 2 400
2010 4 100
2016 7 200
2016 8 300
2017 2 100
2017 3 200
2017 6 400
You might try the following:
SELECT MAX(year), MAX(month)
, MAX(value) KEEP ( DENSE_RANK FIRST ORDER BY year DESC, month DESC )
FROM mytable;
If you want the max month per year along with the corresponding value, then you can do this:
SELECT year, MAX(month)
, MAX(value) KEEP ( DENSE_RANK FIRST ORDER BY month DESC )
FROM mytable
GROUP BY year;
Hope this helps.
You could use:
SELECT *
FROM (SELECT *
FROM tab t
ORDER BY Year DESC, Month DESC) s
WHERE rownum = 1;
Select * from table_name where month =
(select max(month) from table_name where year =
(select max(year) from table_name));
This might be the answer you are looking for, I have used nested queries to reach out to the desired result

Aggregation per Date

I have thousands of companies listed but for illustration; I cited 2 companies. I need to produce the column TotalSales in which values are the sum of sales per company , a year prior to its corresponding actual year & quarter.
Company Sales Quarter Year TotalSales QtrYr_Included
ABC Inc. 10,000 1 2010 null Q12009 - Q42009
ABC Inc. 50,000 2 2010 10,000 Q22009 - Q12010
ABC Inc. 35,000 3 2010 60,000 Q32009 - Q22010
ABC Inc. 15,000 4 2010 95,000 Q42009 - Q32010
ABC Inc. 5,000 1 2011 110,000 Q12010 - Q42010
ABC Inc. 10,000 2 2011 105,000 Q22010 - Q12011
SoKor Group 50,000 1 2009 null Q12008 - Q42008
SoKor Group 10,000 2 2009 50,000 Q22008 - Q12009
SoKor Group 10,000 3 2009 60,000 Q32008 - Q22009
SoKor Group 5,000 4 2009 70,000 Q42008 - Q32009
SoKor Group 15,000 1 2010 . Q12009 - Q42009
SoKor Group 20,000 3 2010 . Q22009 - Q12010
Thank you so much.
Here is one way to do it using Sum Over window aggregate
SELECT *,
Sum(sales)
OVER(
partition BY Company
ORDER BY [Year], [Quarter] ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING)
FROM Yourtable
for Older versions
;WITH cte
AS (SELECT Row_number()OVER(partition BY Company ORDER BY [Year], [Quarter]) rn,*
FROM Yourtable a)
SELECT *
FROM cte a
CROSS apply (SELECT Sum (sales) Total_sales
FROM (SELECT TOP 4 sales
FROM cte b
WHERE a.Company = b.Company
AND b.rn < a.rn
ORDER BY [Year] DESC,
[Quarter] DESC)a) cs
#Prdp's solution is valid. However, it would show incorrect results when there are quarters missing for a given company as it will consider whatever row was available before the missing row. A way to avoid such situation is using derived tables to generate all combinations of year,quarter and company. Left joining the original table on to this result would generate 0 sales for the missing quarters. Then use the sum window function to get the sum of sales for the last 4 quarters for each row.
SELECT *
FROM
(SELECT C.COMPANY,
Y.[YEAR],
Q.[QUARTER],
T.SALES,
SUM(COALESCE(T.SALES,0)) OVER(PARTITION BY C.COMPANY
ORDER BY Y.[YEAR], Q.[QUARTER]
ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING) AS PREV_4QTRS_TOTAL
FROM
(SELECT 2008 AS [YEAR]
UNION ALL SELECT 2009
UNION ALL SELECT 2010
UNION ALL SELECT 2011
UNION ALL SELECT 2012
UNION ALL SELECT 2013) Y --Add more years as required or generate them using a recursive cte or a tally table
CROSS JOIN
(SELECT 1 AS [QUARTER]
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4) Q
CROSS JOIN
(SELECT DISTINCT COMPANY
FROM T) C
LEFT JOIN T ON Y.[YEAR]=T.[YEAR]
AND Q.[QUARTER]=T.[QUARTER]
AND C.COMPANY=T.COMPANY
) X
WHERE SALES IS NOT NULL --to filter the result to include only rows from the original table
ORDER BY 1,2,3
Sample Demo

select best attribute of a row SQL oracle

YEAR MONTH BALANCE SSN
2016 1 3175 34/1043/03T
2016 1 2984 93/1194/07T
2016 1 2269 39/3149/00T
2015 12 3172 36/1011/03T
2015 12 2984 22/1224/07T
2015 12 2169 12/3143/00T
For example I have this table, but I have rows for each month of each year, and I have to choose the best ssn and balance of each month of each year. For example, here, I would like obtain this on my query:
YEAR MONTH BALANCE SSN
2016 1 3175 34/1043/03T
2015 12 3172 36/1011/03T
What can I do?
You can do this in several ways. A very Oracle'ish way is to use keep:
select year, month,
max(balance) as balance,
max(SSN) keep (dense_rank first order by balance desc) as ssn
from t
group by year, month;
Like most DBMSes Oracle supports ROW_NUMBER/RANK:
select *
from
(
select year, month, balance, SSN,
row_number()
over (partition by year, month
order by balance desc) as rn
from tab
) dt
where rn = 1