Tom loves sports. Every month he takes rest for a while and plays his favorite sport. He changes the game during the month and its change history is shown in table.
Challenge: What is the first and last favorite sport played by Tom?
Tom Favourite Sports
----------------------
Month PreviousSport CurrentSport
JAN REST CRICKET
JAN CRICKET RUGBY
JAN RUGBY VOLLEYBALL
JAN VOLLEYBALL FOOTBALL
JAN FOOTBALL TENNIS
JAN TENNIS RUGBY
FEB REST KAYAKING
FEB KAYAKING SNOWBOARDING
FEB SNOWBOARDING SKATING
FEB SKATING RAFTING
FEB RAFTING KAYAKING
MAR REST RACING
MAR RACING GLIDING
MAR GLIDING SKYDIVING
And the output should be
Month FirstSport LastSport
JAN CRICKET RUGBY
FEB KAYAKING KAYAKING
MAR RACING SKYDIVING
CHANGE : Slight modification to the source table
MTH PREVIOUS_SPORT CURRENT_SPORT
JAN VOLLEYBALL FOOTBALL
FEB REST KAYAKING
MAR REST RACING
JAN CRICKET RUGBY
FEB SNOWBOARDING SKATING
MAR RACING GLIDING
JAN RUGBY VOLLEYBALL
FEB SKATING RAFTING
MAR GLIDING SKYDIVING
JAN FOOTBALL TENNIS
FEB RAFTING KAYAKING
JAN TENNIS RUGBY
JAN REST CRICKET
Now how do I get the previous output?
Thanks in advance.
I just have considered ID column for Table, we can avoid it & convert Jan/Feb/March as Month Numbers & go ahead without ID column. But for quick reply, This is it !!!
Select Main.[Month],F.CurrentSport,L.CurrentSport from
(
Select a.[Month],min(r) fSport ,max(r) lSport from
(select * ,ROW_NUMBER() over (Partition by Month order by ID) as R
from Tom
)a
group by a.[Month]
)as Main
inner join
(
select *, ROW_NUMBER() over (Partition by Month order by ID) as R
from Tom
)as F on F.R = Main.fSport and f.[Month] = Main.[Month]
inner join
(
select *, ROW_NUMBER() over (Partition by Month order by ID) as R
from Tom
)as L on L.R = Main.lSport and L.[Month] = Main.[Month]
order by F.ID
See Fiddle DEMO HERE
One more variant, avoiding ROW_Number multiple times & using CTE. I think this should be faster than previous.
;With CTE as
(
select ID,[Month] as M,CurrentSport as Sport
, ROW_NUMBER() over (Partition by Month order by ID) as R
from Tom
), CTE1 as
(
Select M, Min(R) as FS ,Max(R) as LS from CTE
group by M
)
Select CTE1.M,F.Sport as First,L.Sport as Last from CTE1
inner join CTE as F on F.R = CTE1.FS and F.M = CTE1.M
inner join CTE as L on L.R = CTE1.LS and L.M = CTE1.M
order by L.ID asc
See DEMO HERE
I don't think we need the ID column as such. We can work with RowNumber.
So it should be coming up like this :
SELECT Mintmp.mnth,
Mintmp.currentsport AS FirstSport,
MaxTmp.currentsport AS CurrentSport
FROM (SELECT Row_number()
OVER(
partition BY mnth
ORDER BY mnth) AS RowNum,
*
FROM #T1) MinTmp
INNER JOIN (SELECT Min(rownum) AS MinRow,
mnth,
Max(rownum) AS MaxRow
FROM (SELECT Row_number()
OVER(
partition BY mnth
ORDER BY mnth) AS RowNum,
*
FROM #T1) tmp
GROUP BY mnth) grpTable
ON grpTable.minrow = MinTmp.rownum
AND grpTable.mnth = MinTmp.mnth
INNER JOIN (SELECT Row_number()
OVER(
partition BY mnth
ORDER BY mnth) AS RowNum,
*
FROM #T1) MaxTmp
ON Maxtmp.rownum = grpTable.maxrow
AND Maxtmp.mnth = grpTable.mnth
Related
I have table of employees salary details records with columns
Id Name Year Month Salary
1 ABC 2021 Jan 50000
2 PQR 2021 Jan 40000
3 KLM 2021 Feb 45000
4 LMN 2021 Jan 55000
5 LMN 2022 Jan 20000
6 ABC 2022 Feb 25000
7 ABC 2022 Jan 2500
8 ABC 2022 Dec 60000
9 LMN 2022 Nov 70000
Now I want to find which employee gets salary greater than 100000 from joining, and display employees all data
--find which employee gets more than 100000 salary till now
select name,sum(salary) as AnnualSalary from tblEmpsalary
group by Name
having sum(Salary)>100000 --this query works
--but below query display no data , (I want to show all data of employee which gets more than 100000 total salary)
SELECT id, name,Month,Year, SUM(Salary) AS TotalSales
FROM tblEmpsalary
GROUP BY name,Id,Month,Year,Salary
having SUM(Salary)>100000;
SELECT T.ID,T.Name,T.Year,T.Month,T.Salary
FROM tblEmpsalary AS T
JOIN
(
select ID
from tblEmpsalary
group by ID
having sum(Salary)>100000
)AS X ON T.ID=X.ID
You can use a window function for this
SELECT
id,
name,
Month,
Year,
TotalSales
FROM (
SELECT *,
SUM(Salary) OVER (PARTITION BY name) AS TotalSales
FROM tblEmpsalary e
) e
WHERE e.TotalSales > 100000;
Please, try with below query where one query for grouping and another is joining for fetch employees details:
SELECT TS.id, TS.name, TS.Month,Year, TS.Salary, ATS.TotalSales FROM
(SELECT Month, Year, SUM(Salary) AS TotalSales
FROM tblEmpsalary
GROUP BY Month,Year,Salary
HAVING SUM(Salary)>100000
) AS ATS
LEFT OUTER JOIN tblEmpsalary TS on ATS.Month = TS.Month and ATS.Year = TS.Year
ORDER BY TS.name, TS.Id, TS.Month, TS.Year, ATS.TotalSales
I have obtained this table by using multiple joins
E_name s_date year h_value l_value update_date
a 01-08-2012 2012 25 70 01-01-2012
a 23-06-2012 2010 20 55 01-01-2009
a 19-03-2020 2020 210 540 29-04-2020
a 14-02-2020 2020 78 765 29-04-2020
b 27-12-2018 2018 14 29 31-01-2019
b 19-12-2018 2018 17 30 19-12-2018
I want to remove duplicates based on E_name and year.
if the next record has the same E_name and year as previous, then
row with most recent update_date will be considered
if both update_date are the same then the row with the most recent s_date will be considered
Required Output
E_name s_date year h_value l_value update_date
a 01-08-2012 2012 25 70 01-01-2012
a 23-06-2012 2010 20 55 01-01-2009
a 19-03-2020 2020 210 540 29-04-2020
b 27-12-2018 2018 14 29 31-01-2019
You need a group by and a row_number() on top
Select * from
( Select e_name,"year",
maxdate,update_date,
row_number() over (partition by e_name,"year" order by
update_date desc) as rn
from
( Select e_name,"year",
update_date,max(s_date) as maxdate from
sample
group by
e_name,"year",update_date
)
)
where rn =1
check this output link fiddle :http://sqlfiddle.com/#!4/c1646/23
A query like below may do the tricks. Put your oriented data into a Temp Table and apply below query on your Temp Table
with MyCTE
as
(
select
E_Name
,S_Date
,year
,H_value
,L_value
,update_date
, RANK() over (partition by E_Name,year order by update_date desc,S_DATE desc) as ranking
from TempTable
)
select * from MyCTE where ranking=1
I have a table ADS in snowflake like so (data is being inserted each day), note there are duplicates entries on rows 3 and 4:
ID
REPORT_DATE
CLICKS
IMPRESSIONS
1
Jan 01
20
400
1
Jan 02
25
600
1
Jan 03
80
900
1
Jan 03
80
900
2
Jan 01
30
500
2
Jan 02
55
650
2
Jan 03
90
950
I want to select all entries based on ID with the max REPORT_DATE - essentially I want to know the latest number of CLICKS and IMPRESSIONS for each ID:
ID
REPORT_DATE
CLICKS
IMPRESSIONS
1
Jan 03
80
900
2
Jan 03
90
950
This query successfully gives me the max DATE for each ID:
SELECT
MAX(REPORT_DATE),
ID
FROM ADS
GROUP BY
ID;
Result:
ID
MAX(REPORT_DATE)
1
Jan 03
2
Jan 03
However, when I try to conduct an inner join, duplicates arise:
SELECT
a.ID,
a.REPORT_DATE,
a.CLICKS,
a.IMPRESSIONS
FROM ADS a
INNER JOIN (
SELECT
MAX(REPORT_DATE),
ID
FROM ADS
GROUP BY
ID
) b
ON a.ID = b.ID
AND a.REPORT_DATE = b.REPORT_DATE;
Result:
ID
REPORT_DATE
CLICKS
IMPRESSIONS
1
Jan 03
80
900
1
Jan 03
80
900
2
Jan 03
90
950
How can I construct my query to remove these duplicates?
You could use QUALIFY and ROW_NUMBER():
SELECT a.ID,a.REPORT_DATE,a.CLICKS,a.IMPRESSIONS
FROM ADS a
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY REPORT_DATE DESC) = 1;
Please note that ORDER BY REPORT_DATE is not stable(in case of a tie). I would suggest adding another column for sorting that is the tuple is always unique.
If the rows that have a tie are the same it actually is not an issue.
You can use row_number() window function:
select id, report_date, clicks, impresions from
(
select id, report_date, clicks, impresions, row_number()over(partition by id order
by report_date desc) rnk from ADs
)t
where rn=1
Following is my output:
MONTH STAF STAFFNAME TOTAL_ORDER_DELIVERED
===== ==== ==================== =====================
JAN S009 Theresina Ertelt 1
FEB S015 Lonna Charker 1
MAR S003 Suzi Maccari 2
MAR S010 Zacharie Witty 1
MAR S020 Abbie Gosnoll 1
MAR S017 Renee Alston 1
AUG S006 Falito Ollerton 1
AUG S017 Renee Alston 1
AUG S003 Suzi Maccari 1
OCT S003 Suzi Maccari 3
OCT S020 Abbie Gosnoll 2
What I want is:
MONTH STAF STAFFNAME TOTAL_ORDER_DELIVERED
===== ==== ==================== =====================
JAN S009 Theresina Ertelt 1
FEB S015 Lonna Charker 1
MAR S003 Suzi Maccari 2
AUG S006 Falito Ollerton 1
AUG S017 Renee Alston 1
AUG S003 Suzi Maccari 1
OCT S003 Suzi Maccari 3
I want to select the highest result based on the month but can't figure what to do. Here are my query in SQL:
SELECT TO_CHAR(TO_DATE(EXTRACT(MONTH FROM receivedDate),'mm'),'MON') AS Month,
d.staffID, staffname, count(deliveryID) AS Total_Order_Delivered
FROM delivery d, deliverystaff s
WHERE (d.staffid = s.staffid)
AND (EXTRACT(YEAR FROM receivedDate) = 2020)
GROUP BY EXTRACT(MONTH FROM d.receivedDate),d.staffid, staffname
ORDER BY EXTRACT(MONTH FROM d.receivedDate),count(deliveryID) desc;
I would suggest using RANK here:
WITH cte AS (
SELECT TO_CHAR(TO_DATE(EXTRACT(MONTH FROM receivedDate), 'mm'), 'MON') AS Month,
EXTRACT(MONTH FROM d.receivedDate) AS month_num,
d.staffID, staffname, COUNT(deliveryID) AS Total_Order_Delivered,
RANK() OVER (PARTITION BY EXTRACT(MONTH FROM d.receivedDate), d.staffid, staffname
ORDER BY COUNT(deliveryID) DESC) rnk
FROM delivery d
INNER JOIN deliverystaff s ON d.staffid = s.staffid
WHERE EXTRACT(YEAR FROM receivedDate) = 2020
GROUP BY EXTRACT(MONTH FROM d.receivedDate), d.staffid, staffname
)
SELECT Month, staffID, staffname, Total_Order_Delivered
FROM cte
WHERE rnk = 1
ORDER BY month_num;
Tim's answer is fine. However, I strongly encourage you to make some changes to the query.
First, for the where clause don't use extract(). Use direct date comparisons. Second, include the year and month in the aggregation. Then, be sure that you qualify all column references.
That allows you to do:
SELECT sd.*
FROM (SELECT TO_CHAR(d.receivedDate, 'YYYY-MON') AS Month_year,
s.staffID, s.staffname, COUNT(*) AS Total_Order_Delivered,
RANK() OVER (PARTITION BY TO_CHAR(d.receivedDate, 'YYYY-MON') ORDER BY COUNT(*) DESC) as seqnum,
MIN(d.receiveddate) as min_receiveddate
FROM deliverystaff s JOIN
delivery d
ON d.staffid = s.staffid
WHERE d.receivedDate >= DATE '2020-01-01' AND
d.receivedDate < DATE '2021-01-01'
GROUP BY TO_CHAR(d.receivedDate, 'YYYY-MON') AS Month,
s.staffID, s.staffname
) sd
WHERE seqnum = 1
ORDER BY min_receiveddate;
In addition to the above, this allows you to order the results chronologically and works if you extend the time frame to more than one year.
By executing this query
SELECT year, genre, COUNT(genre)
FROM Oscar
GROUP BY year, genre
I got the following output:
2016 Action 2
2016 Romance 1
2017 Action 1
2017 Romance 2
2018 Fantasy 1
2019 Action 1
2019 Fantasy 2
2020 Action 3
2020 Fantasy 1
2020 Romance 1
Now i want to display only the genre with the highest number per year to display. What is the best way to do this?
So I want the output to look like this:
2016 Action
2017 Romance
2018 Fantasy
2019 Fantasy
2020 Action
You can use window functions:
SELECT year, genre
FROM (
SELECT year, genre, RANK() OVER(PARTITION BY year ORDER BY COUNT(*) DESC) rn
FROM Oscar
GROUP BY year, genre
) t
WHERE rn = 1
If your database does not support window functions (eg MySQL < 8.0), another option is:
SELECT year, genre
FROM Oscar o
GROUP BY year, genre
HAVING COUNT(*) = (
SELECT COUNT(*)
FROM Oscar o1
WHERE o1.year = o.year
GROUP BY o1.category
ORDER BY COUNT(*) DESC LIMIT 1
)
Use window functions:
SELECT year, genre
FROM (SELECT year, genre, COUNT(*) as cnt,
RANK() OVER (PARTITION BY year ORDER BY COUNT(*) DESC) as seqnum
FROM Oscar
GROUP BY year, genre
) yg
WHERE seqnum = 1;
If there are ties, RANK() returns all highest ranked values. Use ROW_NUMBER() if you specifically want one row, even when there are ties for first.