How to get min value at max date in sql? - sql

I have a table with snapshot data. It has productid and date and quantity columns. I need to find min value in the max date. Let's say, we have product X: X had the last snapshot at Y date but it has two snapshots at Y with 9 and 8 quantity values. I need to get
product_id | date | quantity
X Y 8
So far I came up with this.
select
productid
, max(snapshot_date) max_date
, min(quantity) min_quantity
from snapshot_table
group by 1
It works but I don't know why. Why this does not bring min value for each date?

I would use RANK here along with a scalar subquery:
WITH cte AS (
SELECT *, RANK() OVER (ORDER BY quantity) rnk
FROM snapshot_table
WHERE snapshot_date = (SELECT MAX(snapshot_date) FROM snapshot_table)
)
SELECT productid, snapshot_date, quantity
FROM cte
WHERE rnk = 1;
Note that this solution caters to the possibility that two or more records happened to be tied for having the lower quantity among those most recent records.
Edit: We could simplify by doing away with the CTE and instead using the QUALIFY clause for the restriction on the RANK:
SELECT productid, snapshot_date, quantity
FROM snapshot_table
WHERE snapshot_date = (SELECT MAX(snapshot_date) FROM snapshot_table)
QUALIFY RANK() OVER (ORDER BY quantity) = 1;

Consider also below approach
select distinct product_id,
max(snapshot_date) over product as max_date,
first_value(quantity) over(product order by snapshot_date desc, quantity) as min_quantity
from your_table
window product as (partition by product_id)

use row_number()
with cte as (select *,
row_number() over(partition by product_id order by date desc) rn
from table_name) select * from cte where rn=1

Related

Calculating Top N items per dimension

I have the following query that shows total sales for each product on an hourly basis. However, it is very big data and I don't want to see all products, so would like to see the top 1000 product_id based on sales for each date, hour, and category_id dimensions.
SELECT date,
hour,
category_id,
product_id,
sum(sales) AS sales
FROM a
LEFT JOIN
ON a.product_id = b.product_id
WHERE date(date) >= date('2021-01-01')
GROUP BY 1, 2, 3, 4
How to do it in the Athena?
Thanks in advance.
You can use rank function on your result and then filter out corresponding ranks:
SELECT date,
hour,
category_id,
product_id,
sales
FROM
(
SELECT *,
rank() OVER (PARTITION BY date, hour, category_id
ORDER BY sales DESC) AS rnk
FROM (your query)
)
WHERE rnk <= 1000

Retrieve recent 5 days forecast for each cities with latest issue date

I need to retrieve the recent 5 days forecast info for each cities.
My table looks like below
The real problem is with the issue date.
the city may contain several forecast info for the same date with distinct issue date.
I need to retrieve recent 5 records for each cities with latest issue date and group by forecast date
I have tried something like below but not giving the expected result
SELECT * FROM(
SELECT
ROW_NUMBER () OVER (PARTITION BY CITY_ID ORDER BY FORECAST_DATE DESC, ISSUE_DATE DESC) AS rn,
CITY_ID, FORECAST_DATE, ISSUE_DATE
FROM
FORECAST
GROUP BY FORECAST_DATE
) WHERE rn <= 5
Any suggestion or advice will be helpful
This will get the latest issued forecast per day over the most recent 5 days for each city:
SELECT *
FROM (
SELECT f.*,
DENSE_RANK() OVER ( PARTITION BY city_id ORDER BY forecast_date DESC )
AS forecast_rank,
ROW_NUMBER() OVER ( PARTITION BY city_id, forecast_date ORDER BY issue_date DESC )
AS issue_rn
FROM Forecast f
)
WHERE forecast_rank <= 5
AND issue_rn = 1;
Partition by works like group by but for the function only.
Try
with CTE as
(
select t1.*,
row_number() over (partition by city_id, forecast_date order by issue_date desc) as r_ord
from Forecast
)
select CTE.*
from CTE
where r_ord <= 5
Try this
SELECT * FROM(
SELECT
ROW_NUMBER () OVER (PARTITION BY CITY_ID, FORECAST_DATE order by ISSUE_DATE DESC) AS rn,
CITY_ID, FORECAST_DATE, ISSUE_DATE
FROM
FORECAST
) WHERE rn <= 5

Find max value for each year

I have a question that is asking:
-List the max sales for each year?
I think I have the starter query but I can't figure out how to get all the years in my answer:
SELECT TO_CHAR(stockdate,'YYYY') AS year, sales
FROM sample_newbooks
WHERE sales = (SELECT MAX(sales) FROM sample_newbooks);
This query gives me the year with the max sales. I need max sales for EACH year. Thanks for your help!
Use group by and max if all you need is year and max sales of the year.
select
to_char(stockdate, 'yyyy') year,
max(sales) sales
from sample_newbooks
group by to_char(stockdate, 'yyyy')
If you need rows with all the columns with max sales for the year, you can use window function row_number:
select
*
from (
select
t.*,
row_number() over (partition by to_char(stockdate, 'yyyy') order by sales desc) rn
from sample_newbooks t
) t where rn = 1;
If you want to get the rows with ties on sales, use rank:
select
*
from (
select
t.*,
rank() over (partition by to_char(stockdate, 'yyyy') order by sales desc) rn
from sample_newbooks t
) t where rn = 1;

Get the first and last record of each item for the month

Product ID Quantity DateAdded
1 100 4/1/14
2 200 4/2/14
3 300 4/2/14
1 80 4/3/14
3 40 4/5/14
2 5 4/6/14
1 10 4/7/14
I am using this SQL statement to display the first and last record of each item:
SELECT
ProductID, MIN(Quantity) AS Starting, MAX(Quantity) AS Ending
FROM
Records
WHERE
DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
GROUP BY
ProductID, Quantity
but I am getting the same values for the Starting and Ending columns. I want to achieve something like this:
Product ID Starting Ending
1 100 10
2 200 5
3 300 40
Use the row_number() ranking function
select starting.*, ending.ending
from
(select ProductID, quantity as starting from
(select * , ROW_NUMBER() over (partition by productid order by dateadded) rn
from yourtable
where DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
) first
where rn = 1) starting
inner join
(select ProductID, quantity as ending from
(select * , ROW_NUMBER() over (partition by productid order by dateadded desc) rn
from yourtable
where DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
) last
where rn = 1) ending
on starting.productid=ending.productid
The first subquery gets the first entry for the time period, the second gets the last entry
You are getting the same quantities because you are aggregating by quantity in the group by as well as product. Your version of the query, properly written would be:
SELECT ProductID, MIN(Quantity) AS Starting, MAX(Quantity) AS Ending
FROM Records
WHERE DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
GROUP BY ProductID;
However, this doesn't give you the first and last values. It only gives you the minimum and maximum ones. To get those values, use row_number() and conditional aggregation:
SELECT ProductID,
MAX(CASE WHEN seqnum_asc = 1 THEN Quantity END) as Starting,
MAX(CASE WHEN seqnum_desc = 1 THEN Quantity END) as Ending
FROM (SELECT r.*,
row_number() over (partition by product order by dateadded asc) as seqnum_asc,
row_number() over (partition by product order by dateadded desc) as seqnum_desc
FROM Records r
) r
WHERE DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
GROUP BY ProductID;
If you are using SQL Server 2012, then you can also use this with FIRST_VALUE() and LAST_VALUE() instead of row_number().
Change your query to this
SELECT
ProductID, MIN(Quantity) AS Starting, MAX(Quantity) AS Ending
FROM
Records
WHERE
DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
GROUP BY
ProductID
you don't need to make a group for quantity
DECLARE #test Table
(ID INT, Name INT)
INSERT INTO #test
VALUES
(1, 100),
(2, 200),
(3, 300 ),
(1, 5),
(2, 10),
(3, 15);
select ID,MIN(name),MAX(name) from #test
group by ID
This works in 2012 and newer:
with x as (
select distinct productid
,first_value(quantity) over(partition by productid order by dateadded
range between unbounded preceding and current row) as starting
,last_value(quantity) over(partition by productid order by dateadded
range between current row and unbounded following) as ending
from #t
)
select productid, starting, ending
from x
Single pass through the table.
I did not try it on SQL Server but on MySQL this SQL is working:
SELECT
ProductID, MAX(Quantity) AS Starting, MIN(Quantity) AS Ending
FROM
Records
WHERE
DateAdded BETWEEN '2014-04-01' AND '2014-04-30'
GROUP BY
ProductID;
For DateAdded field:
SELECT
ProductID, MAX(Quantity) AS Starting, MIN(Quantity) AS Ending
FROM
Records
WHERE
convert(datetime, DateAdded) BETWEEN '2014-04-01' AND '2014-04-30'
GROUP BY
ProductID;
P.S. Missing sqlfiddle badly.. :(

Finding a date with the largest sum

I have a database of transactions, accounts, profit/loss, and date. I need to find the dates which the largest profit occurs by account. I have already found a way to find these actually max/min values but I can't seem to be able to pull the actual date from it. My code so far is like this:
Select accountnum, min(ammount)
from table
where date > '02-Jan-13'
group by accountnum
order by accountnum
Ideally I would like to see account num, the min or max, and then the date which this occurred on.
Try something like this to get the min and max amount for each customer and the date it happened.
WITH max_amount as (
SELECT accountnum, max(amount) amount, date
FROM TABLE
GROUP BY accountnum, date
),
min_amount as (
SELECT accountnum, min(amount) amount, date
FROM TABLE
GROUP BY accountnum, date
)
SELECT t.accountnum, ma.amount, ma.date, mi.amount, ma.date
FROM table t
JOIN max_amount ma
ON ma.accountnum = t.accountnum
JOIN min_amount mi
ON mi.accountnum = t.accountnum
If you want the data for just this year you could add a where clause to the end of the statement
WHERE t.date > '02-Jan-13'
The easiest way to do this is using window/analytic functions. These are ANSI standard and most databases support them (MySQL and Access being two notable exceptions).
Here is one way:
select t.accountnum, min_amount, max_amount,
min(case when amount = min_amount then date end) as min_amount_date,
min(case when amount = min_amount then date end) as max_amount_date,
from (Select t.*,
min(amount) over (partition by accountnum) as min_amount,
max(amount) over (partition by accountnum) as max_amount
from table t
where date > '02-Jan-13'
) t
group by accountnum, min_amount, max_amount;
order by accountnum
The subquery calculates the minimum and maximum amount for each account, using min() as a window function. The outer query selects these values. It then uses conditional aggregation to get the first date when each of those values occurred.
;with cte as
(
select accountnum, ammount, date,
row_number() over (partition by accountnum order by ammount desc) rn,
max(ammount) over (partition by accountnum) maxamount,
min(ammount) over (partition by accountnum) minamount
from table
where date > '20130102'
)
select accountnum,
ammount as amount,
date as date_of_max_amount,
minamount,
maxamount
from cte where rn = 1