SQL Server Find Max and Min in Group - sql

I'm using Microsoft SQL Server and have a table like this:
DATE
ITEM
BUYER
QTY_BUY
2022-01-01
ITEM A
TOMMY
5
2022-01-01
ITEM A
BENNY
3
2022-01-01
ITEM A
ANDY
1
2022-01-01
ITEM A
JOHN
8
2022-01-01
ITEM B
TOMMY
2
2022-01-01
ITEM B
BENNY
10
2022-01-01
ITEM B
ANDY
3
2022-01-01
ITEM B
JOHN
6
2022-01-02
ITEM A
TOMMY
3
2022-01-02
ITEM A
BENNY
0
2022-01-02
ITEM A
ANDY
5
2022-01-02
ITEM A
JOHN
6
I want to show top buyer and min buyer group by date and item, so it will look like:
DATE
ITEM
TOP_BUYER
TOP_QTY
MIN_BUYER
QTY_MIN
2022-01-01
ITEM A
JOHN
8
ANDY
1
2022-01-01
ITEM B
BENNY
10
TOMMY
2
2022-01-02
ITEM A
JOHN
6
BENNY
0
Please help me to do that, I try so many trick but cannot reach it. Thanks in advance

We can handle this requirement using ROW_NUMBER:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY DATE, ITEM ORDER BY QTY_BUY) rn1,
ROW_NUMBER() OVER (PARTITION BY DATE, ITEM ORDER BY QTY_BUY DESC) rn2
FROM yourTable
)
SELECT DATE, ITEM,
MAX(CASE WHEN rn2 = 1 THEN BUYER END) AS TOP_BUYER,
MAX(CASE WHEN rn2 = 1 THEN QTY_BUY END) AS TOP_QTY,
MAX(CASE WHEN rn1 = 1 THEN BUYER END) AS MIN_BUYER,
MAX(CASE WHEN rn1 = 1 THEN QTY_BUY END) AS QTY_MIN
FROM cte
GROUP BY DATE, ITEM
ORDER BY DATE, ITEM;

The solution is to use first_value + partition over
This query was tested inside SQL Server
select distinct [date], Item
, FIRST_VALUE(buyer) OVER (partition by [date], item ORDER BY qty_buy desc) AS Top_Buyer
, FIRST_VALUE(qty_buy) OVER (partition by [date], item ORDER BY qty_buy desc) AS Top_Qty
, FIRST_VALUE(buyer) OVER (partition by [date], item ORDER BY [date], item, qty_buy asc) AS Min_Buyer
, FIRST_VALUE(qty_buy) OVER (partition by [date], item ORDER BY [date], item, qty_buy asc) AS Qty_Min
from testtable

It can also be done using simple group by and outer apply,
see this dbfiddle
select t.bdate,
t.item,
max(tb.buyer) as top_buyer,
max(t.qty) as top_qty,
max(mb.buyer) as min_buyer,
min(t.qty) as qty_min
from test t
outer apply ( select top 1 t2.buyer
from test t2
where t2.bdate = t.bdate
and t2.item = t.item
order by t2.qty desc
) tb
outer apply ( select top 1 t2.buyer
from test t2
where t2.bdate = t.bdate
and t2.item = t.item
order by t2.qty
) mb
group by t.bdate,
t.item
order by t.bdate

Related

SQL: An alternative to Group By approach using Partion By

I have a table in a DW system (say AWS SnowFlake):
UPC_CODE A_PRICE A_QTY DATE COMPANY_CODE A_CAT
1001 100.25 2 2021-05-06 1 PB
1001 2122.75 10 2021-05-01 1 PB
1002 212.75 5 2021-05-07 2 PT
1002 3100.75 10 2021-05-01 2 PB
What I am looking for is :
For each UPC_CODE and COMPANY_CODE the latest data should be picked up
So the resultant table should be like below:
UPC_CODE A_PRICE A_QTY DATE COMPANY_CODE A_CAT
1001 100.25 2 2021-05-06 1 PB
1002 212.75 5 2021-05-07 2 PT
Approach: Below SQL string
SELECT UPC_CODE,A_PRICE,A_QTY,MAX(DATE) AS F_DATE,COMPANY_CODE,A_CAT
FROM <table_name>
GROUP BY 1,2,3,5,6
Can I have an alternative approach using partionby()?
Your current GROUP BY query doesn't really do what you have in mind. One canonical approach here uses ROW_NUMBER:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY UPC_CODE, COMPANY_CODE ORDER BY DATE DESC) rn
FROM yourTable
)
SELECT UPC_CODE, A_PRICE, A_QTY, DATE, COMPANY_CODE, A_CAT
FROM cte
WHERE rn = 1;
If you did want to use a GROUP BY approach, here is one way to do that:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT UPC_CODE, COMPANY_CODE, MAX(DATE) AS MAX_DATE
FROM yourTable
GROUP BY UPC_CODE, COMPANY_CODE
) t2
ON t2.UPC_CODE = t1.UPC_CODE AND
t2.COMPANY_CODE = t1.COMPANY_CODE AND
t2.MAX_DATE = t1.DATE;
In Snowflake (which your first line suggests), you would use QUALIFY:
SELECT UPC_CODE, A_PRICE, A_QTY, DATE AS F_DATE, COMPANY_CODE, A_CAT
FROM <table_name>
QUALIFY ROW_NUMBER() OVER (PARTITION BYUPC_CODE, A_PRICE, A_QTY, COMPANY_CODE, A_CAT
ORDER BY DATE DESC
) = 1;

PostgreSQL Pivot by Last Date

I need to make a PIVOT table from Source like this table
FactID UserID Date Product QTY
1 11 01/01/2020 A 600
2 11 02/01/2020 A 400
3 11 03/01/2020 B 500
4 11 04/01/2020 B 200
6 22 06/01/2020 A 1000
7 22 07/01/2020 A 200
8 22 08/01/2020 B 300
9 22 09/01/2020 B 100
Need Pivot Like this where Product QTY is QTY by Last Date
UserID A B
11 400 200
22 200 100
My try PostgreSQL
Select
UserID,
MAX(CASE WHEN Product='A' THEN 'QTY' END) AS 'A',
MAX(CASE WHEN Product='B' THEN 'QTY' END) AS 'B'
FROM table
GROUP BY UserID
And Result
UserID A B
11 600 500
22 1000 300
I mean I get a result by the maximum QTY and not by the maximum date!
What do I need to add to get results by the maximum (last) date ??
Postgres doesn't have "first" and "last" aggregation functions. One method for doing this (without a subquery) uses arrays:
select userid,
(array_agg(qty order by date desc) filter (where product = 'A'))[1] as a,
(array_agg(qty order by date desc) filter (where product = 'B'))[1] as b
from tab
group by userid;
Another method uses select distinct with first_value():
select distinct userid,
first_value(qty) over (partition by userid order by product = 'A' desc, date desc) as a,
first_value(qty) over (partition by userid order by product = 'B' desc, date desc) as b
from tab;
With the appropriate indexes, though, distinct on might be the fastest approach:
select userid,
max(qty) filter (where product = 'A') as a,
max(qty) filter (where product = 'B') as b
from (select distinct on (userid, product) t.*
from tab t
order by userid, product, date desc
) t
group by userid;
In particular, this can use an index on userid, product, date desc). The improvement in performance will be most notable if there are many dates for a given user.
You can use DENSE_RANK() window function in order to filter by the last date per each product and UserID before applying conditional aggregation such as
SELECT UserID,
MAX(CASE WHEN Product='A' THEN QTY END) AS "A",
MAX(CASE WHEN Product='B' THEN QTY END) AS "B"
FROM
(
SELECT t.*, DENSE_RANK() OVER (PARTITION BY Product,UserID ORDER BY Date DESC) AS rn
FROM tab t
) q
WHERE rn = 1
GROUP BY UserID
Demo
presuming all date values are distinct(no ties occur for dates)

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

SQL group by with NULL

I have a table something like this:
ID ProductID ProductName Price
== ========= =========== =====
1 XX1 TShirt 10
2 XX1 TShirt 10
3 NULL TShirt 10
4 XX2 Shirt 20
5 XX3 Shirt1 30
Now I want this to group by ProductName and results will be as follows
ID ProductID ProductName Price
== ========= =========== =====
1 XX1 TShirt 30
4 XX2 Shirt 20
5 XX3 Shirt1 30
Thanks
ProductID seems to be irrelevant for the group, so don't use it. To get all columns you could use a CTE and a ranking function like ROW_NUMBER:
WITH CTE AS(
SELECT ID,
ProductID,
ProductName,
Price = SUM(Price) OVER (PARTITION BY ProductName),
RN = ROW_NUMBER() OVER (PARTITION BY ProductName ORDER BY ID)
FROM dbo.TableName
)
SELECT CTE.* FROM CTE
WHERE RN = 1
If you want to take the row which contains the ProductID(where it is not NULL) modify the ORDER BY:
WITH CTE AS(
SELECT ID,
ProductID,
ProductName,
Price = SUM(Price) OVER (PARTITION BY ProductName),
RN = ROW_NUMBER() OVER (PARTITION BY ProductName
ORDER BY CASE WHEN ProductID IS NOT NULL
THEN 0 ELSE 1 END, ID)
FROM dbo.TableName
)
SELECT CTE.* FROM CTE
WHERE RN = 1

TSQL Query to fetch row details from a table when the value of a column last changed

I have two tables.
Table 1
Item
1
2
Table 2
Item Date Amount
1 12/31 30
1 12/30 30
1 12/29 20
2 12/31 100
2 12/30 90
2 12/29 90
Now my result should have
Item Date Amount
1 12/29 20
2 12/30 90
3 12/31 12
Basically, i am trying to find out the date when the price changed recently. In turn, i will use this information to calculate the no of days the item is at the current price.
Thanks
;WITH cte
AS (SELECT *,
Row_number() OVER (PARTITION BY Item ORDER BY Date) -
Row_number() OVER (PARTITION BY Amount, Amount ORDER BY Date) AS
grp
FROM table2)
SELECT Item,
MAX(Amount) AS Amount,
MIN(Date) AS startrange,
MAX(Date) AS endrange,
1+DATEDIFF(DAY,MIN(Date),MAX(Date)) AS numdays
FROM cte
GROUP BY grp,
Item
Returns the following for your test data
Item Amount startrange endrange numdays
----------- ----------- ---------- ---------- -----------
1 20 2010-12-29 2010-12-29 1
1 30 2010-12-30 2010-12-31 2
2 90 2010-12-29 2010-12-30 2
2 100 2010-12-31 2010-12-31 1
Try this:
SELECT Item, Date, Amount
FROM
(
SELECT
T2.Item,
T2.Date,
T2.Amount,
ROW_NUMBER() OVER (PARTITION BY T2.Item ORDER BY T2.Date DESC) rn
FROM table2 T2
JOIN
(
SELECT Item, Amount
FROM
(
SELECT
Item,
Amount,
ROW_NUMBER() OVER (PARTITION BY Item ORDER BY Date DESC) rn
FROM table2
) T1
WHERE rn = 1
) T3
ON T2.Item = T3.Item AND T2.Amount <> T3.Amount
) T4
WHERE rn = 1
Result for your example data:
Item Date Amount
1 2010-12-29 20
2 2010-12-30 90
Explanation
The subquery T3 finds the most recent price for each item by using ROW_NUMBER. This is then joined back to the original table and rows where the price of an item is equal to the most recent price are removed. The most recent price of the remaining data is then found for each item, again using the ROW_NUMBER technique. This is the second most recent price.