MAX values over partition by () - sql

I have monthly sales over specific car brands, and every month i want the max 5 car brands in relation to the sales. Then, next to each of these max brands, i want the number (if there is) that indicates how many times this specific brand was in the top five the previous 4 months.
Foe example, if table data is:
Timestamp | Brand | Sales
1/1/2012 | A | 23
1/1/2012 | B | 45
1/1/2012 | C | 11
1/1/2012 | D | 3
1/1/2012 | E | 55
1/1/2012 | F | 1
1/1/2012 | G | 22
---------------------------
1/2/2012 | A | 93
1/2/2012 | B | 35
1/2/2012 | C | 01
1/2/2012 | D | 100
1/2/2012 | E | 45
1/2/2012 | F | 77
1/2/2012 | G | 12
for a two month data, the query output for February (examining only Feb and Jan) would be :
Max_ Brand_Sales| Reappearance_Factor
--------------------------------------
E | 1
B | 1
D | 0
F | 0
A | 1

Select
c.Brand,
nvl(Count(p.Brand), 0) As Reappearance_Factor
From (
Select
Brand,
Rank () Over (Order By Sales Desc) as r
From
Sales
Where
Timestamp = Date '2012-02-01'
) c
left outer join (
Select
Brand,
Rank () Over (Partition By Timestamp Order By Sales Desc) as r
From
Sales
Where
Timestamp >= Date '2011-10-01' And
Timestamp < Date '2012-02-01'
) p
on c.Brand = p.Brand And p.r <= 5
Where
c.r <= 5
Group By
c.Brand
http://sqlfiddle.com/#!4/46770/21

Try this:
1) Query that calculates monthly rank for every brand:
SELECT
s.Brand,
trunc(s.Timestamp,'MONTH') month_start,
rank() OVER (PARTITION BY trunc(s.Timestamp,'MONTH')
ORDER BY s.Sales DESC) as monthly_rank
FROM Sales s;
2) Query that outputs the top 5 brands for current month:
SELECT
t.Brand
FROM
(
SELECT
s.Brand,
trunc(s.Timestamp,'MONTH') month_start,
rank() OVER (PARTITION BY trunc(s.Timestamp,'MONTH')
ORDER BY s.Sales DESC) as monthly_rank
FROM Sales s
) t
WHERE monthly_rank <= 5
AND month_start = trunc(sysdate,'MONTH');
3) Query to calculate "Reappearance" for past 4 month
SELECT
t.Brand,
count(*) as top
FROM
(
SELECT
s.Brand,
trunc(s.Timestamp,'MONTH') month_start,
rank() OVER (PARTITION BY trunc(s.Timestamp,'MONTH')
ORDER BY s.Sales DESC) as monthly_rank
FROM Sales s
) t
WHERE monthly_rank <= 5
AND t.month_start BETWEEN add_months(sysdate, -1)
AND add_months(sysdate, -5)
GROUP BY t.Brand;
4) Last thing to do - LEFT JOIN query 2 and 3
SQLFiddle here http://sqlfiddle.com/#!4/46770/65

Related

LEFT JOIN match. If no match, need to match on most recent date

My current SQL code:
SELECT
[Date], [Count]
FROM
Calendar_Table pdv
LEFT JOIN
(SELECT
COUNT([FILE NAME]) AS [Count], [CLOSE DT]
FROM
Production_Table
GROUP BY
[CLOSE DT]) [Group] ON [pdv].[Date] = [Group].[CLOSE DT]
ORDER BY
[Date]
Please see code below. Calendar_Table is a simple table, 1 row for every date. Production_Table gives products sold each day. If the left join produces a NULL, please produce the most recent non-NULL value.
Current output:
Date | Count
-----------+--------
9/4/2019 | NULL
9/5/2019 | 1
9/6/2019 | 4
9/7/2019 | NULL
9/8/2019 | 7
9/9/2019 | 11
9/10/2019 | NULL
9/11/2019 | 14
9/12/2019 | NULL
9/13/2019 | 19
Desired output:
Date | Count
-----------+--------
9/4/2019 | 0
9/5/2019 | 1
9/6/2019 | 4
9/7/2019 | 4
9/8/2019 | 7
9/9/2019 | 11
9/10/2019 | 11
9/11/2019 | 14
9/12/2019 | 14
9/13/2019 | 19
One option is a lateral join:
select c.date, p.*
from calendar_table c
outer apply (
select top (1) count(file_name) as cnt, close_dt
from production_table p
where p.close_dt <= c.date
group by p.close_dt
order by p.close_dt desc
) p
As an alternative, we can use an equi-join to bring the matching dates, as in your original query, and then fill the gaps with window functions. The basic idea is to build groups that reset everytime a match is met.
select date, coalesce(max(cnt) over(partition by grp), 0) as cnt
from (
select c.date, p.cnt,
sum(case when p.close_dt is null then 0 else 1 end) over(order by c.dt) as grp
from calendar_table c
left join (
select close_dt, count(file_name) as cnt
from production_table p
group by close_dt
) p on p.close_dt = c.date
) t
Depending on your data, one solution or the other may perform better.

DB2 query to find average sale for each item 1 year previous

Having some trouble figuring out how to make these query.
In general I have a table with
sales_ID
Employee_ID
sale_date
sale_price
what I want to do is have a view that shows for each sales item how much the employee on average sells for 1 year previous of the sale_date.
example: Suppose I have this in the sales table
sales_ID employee_id sale_date sale_price
1 Bob 2016/06/10 100
2 Bob 2016/01/01 75
3 Bob 2014/01/01 475
4 Bob 2015/12/01 100
5 Bob 2016/05/01 200
6 Fred 2016/01/01 30
7 Fred 2015/05/01 50
for sales_id 1 record I want to pull all sales from Bob by 1 year up to the month of the sale (so 2015-05-01 to 2016-05-31 which has 3 sales for 75, 100, 200) so the final output would be
sales_ID employee_id sale_date sale_price avg_sale
1 Bob 2016/06/10 100 125
2 Bob 2016/01/01 75 275
3 Bob 2014/01/01 475 null
4 Bob 2015/12/01 100 475
5 Bob 2016/05/01 200 87.5
6 Fred 2016/01/01 30 50
7 Fred 2015/05/01 50 null
What I tried doing is something like this
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join (
select employee_id, avg(sale_price) as avg_price
from sales
where sale_date between Date(VARCHAR(YEAR(a.sale_date)-1) ||'-'|| VARCHAR(MONTH(a.sale_date)-1) || '-01')
and Date(VARCHAR(YEAR(a.sale_date)) ||'-'|| VARCHAR(MONTH(a.sale_date)) || '-01') -1 day
group by employee_id
) b on a.employee_id = b.employee_id
which DB2 doesn't like using the parent table a in the sub query, but I can't think of how to properly write this query. any thoughts?
Ok. I think I figured it out. Please note 3 things.
I couldn't test it in DB2, so I used Oracle. But syntax would be more or less same.
I didn't use your 1 year logic exactly. I am counting current_date minus 365 days, but you can change the between part in where clause in inner query, as you mentioned in the question.
The expected output you mentioned is incorrect. So for every sale_id, I took the date, found the employee_id, took all the sales of that employee for last 1 year, excluding the current date, and then took average. If you want to change it, you can change the where clause in subquery.
select t1.*,t2.avg_sale
from
sales t1
left join
(
select a.sales_id
,avg(b.sale_price) as avg_sale
from sales a
inner join
sales b
on a.employee_id=b.employee_id
where b.sale_date between a.sale_date - 365 and a.sale_date -1
group by a.sales_id
) t2
on t1.sales_id=t2.sales_id
order by t1.sales_id
Output
+----------+-------------+-------------+------------+----------+
| SALES_ID | EMPLOYEE_ID | SALE_DATE | SALE_PRICE | AVG_SALE |
+----------+-------------+-------------+------------+----------+
| 1 | Bob | 10-JUN-2016 | 100 | 125 |
| 2 | Bob | 01-JAN-2016 | 75 | 100 |
| 3 | Bob | 01-JAN-2014 | 475 | |
| 4 | Bob | 01-DEC-2015 | 100 | |
| 5 | Bob | 01-MAY-2016 | 200 | 87.5 |
| 6 | Fred | 01-JAN-2016 | 30 | 50 |
| 7 | Fred | 01-MAY-2015 | 50 | |
+----------+-------------+-------------+------------+----------+
You can almost fix your original query by doing a LATERAL join. Lateral allows you to reference previously declared tables as in:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join LATERAL (
select employee_id, avg(sale_price) as avg_price
from sales
where sale_date between Date(VARCHAR(YEAR(a.sale_date)-1) ||'-'|| VARCHAR(MONTH(a.sale_date)-1) || '-01')
and Date(VARCHAR(YEAR(a.sale_date)) ||'-'|| VARCHAR(MONTH(a.sale_date)) || '-01') -1 day
group by employee_id
) b on a.employee_id = b.employee_id
However, I get an syntax error from your date arithmetic, so using #Utsav solution for this yields:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join lateral (
select employee_id, avg(sale_price) as avg_price
from sales b
where a.employee_id = b.employee_id
and b.sale_date between a.sale_date - 365 and a.sale_date -1
group by employee_id
) b on a.employee_id = b.employee_id
Since we already pushed the predicate inside the LATERAL join, it is strictly speaking not necessary to use the on clause:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join lateral (
select employee_id, avg(sale_price) as avg_price
from sales b
where a.employee_id = b.employee_id
and b.sale_date between a.sale_date - 365 and a.sale_date -1
group by employee_id
) b on 1=1
By using a LATERAL join we removed one access against the sales table. A comparison of the plans show:
No LATERAL Join
Access Plan:
Total Cost: 20,4571
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
7
>MSJOIN
( 2)
20,4565
3
/---+----\
7 0,388889
TBSCAN FILTER
( 3) ( 6)
6,81572 13,6402
1 2
| |
7 2,72222
SORT GRPBY
( 4) ( 7)
6,81552 13,6397
1 2
| |
7 2,72222
TBSCAN TBSCAN
( 5) ( 8)
6,81488 13,6395
1 2
| |
7 2,72222
TABLE: LELLE SORT
SALES ( 9)
Q6 13,6391
2
|
2,72222
HSJOIN
( 10)
13,6385
2
/-----+------\
7 7
TBSCAN TBSCAN
( 11) ( 12)
6,81488 6,81488
1 1
| |
7 7
TABLE: LELLE TABLE: LELLE
SALES SALES
Q2 Q1
LATERAL Join
Access Plan:
Total Cost: 13,6565
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
7
>^NLJOIN
( 2)
13,6559
2
/---+----\
7 0,35
TBSCAN GRPBY
( 3) ( 4)
6,81488 6,81662
1 1
| |
7 0,35
TABLE: LELLE TBSCAN
SALES ( 5)
Q5 6,81656
1
|
7
TABLE: LELLE
SALES
Q1
Window functions with framing
DB2 does not yet support range frames over dates, but by using a clever trick by #mustaccio in:
https://dba.stackexchange.com/questions/141263/what-is-the-meaning-of-order-by-x-range-between-n-preceding-if-x-is-a-dat
we can actually use only one table access and solve the problem:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date
, avg(sale_price) over (partition by employee_id
order by julian_day(a.sale_date)
range between 365 preceding
and 1 preceding
) as avg_price
from sales a
Access Plan:
Total Cost: 6.8197
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
7
TBSCAN
( 2)
6.81753
1
|
7
SORT
( 3)
6.81703
1
|
7
TBSCAN
( 4)
6.81488
1
|
7
TABLE: LELLE
SALES
Q1

Count occurence of specific code per customer in 6 month period

I have a table that contains the following:
customerid | date (dmy) | productid
John | 1-3-14 | A
John | 7-5-14 | Y
John | 8-5-14 | Y
John | 1-10-15 | B
John | 1-11-15 | Y
Pete | 1-7-15 | Y
I need to find out how often customer X has bought Product Y in a six-month period.
The start of a period is defined as the first time a customer has bought one of the products A,B, C or Y. The endtime of a period is exactly six months after that.
The next period starts when the customer buys again one of the products A,B,C or Y.
So the output should be
customerid | period-start | period-end | countofY
John | 1-3-14 | 8-5-14 | 2
John | 1-10-15 | 1-11-15 | 1
Pete | 1-7-15 | 1-7-15 | 1
SELECT c.Customerid, MIN(c.pdate) AS startperiod, c1.endperiod,
(
SELECT COUNT(temp.productid) FROM Customer temp
WHERE temp.Customerid = c.Customerid
AND temp.pdate >= MIN(c.pdate)
AND temp.pdate <= c1.endperiod
GROUP BY temp.productid HAVING temp.productid ='Y'
)AS countOfY
FROM Customer c
CROSS APPLY
(
SELECT TOP 1 c1.pdate AS endperiod
FROM Customer c1
WHERE c1.Customerid = c.Customerid
AND c1.pdate >= c.pdate
AND
(
DATEDIFF(MONTH, c.pdate, c1.pdate) < 6
OR
(
SELECT TOP 1 t.pdate FROM Customer t
WHERE t.Customerid = c.Customerid
AND t.pdate < c1.pdate
) IS NULL
)
ORDER BY c1.pdate DESC
)AS c1 GROUP BY c1.endperiod, c.Customerid
;WITH CTE_DateRanges AS (
SELECT
customerid,
productid,
MIN(purchase_date) AS period_start,
DATEADD(MM, 6, MIN(purchase_date)) AS period_end
FROM
My_Table
GROUP BY
customerid,
productid
)
SELECT
DR.customerid,
DR.productid,
DR.period_start,
DR.period_end,
COUNT(*)
FROM
CTE_DateRanges DR
INNER JOIN My_Table MT ON
MT.customerid = DR.customerid AND
MT.productid = DR.productid AND
MT.purchase_date BETWEEN DR.period_start AND DR.period_end
GROUP BY
DR.customerid,
DR.productid,
DR.period_start,
DR.period_end,

Select row that has max total value SQL Server

I have the following scheme (2 tables):
Customer (Id, Name) and
Sale (Id, CustomerId, Date, Sum)
How to select the following data ?
1) Best customer of all time (Customer, which has Max Total value in the Sum column)
For example, I have 2 tables (Customers and Sales respectively):
id CustomerName
---|--------------
1 | First
2 | Second
3 | Third
id CustomerId datetime Sum
---|----------|------------|-----
1 | 1 | 04/06/2013 | 50
2 | 2 | 04/06/2013 | 60
3 | 3 | 04/07/2013 | 30
4 | 1 | 03/07/2013 | 50
5 | 1 | 03/08/2013 | 50
6 | 2 | 03/08/2013 | 30
7 | 3 | 24/09/2013 | 20
Desired result:
CustomerName TotalSum
------------|--------
First | 150
2) Best customer of each month in the current year (the same as previous but for each month in the current year)
Thanks.
Try this for the best customer of all times
SELECT Top 1 WITH TIES c.CustomerName, SUM(s.SUM) AS TotalSum
FROM Customer c JOIN Sales s ON s.CustomerId = c.CustomerId
GROUP BY c.CustomerId, c.CustomerName
ORDER BY SUM(s.SUM) DESC
One option is to use RANK() combined with the SUM aggregate. This will get you the overall values.
select customername, sumtotal
from (
select c.customername,
sum(s.sum) sumtotal,
rank() over (order by sum(s.sum) desc) rnk
from customer c
join sales s on c.id = s.customerid
group by c.id, c.customername
) t
where rnk = 1
SQL Fiddle Demo
Grouping this by month and year should be trivial at that point.

Sql: Calc average times a customers ordered a product in a period

How would you calc how many times a product is sold in average in a week or month, year.
I'm not interested in the Amount, but how many times a customer has bought a given product.
OrderLine
OrderNo | ProductNo | Amount |
----------------------------------------
1 | 1 | 10 |
1 | 4 | 2 |
2 | 1 | 2 |
3 | 1 | 4 |
Order
OrderNo | OrderDate
----------------------------------------
1 | 2012-02-21
2 | 2012-02-22
3 | 2012-02-25
This is the output I'm looking for
ProductNo | Average Orders a Week | Average Orders a month |
------------------------------------------------------------
1 | 3 | 12 |
2 | 5 | 20 |
You would have to first pre-query it grouped and counted per averaging method you wanted. To distinguish between year 1 and 2, I would add year() of the transaction into the grouping qualifier for distinctness. Such as Sales in Jan 2010 vs Sales in 2011 vs 2012... similarly, week 1 of 2010, week 1 of 2011 and 2012 instead of counting as all 3 years as a single week.
The following could be done if you are using MySQL
select
PreCount.ProductNo,
PreCount.TotalCount / PreCount.CountOfYrWeeks as AvgPerWeek,
PreCount.TotalCount / PreCount.CountOfYrMonths as AvgPerMonth,
PreCount.TotalCount / PreCount.CountOfYears as AvgPerYear
from
( select
OL.ProductNo,
count(*) TotalCount,
count( distinct YEARWEEK( O.OrderDate ) ) as CountOfYrWeeks,
count( distinct Date_Format( O.OrderDate, "%Y%M" )) as CountOfYrMonths,
count( distinct Year( O.OrderDate )) as CountOfYears
from
OrderLine OL
JOIN Order O
on OL.OrderNo = O.OrderNo
group by
OL.ProductNo ) PreCount
This is a copy of DRapp's answer, but coded for SQL Server (it's too big for a comment!)
SELECT PreCount.ProductNo,
PreCount.TotalCount / PreCount.CountOfYrWeeks AS AvgPerWeek,
PreCount.TotalCount / PreCount.CountOfYrMonths AS AvgPerMonth,
PreCount.TotalCount / PreCount.CountOfYears AS AvgPerYear
FROM (SELECT OL.ProductNo,
Count(*) TotalCount,
Count(DISTINCT Datepart(wk, O.OrderDate)) AS CountOfYrWeeks,
Count(DISTINCT Datepart(mm, O.OrderDate)) AS CountOfYrMonths,
Count(DISTINCT Year(O.OrderDate)) AS CountOfYears
FROM OrderLine OL JOIN [Order] O
ON OL.OrderNo = O.OrderNo
GROUP BY OL.ProductNo) PreCount