Summing and then getting min and max of top 70% in SQL - sql

I have data for the purchases of a product formatted like this:
Item | Price | Quantity Bought
ABC 10.10 4
DEF 8.30 12
DEF 7.75 8
ABC 10.50 20
GHI 15.4 1
GHI 15.2 12
ABC 10.25 8
... ... ...
Where each row represents an individual purchasing a certain amount at a certain price. I would like to aggregate this data and eliminate the prices below the 30th percentile for total quantity bought from my table.
For example, in the above data set the total amount of product ABC bought was (4+20+8) = 32 units, with average price = (4*10.10 + 8*10.25 + 20*10.50)/32 = 10.39.
I would like to organize the above data set like this:
Item | VWP | Total Vol | 70th %ile min | 70th %ile max
ABC 10.39 32 ??? ???
DEF ... 20 ??? ???
GHI ... 13 ??? ???
Where VWP is the volume weighted price, and the 70th %ile min/max represent the minimum and maximum prices within the top 70% of volume.
In other words, I want to eliminate the prices with the lowest volumes until I have 70% of the total volume for the day contained in the remaining prices. I would then like to publish the min and max price for the ones that are left in the 70th %ile min/max columns.
I tried to be as clear as possible, but if this is tough to follow along with please let me know which parts need clarification.
Note: These are not the only columns contained in my dataset, and I will be selecting and calculating other values as well. I only included the columns that are relevant to this specific calculation.
EDIT:
Here is my code so far, and I need to incorporate my calculation into this (the variables with the '#' symbol before them are inputs that are given by the user:
SELECT Item,
SUM(quantity) AS Total_Vol,
DATEADD(day, -#DateOffset, CONVERT(date, GETDATE())) AS buyDate,
MIN(Price) AS MinPrice,
MAX(Price) AS MaxPrice,
MAX(Price) - MIN(Price) AS PriceRange,
ROUND(SUM(Price * quantity)/SUM(quantity), 6) AS VWP,
FROM TransactTracker..CustData
-- #DateOffset (Number of days data is offset by)
-- #StartTime (Time to start data in hours)
-- #EndTime (Time to stop data in hours)
WHERE DATEDIFF(day, TradeDateTime, GETDATE()) = (#DateOffset+1)
AND DATEPART(hh, TradeDateTime) >= #StartTime
AND HitTake = ''
OR DATEDIFF(day, TradeDateTime, GETDATE()) = #DateOffset
AND DATEPART(hh, TradeDateTime) < #EndTime
AND HitTake = ''
GROUP BY Item
EDIT 2:
FROM (SELECT p.*,
(SELECT SUM(quantity) from TransactTracker..CustData p2
where p2.Series = p.Series and p2.Size >= p.Size) as volCum
FROM TransactTracker..CustData p
) p
EDIT 3:
(case when CAST(qcum AS FLOAT) / SUM(quantity) <= 0.7 THEN MIN(Price) END) AS min70px,
(case when CAST(qcum AS FLOAT) / SUM(quantity) <= 0.7 THEN MAX(Price) END) AS max70px
FROM (select p.*,
(select SUM(quantity) from TransactTracker..CustData p2
where p2.Item = p.Item and p2.quantity >= p.quantity)
as qcum from TransactTracker..CustData p) cd

There is some ambiguity on how you define 70 % when something goes over the threshold. However, the challenge is two fold. After identifying the cumulative proportion, the query also needs to choose the appropriate row. This suggests using row_number() for selection.
This solution using SQL Server 2012 syntax calculates the cumulative sum. It then takes assigns a sequential value based on how close the ratio is to 70%.
select item,
SUM(price * quantity) / SUM(quantity) as vwp,
SUM(quantity) as total_vol,
min(case when seqnum = 1 then price end) as min70price,
max(case when seqnum = 1 then price end) as max70price
from (select p.*,
ROW_NUMBER() over (partition by item order by abs(0.7 - qcum/qtot) as seqnum
from (select p.*,
SUM(quantity) over (partition by item order by vol desc) as qcum,
SUM(quantity) over (partition by item) as qtot
from purchases p
) p
) p
group by item;
To get the largest value less than 70%, then you would use:
max(case when qcum < qtot*0.7 then qcum end) over (partition by item) as lastqcum
And then the case statements in the outer select would be:
min(case when lastqcum = qcum then price end) . .
In earlier versions of SQL Server, you can get the same effect with the correlated subquery:
select item,
SUM(price * quantity) / SUM(quantity) as vwp,
SUM(quantity) as total_vol,
min(case when seqnum = 1 then price end) as min70price,
max(case when seqnum = 1 then price end) as max70price
from (select p.*,
ROW_NUMBER() over (partition by item order by abs(0.7 - qcum/qtot) as seqnum
from (select p.*,
(select SUM(quantity) from purchases p2 where p2.item = p.item and p2.quantity >= p.quantity
) as qsum,
SUM(quantity) over (partition by item) as qtot
from purchases p
) p
) p
group by item
Here is the example with your code:
SELECT Item,
SUM(quantity) AS Total_Vol,
DATEADD(day, -#DateOffset, CONVERT(date, GETDATE())) AS buyDate,
MIN(Price) AS MinPrice,
MAX(Price) AS MaxPrice,
MAX(Price) - MIN(Price) AS PriceRange,
ROUND(SUM(Price * quantity)/SUM(quantity), 6) AS VWP,
min(case when seqnum = 1 then price end) as min70price,
max(case when seqnum = 1 then price end) as max70price
from (select p.*,
ROW_NUMBER() over (partition by item order by abs(0.7 - qcum/qtot) as seqnum
from (select p.*,
(select SUM(quantity) from TransactTracker..CustData p2 where p2.item = p.item and p2.quantity >= p.quantity
) as qsum,
SUM(quantity) over (partition by item) as qtot
from purchases TransactTracker..CustData
) p
) cd
-- #DateOffset (Number of days data is offset by)
-- #StartTime (Time to start data in hours)
-- #EndTime (Time to stop data in hours)
WHERE DATEDIFF(day, TradeDateTime, GETDATE()) = (#DateOffset+1)
AND DATEPART(hh, TradeDateTime) >= #StartTime
AND HitTake = ''
OR DATEDIFF(day, TradeDateTime, GETDATE()) = #DateOffset
AND DATEPART(hh, TradeDateTime) < #EndTime
AND HitTake = ''
GROUP BY Item

Related

How to Nest query with different criteria

I have a Sales_details table where I like to get a report of the top 150 products and the top 10 customers of each product. The code I have below does just that and is working perfectly. However, it is using the same date range for both. How do I modify this so that the top 150 products is based on a 10 years history while the top 10 customers is based on 2 years history?
select pc.*
from (select pc.*,
dense_rank() over (order by product_sales desc, product_id) as product_rank
from (select sd.product_id, sd.custno, sum(sd.sales$) as total_sales,
row_number() over (partition by sd.product_id order by sum(sd.sales$) as cust_within_product_rank,
sum(sum(sd.sales$)) over (partition by sd.product_id) as product_sales
from salesdetails sd
group by sd.product_id, sd.custno
) pc
) pc
where product_rank <= 150 and cust_within_product_rank <= 10;
You can use conditional aggregation:
select pc.*
from (select pc.*,
dense_rank() over (order by product_sales desc, product_id) as product_rank
from (select sd.product_id, sd.custno, sum(sd.sales$) as total_sales,
row_number() over (partition by sd.product_id
order by sum(case when date > dateadd(year, -2, getdate()) then sd.sales$ else 0 end)
) as cust_within_product_rank,
sum(sum(case when date > dateadd(year, -10, getdate()) then sd.sales$ else 0 end)) over (partition by sd.product_id) as product_sales
from salesdetails sd
group by sd.product_id, sd.custno
) pc
) pc
where product_rank <= 150 and cust_within_product_rank <= 10;
I'm not sure what column you use for date, so I just called it date.

Query to get top product gainers by sales over previous week

I have a database table with three columns.
WeekNumber, ProductName, SalesCount
Sample data is shown in below table. I want top 10 gainers(by %) for week 26 over previous week i.e. week 25. The only condition is that the product should have sales count greater than 0 in both the weeks.
In the sample data B,C,D are the common products and C has the highest % gain.
Similarly, I will need top 10 losers also.
What I have tried till now is to make a inner join and get common products between two weeks. However, I am not able to get the top gainers logic.
The output should be like
Product PercentGain
C 400%
D 12.5%
B 10%
This will give you a generic answer, not just for any particular week:
select top 10 product , gain [gain%]
from
(
SELECT product, ((curr.salescount-prev.salescount)/prev.salescount)*100 gain
from
(select weeknumber, product, salescount from tbl) prev
JOIN
(select weeknumber, product, salescount from tbl) curr
on prev.weeknumber = curr.weeknumber - 1
AND prev.product = curr.product
where prev.salescount > 0 and curr.salescount > 0
)A
order by gain desc
If you are interested in weeks 25 and 26, then just add the condition below in the WHERE clause:
and prev.weeknumber = 25
If you are using SQL-Server 2012 (or newer), you could use the lag function to match "this" weeks sales with the previous week's. From there on, it's just some math:
SELECT TOP 10 product, sales/prev_sales - 1 AS gain
FROM (SELECT product,
sales,
LAG(sales) OVER (PARTITION BY product
ORDER BY weeknumber) AS prev_sales
FROM mytable) t
WHERE weeknumber = 26 AND
sales > 0 AND
prev_sales > 0 AND
sales > prev_sales
ORDER BY sales/prev_sales
this is the Query .
select top 10 product , gain [gain%]
from
(
SELECT curr.Product, ( (curr.Sales - prev.Sales ) *100)/prev.Sales gain
from
(select weeknumber, product, sales from ProductInfo where weeknumber = 25 ) prev
JOIN
(select weeknumber, product, sales from ProductInfo where weeknumber = 26 ) curr
on prev.product = curr.product
where prev.Sales > 0 and curr.Sales > 0
)A
order by gain desc

SQL Deduct value from multiple rows

I would like to apply total $10.00 discount for each customers.The discount should be applied to multiple transactions until all $10.00 used.
Example:
CustomerID Transaction Amount Discount TransactionID
1 $8.00 $8.00 1
1 $6.00 $2.00 2
1 $5.00 $0.00 3
1 $1.00 $0.00 4
2 $5.00 $5.00 5
2 $2.00 $2.00 6
2 $2.00 $2.00 7
3 $45.00 $10.00 8
3 $6.00 $0.00 9
The query below keeps track of the running sum and calculates the discount depending on whether the running sum is greater than or less than the discount amount.
select
customerid, transaction_amount, transactionid,
(case when 10 > (sum_amount - transaction_amount)
then (case when transaction_amount >= 10 - (sum_amount - transaction_amount)
then 10 - (sum_amount - transaction_amount)
else transaction_amount end)
else 0 end) discount
from (
select customerid, transaction_amount, transactionid,
sum(transaction_amount) over (partition by customerid order by transactionid) sum_amount
from Table1
) t1 order by customerid, transactionid
http://sqlfiddle.com/#!6/552c2/7
same query with a self join which should work on most db's including mssql 2008
select
customerid, transaction_amount, transactionid,
(case when 10 > (sum_amount - transaction_amount)
then (case when transaction_amount >= 10 - (sum_amount - transaction_amount)
then 10 - (sum_amount - transaction_amount)
else transaction_amount end)
else 0 end) discount
from (
select t1.customerid, t1.transaction_amount, t1.transactionid,
sum(t2.transaction_amount) sum_amount
from Table1 t1
join Table1 t2 on t1.customerid = t2.customerid
and t1.transactionid >= t2.transactionid
group by t1.customerid, t1.transaction_amount, t1.transactionid
) t1 order by customerid, transactionid
http://sqlfiddle.com/#!3/552c2/2
You can do this with recursive common table expressions, although it isn't particularly pretty. SQL Server stuggles to optimize these types of query. See Sum of minutes between multiple date ranges for some discussion.
If you wanted to go further with this approach, you'd probably need to make a temporary table of x, so you can index it on (customerid, rn)
;with x as (
select
tx.*,
row_number() over (
partition by customerid
order by transaction_amount desc, transactionid
) rn
from
tx
), y as (
select
x.transactionid,
x.customerid,
x.transaction_amount,
case
when 10 >= x.transaction_amount then x.transaction_amount
else 10
end as discount,
case
when 10 >= x.transaction_amount then 10 - x.transaction_amount
else 0
end as remainder,
x.rn as rn
from
x
where
rn = 1
union all
select
x.transactionid,
x.customerid,
x.transaction_amount,
case
when y.remainder >= x.transaction_amount then x.transaction_amount
else y.remainder
end,
case
when y.remainder >= x.transaction_amount then y.remainder - x.transaction_amount
else 0
end,
x.rn
from
y
inner join
x
on y.rn = x.rn - 1 and y.customerid = x.customerid
where
y.remainder > 0
)
update
tx
set
discount = y.discount
from
tx
inner join
y
on tx.transactionid = y.transactionid;
Example SQLFiddle
I usually like to setup a test environment for such questions. I will use a local temporary table. Please note, I made the data un-ordered since it is not guaranteed in a real life.
-- play table
if exists (select 1 from tempdb.sys.tables where name like '%transactions%')
drop table #transactions
go
-- play table
create table #transactions
(
trans_id int identity(1,1) primary key,
customer_id int,
trans_amt smallmoney
)
go
-- add data
insert into #transactions
values
(1,$8.00),
(2,$5.00),
(3,$45.00),
(1,$6.00),
(2,$2.00),
(1,$5.00),
(2,$2.00),
(1,$1.00),
(3,$6.00);
go
I am going to give you two answers.
First, in 2014 there are new windows functions for rows preceding. This allows us to get a running total (rt) and a rt adjusted by one entry. Give these two values, we can determine if the maximum discount has been exceeded or not.
-- Two running totals for 2014
;
with cte_running_total
as
(
select
*,
SUM(trans_amt)
OVER (PARTITION BY customer_id
ORDER BY trans_id
ROWS BETWEEN UNBOUNDED PRECEDING AND
0 PRECEDING) as running_tot_p0,
SUM(trans_amt)
OVER (PARTITION BY customer_id
ORDER BY trans_id
ROWS BETWEEN UNBOUNDED PRECEDING AND
1 PRECEDING) as running_tot_p1
from
#transactions
)
select
*
,
case
when coalesce(running_tot_p1, 0) <= 10 and running_tot_p0 <= 10 then
trans_amt
when coalesce(running_tot_p1, 0) <= 10 and running_tot_p0 > 10 then
10 - coalesce(running_tot_p1, 0)
else 0
end as discount_amt
from cte_running_total;
Again, the above version is using a common table expression and advanced windowing to get the totals.
Do not fret! The same can be done all the way down to SQL 2000.
Second solution, I am just going to use the order by, sub-queries, and a temporary table to store the information that is normally in the CTE. You can switch the temporary table for a CTE in SQL 2008 if you want.
-- w/o any fancy functions - save to temp table
select *,
(
select count(*) from #transactions i
where i.customer_id = o.customer_id
and i.trans_id <= o.trans_id
) as sys_rn,
(
select sum(trans_amt) from #transactions i
where i.customer_id = o.customer_id
and i.trans_id <= o.trans_id
) as sys_tot_p0,
(
select sum(trans_amt) from #transactions i
where i.customer_id = o.customer_id
and i.trans_id < o.trans_id
) as sys_tot_p1
into #results
from #transactions o
order by customer_id, trans_id
go
-- report off temp table
select
trans_id,
customer_id,
trans_amt,
case
when coalesce(sys_tot_p1, 0) <= 10 and sys_tot_p0 <= 10 then
trans_amt
when coalesce(sys_tot_p1, 0) <= 10 and sys_tot_p0 > 10 then
10 - coalesce(sys_tot_p1, 0)
else 0
end as discount_amt
from #results
order by customer_id, trans_id
go
In short, your answer is show in the following screen shot. Cut and paste the code into SSMS and have some fun.

SQL query for Pricing analysis

Facing issue to find the Min and Max pricing status on the column YearMonth,
Below is my table data
YearMonth STATE ProductGroup LocaProdname Price
201407 MH AIRTEL AIRTEL-3G 10,000
201208 GJ IDEA IDEA-3G 1,200
201406 WB AIRCEL AIRCEL PERPAID 5,866
201407 DL TATA DOCOMA TATA LANDLINE 8,955
201207 KAR VODAFONE VODAFONE-3G 7,899
201312 MH AIRTEL AIRTEL-3G 15,000
201408 GJ IDEA IDEA-3G 25,000
I require below output:
YearMonth STATE ProductGroup LocaProdname Price Indictor-YEAR
201407 MH AIRTEL AIRTEL-3G 10,000 MAX
201312 MH AIRTEL AIRTEL-3G 15,000 MIN
201408 GJ IDEA IDEA-3G 25,000 MAX
201208 GJ IDEA IDEA-3G 1,200 MIN
I need the Max yearmonth and min Year values values.
If I understand correctly, you can do this with row_number():
select YearMonth, STATE, ProductGroup, LocaProdname, Price,
(case when seqnum_asc = 1 then 'MIN' else 'MAX' end) as Indicator
from (select d.*,
row_number() over (partition by state, productgroup, localprodname
order by price asc) as seqnum_asc,
row_number() over (partition by state, productgroup, localprodname
order by pricedesc) as seqnum_desc
from data
) d
where seqnum_asc = 1 or seqnum_desc = 1;
EDIT:
Does this do what you want?
select YearMonth, STATE, ProductGroup, LocaProdname, Price,
(case when seqnum_asc = 1 then 'MIN' else 'MAX' end) as Indicator
from (select d.*,
row_number() over (partition by YearMonth
order by price asc) as seqnum_asc,
row_number() over (partition by YearMOnth
order by pricedesc) as seqnum_desc
from data
) d
where seqnum_asc = 1 or seqnum_desc = 1;
Please use Row_number with partition BY and remove unwanted code as per your need,
SELECT yearmonth,state,productgroup,locaprodname,price,operation
FROM (
SELECT * FROM (SELECT p.yearmonth,p.state,p.productgroup,p.locaprodname,p.price,'MAX' AS Operation,
Row_number() OVER( partition BY p.productgroup, p.locaprodname
ORDER BY p.price DESC) AS Row
FROM pricingtest p) AS Maxx
WHERE Maxx.row = 1
UNION ALL
SELECT * FROM (SELECT p.yearmonth,p.state,p.productgroup,p.locaprodname,p.price,'MIN' AS Operation,
Row_number() OVER( partition BY p.productgroup, p.locaprodname
ORDER BY p.price ASC) AS Row
FROM pricingtest p) AS Minn
WHERE Minn.row = 1
) AS whole
ORDER BY yearmonth,productgroup
This can be done by finding the MAX/MIN values associated with the LocaProdname,ProductGroup and State then joining in on the table where everything matches. See below, or view the fiddle at http://sqlfiddle.com/#!3/4d6bd/2
NOTE: I've added in HAVING COUNT(*) > 1 as you seem to only want ones which have changed price. (Ie. Have more than 1 entry)
SELECT p.YearMonth
,p.State
,p.ProductGroup
,p.LocaProdname
,p.Price
,CASE
WHEN p.Price = a.MaxPrice
THEN 'MAX'
WHEN p.Price = a.MinPrice
THEN 'MIN'
END AS [Indicator-YEAR]
FROM PricingTest p
INNER JOIN (
SELECT LocaProdname
,ProductGroup
,State
,MAX(Price) AS MaxPrice
,MIN(Price) AS MinPrice
FROM pricingTest
GROUP BY LocaProdname
,ProductGroup
,State
HAVING COUNT(*) > 1
) a ON a.LocaProdname = p.LocaProdname
AND a.ProductGroup = p.ProductGroup
AND a.State= p.State
AND (
a.MaxPrice = p.Price
OR a.MinPrice = p.Price
)
ORDER BY LocaProdname
EDIT: Or I just noticed it's the max/min YearMonth the user might be looking, if this is the case check out http://sqlfiddle.com/#!3/4d6bd/4 It is basically just replacing all references to Price to YearMonth.
Once you get the last and first record you can UNION results:
SELECT t.*, 'MIN' AS Indicator
FROM
myTable t LEFT JOIN
myTable t2 ON t.YearMonth = t2.YearMonth AND t2.price < t.price
WHERE t2.YearMonth IS NULL
UNION
SELECT t.*, 'MAX' AS Indicator
FROM
myTable t LEFT JOIN
myTable t2 ON t.YearMonth = t2.YearMonth AND t2.price > t.price
WHERE t2.YearMonth IS NULL
If you have several records with same highest price, above query will return all of them. Also if you only have one record in a month, it will be returned twice as both MIN and MAX.

Calculate the qty of items that were purchased more than once by a customer - but need to exclude the first order qty

SELECT TOP 100 PERCENT soheader.custid,SOHeader.OrdNbr, SOLine.InvtID, SOLine.Descr,SOLine.QtyOrd
FROM SOHeader INNER JOIN
SOLine ON SOHeader.OrdNbr = SOLine.OrdNbr
WHERE (SOHeader.OrdDate >= CONVERT(DATETIME, '2013-06-01 00:00:00', 102)) AND (SOHeader.OrdDate <= GETDATE()) AND (SOHeader.CustID = '69065')
ORDER BY SOLine.InvtID, SOHeader.OrdNbr
here is my sample data
69065 WO0175279 69407 Jazzy Laces White 3
69065 WO0175393 69407 Jazzy Laces White 6
69065 WO0175393 69407 Jazzy Laces White 9
Now I want to know how to get the total qty of this item ordered after the first order. I do not want to include the qty of 3 in the first record above. I just want to include the qty of 6 in the first reorder and the qty 9 of the second reorder which equals the qty of 15.
69065 is the customer ID
WO##### is the order ID
69407 is the inventory ID
SELECT invtId, SUM(QtyOrd)
FROM (
SELECT invtId,
qtyOrd,
ROW_NUMBER() OVER (PARTITION BY invtId ORDER BY h.ordDate, h.ordNbr) rn
FROM soLine l
JOIN soHeader h
ON h.ordNbr = l.ordNbr
WHERE l.custId = 69065
) q
WHERE rn > 1
GROUP BY
invtId
WITH cl
as
(select *, ROW_NUMBER() OVER (PARTITION BY InvtID ORDER BY QtyOrd) rn
from t)
select InvtID, QtyOrd,
(select SUM(QtyOrd) from cl oo
where o.InvtID = oo.InvtID and o.rn -1 > 0
and rn between o.rn -1 and o.rn) as 'sm'
from cl o
Here's a Demo on SqlFiddle.