How to Nest query with different criteria - sql

I have a Sales_details table where I like to get a report of the top 150 products and the top 10 customers of each product. The code I have below does just that and is working perfectly. However, it is using the same date range for both. How do I modify this so that the top 150 products is based on a 10 years history while the top 10 customers is based on 2 years history?
select pc.*
from (select pc.*,
dense_rank() over (order by product_sales desc, product_id) as product_rank
from (select sd.product_id, sd.custno, sum(sd.sales$) as total_sales,
row_number() over (partition by sd.product_id order by sum(sd.sales$) as cust_within_product_rank,
sum(sum(sd.sales$)) over (partition by sd.product_id) as product_sales
from salesdetails sd
group by sd.product_id, sd.custno
) pc
) pc
where product_rank <= 150 and cust_within_product_rank <= 10;

You can use conditional aggregation:
select pc.*
from (select pc.*,
dense_rank() over (order by product_sales desc, product_id) as product_rank
from (select sd.product_id, sd.custno, sum(sd.sales$) as total_sales,
row_number() over (partition by sd.product_id
order by sum(case when date > dateadd(year, -2, getdate()) then sd.sales$ else 0 end)
) as cust_within_product_rank,
sum(sum(case when date > dateadd(year, -10, getdate()) then sd.sales$ else 0 end)) over (partition by sd.product_id) as product_sales
from salesdetails sd
group by sd.product_id, sd.custno
) pc
) pc
where product_rank <= 150 and cust_within_product_rank <= 10;
I'm not sure what column you use for date, so I just called it date.

Related

SQL - get start & end balance for each member each year

so I'd like to effectively get for each year the starting and end balance for each member for every year there is a record. for example the below would give me the latest balance for each member each year based on the date column
SELECT
T.MemberID,
T.DateCol,
T.Amount
FROM
(SELECT T.MemberID,
T.DateCol,
Amount,
ROW_NUMBER() OVER (PARTITION BY MemberID,
YEAR(DateCol)
ORDER BY
DateCol desc) AS seqnum
FROM
Tablet T
GROUP BY DateCol, MemberID, Amount
) T
WHERE
seqnum = 1 AND
MemberID = '1000009'
and the below would give me the earliest balance for each year
SELECT
T.MemberID,
T.DateCol,
T.Amount
FROM
(SELECT T.MemberID,
T.DateCol,
Amount,
ROW_NUMBER() OVER (PARTITION BY MemberID,
YEAR(DateCol)
ORDER BY
DateCol) AS seqnum
FROM
Tablet T
GROUP BY DateCol, MemberID, Amount
) T
WHERE
seqnum = 1 AND
MemberID = '1000009'
This would give me a result set like the below, column titles (MemberID, Date, Amount)
What I'm looking for is one query which is done by YEAR, MEMBERID, STARTBALANCE, ENDBALANCE as the columns. And would look like the below
What would be the best way to go about this?
commented above

getting difference between two invoices by ranking and subtracting one from the other

Trying to grab difference in invoices
Attempted using cte's for ranks 1 and 2, but they have a subquery in them and cant be done!
the second query looks the same, but with rank=2.
select *
from (
SELECT i.id, i.subtotal/100 as subtotal, i.created_at, i.paid_at
,RANK() OVER (PARTITION BY i.subscription_id ORDER BY i.created_at DESC) AS Rank
From Invoices i
) as r
where r.rank = 1
order by r.created_at desc;
Following the path that you are on (using row_number()/rank()), you can use conditional aggregation. Assuming you want the difference of the subtotal, then:
select sum(case when seqnum = 1 then subtotal
else - subtotal
end) as difference
from (select i.*, i.subtotal/100 as subtotal,
row_number() over (partition by i.subscription_id order by i.created_at desc) as seqnum
from Invoices i
) i
where seqnum in (1, 2)
order by r.created_at desc;

SQL Server 2008 calculating data difference when we have only one date column

I have a date column Order_date and I am looking for ways to calculate the date difference between customer last order date and his recent previous ( previous form last) order_date ....
Example
Customer : 1, 2 , 1 , 1
Order_date: 01/02/2007, 02/01/2015, 06/02/2014, 04/02/2015
As you can see customer # 1 has three orders.
I want to know the date difference between his recent order date (04/02/2015) and his recent previous (06/02/2014).
For SQL Server 2012 & 2014 you could use LAG with a DATEDIFF to see the number of days between them.
For older versions, a CTE would probably be your best bet:
;WITH CTE AS
(
SELECT CustomerID,
Order_Date,
rn = ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY Order_Date DESC)
)
SELECT c1.CustomerID,
DATEDIFF(d, c1.Order_Date, c2.Order_Date)
FROM CTE c1
INNER JOIN CTE c2 ON c2.rn = c1.rn + 1
In SQL Server 2012+, you can use lag() to get the difference between any two dates:
select t.*,
datediff(day, lag(order_date) over (partition by customer order by order_date),
order_date) as days_dff
from table t;
If you have an older version, you can do something similar with correlated subqueries or outer apply.
EDIT:
If you just want the difference between the two most recent dates, use conditional aggregation instead:
select customer,
datediff(day, max(case when seqnum = 2 then order_date end),
max(case when seqnum = 1 then order_date end)
) as MostRecentDiff
from (select t.*,
row_number() over (partition by customer order by order_date desc) as seqnum
from table t
) t
group by customer;
If you're using SQL Server 2008 or later, you can try CROSS APPLY.
SELECT [customers].[customer_id], DATEDIFF(DAY, MIN([recent_orders].[order_date]), MAX([recent_orders].[order_date])) AS [elapsed]
FROM [customers]
CROSS APPLY (
SELECT TOP 2 [order_date]
FROM [orders]
WHERE ([orders].[customer_id] = [customers].[customer_id])
) [recent_orders]
GROUP BY [customers].[customer_id]
SELECT DATEDIFF(DAY, Y.PrevLastOrderDate, Y.LastOrderDate) AS PreviousDays
FROM
(
SELECT X.LastOrderDate
, (SELECT MAX(OrderDate) FROM dbo.Orders SO WHERE SO.CustomerID=1 AND SO.OrderDate < X.LastOrderDate) AS PrevLastOrderDate
FROM
(
select MAX(OrderDate) AS LastOrderDate
FROM dbo.Orders O
WHERE O.CustomerID=1
)X
)Y
drop table #Invoices
create table #Invoices ( OrderId int , OrderDate datetime )
insert into #Invoices (OrderId , OrderDate )
select 101, '01/01/2001' UNION ALL Select 202, '02/02/2002' UNION ALL Select 303, '03/03/2003'
UNION ALL Select 808, '08/08/2008' UNION ALL Select 909, '09/09/2009'
;
WITH
MyCTE /* http://technet.microsoft.com/en-us/library/ms175972.aspx */
( OrderId,OrderDate,ROWID) AS
(
SELECT
OrderId,OrderDate
, ROW_NUMBER() OVER ( ORDER BY OrderDate ) as ROWID
FROM
#Invoices inv
)
SELECT
OrderId,OrderDate
,(Select Max(OrderDate) from MyCTE innerAlias where innerAlias.ROWID = (outerAlias.ROWID-1) ) as PreviousOrderDate
,
[MyDiff] =
CASE
WHEN (Select Max(OrderDate) from MyCTE innerAlias where innerAlias.ROWID = (outerAlias.ROWID-1) ) iS NULL then 0
ELSE DATEDIFF (mm, OrderDate , (Select Max(OrderDate) from MyCTE innerAlias where innerAlias.ROWID = (outerAlias.ROWID-1) ) )
END
, ROWIDMINUSONE = (ROWID-1)
, ROWID as ROWID_SHOWN_FOR_KICKS , OrderDate as OrderDateASecondTimeForConvenience
FROM
MyCTE outerAlias
ORDER BY outerAlias.OrderDate Desc , OrderId

SQL query for Pricing analysis

Facing issue to find the Min and Max pricing status on the column YearMonth,
Below is my table data
YearMonth STATE ProductGroup LocaProdname Price
201407 MH AIRTEL AIRTEL-3G 10,000
201208 GJ IDEA IDEA-3G 1,200
201406 WB AIRCEL AIRCEL PERPAID 5,866
201407 DL TATA DOCOMA TATA LANDLINE 8,955
201207 KAR VODAFONE VODAFONE-3G 7,899
201312 MH AIRTEL AIRTEL-3G 15,000
201408 GJ IDEA IDEA-3G 25,000
I require below output:
YearMonth STATE ProductGroup LocaProdname Price Indictor-YEAR
201407 MH AIRTEL AIRTEL-3G 10,000 MAX
201312 MH AIRTEL AIRTEL-3G 15,000 MIN
201408 GJ IDEA IDEA-3G 25,000 MAX
201208 GJ IDEA IDEA-3G 1,200 MIN
I need the Max yearmonth and min Year values values.
If I understand correctly, you can do this with row_number():
select YearMonth, STATE, ProductGroup, LocaProdname, Price,
(case when seqnum_asc = 1 then 'MIN' else 'MAX' end) as Indicator
from (select d.*,
row_number() over (partition by state, productgroup, localprodname
order by price asc) as seqnum_asc,
row_number() over (partition by state, productgroup, localprodname
order by pricedesc) as seqnum_desc
from data
) d
where seqnum_asc = 1 or seqnum_desc = 1;
EDIT:
Does this do what you want?
select YearMonth, STATE, ProductGroup, LocaProdname, Price,
(case when seqnum_asc = 1 then 'MIN' else 'MAX' end) as Indicator
from (select d.*,
row_number() over (partition by YearMonth
order by price asc) as seqnum_asc,
row_number() over (partition by YearMOnth
order by pricedesc) as seqnum_desc
from data
) d
where seqnum_asc = 1 or seqnum_desc = 1;
Please use Row_number with partition BY and remove unwanted code as per your need,
SELECT yearmonth,state,productgroup,locaprodname,price,operation
FROM (
SELECT * FROM (SELECT p.yearmonth,p.state,p.productgroup,p.locaprodname,p.price,'MAX' AS Operation,
Row_number() OVER( partition BY p.productgroup, p.locaprodname
ORDER BY p.price DESC) AS Row
FROM pricingtest p) AS Maxx
WHERE Maxx.row = 1
UNION ALL
SELECT * FROM (SELECT p.yearmonth,p.state,p.productgroup,p.locaprodname,p.price,'MIN' AS Operation,
Row_number() OVER( partition BY p.productgroup, p.locaprodname
ORDER BY p.price ASC) AS Row
FROM pricingtest p) AS Minn
WHERE Minn.row = 1
) AS whole
ORDER BY yearmonth,productgroup
This can be done by finding the MAX/MIN values associated with the LocaProdname,ProductGroup and State then joining in on the table where everything matches. See below, or view the fiddle at http://sqlfiddle.com/#!3/4d6bd/2
NOTE: I've added in HAVING COUNT(*) > 1 as you seem to only want ones which have changed price. (Ie. Have more than 1 entry)
SELECT p.YearMonth
,p.State
,p.ProductGroup
,p.LocaProdname
,p.Price
,CASE
WHEN p.Price = a.MaxPrice
THEN 'MAX'
WHEN p.Price = a.MinPrice
THEN 'MIN'
END AS [Indicator-YEAR]
FROM PricingTest p
INNER JOIN (
SELECT LocaProdname
,ProductGroup
,State
,MAX(Price) AS MaxPrice
,MIN(Price) AS MinPrice
FROM pricingTest
GROUP BY LocaProdname
,ProductGroup
,State
HAVING COUNT(*) > 1
) a ON a.LocaProdname = p.LocaProdname
AND a.ProductGroup = p.ProductGroup
AND a.State= p.State
AND (
a.MaxPrice = p.Price
OR a.MinPrice = p.Price
)
ORDER BY LocaProdname
EDIT: Or I just noticed it's the max/min YearMonth the user might be looking, if this is the case check out http://sqlfiddle.com/#!3/4d6bd/4 It is basically just replacing all references to Price to YearMonth.
Once you get the last and first record you can UNION results:
SELECT t.*, 'MIN' AS Indicator
FROM
myTable t LEFT JOIN
myTable t2 ON t.YearMonth = t2.YearMonth AND t2.price < t.price
WHERE t2.YearMonth IS NULL
UNION
SELECT t.*, 'MAX' AS Indicator
FROM
myTable t LEFT JOIN
myTable t2 ON t.YearMonth = t2.YearMonth AND t2.price > t.price
WHERE t2.YearMonth IS NULL
If you have several records with same highest price, above query will return all of them. Also if you only have one record in a month, it will be returned twice as both MIN and MAX.

Summing and then getting min and max of top 70% in SQL

I have data for the purchases of a product formatted like this:
Item | Price | Quantity Bought
ABC 10.10 4
DEF 8.30 12
DEF 7.75 8
ABC 10.50 20
GHI 15.4 1
GHI 15.2 12
ABC 10.25 8
... ... ...
Where each row represents an individual purchasing a certain amount at a certain price. I would like to aggregate this data and eliminate the prices below the 30th percentile for total quantity bought from my table.
For example, in the above data set the total amount of product ABC bought was (4+20+8) = 32 units, with average price = (4*10.10 + 8*10.25 + 20*10.50)/32 = 10.39.
I would like to organize the above data set like this:
Item | VWP | Total Vol | 70th %ile min | 70th %ile max
ABC 10.39 32 ??? ???
DEF ... 20 ??? ???
GHI ... 13 ??? ???
Where VWP is the volume weighted price, and the 70th %ile min/max represent the minimum and maximum prices within the top 70% of volume.
In other words, I want to eliminate the prices with the lowest volumes until I have 70% of the total volume for the day contained in the remaining prices. I would then like to publish the min and max price for the ones that are left in the 70th %ile min/max columns.
I tried to be as clear as possible, but if this is tough to follow along with please let me know which parts need clarification.
Note: These are not the only columns contained in my dataset, and I will be selecting and calculating other values as well. I only included the columns that are relevant to this specific calculation.
EDIT:
Here is my code so far, and I need to incorporate my calculation into this (the variables with the '#' symbol before them are inputs that are given by the user:
SELECT Item,
SUM(quantity) AS Total_Vol,
DATEADD(day, -#DateOffset, CONVERT(date, GETDATE())) AS buyDate,
MIN(Price) AS MinPrice,
MAX(Price) AS MaxPrice,
MAX(Price) - MIN(Price) AS PriceRange,
ROUND(SUM(Price * quantity)/SUM(quantity), 6) AS VWP,
FROM TransactTracker..CustData
-- #DateOffset (Number of days data is offset by)
-- #StartTime (Time to start data in hours)
-- #EndTime (Time to stop data in hours)
WHERE DATEDIFF(day, TradeDateTime, GETDATE()) = (#DateOffset+1)
AND DATEPART(hh, TradeDateTime) >= #StartTime
AND HitTake = ''
OR DATEDIFF(day, TradeDateTime, GETDATE()) = #DateOffset
AND DATEPART(hh, TradeDateTime) < #EndTime
AND HitTake = ''
GROUP BY Item
EDIT 2:
FROM (SELECT p.*,
(SELECT SUM(quantity) from TransactTracker..CustData p2
where p2.Series = p.Series and p2.Size >= p.Size) as volCum
FROM TransactTracker..CustData p
) p
EDIT 3:
(case when CAST(qcum AS FLOAT) / SUM(quantity) <= 0.7 THEN MIN(Price) END) AS min70px,
(case when CAST(qcum AS FLOAT) / SUM(quantity) <= 0.7 THEN MAX(Price) END) AS max70px
FROM (select p.*,
(select SUM(quantity) from TransactTracker..CustData p2
where p2.Item = p.Item and p2.quantity >= p.quantity)
as qcum from TransactTracker..CustData p) cd
There is some ambiguity on how you define 70 % when something goes over the threshold. However, the challenge is two fold. After identifying the cumulative proportion, the query also needs to choose the appropriate row. This suggests using row_number() for selection.
This solution using SQL Server 2012 syntax calculates the cumulative sum. It then takes assigns a sequential value based on how close the ratio is to 70%.
select item,
SUM(price * quantity) / SUM(quantity) as vwp,
SUM(quantity) as total_vol,
min(case when seqnum = 1 then price end) as min70price,
max(case when seqnum = 1 then price end) as max70price
from (select p.*,
ROW_NUMBER() over (partition by item order by abs(0.7 - qcum/qtot) as seqnum
from (select p.*,
SUM(quantity) over (partition by item order by vol desc) as qcum,
SUM(quantity) over (partition by item) as qtot
from purchases p
) p
) p
group by item;
To get the largest value less than 70%, then you would use:
max(case when qcum < qtot*0.7 then qcum end) over (partition by item) as lastqcum
And then the case statements in the outer select would be:
min(case when lastqcum = qcum then price end) . .
In earlier versions of SQL Server, you can get the same effect with the correlated subquery:
select item,
SUM(price * quantity) / SUM(quantity) as vwp,
SUM(quantity) as total_vol,
min(case when seqnum = 1 then price end) as min70price,
max(case when seqnum = 1 then price end) as max70price
from (select p.*,
ROW_NUMBER() over (partition by item order by abs(0.7 - qcum/qtot) as seqnum
from (select p.*,
(select SUM(quantity) from purchases p2 where p2.item = p.item and p2.quantity >= p.quantity
) as qsum,
SUM(quantity) over (partition by item) as qtot
from purchases p
) p
) p
group by item
Here is the example with your code:
SELECT Item,
SUM(quantity) AS Total_Vol,
DATEADD(day, -#DateOffset, CONVERT(date, GETDATE())) AS buyDate,
MIN(Price) AS MinPrice,
MAX(Price) AS MaxPrice,
MAX(Price) - MIN(Price) AS PriceRange,
ROUND(SUM(Price * quantity)/SUM(quantity), 6) AS VWP,
min(case when seqnum = 1 then price end) as min70price,
max(case when seqnum = 1 then price end) as max70price
from (select p.*,
ROW_NUMBER() over (partition by item order by abs(0.7 - qcum/qtot) as seqnum
from (select p.*,
(select SUM(quantity) from TransactTracker..CustData p2 where p2.item = p.item and p2.quantity >= p.quantity
) as qsum,
SUM(quantity) over (partition by item) as qtot
from purchases TransactTracker..CustData
) p
) cd
-- #DateOffset (Number of days data is offset by)
-- #StartTime (Time to start data in hours)
-- #EndTime (Time to stop data in hours)
WHERE DATEDIFF(day, TradeDateTime, GETDATE()) = (#DateOffset+1)
AND DATEPART(hh, TradeDateTime) >= #StartTime
AND HitTake = ''
OR DATEDIFF(day, TradeDateTime, GETDATE()) = #DateOffset
AND DATEPART(hh, TradeDateTime) < #EndTime
AND HitTake = ''
GROUP BY Item