I am working in SQL Server 2008 R2 and having a hard time gathering new vs repeat customer orders.
I have data in this format:
OrderID OrderDate Customer OrderAmount
-----------------------------------------------
1 1/1/2017 A $10
2 1/2/2017 B $20
3 1/3/2017 C $30
4 4/1/2017 C $40
5 4/2/2017 D $50
6 4/3/2017 D $60
7 1/6/2018 B $70
Here's what we want:
New defined as: customer has not placed any orders in any prior months.
Repeat defined as: customer has placed an order in a prior month (even if many years ago).
This means that if a new customer places multiple orders in her first month, they would all be considered "new" customer orders. And orders placed in subsequent months would all be considered "repeat" customer orders.
We want to get New orders (count and sum) and Repeat orders (count and sum) per year, per month:
Year Month NewCount NewSum RepeatCount RepeatSum
-----------------------------------------------------------------------------
2017 1 3 (A,B,C) $60 (10+20+30) 0 $0
2017 4 2 (D,D) $110 (50+60) 1 (C) $40 (40)
2018 1 0 $0 1 (B) $70 (70)
(The info in () parenthesis is not part of the result; just putting it here for clarity)
The SQL is easy to write for any single given month, but I don't know how to do it when gathering years worth of months at a time...
If there is a month with no orders of any kind then NULL or 0 values for the year:month would be preferred.
You can use dense_rank to find new and old customers. This query returns your provided output
declare #t table (OrderID int, OrderDate date, Customer char(1), OrderAmount int)
insert into #t
values (1, '20170101', 'A', 10)
, (2, '20170102', 'B', 20), (3, '20170103', 'C', 30)
, (4, '20170401', 'C', 40), (5, '20170402', 'D', 50)
, (6, '20170403', 'D', 60), (7, '20180106', 'B', 70)
select
[year], [month], NewCount = isnull(sum(case when dr = 1 then 1 end), 0)
, NewSum = isnull(sum(case when dr = 1 then OrderAmount end), 0)
, RepeatCount = isnull(sum(case when dr > 1 then 1 end), 0)
, RepeatSum = isnull(sum(case when dr > 1 then OrderAmount end), 0)
from (
select
*, [year] = year(OrderDate), [month] = month(OrderDate)
, dr = dense_rank() over (partition by Customer order by dateadd(month, datediff(month, 0, OrderDate), 0))
from
#t
) t
group by [year], [month]
Output
year month NewCount NewSum RepeatCount RepeatSum
----------------------------------------------------------
2017 1 3 60 0 0
2018 1 0 0 1 70
2017 4 2 110 1 40
You must get combination of each year in the table with all months at first if you want to display months without orders. Then join with upper query
select
*
from
(select distinct y = year(OrderDate) from #t) t
cross join (values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12)) q(m)
First, start by summarizing the data with one record per customer per month.
Then, you can use a self-join or similar construct to get the information you need:
with cm as (
select customer, dateadd(day, 1 - day(orderdate), orderdate) as yyyymm
sum(orderamount) as monthamount, count(*) as numorders
from orders
group by customer
)
select year(cm.yyyymm) as yr, month(cm.yyyymm) as mon,
sum(case when cm.num_orders > 0 and cm_prev.customer is null then 1 else 0 end) as new_count,
sum(case when cm.num_orders > 0 and cm_prev.customer is null then monthamount else 0 end) as new_amount,
sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then 1 else 0 end) as repeat_count,
sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then monthamount else 0 end) as repeat_amount
from cm left join
cm cm_prev
on cm.customer = cm_prev.customer and
cm.yyyymm = dateadd(month, 1, cm_prev.yyyymm)
group by year(cm.yyyymm), month(cm.yyyymm)
order by year(cm.yyyymm), month(cm.yyyymm);
This would be a bit easier in SQL Server 2012, where you can use lag().
Related
I am counting the birthdays , sales , order in all 12 months from customers table in SQL server like these
In Customers table birth_date ,sale_date, order_date are columns of the table
select 1 as ranking,'Birthdays' as Type,[MONTH],TOTAL
from ( select DATENAME(month, birth_date) AS [MONTH],count(*) TOTAL
from customers
group by DATENAME(month, birth_date)
)x
union
select 2 as ranking,'sales' as Type,[MONTH],TOTAL
from ( select DATENAME(month, sale_date) AS [MONTH],count(*) TOTAL
from customers
group by DATENAME(month, sale_date)
)x
union
select 3 as ranking,'Orders' as Type,[MONTH],TOTAL
from ( select DATENAME(month, order_date) AS [MONTH],count(*) TOTAL
from customers
group by DATENAME(month, order_date)
)x
And the output is like these(just dummy data)
ranking
Type
MONTH
TOTAL
1
Birthdays
January
12
1
Birthdays
April
6
1
Birthdays
May
10
2
Sales
Febrary
8
2
Sales
April
14
2
Sales
May
10
3
Orders
June
4
3
Orders
July
3
3
Orders
October
6
3
Orders
December
17
I want to find count of these all these three types without using UNION and UNION ALL, means I want these data by single query statement (or more optimize version of these query)
Another approach is to create a CTE with all available ranking values and use CROSS APPLY for it, as shown below.
WITH ranks(ranking) AS (
SELECT * FROM (VALUES (1), (2), (3)) v(r)
)
SELECT
r.ranking,
CASE WHEN r.ranking = 1 THEN 'Birthdays'
WHEN r.ranking = 2 THEN 'Sales'
WHEN r.ranking = 3 THEN 'Orders'
END AS Type,
DATENAME(month, CASE WHEN r.ranking = 1 THEN c.birth_date
WHEN r.ranking = 2 THEN c.sale_date
WHEN r.ranking = 3 THEN c.order_date
END) AS MONTH,
COUNT(*) AS TOTAL
FROM customers c
CROSS APPLY ranks r
GROUP BY r.ranking,
DATENAME(month, CASE WHEN r.ranking = 1 THEN c.birth_date
WHEN r.ranking = 2 THEN c.sale_date
WHEN r.ranking = 3 THEN c.order_date
END)
ORDER BY r.ranking, MONTH
I have the following table on SQL Server:
ID
FROM
TO
OFFER NUMBER
1
2022.01.02
9999.12.31
1
1
2022.01.02
2022.02.10
2
2
2022.01.05
2022.02.15
1
3
2022.01.02
9999.12.31
1
3
2022.01.15
2022.02.20
2
3
2022.02.03
2022.02.25
3
4
2022.01.16
2022.02.05
1
5
2022.01.17
2022.02.13
1
5
2022.02.05
2022.02.13
2
The range includes the start date but excludes the end date.
The date 9999.12.31 is given (comes from another system), but we could use the last day of the current quarter instead.
I need to find a way to determine the number of days when the customer sees exactly one, two, or three offers. The following picture shows the method upon id 3:
The expected results should be like (without using the last day of the quarter):
ID
# of days when the customer sees only 1 offer
# of days when the customer sees 2 offers
# of days when the customer sees 3 offers
1
2913863
39
0
2
41
0
0
3
2913861
24
17
4
20
0
0
5
19
8
0
I've found this article but it did not enlighten me.
Also I have limited privileges that is I am not able to declare a variable for example so I need to use "basic" TSQL.
Please provide a detailed explanation besides the code.
Thanks in advance!
The following will (for each ID) extract all distinct dates, construct non-overlapping date ranges to test, and will count up the number of offers per range. The final step is to sum and format.
The fact that the start dates are inclusive and the end dates are exclusive while sometimes non-intuitive for the human, actually works well in algorithms like this.
DECLARE #Data TABLE (Id INT, FromDate DATETIME, ToDate DATETIME, OfferNumber INT)
INSERT #Data
VALUES
(1, '2022-01-02', '9999-12-31', 1),
(1, '2022-01-02', '2022-02-10', 2),
(2, '2022-01-05', '2022-02-15', 1),
(3, '2022-01-02', '9999-12-31', 1),
(3, '2022-01-15', '2022-02-20', 2),
(3, '2022-02-03', '2022-02-25', 3),
(4, '2022-01-16', '2022-02-05', 1),
(5, '2022-01-17', '2022-02-13', 1),
(5, '2022-02-05', '2022-02-13', 2)
;
WITH Dates AS ( -- Gather distinct dates
SELECT Id, Date = FromDate FROM #Data
UNION --(distinct)
SELECT Id, Date = ToDate FROM #Data
),
Ranges AS ( --Construct non-overlapping ranges (The ToDate = NULL case will be ignored later)
SELECT ID, FromDate = Date, ToDate = LEAD(Date) OVER(PARTITION BY Id ORDER BY Date)
FROM Dates
),
Counts AS ( -- Calculate days and count offers per date range
SELECT R.Id, R.FromDate, R.ToDate,
Days = DATEDIFF(DAY, R.FromDate, R.ToDate),
Offers = COUNT(*)
FROM Ranges R
JOIN #Data D ON D.Id = R.Id
AND D.FromDate <= R.FromDate
AND D.ToDate >= R.ToDate
GROUP BY R.Id, R.FromDate, R.ToDate
)
SELECT Id
,[Days with 1 Offer] = SUM(CASE WHEN Offers = 1 THEN Days ELSE 0 END)
,[Days with 2 Offers] = SUM(CASE WHEN Offers = 2 THEN Days ELSE 0 END)
,[Days with 3 Offers] = SUM(CASE WHEN Offers = 3 THEN Days ELSE 0 END)
FROM Counts
GROUP BY Id
The WITH clause introduces Common Table Expressions (CTEs) which progressively build up intermediate results until a final select can be made.
Results:
Id
Days with 1 Offer
Days with 2 Offers
Days with 3 Offers
1
2913863
39
0
2
41
0
0
3
2913861
24
17
4
20
0
0
5
19
8
0
Alternately, the final select could use a pivot. Something like:
SELECT Id,
[Days with 1 Offer] = ISNULL([1], 0),
[Days with 2 Offers] = ISNULL([2], 0),
[Days with 3 Offers] = ISNULL([3], 0)
FROM (SELECT Id, Offers, Days FROM Counts) C
PIVOT (SUM(Days) FOR Offers IN ([1], [2], [3])) PVT
ORDER BY Id
See This db<>fiddle for a working example.
Find all date points for each ID. For each date point, find the number of overlapping.
Refer to comments within query
with
dates as
(
-- get all date points
select ID, theDate = FromDate from offers
union -- union to exclude any duplicate
select ID, theDate = ToDate from offers
),
cte as
(
select ID = d.ID,
Date_Start = d.theDate,
Date_End = LEAD(d.theDate) OVER (PARTITION BY ID ORDER BY theDate),
TheCount = c.cnt
from dates d
cross apply
(
-- Count no of overlapping
select cnt = count(*)
from offers x
where x.ID = d.ID
and x.FromDate <= d.theDate
and x.ToDate > d.theDate
) c
)
select ID, TheCount, days = sum(datediff(day, Date_Start, Date_End))
from cte
where Date_End is not null
group by ID, TheCount
order by ID, TheCount
Result :
ID
TheCount
days
1
1
2913863
1
2
39
2
1
41
3
1
2913861
3
2
29
3
3
12
4
1
20
5
1
19
5
2
8
To get to the required format, use PIVOT
dbfiddle demo
I have a table with historical stocks prices for hundreds of stocks. I need to extract only those stocks that reached $10 or greater for the first time.
Stock
Price
Date
AAA
9
2021-10-01
AAA
10
2021-10-02
AAA
8
2021-10-03
AAA
10
2021-10-04
BBB
9
2021-10-01
BBB
11
2021-10-02
BBB
12
2021-10-03
Is there a way to count how many times each stock hit >= 10 in order to pull only those where count = 1 (in this case it would be stock BBB considering it never reached 10 in the past)?
Since I couldn't figure how to create count I've tried the below manipulations with min/max dates but this looks like a bit awkward approach. Any idea of a simpler solution?
with query1 as (
select Stock, min(date) as min_greater10_dt
from t
where Price >= 10
group by Stock
), query2 as (
select Stock, max(date) as max_greater10_dt
from t
where Price >= 10
group by Stock
)
select Stock
from t a
join query1 b on b.Stock = a.Stock
join query2 c on c.Stock = a.Stock
where not(a.Price < 10 and a.Date between b.min_greater10_dt and c.max_greater10_dt)
This is a type of gaps-and-islands problem which can be solved as follows:
detect the change from < 10 to >= 10 using a lagged price
count the number of such changes
filter in only stock where this has happened exactly once
and take the first row since you only want the stock (you could group by here but a row number allows you to select the entire row should you wish to).
declare #Table table (Stock varchar(3), Price money, [Date] date);
insert into #Table (Stock, Price, [Date])
values
('AAA', 9, '2021-10-01'),
('AAA', 10, '2021-10-02'),
('AAA', 8, '2021-10-03'),
('AAA', 10, '2021-10-04'),
('BBB', 9, '2021-10-01'),
('BBB', 11, '2021-10-02'),
('BBB', 12, '2021-10-03');
with cte1 as (
select Stock, Price, [Date]
, row_number() over (partition by Stock, case when Price >= 10 then 1 else 0 end order by [Date] asc) rn
, lag(Price,1,0) over (partition by Stock order by [Date] asc) LaggedStock
from #Table
), cte2 as (
select Stock, Price, [Date], rn, LaggedStock
, sum(case when Price >= 10 and LaggedStock < 10 then 1 else 0 end) over (partition by Stock) StockOver10
from cte1
)
select Stock
--, Price, [Date], rn, LaggedStock, StockOver10 -- debug
from cte2
where Price >= 10
and StockOver10 = 1 and rn = 1;
Returns:
Stock
BBB
Note: providing DDL+DML as show above makes it much easier of people to assist.
I have a table sales with columns
Month SalesAmount
--------------------------
4 50000
5 60000
6 70000
7 50000
8 60000
9 40000
I want result like this
From Month To Month Result
-----------------------------------------------
4 6 Increasing
6 7 Decreasing
7 8 Increasing
8 9 Decreasing
without using a cursor
Try this. Basically, you need to join the table to itself by the month (+1), then pull the data you want/perform any calcs.
Select
M1.Month as [From],
M2.Month as [To],
Case
When M2.SalesAmount > M1.SalesAmount Then 'Increasing'
When M2.SalesAmount < M1.SalesAmount Then 'Decreasing'
Else 'Holding Steady'
End
From sales M1
Inner Join sales M2 on M2.Month = M1.Month + 1
This works if you want the breakdown month by month. However, your example data set compresses months 4-6. Without more details on how you determine what to compress, I'm going to make the following assumptions:
You want detailed data for the last 3 periods, and a compressed summary of all other periods.
You wish only the overall trend between the first month and the last month inside the compressed period. i.e. you want to know the difference between the first, and the last month values.
To do that, the query starts to get more complicated. I've done it with two Unioned queries:
With
compressed_range as
( select min([Month]) as min_month, max([Month]) - 3 as max_month from sales )
Select
M1.[Month] as [From],
M2.[Month] as [To],
Case
When M2.SalesAmount > M1.SalesAmount Then 'Increasing'
When M2.SalesAmount < M1.SalesAmount Then 'Decreasing'
Else 'Holding Steady'
End
From sales M1
Inner Join sales M2 on M2.[Month] = ( select max_month from compressed_range )
Where M1.Month = ( select min_month from compressed_range )
Union All
Select
M1.Month as [From],
M2.Month as [To],
Case
When M2.SalesAmount > M1.SalesAmount Then 'Increasing'
When M2.SalesAmount < M1.SalesAmount Then 'Decreasing'
Else 'Holding Steady'
End
From sales M1
Inner Join sales M2 on M2.Month = M1.Month + 1
Where M2.Month >= (Select max_month + 1 from compressed_range)
This gives your desired result:
DECLARE #T TABLE (Month INT, SalesAmount MONEY);
INSERT #T
VALUES (4, 50000), (5, 60000), (6, 70000), (7, 50000), (8, 60000), (9, 40000);
WITH CTE AS
( SELECT FromMonth = T2.Month,
ToMonth = T.Month,
Result = CASE T2.Result
WHEN -1 THEN 'Decreasing'
WHEN 0 THEN 'Static'
WHEN 1 THEN 'Increasing'
END,
GroupingSet = ROW_NUMBER() OVER(ORDER BY T.Month) - ROW_NUMBER() OVER(PARTITION BY T2.Result ORDER BY T.Month)
FROM #T T
CROSS APPLY
( SELECT TOP 1
T2.SalesAmount,
T2.Month,
Result = SIGN(T.SalesAmount - T2.SalesAmount)
FROM #T T2
WHERE T2.Month < T.Month
ORDER BY T2.Month DESC
) T2
)
SELECT FromMonth = MIN(FromMonth),
ToMonth = MAX(ToMonth),
Result
FROM CTE
GROUP BY Result, GroupingSet
ORDER BY FromMonth;
The first stage is to get the sales amount for the previous month each time:
SELECT *
FROM #T T
CROSS APPLY
( SELECT TOP 1
T2.SalesAmount,
T2.Month,
Result = SIGN(T.SalesAmount - T2.SalesAmount)
FROM #T T2
WHERE T2.Month < T.Month
ORDER BY T2.Month DESC
) T2
ORDER BY T.MONTH
Will Give:
Month SalesAmount SalesAmount Month Result
5 60000.00 50000.00 4 1.00
6 70000.00 60000.00 5 1.00
7 50000.00 70000.00 6 -1.00
8 60000.00 50000.00 7 1.00
9 40000.00 60000.00 8 -1.00
Where Result is just an indicator of whether or not the amount has increased or decreased. You then need to apply an ordering trick whereby each member of a sequence - it's postion in the sequence is constant for sequential members. So with the above data set if we added:
RN1 = ROW_NUMBER() OVER(ORDER BY T.Month),
RN2 = ROW_NUMBER() OVER(PARTITION BY T2.Result ORDER BY T.Month)
Month SalesAmount SalesAmount Month Result RN1 RN2 | RN1 - RN2
5 60000.00 50000.00 4 1.00 1 1 | 0
6 70000.00 60000.00 5 1.00 2 2 | 0
7 50000.00 70000.00 6 -1.00 3 1 | 2
8 60000.00 50000.00 7 1.00 4 3 | 1
9 40000.00 60000.00 8 -1.00 5 2 | 3
So you can see for the first 2 rows the final column RN1 - RN2 remains the same as they are both increasing, then when the result changes, the difference between these two row_numbers chnages, so creates a new group.
You can then group by this calculation (the GroupingSet column in the original query), to group your consecutive periods of increase and decrease together.
Example on SQL Fiddle
If you are using only month no in your table structure, you can try something like this
SELECT s1.month AS From_Month,
s2.month AS To_Month,
CASE
WHEN s2.salesamount > s1.salesamount THEN 'Increasing'
ELSE 'Decresing'
END AS res
FROM sales AS s1,
sales AS s2
WHERE s1.month + 1 = s2.month
demo at http://sqlfiddle.com/#!6/0819d/11
I am looking to set up 6 groups into which customers would fall into:
Non-purchaser (never bought from us)
New purchaser (purchased for the first time within the current financial year)
Reactivated purchaser (purchased in the current financial year, and also in the 2nd most recent year)
Lapsed purchaser (purchased in the prior financial year but not the current one)
2 yr Consecutive purchaser (has purchased in the current financial year and the most recent one)
3-4 yr consecutive purchaser (has purchased in every year for the last 3 or 4 financial years)
5+ year consecutive purchaser (has purchased in every financial year for a minimum of 5 years)
The financial year I would be using would be from 1st april to 31st march, and would use the following tables:
purchaser (including id (primary key))
purchases (date_purchased, purchases_purchaser_id)
Where the tables are joined on purchaser_id = purchases_purchaser_id and each purchaser can have multiple purchases withn any financial year (so could presumably be grouped by year as well)
It's been driving me mad so any help would be majorly appreciated!!!
Thanks,
Davin
Here is dynamic version
Declare #currentYear int
Declare #OlderThan5yrs datetime
Set #currentYear = Year(GetDate()) - Case When month(GetDate())<4 then 1 else 0 end
Set #OlderThan5yrs = cast(cast( #currentYear-5 as varchar(4))+'/04/01' as datetime)
Select p.pName,
p.purchaser_id,
isNull(a.[5+YrAgo],0) as [5+YrAgo],
isNull(a.[4YrAgo], 0) as [4YrAgo],
isNull(a.[3YrAgo], 0) as [3YrAgo],
isNull(a.[2YrAgo], 0) as [2YrAgo],
isNull(a.[1YrAgo], 0) as [1YrAgo],
isNull(a.[CurYr], 0) as [CurYr],
isNull(a.Category, 'Non-purchaser (ever)') as Category
From purchasers p
Left Join
(
Select purchases_purchaser_id,
[5] as [5+YrAgo],
[4] as [4YrAgo],
[3] as [3YrAgo],
[2] as [2YrAgo],
[1] as [1YrAgo],
[0] as [CurYr],
Case When [4]+[3]+[2]+[1]+[0] = 5 Then '5+ year consecutive'
When [2]+[1]+[0] = 3 Then '3-4 yr consecutive'
When [1]+[0] = 2 Then '2 yr Consecutive'
When [1]=1 and [0]=0 Then 'Lapsed'
When [2]=1 and [1]=0 and [0]=1 Then 'Reactivated'
When [4]+[3]+[2]+[1]=0 and [0]=1 Then 'New'
When [4]+[3]+[2]+[1]+[0] = 0 Then 'Non-purchaser (last 5 yrs)'
Else 'non categorized'
End as Category
From (
Select purchases_purchaser_id,
Case When date_purchased < #OlderThan5yrs Then 5
Else #currentYear - Year(date_purchased)+ Case When month(date_purchased)<4 Then 1 else 0 end
end as fiscalYear, count(*) as nPurchases
From purchases
Group by purchases_purchaser_id,
Case When date_purchased < #OlderThan5yrs Then 5
Else #currentYear - Year(date_purchased)+ Case When month(date_purchased)<4 Then 1 else 0 end
end
) as AggData
PIVOT ( count(nPurchases) for fiscalYear in ([5],[4],[3],[2],[1],[0]) ) pvt
) as a
on p.purchaser_id=a.purchases_purchaser_id
UPDATED:
Here is result with data I inserted in previous query (You will have to add # to table names in the query).
pName purchaser_id 5+YrAgo 4YrAgo 3YrAgo 2YrAgo 1YrAgo CurYr Category
-------------------- ------------ ------- ------ ------ ------ ------ ----- --------------------------
Non-purchaser 0 0 0 0 0 0 0 Non-purchaser (ever)
New purchaser 1 0 0 0 0 0 1 New
Reactivated 2 0 0 1 1 0 1 Reactivated
Lapsed 3 0 0 0 1 1 0 Lapsed
2 yr Consecutive 4 0 0 0 0 1 1 2 yr Consecutive
3 yr consecutive 5 0 0 0 1 1 1 3-4 yr consecutive
4 yr consecutive 6 0 0 1 1 1 1 3-4 yr consecutive
5+ year consecutive 7 1 1 1 1 1 1 5+ year consecutive
Uncategorized 8 0 0 1 0 0 0 non categorized
old one 9 1 0 0 0 0 0 Non-purchaser (last 5 yrs)
You also don't need columns [5+YrAgo], [4YrAgo], [3YrAgo], [2YrAgo], [1YrAgo] and [CurYr].
I added them to be easier to check query logic.
UPDATE 2
Below is query you asked in comment.
Note
table structures I've used in query are:
Table purchasers ( purchaser_id int, pName varchar(20))
Table purchases (purchases_purchaser_id int, date_purchased datetime)
and there is Foreign key on purchases (purchases_purchaser_id) referencing purchases (purchaser_id).
;With AggData as (
Select purchases_purchaser_id,
Case When [4]+[3]+[2]+[1]+[0] = 5 Then 1 end as [Consec5],
Case When [4]=0 and [2]+[1]+[0] = 3 Then 1 end as [Consec34],
Case When [2]=0 and [1]+[0] = 2 Then 1 end as [Consec2],
Case When [1]=1 and [0]=0 Then 1 end as [Lapsed],
Case When [2]=1 and [1]=0 and [0]=1 Then 1 end as [Reactivated],
Case When [4]+[3]+[2]+[1]=0 and [0]=1 Then 1 end as [New],
Case When [4]+[3]+[2]>0 and [1]+[0]=0 Then 1 end as [Uncateg]
From (
Select purchases_purchaser_id,
#currentYear - Year(date_purchased) + Case When month(date_purchased)<4 Then 1 else 0 end as fiscalYear,
count(*) as nPurchases
From purchases
Where date_purchased >= #OlderThan5yrs
Group by purchases_purchaser_id,
#currentYear - Year(date_purchased) + Case When month(date_purchased)<4 Then 1 else 0 end
) as AggData
PIVOT ( count(nPurchases) for fiscalYear in ([4],[3],[2],[1],[0]) ) pvt
)
Select count([Consec5]) as [Consec5],
count([Consec34]) as [Consec34],
count([Consec2]) as [Consec2],
count([Lapsed]) as [Lapsed],
count([Reactivated]) as [Reactivated],
count([New]) as [New],
count(*)-count(a.purchases_purchaser_id) as [Non],
count([Uncateg]) as [Uncateg]
From purchasers p
Left Join AggData as a
on p.purchaser_id=a.purchases_purchaser_id
Result (With test data from previous post)
Consec5 Consec34 Consec2 Lapsed Reactivated New Non Uncateg
------- -------- ------- ------ ----------- --- --- -------
1 2 1 1 1 1 2 1
MS SQL Server (works on 2000, 2005, 2008)
SET NOCOUNT ON
CREATE TABLE #purchasers (purchaser_id int, pName varchar(20))
Insert Into #purchasers values (0, 'Non-purchaser')
Insert Into #purchasers values (1, 'New purchaser')
Insert Into #purchasers values (2, 'Reactivated')
Insert Into #purchasers values (3, 'Lapsed')
Insert Into #purchasers values (4, '2 yr Consecutive')
Insert Into #purchasers values (5, '3 yr consecutive')
Insert Into #purchasers values (6, '4 yr consecutive')
Insert Into #purchasers values (7, '5+ year consecutive')
Insert Into #purchasers values (8, 'Uncategorized')
Insert Into #purchasers values (9, 'old one')
CREATE TABLE #purchases (date_purchased datetime, purchases_purchaser_id int)
Insert Into #purchases values ('2010/05/03', 1)
Insert Into #purchases values ('2007/05/03', 2)
Insert Into #purchases values ('2008/05/03', 2)
Insert Into #purchases values ('2010/05/03', 2)
Insert Into #purchases values ('2008/05/03', 3)
Insert Into #purchases values ('2009/05/03', 3)
Insert Into #purchases values ('2009/05/03', 4)
Insert Into #purchases values ('2010/05/03', 4)
Insert Into #purchases values ('2008/05/03', 5)
Insert Into #purchases values ('2009/05/03', 5)
Insert Into #purchases values ('2010/05/03', 5)
Insert Into #purchases values ('2007/05/03', 6)
Insert Into #purchases values ('2008/05/03', 6)
Insert Into #purchases values ('2009/05/03', 6)
Insert Into #purchases values ('2010/05/03', 6)
Insert Into #purchases values ('2004/05/03', 7)
Insert Into #purchases values ('2005/05/03', 7)
Insert Into #purchases values ('2006/05/03', 7)
Insert Into #purchases values ('2007/05/03', 7)
Insert Into #purchases values ('2008/05/03', 7)
Insert Into #purchases values ('2009/05/03', 7)
Insert Into #purchases values ('2009/05/03', 7)
Insert Into #purchases values ('2009/05/03', 7)
Insert Into #purchases values ('2010/05/03', 7)
Insert Into #purchases values ('2007/05/03', 8)
Insert Into #purchases values ('2000/05/03', 9)
Select p.pName,
p.purchaser_id,
isNull(a.[2005],0) as [Bef.2006],
isNull(a.[2006],0) as [2006],
isNull(a.[2007],0) as [2007],
isNull(a.[2008],0) as [2008],
isNull(a.[2009],0) as [2009],
isNull(a.[2010],0) as [2010],
isNull(a.Category, 'Non-purchaser') as Category
From #purchasers p
Left Join
(
Select purchases_purchaser_id, [2005],[2006],[2007],[2008],[2009],[2010],
Case When [2006]+[2007]+[2008]+[2009]+[2010] = 5 Then '5+ year consecutive'
When [2008]+[2009]+[2010] = 3 Then '3-4 yr consecutive'
When [2009]+[2010] = 2 Then '2 yr Consecutive'
When [2009]=1 and [2010]=0 Then 'Lapsed'
When [2008]=1 and [2009]=0 and [2010]=1 Then 'Reactivated'
When [2006]+[2007]+[2008]+[2009]=0 and [2010]=1 Then 'New'
When [2006]+[2007]+[2008]+[2009]+[2010] = 0 Then 'Non-purchaser in last 5 yrs'
Else 'non categorized'
End as Category
From (
Select purchases_purchaser_id,
Case When date_purchased < '2006/04/01' Then 2005
Else Year(date_purchased)- Case When month(date_purchased)<4 Then -1 else 0 end
end as fiscalYear, count(*) as nPurchases
From #purchases
Group by purchases_purchaser_id,
Case When date_purchased < '2006/04/01' Then 2005
Else Year(date_purchased)- Case When month(date_purchased)<4 Then -1 else 0 end
end
) as AggData
PIVOT ( count(nPurchases) for fiscalYear in ([2005],[2006],[2007],[2008],[2009],[2010]) ) pvt
) as a
on p.purchaser_id=a.purchases_purchaser_id
Although it COULD be done a bit easier with another table of date ranges showing the 5 fiscal years, I have hard-coded the from/to date references for your query and appears to be working...
The INNER Select will pre-gather a "flag" based on any 1 or more purchase within the given date range... ex: Apr 1, 2010 = "20100401" for date conversion to Mar 31, 2011 = "20110331", and cycle through last 5 years... Additionally, a flag to count for ANY with a date purchase within the actual purchases table to confirm a "never purchased" vs someone purchasing 6, 7 or older years history...
That queries' basis will basically create a cross-tab of possible individual years where activity has occurred. I can then query with the most detailed criteria for some caption of their classification down to the least...
I converted from another SQL language as best as possible to comply with SQL-Server syntax (mostly about the date conversion), but otherwise, the principle and queries do work... The final classification column is character, but can be whatever you want to supercede.
SELECT
id,
CASE
WHEN year1 + year2 + year3 + year4 + year5 = 5 THEN "5+yrs "
WHEN year1 + year2 + year3 + year4 >= 3 THEN "3-4yrs"
WHEN year1 + year2 = 2, "2yrs "
WHEN year1 = 1 AND year2 = 0 AND year3 = 1 THEN "Reacti"
WHEN year1 = 1 THEN "New "
WHEN year1 = 0 AND year2 = 1 THEN "Lapsed"
WHEN AnyPurchase = 1, "over5"
ELSE "never" BuyerClassification
END
FROM
( SELECT
id,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20100401", 112 )
AND date_purchased <= CONVERT( Date, "20110331", 112 )
THEN 1 ELSE 0 END ) Year1,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20090401", 112 )
AND date_purchased <= CONVERT( Date, "20100331", 112 )
THEN 1 ELSE 0 END ) Year2,
MAX( CASE WEHEN date_purchased >= CONVERT( Date, "20080401", 112 )
AND date_purchased <= CONVERT( Date, "20090331", 112 )
THEN 1 ELSE 0 END ) Year3,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20070401", 112 )
AND date_purchased <= CONVERT( Date, "20080331", 112 )
THEN 1 ELSE 0 END ) Year4,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20060401", 112 )
AND date_purchased <= CONVERT( Date, "20070331", 112 )
THEN 1 ELSE 0 END ) Year5,
MAX( CASE WHEN date_purchased <= CONVERT( Date, "20100401", 112 )
THEN 1 ELSE 0 END ) AnyPurchase
FROM
purchaser LEFT OUTER JOIN purchases
ON purchaser.id = purchases.purchases_purchaser_id
GROUP BY
1 ) PreGroup1
EDIT --
fixed parens via syntax conversion and missed it...
The "Group By 1" refers to doing a group by the first column in the query which is the purchaser's ID from the purchaser. By doing a left-outer join will guarantee all possible people in the purchasers table regardless of having any actual purchases. The "PreGroup1" is the "alias" of the select statement just in case you wanted to do other joins subsequent in the outer most select where detecting the year values for classification.
Although it will work, but may not be as efficient as others have chimed-in on by doing analysis of the query, it may open your mind to some querying and aggregating techniques. This process is basically creating a sort-of cross-tab by utilization of case/when construct on the inner SQL-Select, and final classification in the OUTER most SQL-Select.