MERGE, UPDATE, INSERT T-SQL Cannot INSERT value NULL - sql

Goal: Keep a running table of student class ranks each month of the year
Haves: I have code that provides me with columns
StudentID; '+#DateTXT+'
The DateTXT is dynamic variable, returns whatever month I'm running the code in.
Needs: I'm trying to use the MERGE, UPDATE, INSERT functions to where I can run the code once and establish a table:
| StudentID | Jan |
| 56789 | 2 |
| 12345 | 7 |
Then each month I add a new month column the permanent table:
EXEC('ALTER TABLE StudentRanking
ADD ' + #DateTXT + ' smallint NOT NULL DEFAULT(999)')
| StudentID | Jan | Feb |
| 56789 | 2 | 999 |
| 12345 | 7 | 999 |
I'll run the ranking code again for February and save it into a temporary table, which I will use to merge, update, insert with the StudentRanking table:
| StudentID | Feb |
| 56789 | 3 |
(note.. student 12345 doesn't come up)
So I'd like to end up with a running list:
EXEC('
MERGE StudentRanking AS TARGET
USING ##TEMPDB2 AS SOURCE ON (TARGET.StudentID = SOURCE.StudentID)
WHEN MATCHED AND TARGET.' + #DateTXT + ' <> SOURCE.' + #DateTXT + '
THEN UPDATE SET TARGET.' + #DateTXT + ' = SOURCE.' + #DateTXT + '
WHEN NOT MATCHED BY TARGET THEN
INSERT (StudentID, ' + #Rank_TXT + ')
VALUES (SOURCE.StudentID, SOURCE.' + #Rank_TXT + ') ')
| StudentID | Jan | Feb |
| 56789 | 2 | 3 |
| 12345 | 7 |null |
Problem: Some students leave the school, thereby creating a null ranking in proceeding months (e.g. 12345 has no rank in February), so when I try to INSERT the results from a temporary table, I get this ERROR:
SQL Server Database Error: Cannot insert the value NULL into column 'Feb', table 'tempdb.dbo.##TEMPDB'; column does not allow nulls. UPDATE fails.
I could do an ISNULL(ranking,0) but I'd rather have nulls than 0's

Always open for a different, better, approach #GarethD!
I actually got it to work by doing:
EXEC('ALTER TABLE StudentRanking
ADD ' + #Date_TXT + ' smallint DEFAULT(null)')
WHEN MATCHED AND TARGET.' + #Date_TXT + ' IS NULL
THEN UPDATE SET TARGET.' + #Date_TXT + ' = SOURCE.' + #Date_TXT + '

The quick fix is make the column not nullable. It is worth pointing out though that this solution does not scale well. A more scalable approach would be to use a properly normalised table where StudentID and Month make up your primary key.
You then have something like:
CREATE TABLE dbo.StudentRanking
(
Date DATE NOT NULL,
StudentID INT NOT NULL,
Score INT NOT NULL,
CONSTRAINT PK_StudentRanking__StudentID_Date PRIMARY KEY(Date, StudentID),
);
You can then create a view on top of this to get the table in the format you wanted:
CREATE VIEW dbo.StudentRankingByYear
WITH SCHEMABINDING
AS
SELECT StudentID,
Year = DATEPART(YEAR, Date),
Jan = SUM(CASE WHEN DATEPART(MONTH, Date) = 1 THEN Score END),
Feb = SUM(CASE WHEN DATEPART(MONTH, Date) = 2 THEN Score END),
Mar = SUM(CASE WHEN DATEPART(MONTH, Date) = 3 THEN Score END),
Apr = SUM(CASE WHEN DATEPART(MONTH, Date) = 4 THEN Score END),
May = SUM(CASE WHEN DATEPART(MONTH, Date) = 5 THEN Score END),
Jun = SUM(CASE WHEN DATEPART(MONTH, Date) = 6 THEN Score END),
Jul = SUM(CASE WHEN DATEPART(MONTH, Date) = 7 THEN Score END),
Aug = SUM(CASE WHEN DATEPART(MONTH, Date) = 8 THEN Score END),
Sep = SUM(CASE WHEN DATEPART(MONTH, Date) = 9 THEN Score END),
Oct = SUM(CASE WHEN DATEPART(MONTH, Date) = 10 THEN Score END),
Nov = SUM(CASE WHEN DATEPART(MONTH, Date) = 11 THEN Score END),
Dec = SUM(CASE WHEN DATEPART(MONTH, Date) = 12 THEN Score END)
FROM dbo.StudentRanking
GROUP BY StudentID, DATEPART(YEAR, Date);
GO
You can even create an indexed view on top of this to get the table in the format you wanted (if you really needed to, but the query should perform well enough without the need for a view), the only difference is you cannot have null columns, so the missing months would have to show as 0:
CREATE VIEW dbo.StudentRankingByYear
WITH SCHEMABINDING
AS
SELECT StudentID,
Year = DATEPART(YEAR, Date),
Jan = SUM(CASE WHEN DATEPART(MONTH, Date) = 1 THEN Score ELSE 0 END),
Feb = SUM(CASE WHEN DATEPART(MONTH, Date) = 2 THEN Score ELSE 0 END),
Mar = SUM(CASE WHEN DATEPART(MONTH, Date) = 3 THEN Score ELSE 0 END),
Apr = SUM(CASE WHEN DATEPART(MONTH, Date) = 4 THEN Score ELSE 0 END),
May = SUM(CASE WHEN DATEPART(MONTH, Date) = 5 THEN Score ELSE 0 END),
Jun = SUM(CASE WHEN DATEPART(MONTH, Date) = 6 THEN Score ELSE 0 END),
Jul = SUM(CASE WHEN DATEPART(MONTH, Date) = 7 THEN Score ELSE 0 END),
Aug = SUM(CASE WHEN DATEPART(MONTH, Date) = 8 THEN Score ELSE 0 END),
Sep = SUM(CASE WHEN DATEPART(MONTH, Date) = 9 THEN Score ELSE 0 END),
Oct = SUM(CASE WHEN DATEPART(MONTH, Date) = 10 THEN Score ELSE 0 END),
Nov = SUM(CASE WHEN DATEPART(MONTH, Date) = 11 THEN Score ELSE 0 END),
Dec = SUM(CASE WHEN DATEPART(MONTH, Date) = 12 THEN Score ELSE 0 END),
Records = COUNT_BIG(*)
FROM dbo.StudentRanking
GROUP BY StudentID, DATEPART(YEAR, Date);
GO
CREATE UNIQUE CLUSTERED INDEX UQ_StudentRankingByYear__StudentID_Year
ON dbo.StudentRankingByYear (StudentID, Year);

Change your ALTER TABLE to:
EXEC('ALTER TABLE StudentRanking
ADD ' + #DateTXT + ' smallint NULL')
Assuming the down votes are because I didn't offer an alternative that was normalized, I'd recommend using PIVOT for this type of problem.
Setup:
CREATE TABLE dbo.StudentRanking
(
MonthID CHAR(3) NOT NULL,
StudentID INT NOT NULL,
Score INT NOT NULL,
CONSTRAINT PK_StudentRanking__StudentID_Date PRIMARY KEY(MonthID, StudentID),
);
INSERT INTO dbo.StudentRanking VALUES ('JAN', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('FEB', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('MAR', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('APR', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('MAY', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('JUN', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('JUL', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('AUG', 56321, 3)
INSERT INTO dbo.StudentRanking VALUES ('SEP', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('OCT', 56321, 3)
INSERT INTO dbo.StudentRanking VALUES ('NOV', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('DEC', 56321, 2)
INSERT INTO dbo.StudentRanking VALUES ('JAN', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('FEB', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('MAR', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('APR', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('MAY', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('JUN', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('JUL', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('AUG', 56821, 2)
INSERT INTO dbo.StudentRanking VALUES ('SEP', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('OCT', 56821, 2)
INSERT INTO dbo.StudentRanking VALUES ('NOV', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('DEC', 56821, 1)
INSERT INTO dbo.StudentRanking VALUES ('JAN', 56021, 3)
INSERT INTO dbo.StudentRanking VALUES ('FEB', 56021, 3)
INSERT INTO dbo.StudentRanking VALUES ('MAR', 56021, 3)
INSERT INTO dbo.StudentRanking VALUES ('APR', 56021, 3)
INSERT INTO dbo.StudentRanking VALUES ('MAY', 56021, 3)
INSERT INTO dbo.StudentRanking VALUES ('JUN', 56021, 4)
INSERT INTO dbo.StudentRanking VALUES ('JUL', 56021, 5)
Query
SELECT * FROM StudentRanking
PIVOT (SUM(Score) FOR MonthID IN (JAN, FEB, MAR, APR,
MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC)) AS PVT
Results
The SUM(SCORE) is harmless in this instance since there is never more than one record per student per month. It's just there to allow PIVOT to know what to work around.

Related

Pivoting unique users for each month, each year

I'm learning about PIVOT function and I want to try it in my DB, in the table DDOT I have events (rows) made by users during X month Y year in the YYYYMM format.
id_ev iddate id_user ...
------------------------
1 201901 321
2 201902 654
3 201903 987
4 201901 321
5 201903 987
I'm basing my query on the MS Documentation and I'm not getting errors but I'm not able to fill it with the SUM of those unique events (users). In simple words I want to know how many users (unique) checked up each month (x axis) in the year (y axis). However, I'm getting NULL as result
YYYY jan feb mar
----------------------------
2019 NULL NULL NULL
I'm expecting a full table with what I mentionted before.
YYYY jan feb mar
----------------------------
2019 2 1 1
In the code I've tried with different aggregate functions but this block is the closest to a result from SQL.
CREATE TABLE ddot
(
id_ev int NOT NULL ,
iddate int NOT NULL ,
id_user int NOT NULL
);
INSERT INTO DDOT
(
[id_ev], [iddate], [id_user]
)
VALUES
(
1, 201901, 321
),
(
2, 201902, 654
),
(
3, 201903, 987
),
(
4, 201901, 321
),
(
5, 201903, 987
)
GO
SELECT *
FROM (
SELECT COUNT(DISTINCT id_user) [TOT],
DATENAME(YEAR, CAST(iddate+'01' AS DATETIME)) [YYYY], --concat iddate 01 to get full date
DATENAME(MONTH, CAST(iddate+'01' AS DATETIME)) [MMM]
FROM DDOT
GROUP BY DATENAME(YEAR, CAST(iddate+'01' AS DATETIME)),
DATENAME(MONTH, CAST(iddate+'01' AS DATETIME))
) AS DOT_COUNT
PIVOT(
SUM([TOT])
FOR MMM IN (jan, feb, mar)
) AS PVT
Ideally you should be using an actual date in the iddate column, and not a string (number?). We can workaround this using the string functions:
SELECT
CONVERT(varchar(4), LEFT(iddate, 4)) AS YYYY,
COUNT(CASE WHEN CONVERT(varchar(2), RIGHT(iddate, 2)) = '01' THEN 1 END) AS jan,
COUNT(CASE WHEN CONVERT(varchar(2), RIGHT(iddate, 2)) = '02' THEN 1 END) AS feb,
COUNT(CASE WHEN CONVERT(varchar(2), RIGHT(iddate, 2)) = '03' THEN 1 END) AS mar,
...
FROM DDOT
GROUP BY
CONVERT(varchar(4), LEFT(iddate, 4));
Note that if the iddate column already be text, then we can remove all the ugly calls to CONVERT above:
SELECT
LEFT(iddate, 4) AS YYYY,
COUNT(CASE WHEN RIGHT(iddate, 2) = '01' THEN 1 END) AS jan,
COUNT(CASE WHEN RIGHT(iddate, 2) = '02' THEN 1 END) AS feb,
COUNT(CASE WHEN RIGHT(iddate, 2) = '03' THEN 1 END) AS mar,
...
FROM DDOT
GROUP BY
LEFT(iddate, 4);

SQL Server query for new and repeat orders per month

I am working in SQL Server 2008 R2 and having a hard time gathering new vs repeat customer orders.
I have data in this format:
OrderID OrderDate Customer OrderAmount
-----------------------------------------------
1 1/1/2017 A $10
2 1/2/2017 B $20
3 1/3/2017 C $30
4 4/1/2017 C $40
5 4/2/2017 D $50
6 4/3/2017 D $60
7 1/6/2018 B $70
Here's what we want:
New defined as: customer has not placed any orders in any prior months.
Repeat defined as: customer has placed an order in a prior month (even if many years ago).
This means that if a new customer places multiple orders in her first month, they would all be considered "new" customer orders. And orders placed in subsequent months would all be considered "repeat" customer orders.
We want to get New orders (count and sum) and Repeat orders (count and sum) per year, per month:
Year Month NewCount NewSum RepeatCount RepeatSum
-----------------------------------------------------------------------------
2017 1 3 (A,B,C) $60 (10+20+30) 0 $0
2017 4 2 (D,D) $110 (50+60) 1 (C) $40 (40)
2018 1 0 $0 1 (B) $70 (70)
(The info in () parenthesis is not part of the result; just putting it here for clarity)
The SQL is easy to write for any single given month, but I don't know how to do it when gathering years worth of months at a time...
If there is a month with no orders of any kind then NULL or 0 values for the year:month would be preferred.
You can use dense_rank to find new and old customers. This query returns your provided output
declare #t table (OrderID int, OrderDate date, Customer char(1), OrderAmount int)
insert into #t
values (1, '20170101', 'A', 10)
, (2, '20170102', 'B', 20), (3, '20170103', 'C', 30)
, (4, '20170401', 'C', 40), (5, '20170402', 'D', 50)
, (6, '20170403', 'D', 60), (7, '20180106', 'B', 70)
select
[year], [month], NewCount = isnull(sum(case when dr = 1 then 1 end), 0)
, NewSum = isnull(sum(case when dr = 1 then OrderAmount end), 0)
, RepeatCount = isnull(sum(case when dr > 1 then 1 end), 0)
, RepeatSum = isnull(sum(case when dr > 1 then OrderAmount end), 0)
from (
select
*, [year] = year(OrderDate), [month] = month(OrderDate)
, dr = dense_rank() over (partition by Customer order by dateadd(month, datediff(month, 0, OrderDate), 0))
from
#t
) t
group by [year], [month]
Output
year month NewCount NewSum RepeatCount RepeatSum
----------------------------------------------------------
2017 1 3 60 0 0
2018 1 0 0 1 70
2017 4 2 110 1 40
You must get combination of each year in the table with all months at first if you want to display months without orders. Then join with upper query
select
*
from
(select distinct y = year(OrderDate) from #t) t
cross join (values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12)) q(m)
First, start by summarizing the data with one record per customer per month.
Then, you can use a self-join or similar construct to get the information you need:
with cm as (
select customer, dateadd(day, 1 - day(orderdate), orderdate) as yyyymm
sum(orderamount) as monthamount, count(*) as numorders
from orders
group by customer
)
select year(cm.yyyymm) as yr, month(cm.yyyymm) as mon,
sum(case when cm.num_orders > 0 and cm_prev.customer is null then 1 else 0 end) as new_count,
sum(case when cm.num_orders > 0 and cm_prev.customer is null then monthamount else 0 end) as new_amount,
sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then 1 else 0 end) as repeat_count,
sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then monthamount else 0 end) as repeat_amount
from cm left join
cm cm_prev
on cm.customer = cm_prev.customer and
cm.yyyymm = dateadd(month, 1, cm_prev.yyyymm)
group by year(cm.yyyymm), month(cm.yyyymm)
order by year(cm.yyyymm), month(cm.yyyymm);
This would be a bit easier in SQL Server 2012, where you can use lag().

SQL PIVOT without aggregate columns

create table Product_Price
(
id int,
dt date,
SellerName varchar(20),
Product varchar(10),
ShippingTime varchar(20),
Price money
)
insert into Product_Price values (1, '2012-01-16','Sears','AA','2 days',32)
insert into Product_Price values (2, '2012-01-16','Amazon', 'AA','4 days', 40)
insert into Product_Price values (3, '2012-01-16','eBay','AA','1 days', 27)
insert into Product_Price values (4, '2012-01-16','Walmart','AA','Same day', 28)
insert into Product_Price values (5, '2012-01-16','Target', 'AA','3-4 days', 29)
insert into Product_Price values (6, '2012-01-16','Flipcart','AA',NULL, 30)
select *
from
(select dt, product, SellerName, sum(price) as price
from product_price group by dt, product, SellerName) t1
pivot (sum(price) for SellerName in ([amazon],[ebay]))as bob
)
I want 2 more columns in output (One is AmazonShippinTime another is eBayshippintime). How can I get these? Fiddle : http://sqlfiddle.com/#!3/2210d/1
Since you need to pivot on two columns and use different aggregates on both columns, I would use aggregate functions with a CASE expression to get the result:
select
dt,
product,
sum(case when SellerName = 'amazon' then price else 0 end) AmazonPrice,
max(case when SellerName = 'amazon' then ShippingTime end) AmazonShippingTime,
sum(case when SellerName = 'ebay' then price else 0 end) ebayPrice,
max(case when SellerName = 'ebay' then ShippingTime end) ebayShippingTime
from product_price
group by dt, product;
See SQL Fiddle with Demo. This gives a result:
| DT | PRODUCT | AMAZONPRICE | AMAZONSHIPPINGTIME | EBAYPRICE | EBAYSHIPPINGTIME |
|------------|---------|-------------|--------------------|-----------|------------------|
| 2012-01-16 | AA | 40 | 4 days | 27 | 1 days |

Three column SQL PIVOT

How do I do a sql pivot of data that looks like this, USING the SQL PIVOT command ?
id | field | value
---------------------------------------
1 | year | 2011
1 | month | August
2 | year | 2009
1 | day | 21
2 | day | 31
2 | month | July
3 | year | 2010
3 | month | January
3 | day | NULL
Into something that looks like this:
id | year | month | day
-----------------------------
1 2011 August 21
2 2010 July 31
3 2009 January NULL
Try something like this:
DECLARE #myTable AS TABLE([ID] INT, [Field] VARCHAR(20), [Value] VARCHAR(20))
INSERT INTO #myTable VALUES ('1', 'year', '2011')
INSERT INTO #myTable VALUES ('1', 'month', 'August')
INSERT INTO #myTable VALUES ('2', 'year', '2009')
INSERT INTO #myTable VALUES ('1', 'day', '21')
INSERT INTO #myTable VALUES ('2', 'day', '31')
INSERT INTO #myTable VALUES ('2', 'month', 'July')
INSERT INTO #myTable VALUES ('3', 'year', '2010')
INSERT INTO #myTable VALUES ('3', 'month', 'January')
INSERT INTO #myTable VALUES ('3', 'day', NULL)
SELECT [ID], [year], [month], [day]
FROM
(
SELECT [ID], [Field], [Value] FROM #myTable
) t
PIVOT
(
MIN([Value]) FOR [Field] IN ([year], [month], [day])
) AS pvt
ORDER BY pvt.[year] DESC
Which will yield results of:
ID year month day
1 2011 August 21
3 2010 January NULL
2 2009 July 31
;WITH DATA(id,field,value) AS
(
SELECT 1,'year','2011' UNION ALL
SELECT 1,'month','August' UNION ALL
SELECT 2,'year','2009' UNION ALL
SELECT 1,'day ','21' UNION ALL
SELECT 2,'day ','31' UNION ALL
SELECT 2,'month','July' UNION ALL
SELECT 3,'year','2010' UNION ALL
SELECT 3,'month','January' UNION ALL
SELECT 3,'day ',NULL
)
SELECT id,
year,
month,
day
FROM DATA PIVOT (MAX(value) FOR field IN ([year], [month], [day])) AS Pvt
SELECT
id,
MAX(CASE WHEN RK=3 THEN VAL ELSE '' END) AS "YEAR",
MAX(CASE WHEN RK=2 THEN VAL ELSE '' END) AS "MONTH",
MAX(CASE WHEN RK=1 THEN VAL ELSE '' END) AS "DAY"
FROM
(
SELect
ID,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY YEAR1 ASC) RK,
VAL
FROM TEST3)A
GROUP BY 1
ORDER BY 1;

Grouping by financial year and applying these groups as filters

I am looking to set up 6 groups into which customers would fall into:
Non-purchaser (never bought from us)
New purchaser (purchased for the first time within the current financial year)
Reactivated purchaser (purchased in the current financial year, and also in the 2nd most recent year)
Lapsed purchaser (purchased in the prior financial year but not the current one)
2 yr Consecutive purchaser (has purchased in the current financial year and the most recent one)
3-4 yr consecutive purchaser (has purchased in every year for the last 3 or 4 financial years)
5+ year consecutive purchaser (has purchased in every financial year for a minimum of 5 years)
The financial year I would be using would be from 1st april to 31st march, and would use the following tables:
purchaser (including id (primary key))
purchases (date_purchased, purchases_purchaser_id)
Where the tables are joined on purchaser_id = purchases_purchaser_id and each purchaser can have multiple purchases withn any financial year (so could presumably be grouped by year as well)
It's been driving me mad so any help would be majorly appreciated!!!
Thanks,
Davin
Here is dynamic version
Declare #currentYear int
Declare #OlderThan5yrs datetime
Set #currentYear = Year(GetDate()) - Case When month(GetDate())<4 then 1 else 0 end
Set #OlderThan5yrs = cast(cast( #currentYear-5 as varchar(4))+'/04/01' as datetime)
Select p.pName,
p.purchaser_id,
isNull(a.[5+YrAgo],0) as [5+YrAgo],
isNull(a.[4YrAgo], 0) as [4YrAgo],
isNull(a.[3YrAgo], 0) as [3YrAgo],
isNull(a.[2YrAgo], 0) as [2YrAgo],
isNull(a.[1YrAgo], 0) as [1YrAgo],
isNull(a.[CurYr], 0) as [CurYr],
isNull(a.Category, 'Non-purchaser (ever)') as Category
From purchasers p
Left Join
(
Select purchases_purchaser_id,
[5] as [5+YrAgo],
[4] as [4YrAgo],
[3] as [3YrAgo],
[2] as [2YrAgo],
[1] as [1YrAgo],
[0] as [CurYr],
Case When [4]+[3]+[2]+[1]+[0] = 5 Then '5+ year consecutive'
When [2]+[1]+[0] = 3 Then '3-4 yr consecutive'
When [1]+[0] = 2 Then '2 yr Consecutive'
When [1]=1 and [0]=0 Then 'Lapsed'
When [2]=1 and [1]=0 and [0]=1 Then 'Reactivated'
When [4]+[3]+[2]+[1]=0 and [0]=1 Then 'New'
When [4]+[3]+[2]+[1]+[0] = 0 Then 'Non-purchaser (last 5 yrs)'
Else 'non categorized'
End as Category
From (
Select purchases_purchaser_id,
Case When date_purchased < #OlderThan5yrs Then 5
Else #currentYear - Year(date_purchased)+ Case When month(date_purchased)<4 Then 1 else 0 end
end as fiscalYear, count(*) as nPurchases
From purchases
Group by purchases_purchaser_id,
Case When date_purchased < #OlderThan5yrs Then 5
Else #currentYear - Year(date_purchased)+ Case When month(date_purchased)<4 Then 1 else 0 end
end
) as AggData
PIVOT ( count(nPurchases) for fiscalYear in ([5],[4],[3],[2],[1],[0]) ) pvt
) as a
on p.purchaser_id=a.purchases_purchaser_id
UPDATED:
Here is result with data I inserted in previous query (You will have to add # to table names in the query).
pName purchaser_id 5+YrAgo 4YrAgo 3YrAgo 2YrAgo 1YrAgo CurYr Category
-------------------- ------------ ------- ------ ------ ------ ------ ----- --------------------------
Non-purchaser 0 0 0 0 0 0 0 Non-purchaser (ever)
New purchaser 1 0 0 0 0 0 1 New
Reactivated 2 0 0 1 1 0 1 Reactivated
Lapsed 3 0 0 0 1 1 0 Lapsed
2 yr Consecutive 4 0 0 0 0 1 1 2 yr Consecutive
3 yr consecutive 5 0 0 0 1 1 1 3-4 yr consecutive
4 yr consecutive 6 0 0 1 1 1 1 3-4 yr consecutive
5+ year consecutive 7 1 1 1 1 1 1 5+ year consecutive
Uncategorized 8 0 0 1 0 0 0 non categorized
old one 9 1 0 0 0 0 0 Non-purchaser (last 5 yrs)
You also don't need columns [5+YrAgo], [4YrAgo], [3YrAgo], [2YrAgo], [1YrAgo] and [CurYr].
I added them to be easier to check query logic.
UPDATE 2
Below is query you asked in comment.
Note
table structures I've used in query are:
Table purchasers ( purchaser_id int, pName varchar(20))
Table purchases (purchases_purchaser_id int, date_purchased datetime)
and there is Foreign key on purchases (purchases_purchaser_id) referencing purchases (purchaser_id).
;With AggData as (
Select purchases_purchaser_id,
Case When [4]+[3]+[2]+[1]+[0] = 5 Then 1 end as [Consec5],
Case When [4]=0 and [2]+[1]+[0] = 3 Then 1 end as [Consec34],
Case When [2]=0 and [1]+[0] = 2 Then 1 end as [Consec2],
Case When [1]=1 and [0]=0 Then 1 end as [Lapsed],
Case When [2]=1 and [1]=0 and [0]=1 Then 1 end as [Reactivated],
Case When [4]+[3]+[2]+[1]=0 and [0]=1 Then 1 end as [New],
Case When [4]+[3]+[2]>0 and [1]+[0]=0 Then 1 end as [Uncateg]
From (
Select purchases_purchaser_id,
#currentYear - Year(date_purchased) + Case When month(date_purchased)<4 Then 1 else 0 end as fiscalYear,
count(*) as nPurchases
From purchases
Where date_purchased >= #OlderThan5yrs
Group by purchases_purchaser_id,
#currentYear - Year(date_purchased) + Case When month(date_purchased)<4 Then 1 else 0 end
) as AggData
PIVOT ( count(nPurchases) for fiscalYear in ([4],[3],[2],[1],[0]) ) pvt
)
Select count([Consec5]) as [Consec5],
count([Consec34]) as [Consec34],
count([Consec2]) as [Consec2],
count([Lapsed]) as [Lapsed],
count([Reactivated]) as [Reactivated],
count([New]) as [New],
count(*)-count(a.purchases_purchaser_id) as [Non],
count([Uncateg]) as [Uncateg]
From purchasers p
Left Join AggData as a
on p.purchaser_id=a.purchases_purchaser_id
Result (With test data from previous post)
Consec5 Consec34 Consec2 Lapsed Reactivated New Non Uncateg
------- -------- ------- ------ ----------- --- --- -------
1 2 1 1 1 1 2 1
MS SQL Server (works on 2000, 2005, 2008)
SET NOCOUNT ON
CREATE TABLE #purchasers (purchaser_id int, pName varchar(20))
Insert Into #purchasers values (0, 'Non-purchaser')
Insert Into #purchasers values (1, 'New purchaser')
Insert Into #purchasers values (2, 'Reactivated')
Insert Into #purchasers values (3, 'Lapsed')
Insert Into #purchasers values (4, '2 yr Consecutive')
Insert Into #purchasers values (5, '3 yr consecutive')
Insert Into #purchasers values (6, '4 yr consecutive')
Insert Into #purchasers values (7, '5+ year consecutive')
Insert Into #purchasers values (8, 'Uncategorized')
Insert Into #purchasers values (9, 'old one')
CREATE TABLE #purchases (date_purchased datetime, purchases_purchaser_id int)
Insert Into #purchases values ('2010/05/03', 1)
Insert Into #purchases values ('2007/05/03', 2)
Insert Into #purchases values ('2008/05/03', 2)
Insert Into #purchases values ('2010/05/03', 2)
Insert Into #purchases values ('2008/05/03', 3)
Insert Into #purchases values ('2009/05/03', 3)
Insert Into #purchases values ('2009/05/03', 4)
Insert Into #purchases values ('2010/05/03', 4)
Insert Into #purchases values ('2008/05/03', 5)
Insert Into #purchases values ('2009/05/03', 5)
Insert Into #purchases values ('2010/05/03', 5)
Insert Into #purchases values ('2007/05/03', 6)
Insert Into #purchases values ('2008/05/03', 6)
Insert Into #purchases values ('2009/05/03', 6)
Insert Into #purchases values ('2010/05/03', 6)
Insert Into #purchases values ('2004/05/03', 7)
Insert Into #purchases values ('2005/05/03', 7)
Insert Into #purchases values ('2006/05/03', 7)
Insert Into #purchases values ('2007/05/03', 7)
Insert Into #purchases values ('2008/05/03', 7)
Insert Into #purchases values ('2009/05/03', 7)
Insert Into #purchases values ('2009/05/03', 7)
Insert Into #purchases values ('2009/05/03', 7)
Insert Into #purchases values ('2010/05/03', 7)
Insert Into #purchases values ('2007/05/03', 8)
Insert Into #purchases values ('2000/05/03', 9)
Select p.pName,
p.purchaser_id,
isNull(a.[2005],0) as [Bef.2006],
isNull(a.[2006],0) as [2006],
isNull(a.[2007],0) as [2007],
isNull(a.[2008],0) as [2008],
isNull(a.[2009],0) as [2009],
isNull(a.[2010],0) as [2010],
isNull(a.Category, 'Non-purchaser') as Category
From #purchasers p
Left Join
(
Select purchases_purchaser_id, [2005],[2006],[2007],[2008],[2009],[2010],
Case When [2006]+[2007]+[2008]+[2009]+[2010] = 5 Then '5+ year consecutive'
When [2008]+[2009]+[2010] = 3 Then '3-4 yr consecutive'
When [2009]+[2010] = 2 Then '2 yr Consecutive'
When [2009]=1 and [2010]=0 Then 'Lapsed'
When [2008]=1 and [2009]=0 and [2010]=1 Then 'Reactivated'
When [2006]+[2007]+[2008]+[2009]=0 and [2010]=1 Then 'New'
When [2006]+[2007]+[2008]+[2009]+[2010] = 0 Then 'Non-purchaser in last 5 yrs'
Else 'non categorized'
End as Category
From (
Select purchases_purchaser_id,
Case When date_purchased < '2006/04/01' Then 2005
Else Year(date_purchased)- Case When month(date_purchased)<4 Then -1 else 0 end
end as fiscalYear, count(*) as nPurchases
From #purchases
Group by purchases_purchaser_id,
Case When date_purchased < '2006/04/01' Then 2005
Else Year(date_purchased)- Case When month(date_purchased)<4 Then -1 else 0 end
end
) as AggData
PIVOT ( count(nPurchases) for fiscalYear in ([2005],[2006],[2007],[2008],[2009],[2010]) ) pvt
) as a
on p.purchaser_id=a.purchases_purchaser_id
Although it COULD be done a bit easier with another table of date ranges showing the 5 fiscal years, I have hard-coded the from/to date references for your query and appears to be working...
The INNER Select will pre-gather a "flag" based on any 1 or more purchase within the given date range... ex: Apr 1, 2010 = "20100401" for date conversion to Mar 31, 2011 = "20110331", and cycle through last 5 years... Additionally, a flag to count for ANY with a date purchase within the actual purchases table to confirm a "never purchased" vs someone purchasing 6, 7 or older years history...
That queries' basis will basically create a cross-tab of possible individual years where activity has occurred. I can then query with the most detailed criteria for some caption of their classification down to the least...
I converted from another SQL language as best as possible to comply with SQL-Server syntax (mostly about the date conversion), but otherwise, the principle and queries do work... The final classification column is character, but can be whatever you want to supercede.
SELECT
id,
CASE
WHEN year1 + year2 + year3 + year4 + year5 = 5 THEN "5+yrs "
WHEN year1 + year2 + year3 + year4 >= 3 THEN "3-4yrs"
WHEN year1 + year2 = 2, "2yrs "
WHEN year1 = 1 AND year2 = 0 AND year3 = 1 THEN "Reacti"
WHEN year1 = 1 THEN "New "
WHEN year1 = 0 AND year2 = 1 THEN "Lapsed"
WHEN AnyPurchase = 1, "over5"
ELSE "never" BuyerClassification
END
FROM
( SELECT
id,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20100401", 112 )
AND date_purchased <= CONVERT( Date, "20110331", 112 )
THEN 1 ELSE 0 END ) Year1,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20090401", 112 )
AND date_purchased <= CONVERT( Date, "20100331", 112 )
THEN 1 ELSE 0 END ) Year2,
MAX( CASE WEHEN date_purchased >= CONVERT( Date, "20080401", 112 )
AND date_purchased <= CONVERT( Date, "20090331", 112 )
THEN 1 ELSE 0 END ) Year3,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20070401", 112 )
AND date_purchased <= CONVERT( Date, "20080331", 112 )
THEN 1 ELSE 0 END ) Year4,
MAX( CASE WHEN date_purchased >= CONVERT( Date, "20060401", 112 )
AND date_purchased <= CONVERT( Date, "20070331", 112 )
THEN 1 ELSE 0 END ) Year5,
MAX( CASE WHEN date_purchased <= CONVERT( Date, "20100401", 112 )
THEN 1 ELSE 0 END ) AnyPurchase
FROM
purchaser LEFT OUTER JOIN purchases
ON purchaser.id = purchases.purchases_purchaser_id
GROUP BY
1 ) PreGroup1
EDIT --
fixed parens via syntax conversion and missed it...
The "Group By 1" refers to doing a group by the first column in the query which is the purchaser's ID from the purchaser. By doing a left-outer join will guarantee all possible people in the purchasers table regardless of having any actual purchases. The "PreGroup1" is the "alias" of the select statement just in case you wanted to do other joins subsequent in the outer most select where detecting the year values for classification.
Although it will work, but may not be as efficient as others have chimed-in on by doing analysis of the query, it may open your mind to some querying and aggregating techniques. This process is basically creating a sort-of cross-tab by utilization of case/when construct on the inner SQL-Select, and final classification in the OUTER most SQL-Select.