SQL- Cumulative sum based on condition - sql

I have a scenario in which I have to calculate the counter based on below data. If the status is A, B,C than counter should be 0 which is working fine.
If STATUS is D counter should do a cumulative sum with the exception that if status is changed in between(like in 201907) , the counter should reset again and sum should start again with 1,2,3 and so on. Any possible help is appreciated on same.
Input - 3 columns - Customer_No, Date, Status
CUSTOMER_NO Date STATUS
1234 201901 A
1234 201902 B
1234 201903 C
1234 201904 D
1234 201905 D
1234 201906 D
1234 201907 C
1234 201908 D
1234 201910 D
1234 201911 D
1234 201912 D
expected Output - Input columns + Counter Column
CUSTOMER_NO Date STATUS COUNTER
----------------------------------------
1234 201901 A 0
1234 201902 B 0
1234 201903 C 0
1234 201904 D 1
1234 201905 D 2
1234 201906 D 3
1234 201907 C 0
1234 201908 D 1
1234 201910 D 2
1234 201911 D 3
1234 201912 D 4
Sample data
Thanks

You can create a numbering like a serial number for the ordering purpose using the ROW_NUMBER() function as shown below.
create table SampleData(CUSTOMER_NO int
, STATUS char(1)
, COUNTER int)
insert into SampleData Values
(1234, 'A', 0),
(1234, 'B', 0),
(1234, 'C', 0),
(1234, 'D', 1),
(1234, 'D', 2),
(1234, 'D', 3),
(1234, 'C', 0),
(1234, 'D', 1),
(1234, 'D', 2),
(1234, 'D', 3),
(1234, 'D', 4)
;with cte as(
Select *
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
from SampleData
)
select CUSTOMER_NO
, STATUS
, COUNTER
, (SELECT SUM(case STATUS when 'D' then Counter else 0 end) FROM cte t2 WHERE t2.RN <= cte.RN) AS Needed
from cte
Live db<>fiddle demo.

This is a similar approach this Gordon's, however, uses CTEs and ROW_NUMBER to make the islands first, and then 0's if there is only 1 row in that island using a windowed COUNT and a CASE expression:
WITH Grps AS(
SELECT ID,
CUSTOMER_NO,
[STATUS],
ROW_NUMBER() OVER (PARTITION BY CUSTOMER_NO ORDER BY ID) -
ROW_NUMBER() OVER (PARTITION BY CUSTOMER_NO, [STATUS] ORDER BY ID) AS Grp
FROM (VALUES(1,1234,'A'),
(2,1234,'B'),
(3,1234,'C'),
(4,1234,'D'),
(5,1234,'D'),
(6,1234,'D'),
(7,1234,'C'),
(8,1234,'D'),
(9,1234,'D'),
(10,1234,'D'),
(11,1234,'D'))V(ID,CUSTOMER_NO,[STATUS]))
SELECT ID,
CUSTOMER_NO,
[STATUS],
Grp,
CASE WHEN COUNT(ID) OVER (PARTITION BY CUSTOMER_NO, [STATUS], Grp) = 1 THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY CUSTOMER_NO, [STATUS], Grp ORDER BY ID) - 1
END AS [COUNTER]
FROM Grps;
As Gordon mentioned, as well, if you don't have some kind of sequential ID/Key, you can't do this with your data. You will need to implement some kind of sequential ID, and hope that your data retains it's "insert order".

This is a variant of a gaps-and-islands problem. For this particular incarnation, you can identify the islands by counting the number of non-D statuses before a given row.
After identifying the groups, use case and row_number():
select t.*,
(case when status = 'D'
then row_number() over (partition by customer_no, grp, status order by date)
else 0
end) as counter
from (select t.*,
sum(case when status <> 'D' then 1 else 0 end) over (partition by customer_no order by date) as grp
from t
) t

Related

Query: How can enumerate rows based on condition?

Having the following table:
DROP TABLE IF EXISTS #Data
CREATE TABLE #Data
(
Code VARCHAR(10),
Fee VARCHAR(3),
AValue DECIMAL(10, 4)
)
-- DELETE FROM #Data
INSERT INTO #Data
VALUES
('A001', '001', 100), ('A001', '002', 200), ('A001', '003', -50), ('A001', '004', -250), ('A001', '005', 340), ('A001', '006', 500), ('A001', '007', 600)
I need to get the following result (an ordered sequence based if Value is still positive or negative):
Code Fee Value Row
A001 001 100.0000 P1
A001 002 200.0000 P1
A001 003 -50.0000 N1
A001 004 -250.0000 N1
A001 005 340.0000 P2
A001 006 500.0000 P2
A001 007 600.0000 P2
I tried this:
SELECT Code, Fee, AValue, ROW_NUMBER() OVER(PARTITION BY Code, (CASE WHEN AValue > 0 THEN 1 ELSE 2 END) ORDER BY Fee) 'nRow',
FORMAT(ROW_NUMBER() OVER(PARTITION BY Code, Fee, CASE WHEN AValue > 0 THEN 1 ELSE 2 END ORDER BY Fee), CASE WHEN AValue > 0 THEN 'POS00' ELSE 'NEG00' END)
FROM #Data
But it returns:
Code Fee Value Row
A001 001 100.0000 P1
A001 002 200.0000 P1
A001 003 -50.0000 N1
A001 004 -250.0000 N1
A001 005 340.0000 P1
This is a type of gaps-and-islands problem, but it is tricky. Assuming you have no 0 values, then sign() is your friend.
Here is an approach that uses the fact that the difference of row numbers is constant when values on adjacent rows should be combined:
SELECT Code, Fee, AValue,
DENSE_RANK() OVER (PARTITION BY sign(avalue) ORDER BY seqnum - seqnum_2) as num
FROM (SELECT d.*,
ROW_NUMBER() OVER (PARTITION BY code, sign(avalue) ORDER BY fee) as seqnum_2,
ROW_NUMBER() OVER (PARTITION BY code ORDER BY fee) as seqnum
FROM Data d
) d
ORDER BY Code, Fee;
You can incorporate this into your string using CONCAT() or whatever.
Here is a db<>fiddle.
This is somewhat of a blind guess, but perhaps this is what you are after:
WITH Grps AS(
SELECT Code,
Fee,
AValue,
ROW_NUMBER() OVER (ORDER BY Fee) -
ROW_NUMBER() OVER (PARTITION BY CASE WHEN Avalue > 0 THEN 1 WHEN Avalue < 0 THEN -1 END ORDER BY Fee) AS Grp
FROM #Data)
SELECT Code,
Fee,
AValue,
CONCAT(CASE WHEN Avalue > 0 THEN 'P' WHEN Avalue < 0 THEN 'N' ELSE 'Z' END,
DENSE_RANK() OVER (PARTITION BY CASE WHEN Avalue > 0 THEN 1 WHEN Avalue < 0 THEN -1 END ORDER BY Grp)) AS Row
FROM Grps
ORDER BY Fee;```

Issue with Row_Number() Over Partition

I've been trying to reset the row_number when the value changes on Column Value and I have no idea on how should i do this.
This is my SQL snippet:
WITH Sch(SubjectID, VisitID, Scheduled,Actual,UserId,RLev,SubjectTransactionID,SubjectTransactionTypeID,TransactionDateUTC,MissedVisit,FieldId,Value) as
(
select
svs.*,
CASE WHEN stdp.FieldID = 'FrequencyRegime' and svs.SubjectTransactionTypeID in (2,3) THEN
stdp.FieldID
WHEN stdp.FieldID is NULL and svs.SubjectTransactionTypeID = 1
THEN NULL
WHEN stdp.FieldID is NULL
THEN 'FrequencyRegime'
ELSE stdp.FieldID
END AS [FieldID],
CASE WHEN stdp.Value is NULL and svs.SubjectTransactionTypeID = 1
THEN NULL
WHEN stdp.Value IS NULL THEN
(SELECT TOP 1 stdp.Value from SubjectTransaction st
JOIN SubjectTransactionDataPoint STDP on stdp.SubjectTransactionID = st.SubjectTransactionID and stdp.FieldID = 'FrequencyRegime'
where st.SubjectID = svs.SubjectID
order by st.ServerDateST desc)
ELSE stdp.Value END AS [Value]
from SubjectVisitSchedule svs
left join SubjectTransactionDataPoint stdp on svs.SubjectTransactionID = stdp.SubjectTransactionID and stdp.FieldID = 'FrequencyRegime'
)
select
Sch.*,
CASE WHEN sch.Value is not NULL THEN
ROW_NUMBER() over(partition by Sch.Value, Sch.SubjectID order by Sch.SubjectID, Sch.VisitID)
ELSE NULL
END as [FrequencyCounter],
CASE WHEN Sch.Value = 1 THEN 1--v.Quantity
WHEN Sch.Value = 2 and (ROW_NUMBER() over(partition by Sch.Value, Sch.SubjectID order by Sch.SubjectID, Sch.VisitID) % 2) <> 0
THEN 0
WHEN Sch.Value = 2 and (ROW_NUMBER() over(partition by Sch.Value, Sch.SubjectID order by Sch.SubjectID, Sch.VisitID) % 2) = 0
THEN 1
ELSE NULL
END AS [DispenseQuantity]
from Sch
--left join VisitDrugAssignment v on v.VisitID = Sch.VisitID
where SubjectID = '4E80718E-D0D8-4250-B5CF-02B7A259CAC4'
order by SubjectID, VisitID
This is my Dataset:
Based on the Dataset, I am trying to reset the FrequencyCounter to 1 every time the value changes for each subject, Right now it does 50% of what I want, It is counting when the value 1 or 2 is found, but when value 1 comes again after value 2 it continues the count from where it left. I want every time the value is changes the count to also start from the beginning.
It's difficult to reproduce and test without sample data, but if you want to know how to number rows based on change in column value, next approach may help. It's probably not the best one, but at least will give you a good start. Of course, I hope I understand your question correctly.
Data:
CREATE TABLE #Data (
[Id] int,
[Subject] varchar(3),
[Value] int
)
INSERT INTO #Data
([Id], [Subject], [Value])
VALUES
(1, '801', 1),
(2, '801', 2),
(3, '801', 2),
(4, '801', 2),
(5, '801', 1),
(6, '801', 2),
(7, '801', 2),
(8, '801', 2)
Statement:
;WITH ChangesCTE AS (
SELECT
*,
CASE
WHEN LAG([Value]) OVER (PARTITION BY [Subject] ORDER BY [Id]) <> [Value] THEN 1
ELSE 0
END AS [Change]
FROM #Data
), GroupsCTE AS (
SELECT
*,
SUM([Change]) OVER (PARTITION BY [Subject] ORDER BY [Id]) AS [GroupID]
FROM ChangesCTE
)
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY [GroupID] ORDER BY [Id]) AS Rn
FROM GroupsCTE
Result:
--------------------------------------
Id Subject Value Change GroupID Rn
--------------------------------------
1 801 1 0 0 1
2 801 2 1 1 1
3 801 2 0 1 2
4 801 2 0 1 3
5 801 1 1 2 1
6 801 2 1 3 1
7 801 2 0 3 2
8 801 2 0 3 3
As per my understanding, you need DENSE_RANK as you are looking for the row number will only change when value changed. The syntax will be as below-
WITH your_table(your_column)
AS
(
SELECT 2 UNION ALL
SELECT 10 UNION ALL
SELECT 2 UNION ALL
SELECT 11
)
SELECT *,DENSE_RANK() OVER (ORDER BY your_column)
FROM your_table

SQL How do I transpose and group data into static columns? [duplicate]

This question already has answers here:
TSQL Pivot without aggregate function
(9 answers)
Closed 4 years ago.
I have a table with the following data:
UID LAST FIRST FUND AMOUNT STATUS
1 Smith John C 100 1
1 Smith John B 250 1
1 Smith John E 150 1
2 Jones Meg B 275 1
2 Jones Meg F 150 1
3 Carter Bill A 100 1
I would like to transpose the FUND, AMOUNT and STATUS values for each UID into a single row for each UID. The resulting table would have columns added for FUND_1, AMT_1, STATUS_1, FUND_2, AMT_2, STATUS_2, FUND_3, AMT_3, STATUS_3. Each UID may or may not have a total of 3 funds. If they do not, the remaining fund, amt, and status columns are left blank. The resulting table would appear as:
UID LAST FIRST FUND_1 AMT_1 STATUS_1 FUND_2 AMT_2 STATUS_2 FUND_3 AMT_3 STATUS_3
1 Smith John C 100 1 B 250 1 E 150 1
2 Jones Meg B 275 1 F 150 1
3 Carter Bill A 100 1
For clarification, this is how the data would move from the existing table to the resulting table for UID 1:
It seems I am unable to use PIVOT because FUND_1, FUND_2, FUND_3 will be different fund categories for each person. The question, TSQL Pivot without aggregate function helps but doesn't answer my question since I have multiple rows in what would be the the DBColumnName in that question.
This is a pretty common conditional aggregation. Notice how I posted consumable data as a table and insert statements. To be honest it took longer to do that portion than the actual code to select the data. You should do this in the future. Also you should avoid using keywords as column names.
declare #Something table
(
UID int
, LAST varchar(10)
, FIRST varchar(10)
, FUND char(1)
, AMOUNT int
, STATUS int
)
insert #Something values
(1, 'Smith', 'John', 'C', 100, 1)
, (1, 'Smith', 'John', 'B', 250, 1)
, (1, 'Smith', 'John', 'E', 150, 1)
, (2, 'Jones', 'Meg', 'B', 275, 1)
, (2, 'Jones', 'Meg', 'F', 150, 1)
, (3, 'Carter', 'Bill', 'A', 100, 1)
;
with SortedValues as
(
select *
, RowNum = ROW_NUMBER() over(partition by UID order by (select null))
from #Something
)
select UID
, Last
, First
, Fund_1 = max(case when RowNum = 1 then Fund end)
, Amt_1 = max(case when RowNum = 1 then Amount end)
, Status_1 = max(case when RowNum = 1 then Status end)
, Fund_2 = max(case when RowNum = 2 then Fund end)
, Amt_2 = max(case when RowNum = 2 then Amount end)
, Status_2 = max(case when RowNum = 2 then Status end)
, Fund_3 = max(case when RowNum = 3 then Fund end)
, Amt_3 = max(case when RowNum = 3 then Amount end)
, Status_3 = max(case when RowNum = 3 then Status end)
from SortedValues
group by UID
, Last
, First
order by UID
, Last
, First

SQL Server: how to find the record where a field is X for the first time and there are no later records where it isn't

I tried for quite some time now but cannot figure out how to best do this without using cursors. What I want to do (in SQL Server) is:
Find the earliest (by Date) record where Criterion=1 AND NOT followed by Criterion=0 for each Name and Category.
Or expressed differently:
Find the Date when Criterion turned 1 and not turned 0 again afterwards (for each Name and Category).
Some sort of CTE would seem to make sense I guess but that's not my strong suit unfortunately. So I tried nesting queries to find the latest record where Criterion=0 and then select the next record if there is one but I'm getting incorrect results. Another challenge with this is returning a record where there are only records with Criterion=1 for a Name and Category.
Here's the sample data:
Name Category Criterion Date
------------------------------------------------
Bob Cat1 1 22.11.16 08:54 X
Bob Cat2 0 21.02.16 02:29
Bob Cat3 1 22.11.16 08:55
Bob Cat3 0 22.11.16 08:56
Bob Cat4 0 21.06.12 02:30
Bob Cat4 0 18.11.16 08:18
Bob Cat4 1 18.11.16 08:19
Bob Cat4 0 22.11.16 08:20
Bob Cat4 1 22.11.16 08:50 X
Bob Cat4 1 22.11.16 08:51
Hannah Cat1 1 22.11.16 08:54 X
Hannah Cat2 0 21.02.16 02:29
Hannah Cat3 1 22.11.16 08:55
Hannah Cat3 0 22.11.16 08:56
The rows with an X after the row are the ones I want to retrieve.
It's probably not all that complicated in the end...
If you just want the name, category, and date:
select name, category, min(date)
from t
where criterion = 1 and
not exists (select 1
from t t2
where t2.name = t.name and t2.category = t.category and
t2.criterion = 0 and t2.date >= t.date
)
group by name, category;
There are fancier ways to get this information, but this is a relatively simple method.
Actually, the fancier ways aren't particularly complicated:
select t.*
from (select t.*,
min(case when date > maxdate_0 or maxdate_0 is NULL then date end) over (partition by name, category) as mindate_1
from (select t.*,
max(case when criterion = 0 then date end) over (partition by name, category) as maxdate_0
from t
) t
where criterion = 1
) t
where mindate_1 = date;
EDIT:
SQL Fiddle doesn't seem to be working these days. The following is working for me (using Postgres):
with t(name, category, criterion, date) as (
values ('Bob', 'Cat1', 1, '2016-11-16 08:54'),
('Bob', 'Cat2', 0, '2016-02-21 02:29'),
('Bob', 'Cat3', 1, '2016-11-16 08:55'),
('Bob', 'Cat3', 0, '2016-11-16 08:56'),
('Bob', 'Cat4', 0, '2012-06-21 02:30'),
('Bob', 'Cat4', 0, '2016-11-18 08:18'),
('Bob', 'Cat4', 1, '2016-11-18 08:19'),
('Bob', 'Cat4', 0, '2016-11-22 08:20'),
('Bob', 'Cat4', 1, '2016-11-22 08:50'),
('Bob', 'Cat4', 1, '2016-11-22 08:51'),
('Hannah', 'Cat1', 1, '2016-11-22 08:54'),
('Hannah', 'Cat2', 0, '2016-02-21 02:29'),
('Hannah', 'Cat3', 1, '2016-11-22 08:55'),
('Hannah', 'Cat3', 0, '2016-11-22 08:56')
)
select t.*
from (select t.*,
min(case when date > maxdate_0 or maxdate_0 is NULL then date end) over (partition by name, category) as mindate_1
from (select t.*,
max(case when criterion = 0 then date end) over (partition by name, category) as maxdate_0
from t
) t
where criterion = 1
) t
where mindate_1 = date;
How about a left join, and filter the NULLs?
SELECT yt.Name, yt.Category, yt.Criterion, MIN(yt.Date) AS Date
FROM YourTable yt
LEFT JOIN YourTable lj ON lj.Name = yt.Name AND lj.Category = yt.Category AND
lj.Criterion != yt.Criterion AND lj.Date > yt.Date
WHERE yt.Criterion = 1 AND lj.Name IS NULL
GROUP BY yt.Name, yt.Category, yt.Criterion
there are ton's of ways of doing it especially with Window Functions. The NOT EXISTS, or Anti Join are 2 of the better methods but just for fun here is one of the fancier (to steal Gordon's term) ways of doing it with Window Functions:
;WITH cte AS (
SELECT
Name
,Category
,CASE WHEN Criterion = 1 THEN Date END as Criterion1Date
,MAX(CASE WHEN Criterion = 0 THEN Date END) OVER (PARTITION BY Name, Category) as MaxDateCriterion0
FROM
Table
)
SELECT
Name
,Category
,MIN(Criterion1Date) as Date
FROM
cte
WHERE
ISNULL(MaxDateCriterion0,'1/1/1900') < Criterion1Date
GROUP BY
Name
,Category
Or as a Derived Table if you don't like cte, the only difference is basically nesting the cte in the from clause.
SELECT
Name
,Category
,MIN(Criterion1Date) as Date
FROM
(
SELECT
Name
,Category
,CASE WHEN Criterion = 1 THEN Date END as Criterion1Date
,MAX(CASE WHEN Criterion = 0 THEN Date END) OVER (PARTITION BY Name, Category) as MaxDateCriterion0
FROM
Table
) t
WHERE
ISNULL(MaxDateCriterion0,'1/1/1900') < Criterion1Date
GROUP BY
Name
,Category
Modified answer
select name,category
,min (date) as date
from (select name,category,criterion,date
,min (criterion) over
(
partition by name,category
order by date
rows between current row and unbounded following
) as min_following_criterion
from t
) t
where criterion = 1
and ( min_following_criterion <> 0
or min_following_criterion is null
)
group by name,category

How to COUNT rows according to specific complicated rules?

I have the following table:
custid custname channelid channel dateViewed
--------------------------------------------------------------
1 A 1 ABSS 2016-01-09
2 B 2 STHHG 2016-01-19
3 C 4 XGGTS 2016-01-09
6 D 4 XGGTS 2016-01-09
2 B 2 STHHG 2016-01-26
2 B 2 STHHG 2016-01-28
1 A 3 SSJ 2016-01-28
1 A 1 ABSS 2016-01-28
2 B 2 STHHG 2016-02-02
2 B 7 UUJKS 2016-02-10
2 B 8 AKKDC 2016-02-10
2 B 9 GGSK 2016-02-10
2 B 9 GGSK 2016-02-11
2 B 7 UUJKS 2016-02-27
And I want the results to be:
custid custname month count
------------------------------
1 A 1 1
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
According to the following rules:
All channel views subscription is billed every 15 days. If the
customer viewed the same channel within the 15 days, he will only be
billed once for that channel. For instance, custid 2, custname B his billing cycle is 19 Jan - 3 Feb (one billing cycle), 4 Feb - 20 Feb (one billing cycle) and so on. Therefore, he is billed only 1 time in Jan since he watch the same channel throughout the billing cycle; and he is billed 4 times in Feb for watching (channelid 7, 8, 9) and channelid 7 watched on 27 Feb (since this falls in another billing cycle, customer B is also charged here). Customer B is not charged on 2 Feb for watching channel 2 since he was already billed in 19 jan - 3 Feb billing cycle.
An invoice is generated every month for each customer, therefore, the
results should show the 'Month' and the 'Count' of the channels
viewed for each customer.
Can this be done in SQL server?
;WITH cte AS (
SELECT custid,
custname,
channelid,
channel,
dateViewed,
CAST(DATEADD(day,15,dateViewed) as date) as dateEnd,
ROW_NUMBER() OVER (PARTITION BY custid, channelid ORDER BY dateViewed) AS rn
FROM (VALUES
(1, 'A', 1, 'ABSS', '2016-01-09'),(2, 'B', 2, 'STHHG', '2016-01-19'),
(3, 'C', 4, 'XGGTS', '2016-01-09'),(6, 'D', 4, 'XGGTS', '2016-01-09'),
(2, 'B', 2, 'STHHG', '2016-01-26'),(2, 'B', 2, 'STHHG', '2016-01-28'),
(1, 'A', 3, 'SSJ', '2016-01-28'),(1, 'A', 1, 'ABSS', '2016-01-28'),
(2, 'B', 2, 'STHHG', '2016-02-02'),(2, 'B', 7, 'UUJKS', '2016-02-10'),
(2, 'B', 8, 'AKKDC', '2016-02-10'),(2, 'B', 9, 'GGSK', '2016-02-10'),
(2, 'B', 9, 'GGSK', '2016-02-11'),(2, 'B', 7, 'UUJKS', '2016-02-27')
) as t(custid, custname, channelid, channel, dateViewed)
), res AS (
SELECT custid, channelid, dateViewed, dateEnd, 1 as Lev
FROM cte
WHERE rn = 1
UNION ALL
SELECT c.custid, c.channelid, c.dateViewed, c.dateEnd, lev + 1
FROM res r
INNER JOIN cte c ON c.dateViewed > r.dateEnd and c.custid = r.custid and c.channelid = r.channelid
), final AS (
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY custid, channelid, lev ORDER BY dateViewed) rn,
DENSE_RANK() OVER (ORDER BY custid, channelid, dateEnd) dr
FROM res
)
SELECT b.custid,
b.custname,
MONTH(f.dateViewed) as [month],
COUNT(distinct dr) as [count]
FROM cte b
LEFT JOIN final f
ON b.channelid = f.channelid and b.custid = f.custid and b.dateViewed between f.dateViewed and f.dateEnd
WHERE f.rn = 1
GROUP BY b.custid,
b.custname,
MONTH(f.dateViewed)
Output:
custid custname month count
----------- -------- ----------- -----------
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
(5 row(s) affected)
I don't know why you get 1 in count field for customer A. He got:
ABSS 2016-01-09 +1 to count (+15 days = 2016-01-24)
SSJ 2016-01-28 +1 to count
ABSS 2016-01-28 +1 to count (28-01 > 24.01)
So in January there must be count = 3.
Whenever I am trying to count things with complex criteria, I use a sum and case statement. Something like below:
SELECT custid, custname,
SUM(CASE WHEN somecriteria
THEN 1
ELSE 0
END) As CriteriaCount
FROM whateverTable
GROUP BY custid, custname
You can make that somecriteria variable as complicated a statement as you like, so long as it returns a boolean. If it passes, this row returns a 1. If it fails, the row reutrns a 0, then we sum up the values returned to get the count.
Generally this is how you can get any number (10 in this example) of fixed 15 day intervals starting at the given date (#dd in this example).
DECLARE #dd date = CAST('2016-01-19 17:30' AS DATE);
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
E2(N) AS (SELECT 1 FROM E1 a, E1 b),
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000 rows max
tally(N) AS (SELECT TOP (10) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4)
SELECT
startd = DATEADD(D,(N-1)*15, #dd),
endd = DATEADD(D, N*15-1, #dd)
FROM tally
Adapt it to the rules defining how start date must be calculated for the user (and probably chanel).
#Sturgus what if I want to define it in the code? Any other
alternatives besides defining it in the table? How to write a query
that can be run every month to generate the monthly invoice. –
saturday 15 mins ago
Well, one way or another, you will have to save each customer's billing start date (minimally). If you want to do this entirely in SQL without 'editing the database', something like the following should work. The drawback to this approach is that you would need to manually edit the "INSERT INTO" statement every month to suit your needs. If you were allowed to edit the already existing customers table or create a new one, then it would reduce this manual effort.
DECLARE #CustomerBillingPeriodsTVP AS Table(
custID int UNIQUE,
BillingCycleID int,
BillingStartDate Date,
BillingEndDate Date
);
INSERT INTO #CustomerBillingPeriodsTVP (custID, BillingCycleID, BillingStartDate, BillingEndDate) VALUES
(1, 1, '2016-01-03', '2016-01-18'), (2, 1, '2016-01-18', '2016-02-03'), (3, 1, '2016-01-15', '2016-01-30'), (6, 1, '2016-01-14', '2016-01-29');
SELECT A.custid, A.custname, B.BillingCycleID AS [month], COUNT(DISTINCT A.channelid) AS [count]
FROM dbo.tblCustomerChannelViews AS A INNER JOIN #CustomerBillingPeriodsTVP AS B ON A.custid = B.CustID
GROUP BY A.custid, A.custname, B.BillingCycleID;
GO
Where are you getting your customers' billing start dates as it is?
I'm not sure how this solution will scale - but with some good index candidates and decent data housekeeping, it'll work..
You're going to need some extra info for starters, and to normalize your data. You will need to know the first charging period start date for each customer. So store that in a customer table.
Here are the tables I used:
create table #channelViews
(
custId int, channelId int, viewDate datetime
)
create table #channel
(
channelId int, channelName varchar(max)
)
create table #customer
(
custId int, custname varchar(max), chargingStartDate datetime
)
I'll populate some data. I won't get the same results as your sample output, because I don't have the appropriate start dates for each customer. Customer 2 will be OK though.
insert into #channel (channelId, channelName)
select 1, 'ABSS'
union select 2, 'STHHG'
union select 4, 'XGGTS'
union select 3, 'SSJ'
union select 7, 'UUJKS'
union select 8, 'AKKDC'
union select 9, 'GGSK'
insert into #customer (custId, custname, chargingStartDate)
select 1, 'A', '4 Jan 2016'
union select 2, 'B', '19 Jan 2016'
union select 3, 'C', '5 Jan 2016'
union select 6, 'D', '5 Jan 2016'
insert into #channelViews (custId, channelId, viewDate)
select 1,1,'2016-01-09'
union select 2,2,'2016-01-19'
union select 3,4,'2016-01-09'
union select 6,4,'2016-01-09'
union select 2,2,'2016-01-26'
union select 2,2,'2016-01-28'
union select 1,3,'2016-01-28'
union select 1,1,'2016-01-28'
union select 2,2,'2016-02-02'
union select 2,7,'2016-02-10'
union select 2,8,'2016-02-10'
union select 2,9,'2016-02-10'
union select 2,9,'2016-02-11'
union select 2,7,'2016-02-27'
And here is the somewhat unweildy query, in a single statement.
The two underlying sub-queries are actually the same data, so there may be more appropriate / efficient ways to generate these.
We need to exclude from billing any channel charged in the same charging period C for the previous Month. This is the essence of the join. I used a right-join so that I could exclude all such matches from the results (using old.custId is null).
select c.custId, c.[custname], [month], count(*) [count] from
(
select new.custId, new.channelId, new.month, new.chargingPeriod
from
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) old
right join
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) new
on old.custId = new.custId
and old.channelId = new.channelId
and old.month = new.Month -1
and old.chargingPeriod = new.chargingPeriod
where old.custId is null
group by new.custId, new.month, new.chargingPeriod, new.channelId
) filteredResults
join #customer c on c.custId = filteredResults.custId
group by c.custId, [month], c.custname
order by c.custId, [month], c.custname
And finally my results:
custId custname month count
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
This query does the same thing:
select c.custId, c.custname, [month], count(*) from
(
select cv.custId, min(month(viewdate)) [month], cv.channelId
from #channelViews cv join #customer c on cv.custId = c.custId
group by cv.custId, cv.channelId, (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15
) x
join #customer c
on c.custId = x.custId
group by c.custId, c.custname, x.[month]
order by custId, [month]