Summing Records within a Moving Date Range, Date Distances - sql

I have complex calculation requirement for a user logging system. I need to locate the most frequently active users based on their number of logins within a 180 day window. Once two login dates are 181 days apart, they do not count towards a total but could count towards a total when grouped with other dates.
For example here is Jim's login history:
Jim 2018-01-01
Jim 2018-04-01
Jim 2018-05-01
Jim 2018-06-01
Jim 2018-07-01
Jim 2018-08-01
Jim 2018-09-01
Jim 2018-12-01
Using 6 months, instead of 180 days, for simplicity, and only looking 6 months in one direction, Jim had the following totals:
Logins: 5 (2018-01-01 + 6 months)
Logins: 6 (2018-04-01 + 6 months)
Logins: 5 (2018-05-01 + 6 months)
Logins: 5 (2018-06-01 + 6 months)
Logins: 4 (2018-07-01 + 6 months)
Logins: 3 (2018-08-01 + 6 months)
Logins: 2 (2018-09-01 + 6 months)
Logins: 1 (2018-12-01 + 6 months)
So my system would report back 6 because it only wants the maximum total.
Other than brute force calculation, I'm lost on how to construct this system. Yes I can denormalize data to any degree, speed is most important.

Try this:
declare #tbl table(name char(3), dt date);
insert into #tbl values
('Jim', '2018-01-01'),
('Jim', '2018-04-01'),
('Jim', '2018-05-01'),
('Jim', '2018-06-01'),
('Jim', '2018-07-01'),
('Jim', '2018-08-01'),
('Jim', '2018-09-01'),
('Jim', '2018-12-01');
;with cte as (
select name, dt, DATEADD(day, 181, dt) upperDt from #tbl
), cte2 as (
select name,
(select COUNT(*) from cte where dt between c.dt and c.upperDt and name = c.name) cnt
from cte c
)
select name, MAX(cnt) [max]
from cte2
group by name

Try this, using a Common Table Expression to Calculate the EndDate Window and CROSS APPLY to calculate the total number of logins
DECLARE #t TABLE (UserName NVARCHAR(10), LoginDate DATETIME)
INSERT INTO #t
(UserName,LoginDate) VALUES
('Jim','2018-01-01'),
('Jim','2018-04-01'),
('Jim','2018-05-01'),
('Jim','2018-06-01'),
('Jim','2018-07-01'),
('Jim','2018-08-01'),
('Jim','2018-09-01'),
('Jim','2018-12-01')
; WITH CteDateRange
AS(
SELECT
T.UserName
,T.LoginDate
--,EndDateRange = DATEADD(DAY, 181, LoginDate)
,EndDateRange = DATEADD(MONTH, 6, LoginDate)
FROM #t T
)
SELECT
DR.UserName
,DR.LoginDate
,DR.EndDateRange
,T.Total
FROM CteDateRange DR
CROSS APPLY ( SELECT Total = COUNT(D.LoginDate)
FROM CteDateRange D
WHERE D.LoginDate >= DR.LoginDate
AND D.LoginDate <= DR.EndDateRange
AND D.UserName = DR.UserName
) T
Output
UserName LoginDate EndDateRange Total
Jim 2018-01-01 00:00:00.000 2018-07-01 00:00:00.000 5
Jim 2018-04-01 00:00:00.000 2018-10-01 00:00:00.000 6
Jim 2018-05-01 00:00:00.000 2018-11-01 00:00:00.000 5
Jim 2018-06-01 00:00:00.000 2018-12-01 00:00:00.000 5
Jim 2018-07-01 00:00:00.000 2019-01-01 00:00:00.000 4
Jim 2018-08-01 00:00:00.000 2019-02-01 00:00:00.000 3
Jim 2018-09-01 00:00:00.000 2019-03-01 00:00:00.000 2
Jim 2018-12-01 00:00:00.000 2019-06-01 00:00:00.000 1

One basic solution uses a join:
select l.*
from (select l.name, count(*) as cnt,
row_number() over (partition by name order by count(*) desc) as seqnum
from logins l join
logins l2
on l.name = l2.name and
l2.date >= l.date and l2.date < dateadd(day, 181, l.date)
group by l.name
) l
where seqnum = 1;
This might have acceptable performance with an index on logins(name, date).

Related

How to select rows based on a rolling 30 day window SQL

My question involves how to identify an index discharge.
The index discharge is the earliest discharge. On that date, the 30 day window starts. Any admissions during that time period are considered readmissions, and they should be ignored. Once the 30 day window is over, then any subsequent discharge is considered an index and the 30 day window begins again.
I can't seem to work out the logic for this. I've tried different windowing functions, I've tried cross joins and cross applies. The issue I keep encountering is that a readmission cannot be an index admission. It must be excluded.
I have successfully written a while loop to solve this problem, but I'd really like to get this in a set based format, if it's possible. I haven't been successful so far.
Ultimate goal is this -
id
AdmitDate
DischargeDate
MedicalRecordNumber
IndexYN
1
2021-03-03 00:00:00.000
2021-03-09 13:20:00.000
X0090362
1
4
2021-03-05 00:00:00.000
2021-03-10 16:00:00.000
X0012614
1
6
2021-05-18 00:00:00.000
2021-05-21 22:20:00.000
X0012614
1
7
2021-06-21 00:00:00.000
2021-07-08 13:30:00.000
X0012614
1
8
2021-02-03 00:00:00.000
2021-02-09 17:00:00.000
X0019655
1
10
2021-03-23 00:00:00.000
2021-03-26 16:40:00.000
X0019655
1
11
2021-03-15 00:00:00.000
2021-03-18 15:53:00.000
X4135958
1
13
2021-05-17 00:00:00.000
2021-05-23 14:55:00.000
X4135958
1
15
2021-06-24 00:00:00.000
2021-07-13 15:06:00.000
X4135958
1
Sample code is below.
CREATE TABLE #Admissions
(
[id] INT,
[AdmitDate] DATETIME,
[DischargeDateTime] DATETIME,
[UnitNumber] VARCHAR(20),
[IndexYN] INT
)
INSERT INTO #Admissions
VALUES( 1 ,'2021-03-03' ,'2021-03-09 13:20:00.000' ,'X0090362', NULL)
,(2 ,'2021-03-27' ,'2021-03-30 19:59:00.000' ,'X0090362', NULL)
,(3 ,'2021-03-31' ,'2021-04-04 05:57:00.000' ,'X0090362', NULL)
,(4 ,'2021-03-05' ,'2021-03-10 16:00:00.000' ,'X0012614', NULL)
,(5 ,'2021-03-28' ,'2021-04-16 13:55:00.000' ,'X0012614', NULL)
,(6 ,'2021-05-18' ,'2021-05-21 22:20:00.000' ,'X0012614', NULL)
,(7 ,'2021-06-21' ,'2021-07-08 13:30:00.000' ,'X0012614', NULL)
,(8 ,'2021-02-03' ,'2021-02-09 17:00:00.000' ,'X0019655', NULL)
,(9 ,'2021-02-17' ,'2021-02-22 17:25:00.000' ,'X0019655', NULL)
,(10 ,'2021-03-23' ,'2021-03-26 16:40:00.000' ,'X0019655', NULL)
,(11 ,'2021-03-15' ,'2021-03-18 15:53:00.000' ,'X4135958', NULL)
,(12 ,'2021-04-08' ,'2021-04-13 19:42:00.000' ,'X4135958', NULL)
,(13 ,'2021-05-17' ,'2021-05-23 14:55:00.000' ,'X4135958', NULL)
,(14 ,'2021-06-09' ,'2021-06-14 12:45:00.000' ,'X4135958', NULL)
,(15 ,'2021-06-24' ,'2021-07-13 15:06:00.000' ,'X4135958', NULL)
You can use a recursive CTE to identify all rows associated with each "index" discharge:
with a as (
select a.*, row_number() over (order by dischargedatetime) as seqnum
from admissions a
),
cte as (
select id, admitdate, dischargedatetime, unitnumber, seqnum, dischargedatetime as index_dischargedatetime
from a
where seqnum = 1
union all
select a.id, a.admitdate, a.dischargedatetime, a.unitnumber, a.seqnum,
(case when a.dischargedatetime > dateadd(day, 30, cte.index_dischargedatetime)
then a.dischargedatetime else cte.index_dischargedatetime
end) as index_dischargedatetime
from cte join
a
on a.seqnum = cte.seqnum + 1
)
select *
from cte;
You can then incorporate this into an update:
update admissions
set indexyn = (case when admissions.dischargedatetime = cte.index_dischargedatetime then 'Y' else 'N' end)
from cte
where cte.id = admissions.id;
Here is a db<>fiddle. Note that I changed the type of IndexYN to a character to assign 'Y'/'N', which makes sense given the column name.

Oracle SQL - Select users between two date by month

I am learning SQL and I was wondering how to select active users by month, depending on their starting and ending date (both timestamp(6)). My table looks like this:
Cust_Num | Start_Date | End_Date
1 | 2018-01-01 | 2019-01-01
2 | 2018-01-01 | NULL
3 | 2019-01-01 | 2019-06-01
4 | 2017-01-01 | 2019-03-01
So, counting the active users by month, I should have an output like:
As of. | Count
2018-06-01 | 3
...
2019-02-01 | 3
2019-07-01 | 1
So far, I do a manual operation by entering each month:
Select
201906,
count(distinct a.cust_num)
From
active_users a
Where
to_date(‘20190630’,’yyyymmdd) between a.start_date and nvl (a.end_date, ‘31-dec-9999)
union all
Select
201905,
count(distinct a.cust_num)
From
active_users a
Where
to_date(‘20190531’,’yyyymmdd) between a.start_date and nvl (a.end_date, ‘31-dec-9999)
union all
...
Not very optimized and sustainable if I want to enter 10 years ao 120 months lol.
Any help is welcome. Thanks a lot!
This query shows the active-user-count effective as-of the end of the month.
How it works:
Convert each input row (with StartDate and EndDate value) into two rows that represent a point-in-time when the active-user-count incremented (on StartDate) and decremented (on EndDate). We need to convert NULL to a far-off date value because NULL values are sorted before instead of after non-NULL values:
This makes your data look like this:
OnThisDate Change
2018-01-01 1
2019-01-01 -1
2018-01-01 1
9999-12-31 -1
2019-01-01 1
2019-06-01 -1
2017-01-01 1
2019-03-01 -1
Then we simply SUM OVER the Change values (after sorting) to get the active-user-count as of that specific date:
So first, sort by OnThisDate:
OnThisDate Change
2017-01-01 1
2018-01-01 1
2018-01-01 1
2019-01-01 1
2019-01-01 -1
2019-03-01 -1
2019-06-01 -1
9999-12-31 -1
Then SUM OVER:
OnThisDate ActiveCount
2017-01-01 1
2018-01-01 2
2018-01-01 3
2019-01-01 4
2019-01-01 3
2019-03-01 2
2019-06-01 1
9999-12-31 0
Then we PARTITION (not group!) the rows by month and sort them by their date so we can identify the last ActiveCount row for that month (this actually happens in the WHERE of the outermost query, using ROW_NUMBER() and COUNT() for each month PARTITION):
OnThisDate ActiveCount IsLastInMonth
2017-01-01 1 1
2018-01-01 2 0
2018-01-01 3 1
2019-01-01 4 0
2019-01-01 3 1
2019-03-01 2 1
2019-06-01 1 1
9999-12-31 0 1
Then filter on that where IsLastInMonth = 1 (actually, where ROW_COUNT() = COUNT(*) inside each PARTITION) to give us the final output data:
At-end-of-month Active-count
2017-01 1
2018-01 3
2019-01 3
2019-03 2
2019-06 1
9999-12 0
This does result in "gaps" in the result-set because the At-end-of-month column only shows rows where the Active-count value actually changed rather than including all possible calendar months - but that's ideal (as far as I'm concerned) because it excludes redundant data. Filling in the gaps can be done inside your application code by simply repeating output rows for each additional month until it reaches the next At-end-of-month value.
Here's the query using T-SQL on SQL Server (I don't have access to Oracle right now). And here's the SQLFiddle I used to come to a solution: http://sqlfiddle.com/#!18/ad68b7/24
SELECT
OtdYear,
OtdMonth,
ActiveCount
FROM
(
-- This query adds columns to indicate which row is the last-row-in-month ( where RowInMonth == RowsInMonth )
SELECT
OnThisDate,
OtdYear,
OtdMonth,
ROW_NUMBER() OVER ( PARTITION BY OtdYear, OtdMonth ORDER BY OnThisDate ) AS RowInMonth,
COUNT(*) OVER ( PARTITION BY OtdYear, OtdMonth ) AS RowsInMonth,
ActiveCount
FROM
(
SELECT
OnThisDate,
YEAR( OnThisDate ) AS OtdYear,
MONTH( OnThisDate ) AS OtdMonth,
SUM( [Change] ) OVER ( ORDER BY OnThisDate ASC ) AS ActiveCount
FROM
(
SELECT
StartDate AS [OnThisDate],
1 AS [Change]
FROM
tbl
UNION ALL
SELECT
ISNULL( EndDate, DATEFROMPARTS( 9999, 12, 31 ) ) AS [OnThisDate],
-1 AS [Change]
FROM
tbl
) AS sq1
) AS sq2
) AS sq3
WHERE
RowInMonth = RowsInMonth
ORDER BY
OtdYear,
OtdMonth
This query can be flattened into fewer nested queries by using aggregate and window functions directly instead of using aliases (like OtdYear, ActiveCount, etc) but that would make the query much harder to understand.
I have created the query which will give the result of all the months starting from the minimum start date in the table till maximum end date.
You can change it using adding one condition in WHERE clause.
-- table creation
CREATE TABLE ACTIVE_USERS (CUST_NUM NUMBER, START_DATE DATE, END_DATE DATE)
-- data creation
INSERT INTO ACTIVE_USERS
SELECT * FROM
(
SELECT 1, DATE '2018-01-01', DATE '2019-01-01' FROM DUAL UNION ALL
SELECT 2, DATE '2018-01-01', NULL FROM DUAL UNION ALL
SELECT 3, DATE '2019-01-01', DATE '2019-06-01' FROM DUAL UNION ALL
SELECT 4, DATE '2017-01-01', DATE '2019-03-01' FROM DUAL
)
-- data in the actual table
SELECT * FROM ACTIVE_USERS ORDER BY CUST_NUM;
CUST_NUM START_DATE END_DATE
---------- ---------- ----------
1 2018-01-01 2019-01-01
2 2018-01-01
3 2019-01-01 2019-06-01
4 2017-01-01 2019-03-01
Query to fetch desired result
WITH CTE ( START_DATE, END_DATE ) AS
(
SELECT
ADD_MONTHS( START_DATE, LEVEL - 1 ),
ADD_MONTHS( START_DATE, LEVEL ) - 1
FROM
(
SELECT
MIN( START_DATE ) AS START_DATE,
MAX( END_DATE ) AS END_DATE
FROM
ACTIVE_USERS
)
CONNECT BY LEVEL <= CEIL( MONTHS_BETWEEN( END_DATE, START_DATE ) ) + 1
)
--
--
SELECT
C.START_DATE,
COUNT(1) AS CNT
FROM
CTE C
JOIN ACTIVE_USERS D ON
(
C.END_DATE BETWEEN
D.START_DATE
AND
CASE
WHEN D.END_DATE IS NOT NULL THEN D.END_DATE
ELSE C.END_DATE
END
)
GROUP BY
C.START_DATE
ORDER BY
C.START_DATE;
-- output --
START_DATE CNT
---------- ----------
2017-01-01 1
2017-02-01 1
2017-03-01 1
2017-04-01 1
2017-05-01 1
2017-06-01 1
2017-07-01 1
2017-08-01 1
2017-09-01 1
2017-10-01 1
2017-11-01 1
START_DATE CNT
---------- ----------
2017-12-01 1
2018-01-01 3
2018-02-01 3
2018-03-01 3
2018-04-01 3
2018-05-01 3
2018-06-01 3
2018-07-01 3
2018-08-01 3
2018-09-01 3
2018-10-01 3
START_DATE CNT
---------- ----------
2018-11-01 3
2018-12-01 3
2019-01-01 3
2019-02-01 3
2019-03-01 2
2019-04-01 2
2019-05-01 2
2019-06-01 1
30 rows selected.
Cheers!!

Append data to split rows

I want to know how many people weren't available in months historically, for that I have an historicTable which contains data from 2012 to 2018 and each row contains how much time an employee wasn't available (vacations, sickness, etc.) this is one example:
idUser startDate endDate daysUn reason nameEmp
--------------------------------------------------------
123 25/01/2018 09/02/2018 12 Sickness John Doe
This is what I need for every row
idUser startDate endDate daysUn reason nameEmp
--------------------------------------------------------
123 25/01/2018 31/01/2018 5 Sickness John Doe
123 01/01/2018 09/02/2018 7 Sickness John Doe
I know this been asked hundred of times here but I'm having trouble doing this for an entire table, for what I've tried in different answers all process work for specific given startdate and enddate columns, and what I need it's to append ALL data to this table and save it as-is so the analyst will be able to study specific cases and specific employees. This is what I get with my current code:
original_INI original_FIN new_INI new_FIN
----------------------- ----------------------- ----------------------- -----------------------
2017-10-15 00:00:00.000 2018-01-06 00:00:00.000 2017-10-15 00:00:00.000 2017-10-31 00:00:00.000
2017-10-15 00:00:00.000 2018-01-06 00:00:00.000 2017-11-01 00:00:00.000 2017-11-30 00:00:00.000
2017-10-15 00:00:00.000 2018-01-06 00:00:00.000 2017-12-01 00:00:00.000 2017-12-31 00:00:00.000
2017-10-15 00:00:00.000 2018-01-06 00:00:00.000 2018-01-01 00:00:00.000 2018-01-06 00:00:00.000
This is the code, original dates are ok as I can sort data more globally but it could print and save the rest of the data so it's more readable:
;WITH n(n) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY [object_id])-1 FROM sys.all_columns
),
d(n,f,t,md,bp,ep) AS
(
SELECT n.n, d.INI, d.FIN,
DATEDIFF(MONTH, d.INI, d.FIN),
DATEADD(MONTH, n.n, DATEADD(DAY, 1-DAY(INI), INI)),
DATEADD(DAY, -1, DATEADD(MONTH, 1, DATEADD(MONTH, n.n,
DATEADD(DAY, 1-DAY(INI), INI))))
FROM n INNER JOIN archivoFuente AS d
ON d.FIN >= DATEADD(MONTH, n.n-1, d.INI)
)
SELECT original_INI = f, original_FIN = t,
new_INI = CASE n WHEN 0 THEN f ELSE bp END,
new_FIN = CASE n WHEN md THEN t ELSE ep END
FROM d WHERE md >= n
ORDER BY original_INI, new_INI;
Any help with the query it's appreciated.
It's pretty easy actually, I used the same code for my requirements, you need to call each column in each select statement so it exist when you split the rows, check this code:
;WITH n(n) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY [object_id])-1 FROM sys.all_columns
),
d(n,f,t,md,bp,ep,
--CALL YOUR COLUMNS HERE EG: name, id, bla, ble
) AS
(
SELECT n.n,d.INI, d.FIN,
DATEDIFF(MONTH, d.INI, d.FIN),
DATEADD(MONTH, n.n, DATEADD(DAY, 1-DAY(INI), INI)),
DATEADD(DAY, -1, DATEADD(MONTH, 1, DATEADD(MONTH, n.n,
DATEADD(DAY, 1-DAY(INI), INI)))),
--CALL YOUR COLUMNS HERE AGAIN, PAY ATTENTION TO NAMES AND COMMAS
d.id_hr,d.Tipo,d.ID_tip,d.Nom_inc,d.RUT,d.Nombre,d.ID_emp,d.Nom_pos,d.Dias_durac,d.Num_lic,d.ID_usu_ap,d.ult_act
FROM n INNER JOIN archivoFuente AS d
ON d.FIN >= DATEADD(MONTH, n.n-1, d.INI)
)
SELECT --PUT ONCE AGAIN YOUR COLUMNS HERE, THIS WILL WORK FOR THE DISPLAYED RESULT
original_INI = f, original_FIN = t,
new_INI = CASE n WHEN 0 THEN f ELSE bp END,
new_FIN = CASE n WHEN md THEN t ELSE ep END
FROM d
WHERE md >= n
ORDER BY original_INI, new_INI;
Now, to save the table, I'd recommend using an INSERT statement to a new table, how will you do it, I don't know, I'am in the same spot as you. Hope someone check this question.

How to keep the leap year when substracting 1 year

I have this query that gives me a given date for each of the past 15 years. When my starting date is February 29 it does not return the 29 for year 2012, 2008 and 2004. How can I have this query to return the 29 for those years?
DECLARE #TempDate1 TABLE (Entry_Date Date)
INSERT INTO #TempDate1 values ('2016-02-29')
;WITH
a AS(SELECT DATEADD(yy,-1,Entry_Date) d, DATEADD(yy,-1,Entry_Date) d2,0 i
FROM #TempDate1
UNION all
SELECT DATEADD(yy,-1,d),DATEADD(yy,-1,d2),i+1 FROM a WHERE i<14),
b AS(SELECT d,d2, DATEDIFF(dd,0,d)%7 dd,i FROM a)
SELECT
d AS Entry_Date
FROM b
It returns this:
Entry_Date
2015-02-28
2014-02-28
2013-02-28
2012-02-28
2011-02-28
2010-02-28
2009-02-28
2008-02-28
2007-02-28
2006-02-28
2005-02-28
2004-02-28
2003-02-28
2002-02-28
2001-02-28
While I would like to have this:
Entry_Date
2015-02-28
2014-02-28
2013-02-28
2012-02-29
2011-02-28
2010-02-28
2009-02-28
2008-02-29
2007-02-28
2006-02-28
2005-02-28
2004-02-29
2003-02-28
2002-02-28
2001-02-28
Perhaps DateAdd in concert with an ad-hoc tally table
Example
Declare #YourTable Table ([Entry_Date] date)
Insert Into #YourTable Values
('2016-02-29')
,('2015-07-22')
Select YearNr = N
,Anniv = dateadd(YEAR,N*-1,Entry_Date)
From #YourTable A
Cross Apply (
Select Top 15 N=Row_Number() Over (Order By (Select NULL)) From master..spt_values n1
) B
Returns
Simply by using EOMONTH function (SQL Server 2012 and above):
DECLARE #TempDate1 TABLE (Entry_Date Date)
INSERT INTO #TempDate1 values ('2016-02-29')
;WITH
a AS(SELECT DATEADD(yy,-1,Entry_Date) d, DATEADD(yy,-1,Entry_Date) d2,0 i
FROM #TempDate1
UNION all
SELECT DATEADD(yy,-1,d),DATEADD(yy,-1,d2),i+1 FROM a WHERE i<14),
b AS(SELECT d,d2, DATEDIFF(dd,0,d)%7 dd,i FROM a)
SELECT EOMONTH(d) AS Entry_Date
FROM b;
Rextester Demo
Rewrite tour query like this... Not only will handle leap years without jumping through hoops, it's orders of magnitude more efficient than what you currently have.
DECLARE #BaseDate DATE = '2016-02-29';
SELECT
Entry_Date = DATEADD(YEAR, t.n, #BaseDate)
FROM
(VALUES (-1),(-2),(-3),(-4),(-5),
(-6),(-7),(-8),(-9),(-10),
(-11),(-12),(-13),(-14),(-15) ) t (n);
Results...
Entry_Date
----------
2015-02-28
2014-02-28
2013-02-28
2012-02-29
2011-02-28
2010-02-28
2009-02-28
2008-02-29
2007-02-28
2006-02-28
2005-02-28
2004-02-29
2003-02-28
2002-02-28
2001-02-28
EDIT: Same functionality when used with a table of dates (I stole John's table)
DECLARE #YourTable TABLE (id INT, Entry_Date DATE);
INSERT INTO #YourTable VALUES (1, '2016-02-29'), (2, '2015-07-22');
SELECT
yt.id,
Entry_Date = DATEADD(YEAR, t.n, yt.Entry_Date)
FROM
#YourTable yt
CROSS APPLY (VALUES (-1),(-2),(-3),(-4),(-5),
(-6),(-7),(-8),(-9),(-10),
(-11),(-12),(-13),(-14),(-15) ) t (n);
GO
Results...
id Entry_Date
----------- ----------
1 2015-02-28
1 2014-02-28
1 2013-02-28
1 2012-02-29
1 2011-02-28
1 2010-02-28
1 2009-02-28
1 2008-02-29
1 2007-02-28
1 2006-02-28
1 2005-02-28
1 2004-02-29
1 2003-02-28
1 2002-02-28
1 2001-02-28
2 2014-07-22
2 2013-07-22
2 2012-07-22
2 2011-07-22
2 2010-07-22
2 2009-07-22
2 2008-07-22
2 2007-07-22
2 2006-07-22
2 2005-07-22
2 2004-07-22
2 2003-07-22
2 2002-07-22
2 2001-07-22
2 2000-07-22

Count distinct records but only 1 time every XX Days

EDIT: Start date as of Jan 1 XXXX
I need to create a count of distinct userID's based on a 7 day grouping. Basically if a User calls on day 1 and day 2 of the month, they are counted 1 time. However if they call on Day 1 and day 10, then they are counted 2 times.
Table layout:
userId CallId datetime
0 123 01/01/2016 xx:xx:xx
0 124 01/10/2016 xx:xx:xx
1 125 01/10/2016 xx:xx:xx
1 126 01/10/2016 xx:xx:xx
2 127 01/10/2016 xx:xx:xx
1 128 01/30/2016 xx:xx:xx
2 129 01/31/2016 xx:xx:xx
What I need the return to look like:
Count(UserID) Week#
1 1
3 2
2 4
Thank you for your time.
Based on Gurwinders response I have produced the following and included years so that it is still usuable in a years time.
SELECT COUNT(UserID), CallYear, CallWeek
FROM (
SELECT DISTINCT UserID,
datepart(year,datetime) as CallYear,
datepart(week,datetime) as CallWeek
FROM my_table
)
Group By CallYear,CallWeek
This will produce a rolling distinct count begining Jan 1
Declare #YourTable table (userId int,CallId int,datetime datetime)
Insert Into #YourTable values
(0,123,'2016-01-01'),
(0,124,'2016-01-10'),
(1,125,'2016-01-10'),
(1,126,'2016-01-10'),
(2,127,'2016-01-10'),
(1,128,'2016-01-30'),
(2,129,'2016-01-31')
Select D1
,D1 =DateAdd(DD,6,D1)
,Cnt=count(Distinct UserID)
From #YourTable A
Join (Select Top 500 D1=DateAdd(DD,(Row_Number() Over (Order By Number)-1)*7,'2016-01-01') From master..spt_values ) B
on datetime between D1 and DateAdd(DD,6,D1)
Group By D1
Returns
D1 D1 Cnt
2016-01-01 2016-01-07 1
2016-01-08 2016-01-14 3
2016-01-29 2016-02-04 2
you can use this:
select count(distinct userid), datepart(week, datetime) week, datepart(year, datetime) year
from my_table
group by datepart(week, datetime), datepart(year, datetime);
What is your starting date? Have you looked at the DateDiff() function?
Try this:
With ABC
As
(select datepart(week, datetime) as week#
from table)
Select count(week#) as Times,week#
From ABC