tsql grouping with duplication based on variable

tsql grouping with duplication based on variable - sql

I want to create some aggregations from a table but I am not able to figure out a solution.
Example table:
DECLARE #MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO #MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
For each person existing at that time, I want to average the value for the last x (#months_back) months given some starting date (#start_date):
DECLARE #months_back int, #start_date date
set #months_back = 3
set #start_date = '2017-05-01'
SELECT person, avg(the_value) as avg_the_value
FROM #MyTable
where the_date <= #start_date and the_date >= dateadd(month, -#months_back, #start_date)
group by person
This works. I now want to do the same thing again but skip back some months (#month_skip) from the starting date. Then I want to union those two tables together. Then, I again want to skip back #month_skip months from this date and do the same thing. I want to continue doing this until I have skipped back to some specified date (#min_date).
DECLARE #months_back int, #month_skip int, #start_date date, #min_date date
set #months_back = 3
set #month_skip = 2
set #start_date = '2017-05-01'
set #min_date = '2017-03-01'
Using the above variables and the table #MyTable the result should be:
person | avg_the_value
1 | 5
2 | 6
1 | 6
3 | 2
Only one skip is made here since #min_date is 2 months back but I would like to be able to do multiple skips based on what #min_date is.
This example table is simple but the real one has many more automatically created columns and therefore it is not feasible to use a table variable where I would have to declare the scheme of the resulting table.
I asked a related question Here but did not manage to get any of the answers to work for this problem.

It sounds like what you're trying to do is the following:
Starting with a date (e.g. 2017-05-01), look back #months_back months and define a range of dates. For example, if we go 3 months back, we're defining a range from 2017-02-01 through 2017-05-01.
After we define this range, we go back to our starting date and define a new starting date, going back #month_skip months. For example, with an initial starting date of 2017-05-01, we might skip back 2 months, giving us a new starting date of 2017-03-01.
We take this new starting date, and define a range of corresponding dates (as we did above). This produces the range 2016-12-01 through 2017-03-01.
We repeat this as needed through the minimum date specified, to produce a list of date ranges we want to do calculations for:
2017-03-01 through 2017-05-01
2016-12-01 through 2017-03-01
... etc ...
For each of these periods, look at a person and calculate the average of their value.
The query below should do what is described above: rather than taking a value and iterating back to calculate previous values, we use a numbers table to calculate offsets on an interval, which is used to determine the ending and starting dates for each interval/period. This query was built using SQL Server 2008 R2 and should be compatible with future versions.
/* Table, data, variable declarations */
DECLARE #MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO #MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
DECLARE #months_back int, #month_skip int, #start_date date, #min_date date
set #months_back = 3
set #month_skip = 2
set #start_date = '2017-05-01'
set #min_date = '2017-01-01'
/* Common table expression to build list of Integers */
/* reference http://www.itprotoday.com/software-development/build-numbers-table-you-need if you want more info */
declare #end_int bigint = 50
; WITH IntegersTableFill (ints) AS
(
SELECT
CAST(0 AS BIGINT) AS 'ints'
UNION ALL
SELECT (T.ints + 1) AS 'ints'
FROM IntegersTableFill T
WHERE ints <= (
CASE
WHEN (#end_int <= 32767) THEN #end_int
ELSE 32767
END
)
)
/* What we're going to do is define a series of periods.
These periods have a start date and an end date, and will simplify grouping
(in place of the calculate-and-union approach)
*/
/* Now, we start defining the periods
#months_Back_start defines the end of the range we need to calculate for.
#month_skip defines the amount of time we have to jump back for each period
*/
/* Using the number table we defined above and the data in our variables, calculate start and end dates */
,periodEndDates as
(
select ints as Period
,DATEADD(month, -(#months_back*ints), #start_date) as endOfPeriod
from IntegersTableFill itf
)
,periodStartDates as
(
select *
,DATEADD(month, -(#month_skip), endOfPeriod) as startOfPeriod
from periodEndDates
)
,finalPeriodData as
(
select (period) as period, startOfPeriod, endOfPeriod from periodStartDates
)
/* Link the entries in our original data to the periods they fall into */
/* NOTE: The join criteria originally specified allows values to fall into multiple periods.
You may want to fix this?
*/
,periodTableJoin as
(
select * from finalPeriodData fpd
inner join #MyTable mt
on mt.the_date >= fpd.startOfPeriod
and mt.the_date <= fpd.endOfPeriod
and mt.the_date >= #min_date
and mt.the_date <= #start_date
)
/* Calculate averages, grouping by period and person */
,periodValueAggregate as
(
select person, avg(the_value) as avg_the_value from
periodTableJoin
group by period, person
)
select * from periodValueAggregate

The method I propose is set-based, not iterative.
(I am not following your problem exactly, but please follow along and we can iron out any discrepancies)
Essentially, you are looking to divide a calendar up in to periods of interest. The periods are all equal in width and are sequential.
For this, I propose you build a calendar table and mark the periods using division as illustrated in the code;
DECLARE #CalStart DATE = '2017-01-01'
,#CalEnd DATE = '2018-01-01'
,#CalWindowSize INT = 2
;WITH Numbers AS
(
SELECT TOP (DATEDIFF(MONTH, #CalStart, #CalEnd)) N = CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) - 1
FROM syscolumns
)
SELECT CalWindow = N / #CalWindowSize
,CalDate = DATEADD(MONTH, N, #CalStart)
FROM Numbers
Once you have correctly configured the variables, you should have a calendar that represents the windows of interest.
It is then a matter of affixing this calendar to your dataset and grouping by not only the person but the CalWindow too;
DECLARE #MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO #MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
----------------------------------
-- Build Calendar
----------------------------------
DECLARE #CalStart DATE = '2017-01-01'
,#CalEnd DATE = '2018-01-01'
,#CalWindowSize INT = 2
;WITH Numbers AS
(
SELECT TOP (DATEDIFF(MONTH, #CalStart, #CalEnd)) N = CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) - 1
FROM syscolumns
)
,Calendar AS
(
SELECT CalWindow = N / #CalWindowSize
,CalDate = DATEADD(MONTH, N, #CalStart)
FROM Numbers
)
SELECT TB.Person
,AVG(TB.the_value)
FROM #MyTable TB
JOIN Calendar CL ON TB.the_date = CL.CalDate
GROUP BY CL.CalWindow, TB.person
Hope I have understood your problem.

Related

How to calculate average in date column

I don't know how to calculate the average age of a column of type date in SQL Server.

You can use datediff() and aggregation. Assuming that your date column is called dt in table mytable, and that you want the average age in years over the whole table, then you would do:
select avg(datediff(year, dt, getdate())) avg_age
from mytable
You can change the first argument to datediff() (which is called the date part), to any other supported value depending on what you actually mean by age; for example datediff(day, dt, getdate()) gives you the difference in days.

First, lets calculate the age in years correctly. See the comments in the code with the understanding that DATEDIFF does NOT calculate age. It only calculates the number of temporal boundaries that it crosses.
--===== Local obviously named variables defined and assigned
DECLARE #StartDT DATETIME = '2019-12-31 23:59:59.997'
,#EndDT DATETIME = '2020-01-01 00:00:00.000'
;
--===== Show the difference in milliseconds between the two date/times
-- Because of the rounding that DATETIME does on 3.3ms resolution, this will return 4ms,
-- which certainly does NOT depict an age of 1 year.
SELECT DATEDIFF(ms,#StartDT,#EndDT)
;
--===== This solution will mistakenly return an age of 1 year for the dates given,
-- which are only about 4ms apart according the SELECT above.
SELECT IncorrectAgeInYears = DATEDIFF(YEAR, #StartDT, #EndDT)
;
--===== This calulates the age in years correctly in T-SQL.
-- If the anniversary data has not yet occurred, 1 year is substracted.
SELECT CorrectAgeInYears = DATEDIFF(yy, #StartDT, #EndDT)
- IIF(DATEADD(yy, DATEDIFF(yy, #StartDT, #EndDT), #StartDT) > #EndDT, 1, 0)
;
Now, lets turn that correct calculation into a Table Valued Function that returns a single scalar value producing a really high speed "Inline Scalar Function".
CREATE FUNCTION [dbo].[AgeInYears]
(
#StartDT DATETIME, --Date of birth or date of manufacture or start date.
#EndDT DATETIME --Usually, GETDATE() or CURRENT_TIMESTAMP but
--can be any date source like a column that has an end date.
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT AgeInYears = DATEDIFF(yy, #StartDT, #EndDT)
- IIF(DATEADD(yy, DATEDIFF(yy, #StartDT, #EndDT), #StartDT) > #EndDT, 1, 0)
;
Then, to Dale's point, let's create a test table and populate it. This one is a little overkill for this problem but it's also useful for a lot of different examples. Don't let the million rows scare you... this runs in just over 2 seconds on my laptop including the Clustered Index creation.
--===== Create and populate a large test table on-the-fly.
-- "SomeInt" has a range of 1 to 50,000 numbers
-- "SomeLetters2" has a range of "AA" to "ZZ"
-- "SomeDecimal has a range of 10.00 to 100.00 numbers
-- "SomeDate" has a range of >=01/01/2000 & <01/01/2020 whole dates
-- "SomeDateTime" has a range of >=01/01/2000 & <01/01/2020 Date/Times
-- "SomeRand" contains the value of RAND just to show it can be done without a loop.
-- "SomeHex9" contains 9 hex digits from NEWID()
-- "SomeFluff" is a fixed width CHAR column just to give the table a little bulk.
SELECT TOP 1000000
SomeInt = ABS(CHECKSUM(NEWID())%50000) + 1
,SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
+ CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
,SomeDecimal = CAST(RAND(CHECKSUM(NEWID())) * 90 + 10 AS DECIMAL(9,2))
,SomeDate = DATEADD(dd, ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'2000','2020')), '2000')
,SomeDateTime = DATEADD(dd, DATEDIFF(dd,0,'2000'), RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2020'))
,SomeRand = RAND(CHECKSUM(NEWID())) --CHECKSUM produces an INT and is MUCH faster than conversion to VARBINARY.
,SomeHex9 = RIGHT(NEWID(),9)
,SomeFluff = CONVERT(CHAR(170),'170 CHARACTERS RESERVED') --Just to add a little bulk to the table.
INTO dbo.JBMTest
FROM sys.all_columns ac1 --Cross Join forms up to a 16 million rows
CROSS JOIN sys.all_columns ac2 --Pseudo Cursor
;
GO
--===== Add a non-unique Clustered Index to SomeDateTime for this demo.
CREATE CLUSTERED INDEX IXC_Test ON dbo.JBMTest (SomeDateTime ASC)
;
Now, lets find the average age of those million represented by the SomeDateTime column.
SELECT AvgAgeInYears = AVG(age.AgeInYears )
,RowsCounted = COUNT(*)
FROM dbo.JBMTest tst
CROSS APPLY dbo.AgeInYears(SomeDateTime,GETDATE()) age
;
Results:

How to get date difference in SQL Server and return value

How to get day difference from when the user registered to current date? I have this scenario:
I have some fixed value in master table like [0, 6, 12, 18, 24, 30, 36, 42 .....]
and suppose
day difference is greater or equal than 1 and less than 6 then It should be return 1.
day difference is greater than 6 and less than 12 then it should return 2 and so on.
day difference is greater than 12 and less than 18 then return 3.
day difference is greater than 18 and less than 24 then return 4.
.
.
.
And so on.
I don't want to use case statements because values in master table can not be fix but value pattern will be fix. table value pattern is like that:
common difference between two consecutive values is 6 i.e.
if n=0 then
n+1 = (0 + 6) => 6
Thanks
declare #day int;
declare #regdate datetime = '2019-12-09 19:24:19.623';
declare #currentDate datetime = GETDATE();
SET #day = (SELECT DATEDIFF(day, #regdate, #currentDate) % 6 FROM tblMembers WHERE Id = 1)
PRINT #day

I think that you are looking for integer division, not modulo. This is the default behavior in SQL Server when both arguments are integers, so, since DATEDIFF returns an integer, this should do it:
1 + DATEDIFF(day, #regdate, #currentDate) / 6

Here's approach you can build your solution on:
declare #masterTable table (id int, col int);
insert into #masterTable values
(1,0) ,
(2,6) ,
(3,12),
(4,18),
(5,24),
(6,30),
(7,36),
(8,42),
(9,48);
-- test data
declare #day int;
declare #regdate datetime = '2019-12-09 19:24:19.623';
declare #currentDate datetime = GETDATE();
select #day = datediff(day, #regdate, #currentDate)
;with cte as (
select id,
col lowerBound,
-- here we need to provide some fallback value for last record
coalesce(lead(col) over (order by id), 1000) upperBound
from #masterTable
)
select id from (values (#day)) [day]([cnt])
join cte on [day].[cnt] between cte.lowerBound and cte.upperBound

SQL Server : 5 days moving average for last month

I have a view with two columns TOTAL and DATE, the latter one excludes Saturdays and Sundays, i.e.
TOTAL DATE
0 1-1-2014
33 2-1-2014
11 3-1-2014
55 5-1-2014
...
25 15-1-2014
35 16-1-2014
17 17-1-2014
40 20-1-2014
33 21-1-2014
...
The task that I'm trying to complete is counting 5 days TOTAL average for the whole month, i.e between 13th and 17th, 14th and 20th (we skip weekends), 15th and 21st etc. up to current date.
And YES, they ARE OVERLAPPING RANGES.
Any idea how to achieve it in SQL?
Example of the output (starting from the 6th and using fake numbers)
5daysAVG Start_day
22 1-01-2014 <-counted between 1st to 6th Jan excl 4 and 5 of Jan
25 2-01-2014 <- Counted between 2nd to 7th excluding 4 and 5
27 3-01-2014 <- 3rd to 8th excluding 4/5
24 6-01-2014 <-6th to 10th
...
33 today-5

Okay, I usually set up some test data to play with.
Here is some code to create a [work] table in tempdb. I am skipping weekends. The total is a random number from 0 to 40.
-- Just playing
use tempdb;
go
-- drop existing
if object_id ('work') > 0
drop table work
go
-- create new
create table work
(
my_date date,
my_total int
);
go
-- clear data
truncate table work;
go
-- Monday = 1
SET DATEFIRST 1;
GO
-- insert data
declare #dt date = '20131231';
declare #hr int;
while (#dt < '20140201')
begin
set #hr = floor(rand(checksum(newid())) * 40);
set #dt = dateadd(d, 1, #dt);
if (datepart(dw, #dt) < 6)
insert into work values (#dt, #hr);
end
go
This becomes real easy in SQL SERVER 2012 with the new LEAD() window function.
-- show data
with cte_summary as
(
select
row_number() over (order by my_date) as my_num,
my_date,
my_total,
LEAD(my_total, 0, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 1, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 2, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 3, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 4, 0) OVER (ORDER BY my_date) as my_sum,
(select count(*) from work) as my_cnt
from work
)
select * from cte_summary
where my_num <= my_cnt - 4
Basically, we give a row number to each row, calculate the sum for rows 0 (current) to row 4 (4 away) and a total count.
Since this is a running total for five periods, the remaining dates have missing data. Therefore, we toss them out. my_row <= my_cnt -4
I hope this solves your problem!
If you are only caring about one number for the month, change the select to the following. I left the other rows in for you to get an understanding of what is going on.
select avg(my_sum/5) as my_stat
from cte_summary
where my_num <= my_cnt - 4
FOR SQL SERVER < 2012 & >= 2005
Like anything in this world, there is always a way to do it. I used a small tally table to loop thru the data and collect sets of 5 data points for averages.
-- show data
with
cte_tally as
(
select
row_number() over (order by (select 1)) as n
from
sys.all_columns x
),
cte_data as
(
select
row_number() over (order by my_date) as my_num,
my_date,
my_total
from
work
)
select
(select my_date from cte_data where my_num = n) as the_date,
(
select sum(my_total) / 5
from cte_data
where my_num >= n and my_num < n+5
) as the_average
from cte_tally
where n <= (select count(*)-4 from work)
Here is an explanation of the common table expressions (CTE).
cte_data = order data by date and give row numbers
cte_tally = a set based counting algorithm
For groups of five calculate an average and show the date.
This solution does not depend on holidays or weekends. If data is there, it just partitions by groups of five order by date.
If you need to filter out holidays and weekends, create a holiday table. Add a where clause to cte_data that checks for NOT IN (SELECT DATE FROM HOLIDAY TABLE).
Good luck!

SQL Server offers the datepart(wk, ...) function to get the week of the year. Unfortunately, it uses the first day of the year to define the year.
Instead, you can find sequences of consecutive values and group them together:
select min(date), max(date, avg(total*1.0)
from (select v.*, row_number() over (order by date) as seqnum
from view
) v
group by dateadd(day, -seqnum, date);
The idea is that subtracting a sequence of numbers from a sequence of consecutive days yields a constant.
You can also do this by using a canonical date and dividing by 7:
select min(date), max(date, avg(total*1.0)
from view v
group by datediff(day, '2000-01-03', date) / 7;
The date '2000-01-03' is an arbitrary Monday.
EDIT:
You seem to want a 5-day moving average. Because there is missing data for the weekends, avg() should just work:
select v1.date, avg(v2.value)
from view v1 join
view v2
on v2.date >= v1.date and v2.date < dateadd(day, 7, v1.date)
group by v1.date;

Here's a solution that works in SQL 2008;
The concept here is to use a table variable to normalize the data first; the rest is simple math to count and average the days.
By normalizing the data, I mean, get rid of weekend days, and assign ID's in a temporary table variable that can be used to identify the rows;
Check it out: (SqlFiddle also here)
-- This represents your original source table
Declare #YourSourceTable Table
(
Total Int,
CreatedDate DateTime
)
-- This represents some test data in your table with 2 weekends
Insert Into #YourSourceTable Values (0, '1-1-2014')
Insert Into #YourSourceTable Values (33, '1-2-2014')
Insert Into #YourSourceTable Values (11, '1-3-2014')
Insert Into #YourSourceTable Values (55, '1-4-2014')
Insert Into #YourSourceTable Values (25, '1-5-2014')
Insert Into #YourSourceTable Values (35, '1-6-2014')
Insert Into #YourSourceTable Values (17, '1-7-2014')
Insert Into #YourSourceTable Values (40, '1-8-2014')
Insert Into #YourSourceTable Values (33, '1-9-2014')
Insert Into #YourSourceTable Values (43, '1-10-2014')
Insert Into #YourSourceTable Values (21, '1-11-2014')
Insert Into #YourSourceTable Values (5, '1-12-2014')
Insert Into #YourSourceTable Values (12, '1-13-2014')
Insert Into #YourSourceTable Values (16, '1-14-2014')
-- Just a quick test to see the source data
Select * From #YourSourceTable
/* Now we need to normalize the data;
Let's just remove the weekends and get some consistent ID's to use in a separate table variable
We will use DateName SQL Function to exclude weekend days while also giving
sequential ID's to the remaining data in our temporary table variable,
which are easier to query later
*/
Declare #WorkingTable Table
(
TempID Int Identity,
Total Int,
CreatedDate DateTime
)
-- Let's get the data normalized:
Insert Into
#WorkingTable
Select
Total,
CreatedDate
From #YourSourceTable
Where DateName(Weekday, CreatedDate) != 'Saturday'
And DateName(Weekday, CreatedDate) != 'Sunday'
-- Let's run a 2nd quick sanity check to see our normalized data
Select * From #WorkingTable
/* Now that data is normalized, we can just use the ID's to get each 5 day range and
perform simple average function on the columns; I chose to use a CTE here just to
be able to query it and drop the NULL ranges (where there wasn't 5 days of data)
without having to recalculate each average
*/
; With rangeCte (StartDate, TotalAverage)
As
(
Select
wt.createddate As StartDate,
(
wt.Total +
(Select Total From #WorkingTable Where TempID = wt.TempID + 1) +
(Select Total From #WorkingTable Where TempID = wt.TempID + 2) +
(Select Total From #WorkingTable Where TempID = wt.TempID + 3) +
(Select Total From #WorkingTable Where TempID = wt.TempID + 4)
) / 5
As TotalAverage
From
#WorkingTable wt
)
Select
StartDate,
TotalAverage
From rangeCte
Where TotalAverage
Is Not Null

recurring period in sql script

The situation:
The user creates a case record that includes a date field (DateOpened), and wants to send the client a follow up every 30 days until the case is closed.
The user will run the query periodically (probably weekly) and provide a 'From' and 'To' date range to specify the period in which a record may fall within the mutliple of 30 days.
The request:
I need a method to identify records where the user specified date range includes those records which are a multiple of 30 days since the DateOpened date.
UPDATE
This is what came to me all of a sudden while watching a third rate TV show last night!!!
SELECT
....
FROM
....
WHERE
(CAST((DATEDIFF(dd, Invoice.DateOpened #EndDate)/30) AS INT) - CAST((DATEDIFF(dd, Invoice.DateOpened, #StartDate)/30) AS INT)) >=1
OR DATEDIFF(dd, Invoice.DateOpened, #StartDate) % 30 = 0 --this line to capture valid records but where From and To dates are the same

Is this Microsoft SQL? Is this Express edition? As long as it's not Express, you may want to look into using the SQL Agent service, which lets you schedule tasks that can run against the database. What do you want it to do with the record once it hits 30 days?

You can use the DATEDIFF function to calculate the difference between dates in days. You can use the modulus (%) operator to get the "remainder" of a division operation. Combining the two gives you:
SELECT
....
FROM
....
WHERE
--In MS T-SQL, BETWEEN is inclusive.
DateOpened BETWEEN #UserSuppliedFromDate AND #UserSuppliedToDate
AND DATEDIFF(dd, DateOpened, getdate()) % 30 = 0
which should give you the desired result.
Edit (Give this example a try in MSSQL):
DECLARE #Table TABLE
(
ID integer,
DateOpened datetime
)
DECLARE #FromDate as datetime = '1/1/2012'
DECLARE #ToDate as datetime = '12/31/2012'
INSERT INTO #Table VALUES (0, '1/1/1982')
INSERT INTO #Table values (1, '1/1/2012')
INSERT INTO #Table VALUES (2, '2/17/2012')
INSERT INTO #Table VALUES (3, '3/16/2012')
INSERT INTO #Table VALUES (4, '4/16/2012')
INSERT INTO #Table VALUES (5, '5/28/2012')
INSERT INTO #Table VALUES (6, '1/31/2012')
INSERT INTO #Table VALUES (7, '12/12/2013')
DECLARE #DateLoop as datetime
DECLARE #ResultIDs as table ( ID integer, DateLoopAtTheTime datetime, DaysDifference integer )
--Initialize to lowest possible value
SELECT #DateLoop = #FromDate
--Loop until we hit the maximum date to check
WHILE #DateLoop <= #ToDate
BEGIN
INSERT INTO #ResultIDs (ID,DateLoopAtTheTime, DaysDifference)
SELECT ID, #DateLoop, DATEDIFF(dd,#DateLoop, DateOpened)
FROM #Table
WHERE
DATEDIFF(dd,#DateLoop, DateOpened) % 30 = 0
AND DATEDIFF(dd,#DateLoop,DateOpened) > 0 -- Avoids false positives when #DateLoop and DateOpened are the same
AND DateOpened <= #ToDate
SELECT #DateLoop = DATEADD(dd, 1, #DateLoop) -- Increment the iterator
END
SELECT distinct * From #ResultIDs

Grouping by contiguous dates, ignoring weekends in SQL

I'm attempting to group contiguous date ranges to show the minimum and maximum date for each range. So far I've used a solution similar to this one: http://www.sqlservercentral.com/articles/T-SQL/71550/ however I'm on SQL 2000 so I had to make some changes. This is my procedure so far:
create table #tmp
(
date smalldatetime,
rownum int identity
)
insert into #tmp
select distinct date from testDates order by date
select
min(date) as dateRangeStart,
max(date) as dateRangeEnd,
count(*) as dates,
dateadd(dd,-1*rownum, date) as GroupID
from #tmp
group by dateadd(dd,-1*rownum, date)
drop table #tmp
It works exactly how I want except for one issue: weekends. My data sets have no records for weekend dates, which means any group found is at most 5 days. For instance, in the results below, I would like the last 3 groups to show up as a single record, with a dateRangeStart of 10/6 and a dateRangeEnd of 10/20:
Is there some way I can set this up to ignore a break in the date range if that break is just a weekend?
Thanks for the help.

EDITED
I didn't like my previous idea very much. Here's a better one, I think:
Based on the first and the last dates from the set of those to be grouped, prepare the list of all the intermediate weekend dates.
Insert the working dates together with weekend dates, ordered, so they would all be assigned rownum values according to their normal order.
Use your method of finding contiguous ranges with the following modifications:
1) when calculating dateRangeStart, if it's a weekend date, pick the nearest following weekday;
2) accordingly for dateRangeEnd, if it's a weekend date, pick the nearest preceding weekday;
3) when counting dates for the group, pick only weekdays.
Select from the resulting set only those rows where dates > 0, thus eliminating the groups formed only of the weekends.
And here's an implementation of the method, where it is assumed, that a week starts on Sunday (DATEPART returns 1) and weekend days are Sunday and Saturday:
DECLARE #tmp TABLE (date smalldatetime, rownum int IDENTITY);
DECLARE #weekends TABLE (date smalldatetime);
DECLARE #minDate smalldatetime, #maxDate smalldatetime, #date smalldatetime;
/* #1 */
SELECT #minDate = MIN(date), #maxDate = MAX(date)
FROM testDates;
SET #date = #minDate - DATEPART(dw, #minDate) + 7;
WHILE #date < #maxDate BEGIN
INSERT INTO #weekends
SELECT #date UNION ALL
SELECT #date + 1;
SET #date = #date + 7;
END;
/* #2 */
INSERT INTO #tmp
SELECT date FROM testDates
UNION
SELECT date FROM #weekends
ORDER BY date;
/* #3 & #4 */
SELECT *
FROM (
SELECT
MIN(date + CASE DATEPART(dw, date) WHEN 1 THEN 1 WHEN 7 THEN 2 ELSE 0 END)
AS dateRangeStart,
MAX(date - CASE DATEPART(dw, date) WHEN 1 THEN 2 WHEN 7 THEN 1 ELSE 0 END)
AS dateRangeEnd,
COUNT(CASE WHEN DATEPART(dw, date) NOT IN (1, 7) THEN date END) AS dates,
DATEADD(d, -rownum, date) AS GroupID
FROM #tmp
GROUP BY DATEADD(d, -rownum, date)
) s
WHERE dates > 0;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

tsql grouping with duplication based on variable - sql

Related

How to calculate average in date column

How to get date difference in SQL Server and return value

SQL Server : 5 days moving average for last month

recurring period in sql script

Grouping by contiguous dates, ignoring weekends in SQL

Categories

Resources