Related
In DWH (SQL Server) i have two tables:
DWH.Days
DayStart
DayStop
DaysBetween
2022-04-21
2022-04-24
null
2022-03-12
2022-04-27
null
2022-04-21
2022-04-24
null
2022-03-01
2022-04-22
null
and
DWH.Calendar
Date
IsHoliday?
2022-05-11
yes
2022-05-12
no
2022-05-13
yes
2022-05-15
no
I need to update DWH.Days.DaysBetween as a number of days between DayStart and DayStop where DWH.DimCalendar.IsHoliday?='no'. I don't have premission change the data model. I don't have any ideas how to do it, any ideas?
your data
however there exist numerous issues regarding your sample data
declare #Days table (
DayStart DATE NOT NULL,
DayStop DATE NOT NULL,
DaysBetween VARCHAR(40)
);
INSERT INTO #Days (DayStart, DayStop, DaysBetween)
VALUES
('2022-04-21', '2022-04-24', NULL),
('2022-03-12', '2022-04-27', NULL),
('2022-04-21', '2022-04-24', NULL),
--repeated data it should be deleted
('2022-03-01', '2022-04-22', NULL);
declare #Calendar table(
Date DATE NOT NULL,
IsHoliday VARCHAR(30) NOT NULL
);
INSERT INTO #Calendar (Date, IsHoliday)
VALUES
('2022-05-11', 'yes'),
--actual values
('2022-05-12', 'no'),
--actual values
('2022-05-13', 'yes'),
--actual values
('2022-05-15', 'no'),
--actual values
('2022-04-23', 'yes'),
--added values for test. it can be deleted.
('2022-03-10', 'no'),
--added values for test. it can be deleted.
('2022-03-14', 'yes'),
--added values for test. it can be deleted.
('2022-04-22', 'no'),
--added values for test. it can be deleted.
('2022-03-15', 'no');
first you should use cross join for satisfy the Condition DayStart<Date<DayStop and IsHoliday='no'. then use subquery and DATEDIFF and group by as follows
SELECT Daystart,
Daystop,
Datediff(d, Daystart, Daystop) - Count(Isholiday) AS DaysBetween
FROM (SELECT Daystart,
Daystop,
Daysbetween,
Isholiday
FROM #days d,
#Calendar c
WHERE d.Daystart < c.Date
AND d.Daystop > c.Date
AND c.Isholiday = 'no') a
GROUP BY Daystart,
Daystop
I'm looking for help with a SQL query. Below are the details.
Database: Microsoft SQL Server 2016
Data Table:
It's a "version history" table with 3 columns: version number, effective date, and end date.
The version number with an end_dt of 12/31/9999 is considered the "active" version number.
Users can "restore" prior versions and make them active again.
version_number
eff_dt
end_dt
0
2021-04-13 18:03:26.483
2021-04-16 18:35:06.367
1
2021-04-16 18:35:06.370
2021-04-19 20:45:38.993
1
2021-04-19 20:45:38.997
2021-05-06 16:00:59.990
2
2021-05-06 16:00:59.990
2021-05-06 16:13:03.997
3
2021-05-06 16:13:04.000
2021-05-06 16:17:23.127
4
2021-05-06 16:17:23.130
2021-05-06 16:52:45.250
4
2021-05-06 16:52:45.253
2021-05-11 15:36:25.283
4
2021-05-11 15:36:25.283
2021-05-14 15:52:50.843
5
2021-05-14 15:52:50.847
2021-05-20 17:14:55.860
4
2021-05-20 17:14:55.863
2021-05-20 17:14:55.867
1
2021-05-20 17:14:55.870
9999-12-31 00:00:00.000
Desired Output:
A query to display a consolidated version history where consecutive entries in the version history table are displayed as a single row encompassing the entire date range the version was active.
version_number
eff_dt
end_dt
0
2021-04-13 18:03:26.483
2021-04-16 18:35:06.367
1
2021-04-16 18:35:06.370
2021-05-06 16:00:59.990
2
2021-05-06 16:00:59.990
2021-05-06 16:13:03.997
3
2021-05-06 16:13:04.000
2021-05-06 16:17:23.127
4
2021-05-06 16:17:23.130
2021-05-14 15:52:50.843
5
2021-05-14 15:52:50.847
2021-05-20 17:14:55.860
4
2021-05-20 17:14:55.863
2021-05-20 17:14:55.867
1
2021-05-20 17:14:55.870
9999-12-31 00:00:00.000
Question:
How would one write a SQL statement to generate the Desired Output based on the Data Table?
SQL Script to create sample data:
CREATE TABLE #t1(
[version_number] [int] NULL,
[eff_dt] [datetime] NOT NULL,
[end_dt] [datetime] NOT NULL
)
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (1, CAST(N'2021-05-20T17:14:55.870' AS DateTime), CAST(N'9999-12-31T00:00:00.000' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (5, CAST(N'2021-05-14T15:52:50.847' AS DateTime), CAST(N'2021-05-20T17:14:55.860' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-20T17:14:55.863' AS DateTime), CAST(N'2021-05-20T17:14:55.867' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-11T15:36:25.283' AS DateTime), CAST(N'2021-05-14T15:52:50.843' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-06T16:52:45.253' AS DateTime), CAST(N'2021-05-11T15:36:25.283' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-06T16:17:23.130' AS DateTime), CAST(N'2021-05-06T16:52:45.250' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (3, CAST(N'2021-05-06T16:13:04.000' AS DateTime), CAST(N'2021-05-06T16:17:23.127' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (2, CAST(N'2021-05-06T16:00:59.990' AS DateTime), CAST(N'2021-05-06T16:13:03.997' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (1, CAST(N'2021-04-19T20:45:38.997' AS DateTime), CAST(N'2021-05-06T16:00:59.990' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (1, CAST(N'2021-04-16T18:35:06.370' AS DateTime), CAST(N'2021-04-19T20:45:38.993' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (0, CAST(N'2021-04-13T18:03:26.483' AS DateTime), CAST(N'2021-04-16T18:35:06.367' AS DateTime))
GO
You can achieve this in two steps, using successive common table expressions (cte). Firstly, you need a consecutive ranking number within your data. On the basis of this you can then do a recursive cte, looking at the version_number of consecutive rows (necessarily one apart). This allows us to create a "batch" number: if the version_number is the same, then we take the previous batch number, if it is different, we increment the previous batch number by one. Finally we need a simple min and max on the dates grouping by the batch number. The result looks like this:
declare #t1 TABLE (
[version_number] [int] NULL,
[eff_dt] [datetime] NOT NULL,
[end_dt] [datetime] NOT NULL
);
INSERT #t1 ([version_number], [eff_dt], [end_dt])
VALUES
(1, CAST(N'2021-05-20T17:14:55.870' AS DateTime), CAST(N'9999-12-31T00:00:00.000' AS DateTime)),
(5, CAST(N'2021-05-14T15:52:50.847' AS DateTime), CAST(N'2021-05-20T17:14:55.860' AS DateTime)),
(4, CAST(N'2021-05-20T17:14:55.863' AS DateTime), CAST(N'2021-05-20T17:14:55.867' AS DateTime)),
(4, CAST(N'2021-05-11T15:36:25.283' AS DateTime), CAST(N'2021-05-14T15:52:50.843' AS DateTime)),
(4, CAST(N'2021-05-06T16:52:45.253' AS DateTime), CAST(N'2021-05-11T15:36:25.283' AS DateTime)),
(4, CAST(N'2021-05-06T16:17:23.130' AS DateTime), CAST(N'2021-05-06T16:52:45.250' AS DateTime)),
(3, CAST(N'2021-05-06T16:13:04.000' AS DateTime), CAST(N'2021-05-06T16:17:23.127' AS DateTime)),
(2, CAST(N'2021-05-06T16:00:59.990' AS DateTime), CAST(N'2021-05-06T16:13:03.997' AS DateTime)),
(1, CAST(N'2021-04-19T20:45:38.997' AS DateTime), CAST(N'2021-05-06T16:00:59.990' AS DateTime)),
(1, CAST(N'2021-04-16T18:35:06.370' AS DateTime), CAST(N'2021-04-19T20:45:38.993' AS DateTime)),
(0, CAST(N'2021-04-13T18:03:26.483' AS DateTime), CAST(N'2021-04-16T18:35:06.367' AS DateTime));
with rowdata as
(
SELECT version_number, eff_dt, end_dt,
ROW_NUMBER() OVER(ORDER BY eff_dt) rn
FROM #t1
),
cte_recursive as
(
SELECT 1 as batchno, rn, version_number, eff_dt, end_dt
FROM rowdata
WHERE version_number = 0
UNION ALL
SELECT CASE WHEN rec.version_number = rd.version_number
THEN rec.batchno
ELSE rec.batchno + 1
END,
rd.rn, rd.version_number, rd.eff_dt, rd.end_dt
FROM cte_recursive rec
INNER JOIN rowdata rd on rec.rn = rd.rn - 1
)
SELECT
version_number, min(eff_dt) as eff_dt, max(end_dt) as end_dt
FROM cte_recursive
GROUP BY version_number, batchno
A couple of points to note. I prefer to use table variables to temporary tables (has a slight advantage that they don't need to be deleted!). Secondly you can insert multiple values separated by commas, as I have shown (no need for multiple inserts).
To help you understand how the recursive element works, we begin by a simple select which is the base case, in this case selecting where version_number is 0. We then build up from that by joining to the recursive part where rn (the value returned by ROW_NUMBER()) is one greater than the value we already have. We simply need to check for a difference in the version_number between our old value and the new row, to decide if the batch number needs incrementing or not.
You may find it helpful to run these queries one at a time, to help you understand what is happening (for example just run the sub-select that includes the row_number()).
BTW it was good of you to add the create statements.
This is easily accomplished using window functions. It's a variation on the gaps and islands problem.
The premise is to identify the islands of consecutive values of the version_number. The first CTE uses lag to compare the current row value to the previous row value and marks the start of the Next Group when the values are different. The second CTE uses sum as a window function to produce a running total of the groups. This provides each group of like version_numbers with its own sequential value.
The final select is then able to group by the version_number and its sequential group number, using the min and max dates for each.
Note also that using windows function and hitting the source table just once will also be significantly more efficient than a recursive solution.
with ng as (
select *, case when Lag(version_number) over(order by end_dt) = version_number then 0 else 1 end as ng
from #t1
), grp as (
select *, Sum(ng) over(order by end_dt) as grp
from ng
)
select version_number, Min(eff_dt) eff_dt, Max(end_dt) end_dt
from grp
group by version_number, grp
order by eff_dt
you can use this Query :
DROP TABLE #t1
CREATE TABLE #t1(
[version_number] [int] NULL,
[eff_dt] [datetime] NOT NULL,
[end_dt] [datetime] NOT NULL
)
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (1, CAST(N'2021-05-20T17:14:55.870' AS DateTime), CAST(N'9999-12-31T00:00:00.000' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (5, CAST(N'2021-05-14T15:52:50.847' AS DateTime), CAST(N'2021-05-20T17:14:55.860' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-20T17:14:55.863' AS DateTime), CAST(N'2021-05-20T17:14:55.867' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-11T15:36:25.283' AS DateTime), CAST(N'2021-05-14T15:52:50.843' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-06T16:52:45.253' AS DateTime), CAST(N'2021-05-11T15:36:25.283' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (4, CAST(N'2021-05-06T16:17:23.130' AS DateTime), CAST(N'2021-05-06T16:52:45.250' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (3, CAST(N'2021-05-06T16:13:04.000' AS DateTime), CAST(N'2021-05-06T16:17:23.127' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (2, CAST(N'2021-05-06T16:00:59.990' AS DateTime), CAST(N'2021-05-06T16:13:03.997' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (1, CAST(N'2021-04-19T20:45:38.997' AS DateTime), CAST(N'2021-05-06T16:00:59.990' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (1, CAST(N'2021-04-16T18:35:06.370' AS DateTime), CAST(N'2021-04-19T20:45:38.993' AS DateTime))
GO
INSERT #t1 ([version_number], [eff_dt], [end_dt]) VALUES (0, CAST(N'2021-04-13T18:03:26.483' AS DateTime), CAST(N'2021-04-16T18:35:06.367' AS DateTime))
GO
SELECT DISTINCT t.[version_number] , eff_Table.eff_dt , end_Table.end_dt FROM #t1 t INNER JOIN
(SELECT t.version_number, t.eff_dt
,ROW_NUMBER() OVER (PARTITION BY t.version_number ORDER BY t.eff_dt) AS FirstID
FROM #t1 t ) AS eff_Table ON t.version_number = eff_Table.version_number
INNER JOIN (
SELECT t.version_number,
t.end_dt
,ROW_NUMBER() OVER (PARTITION BY t.version_number ORDER BY t.end_dt desc) AS SecondID
FROM #t1 t ) AS end_Table ON end_Table.version_number = t.version_number
WHERE eff_Table.FirstID = 1 AND end_Table.SecondID = 1
I wanted to count the time gap between two rows for the same id if the second is less than an hour after the first, and partition the count for the week.
Suppose given date with time is 2020-07-01 08:00
create table #Temp (
Id integer not null,
Time datetime not null
);
insert into #Temp values (1, '2020-07-01 08:00');
insert into #Temp values (1, '2020-07-01 08:01');
insert into #Temp values (1, '2020-07-01 08:06');
insert into #Temp values (1, '2020-07-01 08:30');
insert into #Temp values (1, '2020-07-08 09:35');
insert into #Temp values (1, '2020-07-15 16:10');
insert into #Temp values (1, '2020-07-15 16:20');
insert into #Temp values (1, '2020-07-17 06:40');
insert into #Temp values (1, '2020-07-17 06:41');
insert into #Temp values (2, '2020-07-01 08:30');
insert into #Temp values (2, '2020-07-01 09:26');
insert into #Temp values (2, '2020-07-01 10:25');
insert into #Temp values (2, '2020-07-09 08:30');
insert into #Temp values (2, '2020-07-09 09:26');
insert into #Temp values (2, '2020-07-09 10:25');
insert into #Temp values (3, '2020-07-21 08:30');
insert into #Temp values (3, '2020-07-21 09:26');
insert into #Temp values (3, '2020-07-21 10:25');
The week should extend up to the last date in the record. Here, the last date is
2020-07-21 10:25
Have to transform the output from this piece of code and divide the duration weekly.
select Id, sum(datediff(minute, Time, next_ts)) as duration_minutes
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from #Temp t
) t
where datediff(minute, Time, next_ts) < 60
group by Id;
Output:
id duration_minutes
1 41
2 230
3 115
The desired output should divide this duration on a weekly basis,
like Week 1, Week 2, Week 3, and so on.
Desired Output:
If the
start date is 2020-07-01 08:00
end date is 2020-07-21 10:25
id | Week 1 | Week 2 | Week 3
--------------------------------------
1 | 30 | 0 | 11
2 | 115 | 115 | 0
3 | 0 | 0 | 115
similarly, if the
start date is 2020-07-08 08:00
id | Week 1 | Week 2
---------------------------
1 | 11 | 0
2 | 115 | 0
3 | 0 | 115
Is this what you want?
select Id,
1 + datediff(second, '2020-07-01 06:00', time) / (24 * 60 * 60 * 7) as week_num,
sum(datediff(minute, Time, next_ts)) as duration_minutes
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from Temp t
) t
where datediff(minute, Time, next_ts) < 60
group by Id, datediff(second, '2020-07-01 06:00', time) / (24 * 60 * 60 * 7)
order by id, week_num;
Here is a db<>fiddle.
I am not able to understand the logic behind the week periods. Anyone, in the example below I am using the following code to set the week:
'Week ' + CAST(DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, #FirstDate, next_ts) / 7) AS VARCHAR(12))
You can adjust it to ignore the ours, be more precise or something else to match your real requirements.
Apart from that, you just need to perform a dynamic PIVOT. Here is the full working example:
DROP TABLE IF EXISTS #Temp;
create table #Temp (
Id integer not null,
Time datetime not null
);
insert into #Temp values (1, '2020-07-01 08:00');
insert into #Temp values (1, '2020-07-01 08:01');
insert into #Temp values (1, '2020-07-01 08:06');
insert into #Temp values (1, '2020-07-01 08:30');
insert into #Temp values (1, '2020-07-08 09:35');
insert into #Temp values (1, '2020-07-15 16:10');
insert into #Temp values (1, '2020-07-15 16:20');
insert into #Temp values (1, '2020-07-17 06:40');
insert into #Temp values (1, '2020-07-17 06:41');
insert into #Temp values (2, '2020-07-01 08:30');
insert into #Temp values (2, '2020-07-01 09:26');
insert into #Temp values (2, '2020-07-01 10:25');
insert into #Temp values (2, '2020-07-09 08:30');
insert into #Temp values (2, '2020-07-09 09:26');
insert into #Temp values (2, '2020-07-09 10:25');
insert into #Temp values (3, '2020-07-21 08:30');
insert into #Temp values (3, '2020-07-21 09:26');
insert into #Temp values (3, '2020-07-21 10:25');
DROP TABLE IF EXISTS #TEST
CREATE TABLE #TEST
(
[ID] INT
,[week_day] VARCHAR(12)
,[time_in_minutes] BIGINT
)
DECLARE #FirstDate DATE;
SELECT #FirstDate = MIN(Time)
FROM #Temp
INSERT INTO #TEST
select id
,'Week ' + CAST(DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, #FirstDate, next_ts) / 7) AS VARCHAR(12))
,datediff(minute, Time, next_ts)
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from #Temp t
) t
where datediff(minute, Time, next_ts) < 60
DECLARE #columns NVARCHAR(MAX);
SELECT #columns = STUFF
(
(
SELECT ',' + QUOTENAME([week_day])
FROM
(
SELECT DISTINCT CAST(REPLACE([week_day], 'Week ', '') AS INT)
,[week_day]
FROM #TEST
) DS ([rowID], [week_day])
ORDER BY [rowID]
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
);
DECLARE #DanymicSQL NVARCHAR(MAX);
SET #DanymicSQL = N'
SELECT [ID], ' + #columns + '
FROM #TEST
PIVOT
(
SUM([time_in_minutes]) FOR [week_day] IN (' + #columns + ')
) PVT';
EXEC sp_executesql #DanymicSQL;
I want to transpose rows to columns against RuleID and Dname.Following is sample data. Please help me out to acheive it
CREATE TABLE [dbo].[Table1](
[RuleID] [nvarchar](10) NULL,
[DateTime1] [datetime] NULL,
[DName] [nvarchar](30) NULL
) ON [PRIMARY]
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRS', CAST(N'2017-03-28T12:22:04.000' AS DateTime), N'DB1')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRK', CAST(N'2017-03-28T12:22:04.260' AS DateTime), N'DB1')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRE', CAST(N'2017-03-28T12:22:09.000' AS DateTime), N'DB1')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRK', CAST(N'2017-04-04T08:33:15.870' AS DateTime), N'DB2')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRE', CAST(N'2017-04-04T08:33:31.000' AS DateTime), N'DB2')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRK', CAST(N'2017-04-04T09:14:30.503' AS DateTime), N'DB2')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRS', CAST(N'2017-04-04T09:14:31.000' AS DateTime), N'DB2')
INSERT [dbo].Table1 ([RuleID], [DateTime1], [DName]) VALUES (N'DBRE', CAST(N'2017-04-04T09:44:33.000' AS DateTime), N'DB2')
Desired Output:
SELECT 'DB1' As DName,'2017-03-28 12:22:04.260'AS BRKIndicated ,'2017-03-28 12:22:04.000' As BRKStart,'2017-03-28 12:22:09.000' AS BRKEnd
UNION
SELECT 'DB2'As DName,'2017-04-04 08:33:15.870'AS BRKIndicated,NULL As BRKStart,'2017-04-04 08:33:31.000'AS BRKEnd
UNION
SELECT 'DB2'As DName,'2017-04-04 09:14:30.503'AS BRKIndicated,'2017-04-04 09:14:31.000' As BRKStart,'2017-04-04 09:44:33.000'AS BRKEnd
Use PIVOT table :
SELECT *
FROM
(
SELECT *
FROM Table1
) A
PIVOT
(
MAX(DateTime1) FOR RuleID IN ([DBRS],[DBRK],[DBRE])
)pvt
Use a self join
select D1.DName, D1.DateTime1 as DBRS, D2.DateTime1 as DBRK, D3.DateTime1 as DBRE
from Table1 D1
inner join Table1 D2
on D1.DName = D2.DName
and D2.RuleID = 'DBRK'
inner join Table1 D3
on D1.DName = D3.DName
and D3.RuleID = 'DBRE'
where D1.RuleID = 'DBRS'
If it's possible that one of the values might be missing, use LEFT JOIN instead
Database: MS SQL 2005
Table:
EmployeeNumber | EntryDate | Status
Sample Data:
200 | 3/1/2009 | P
200 | 3/2/2009 | A
200 | 3/3/2009 | A
201 | 3/1/2009 | A
201 | 3/2/2009 | P
Where P is present, A is absent.
I have tried row_number over partion. But it does not generate the sequence which I expect.
For the above data the sequence I expect is
1
1
2
1
1
SELECT EmployeeNumber, EntryDate,Status
ROW_NUMBER() OVER (
PARTITION BY EmployeeNumber, Status
ORDER BY EmployeeNumber,EntryDate ) AS 'RowNumber'
FROM [Attendance]
i'm not sure I follow what you're wanting with the 1 1 2 1 1 sequence, but simply adding an order by to your original query produces that sequence...
SELECT EmployeeNumber,
EntryDate,
Status,
ROW_NUMBER() OVER (PARTITION BY EmployeeNumber, Status ORDER BY EmployeeNumber, EntryDate) AS 'RowNumber'
FROM Attendance
ORDER BY EmployeeNumber, EntryDate
/*
EmployeeNumber EntryDate Status RowNumber
-------------- ----------------------- ------ --------------------
200 2009-03-01 00:00:00 P 1
200 2009-03-02 00:00:00 A 1
200 2009-03-03 00:00:00 A 2
201 2009-03-01 00:00:00 A 1
201 2009-03-02 00:00:00 P 1
(5 row(s) affected)
*/
You should be able to do this with a CTE in SQL 2005. Stealing Lievens data:
DECLARE #Attendance TABLE (EmployeeNumber INTEGER, EntryDate DATETIME, Status VARCHAR(1))
INSERT INTO #Attendance VALUES (200, '03/01/2009', 'P')
INSERT INTO #Attendance VALUES (200, '03/02/2009', 'A')
INSERT INTO #Attendance VALUES (200, '03/03/2009', 'A')
INSERT INTO #Attendance VALUES (200, '03/04/2009', 'A')
INSERT INTO #Attendance VALUES (200, '04/04/2009', 'A')
INSERT INTO #Attendance VALUES (200, '04/05/2009', 'A')
INSERT INTO #Attendance VALUES (201, '03/01/2009', 'A')
INSERT INTO #Attendance VALUES (201, '03/02/2009', 'A')
INSERT INTO #Attendance VALUES (201, '03/03/2009', 'P');
Then use this CTE to extract the sequence:
WITH Dates
(
EntryDate,
EmployeeNumber,
Status,
Days
)
AS
(
SELECT
a.EntryDate,
a.EmployeeNumber,
a.Status,
1
FROM
#Attendance a
WHERE
a.EntryDate = (SELECT MIN(EntryDate) FROM #Attendance)
-- RECURSIVE
UNION ALL
SELECT
a.EntryDate,
a.EmployeeNumber,
a.Status,
CASE WHEN (a.Status = Parent.Status) THEN Parent.Days + 1 ELSE 1 END
FROM
#Attendance a
INNER JOIN
Dates parent
ON
datediff(day, a.EntryDate, DateAdd(day, 1, parent.EntryDate)) = 0
AND
a.EmployeeNumber = parent.EmployeeNumber
)
SELECT * FROM Dates order by EmployeeNumber, EntryDate
Although as a final note the sequence does seem strange to me, depending on your requirements there may be a better way of aggregating the data? Never the less, this will produce the sequence you require
Does this help you?
It doesn't produce the sequence you ask (No idea how to do that) but it does give you the ammount of consecutive days someone has been absent.
DECLARE #Attendance TABLE (EmployeeNumber INTEGER, EntryDate DATETIME, Status VARCHAR(1))
INSERT INTO #Attendance VALUES (200, '03/01/2009', 'P')
INSERT INTO #Attendance VALUES (200, '03/02/2009', 'A')
INSERT INTO #Attendance VALUES (200, '03/03/2009', 'A')
INSERT INTO #Attendance VALUES (200, '03/04/2009', 'A')
INSERT INTO #Attendance VALUES (200, '04/04/2009', 'A')
INSERT INTO #Attendance VALUES (200, '04/05/2009', 'A')
INSERT INTO #Attendance VALUES (201, '03/01/2009', 'A')
INSERT INTO #Attendance VALUES (201, '03/02/2009', 'A')
INSERT INTO #Attendance VALUES (201, '03/03/2009', 'P')
SELECT a1.EmployeeNumber, [Absent] = COUNT(*) + 1
FROM #Attendance a1
INNER JOIN #Attendance a2 ON a1.EntryDate = a2.EntryDate - 1
AND a1.EmployeeNumber = a2.EmployeeNumber
AND a1.Status = a2.Status
GROUP BY a1.EmployeeNumber
You could use recursion, similar to what I have done here. It seems though that your problem is a little simpler, and since SQL Server limits recursion to 99, this might not work for people who are absent a lot. Let me think about this a few minutes.
If you have a row for every single day, go with Lieven's join.