I have this table:
Vacationtbl:
ID Start End
-------------------------
01 04/10/17 04/12/17
01 04/27/17 05/02/17
02 04/13/17 04/15/17
02 04/17/17 04/20/17
03 06/14/17 06/22/17
Employeetbl:
ID Fname Lname
------------------
01 John AAA
02 Jeny BBB
03 Jeby CCC
I like to count the number of days each employee take vacation in April.
My query:
SELECT
SUM(DATEDIFF(DAY, Start, End) + 1) AS Days
FROM
Vacationtbl
GROUP BY
ID
01 returns 9 (not correct)
02 returns 7 (correct)
How do I fix the query so that it counts until the end of month and stops at end of month. For example, April has 30 days. On second row, Employee 01 should counts 4/27/17 until 4/30/17. And 05/02/17 is for May.
Thanks
The Tally/Calendar table is the way to go. However, you can use an ad-hoc tally table.
Example
Select Year = Year(D)
,Month = Month(D)
,ID
,Days = count(*)
From Vacationtbl A
Cross Apply (
Select Top (DateDiff(DAY,[Start],[End])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),[Start])
From master..spt_values
) B
-- YOUR OPTIONAL WHERE STATEMENT HERE --
Group By ID,Year(D),Month(D)
Order By 1,2,3
Returns
Year Month ID Days
2017 4 01 7
2017 4 02 7
2017 5 01 2
EDIT - To Show All ID even if Zero Days
Select ID
,Year = Year(D)
,Month = Month(D)
,Days = sum(case when D between [Start] and [End] then 1 else 0 end)
From (
Select Top (DateDiff(DAY,'05/01/2017','05/31/2017')+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),'05/01/2017')
From master..spt_values
) D
Cross Join Vacationtbl B
Group By ID,Year(D),Month(D)
Order By 1,2,3
Returns
ID Year Month Days
1 2017 5 2
2 2017 5 0
dbFiddle if it Helps
EDIT - 2 Corrects for Overlaps (Gaps and Islands)
--Create Some Sample Data
----------------------------------------------------------------------
Declare #Vacationtbl Table ([ID] varchar(50),[Start] date,[End] date)
Insert Into #Vacationtbl Values
(01,'04/10/17','04/12/17')
,(01,'04/27/17','05/02/17')
,(02,'04/13/17','04/15/17')
,(02,'04/17/17','04/20/17')
,(02,'04/16/17','04/17/17') -- << Overlap
,(03,'05/16/17','05/17/17')
-- The Actual Query
----------------------------------------------------------------------
Select ID
,Year = Year(D)
,Month = Month(D)
,Days = sum(case when D between [Start] and [End] then 1 else 0 end)
From (Select Top (DateDiff(DAY,'04/01/2017','04/30/2017')+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),'04/01/2017') From master..spt_values ) D
Cross Join (
Select ID,[Start] = min(D),[End] = max(D)
From (
Select E.*,Grp = Dense_Rank() over (Order By D) - Row_Number() over (Partition By ID Order By D)
From (
Select Distinct A.ID,D
From #Vacationtbl A
Cross Apply (Select Top (DateDiff(DAY,A.[Start],A.[End])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[Start]) From master..spt_values ) B
) E
) G
Group By ID,Grp
) B
Group By ID,Year(D),Month(D)
Order By 1,2,3
Returns
ID Year Month Days
1 2017 4 7
2 2017 4 8
3 2017 4 0
Without a dates table, you could use
select Id
,sum(case when [end]>'20170430' and [start]<'20170401' then datediff(day,'20170401','20170430')+1
when [end]>'20170430' then datediff(day,[start],'20170430')+1
when [start]<'20170401' then datediff(day,'20170401',[end])+1
else datediff(day,[start],[end])+1
end) as VacationDays
from Vacationtbl
where [start] <= '20170430' and [end] >= '20170401'
group by Id
There are 3 conditions here
Start is before this month and the end is after this month. In this case you subtract the end and start dates of the month.
End is after month end and start is in the month, in this case subtract month end date from the start.
Start is before this month but the end is in the month. In this case subtract month start date and the end date.
Edit: Based on the OP's comments that the future dates have to be included,
/*This recursive cte generates the month start and end dates with in a given time frame
For Eg: all the month start and end dates for 2017
Change the start and end period as needed*/
with dates (month_start_date,month_end_date) as
(select cast('2017-01-01' as date),cast(eomonth('2017-01-01') as date)
union all
select dateadd(month,1,month_start_date),eomonth(dateadd(month,1,month_start_date)) from dates
where month_start_date < '2017-12-01'
)
--End recursive cte
--Query logic is the same as above
select v.Id
,year(d.month_start_date) as yr,month(d.month_start_date) as mth
,sum(case when v.[end]>d.month_end_date and v.[start]<d.month_start_date then datediff(day,d.month_start_date,d.month_end_date)+1
when v.[end]>d.month_end_date then datediff(day,v.[start],d.month_end_date)+1
when v.[start]<d.month_start_date then datediff(day,d.month_start_date,v.[end])+1
else datediff(day,v.[start],v.[end])+1
end) as VacationDays
from dates d
join Vacationtbl v on v.[start] <= d.month_end_date and v.[end] >= d.month_start_date
group by v.id,year(d.month_start_date),month(d.month_start_date)
Assuming you want only one month and you want to count all days, you can do this with arithmetic. A separate calendar table is not necessary. The advantage is performance.
I think this would be easier if SQL Server supported least() and greatest(), but case will do:
select id,
sum(1 + datediff(day, news, newe)) as vacation_days_april
from vactiontbl v cross apply
(values (case when [start] < '2017-04-01' then cast('2017-04-01' as date) else [start] end),
(case when [end] >= '2017-05-01' then cast('2017-04-30' as date) else [end] end)
) v(news, newe)
where news <= newe
group by id;
You can readily extend this to any month:
with m as (
select cast('2017-04-01' as date) as month_start,
cast('2017-04-30' as date) as month_end
)
select id,
sum(1 + datediff(day, news, newe)) as vacation_days_aprile
from m cross join
vactiontbl v cross apply
(values (case when [start] < m.month_start then m.month_start else [start] end),
(case when [end] >= m.month_end then m.month_end else [end] end)
) v(news, newe)
where news <= newe
group by id;
You can even use a similar idea to extend to multiple months, with a different row for each user and each month.
You can use a Calendar or dates table for this sort of thing.
For only 152kb in memory, you can have 30 years of dates in a table with this:
/* dates table */
declare #fromdate date = '20000101';
declare #years int = 30;
/* 30 years, 19 used data pages ~152kb in memory, ~264kb on disk */
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
select top (datediff(day, #fromdate,dateadd(year,#years,#fromdate)))
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
into dbo.Dates
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date];
create unique clustered index ix_dbo_Dates_date
on dbo.Dates([Date]);
Without taking the actual step of creating a table, you can use it inside a common table expression with just this:
declare #fromdate date = '20170401';
declare #thrudate date = '20170430';
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (datediff(day, #fromdate, #thrudate)+1)
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date]
)
select [Date]
from dates;
Use either like so:
select
v.Id
, count(*) as VacationDays
from Vacationtbl v
inner join Dates d
on d.Date >= v.[Start]
and d.Date <= v.[End]
where d.Date >= '20170401'
and d.Date <= '20170430'
group by v.Id
rextester demo (table): http://rextester.com/PLW73242
rextester demo (cte): http://rextester.com/BCY62752
returns:
+----+--------------+
| Id | VacationDays |
+----+--------------+
| 01 | 7 |
| 02 | 7 |
+----+--------------+
Number and Calendar table reference:
Generate a set or sequence without loops - 2 - Aaron Bertrand
The "Numbers" or "Tally" Table: What it is and how it replaces a loop - Jeff Moden
Creating a Date Table/Dimension in sql Server 2008 - David Stein
Calendar Tables - Why You Need One - David Stein
Creating a date dimension or calendar table in sql Server - Aaron Bertrand
Try this,
declare #Vacationtbl table(ID int,Startdate date,Enddate date)
insert into #Vacationtbl VALUES
(1 ,'04/10/17','04/12/17')
,(1 ,'04/27/17','05/02/17')
,(2 ,'04/13/17','04/15/17')
,(2 ,'04/17/17','04/20/17')
-- somehow convert your input into first day of month
Declare #firstDayofGivenMonth date='2017-04-01'
Declare #LasttDayofGivenMonth date=dateadd(day,-1,dateadd(month,datediff(month,0,#firstDayofGivenMonth)+1,0))
;with CTE as
(
select *
,case when Startdate<#firstDayofGivenMonth then #firstDayofGivenMonth else Startdate end NewStDT
,case when Enddate>#LasttDayofGivenMonth then #LasttDayofGivenMonth else Enddate end NewEDT
from #Vacationtbl
)
SELECT
SUM(DATEDIFF(DAY, NewStDT, NewEDT) + 1) AS Days
FROM
CTE
GROUP BY
ID
I have a table with the following columns:
userid, datetime, type
Sample data:
userid datetime type
1 2013-08-01 08:10:00 I
1 2013-08-01 08:12:00 I
1 2013-08-01 08:12:56 I
I need to fetch data for only two rows other than the row with min(datetime)
my query to fetch data for min(datetime) is :
SELECT
USERID, MIN(CHECKTIME) as ChkTime, CHECKTYPE, COUNT(*) AS CountRows
FROM
T1
WHERE
MONTH(CONVERT(DATETIME, CHECKTIME)) = MONTH(DATEADD(MONTH, -1,
CONVERT(DATE, GETDATE())))
AND YEAR(CONVERT(DATETIME, CHECKTIME)) = YEAR(GETDATE()) AND USERID=35
AND CHECKTYPE='I'
GROUP BY
CONVERT(DATE, CHECKTIME), USERID, CHECKTYPE
HAVING
COUNT(*) > 1
a lil help'll be much appreciated..thnx
Maybe something like this will help you:
WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY userid ORDER BY checktime) RN
FROM dbo.T1
WHERE CHECKTYPE = 'I'
--add your conditions here
)
SELECT * FROM CTE
WHERE RN > 1
Using CTE and ROW_NUMBER() function this will select all rows except min(date) for each user.
SQLFiddle DEMO
SELECT * FROM YOURTABLE A
INNER JOIN
(SELECT USERID,TYPE,MIN(datetime) datetime FROM YOURTABLE GROUP BY USERID,TYPE )B
ON
A.USERID=B.USERID AND
A.TYPE=B.TYPE
WHERE A.DATETIME<>B.DATETIME
Imagine we have a table:
SELECT SUM(A) AS TOTALS,DATE,STUFF FROM TABLE WHERE DATE BETWEEN 'DATESTART' AND 'DATEEND'
GROUP BY DATE,STUFF
Normally this gets the totals as:
totals stuff date
23 x 01.01.1900
3 x 02.01.1900
44 x 06.01.1900
But what if we have the previous the data before the startdate,and i want to add those initial data to my startdate value; for example; from the begining of time i already have a sum value of x lets say 100
so i want my table to start from 123 and add the previous data such as:
123
126
126+44 and so on...
totals stuff date
123 x 01.01.1900
126 x 02.01.1900
170 x 06.01.1900
How can i achieve that?
Source data:
WITH Stocks
AS (
SELECT
Dep.Dept_No ,
SUM(DSL.Metre) AS Metre ,
CONVERT(VARCHAR(10), Date, 112) AS Date
FROM
DS (NOLOCK) DSL
JOIN TBL_Depts (NOLOCK) Dep ON Dep.Dept_No = DSL.Dept
WHERE
1 = 1 AND
DSL.Sil = 0 AND
DSL.Depo IN ( 5000, 5001, 5002, 5003, 5004, 5014, 5018, 5021, 5101, 5109, 5303 ) AND
Dep.Dept_No NOT IN ( 6002 ) AND
Dep.Dept_No IN ( 6000, 6001, 6003, 6004, 6005, 6011, 6024, 6030 ) AND
DSL.Date BETWEEN '2013-06-19' AND '2013-06-20'
GROUP BY
Dep.Dept_No ,
CONVERT(VARCHAR(10), Date, 112)
)
SELECT
Stocks.Metre ,
Dep.Dept AS Dept ,
Stocks.Date
FROM
Stocks
LEFT JOIN TBL_Depts (NOLOCK) Dep ON Stocks.Dept = Dep.Dept
ORDER BY
Stocks.Metre DESC
Any RDBMS with window and analytic functions (SQL Server 2012, PostgreSQL but not MySQL)
SELECT
SumA + SUM(SumARange) OVER (ORDER BY aDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS TOTALS,
other, aDate
FROM
(
SELECT
SUM(a) AS SumARange,
other, aDate
FROM
SomeTable
WHERE
aDate BETWEEN '20130101' AND '20130106'
GROUP BY
other, aDate
) X
CROSS JOIN
(
SELECT
SUM(a) AS SumA
FROM
SomeTable
WHERE
aDate < '20130101'
) Y
ORDER BY
aDate;
or
SELECT
SUM(SumA) OVER () + SUM(SumARange) OVER (ORDER BY aDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS TOTALS,
other, aDate
FROM
(
SELECT
SUM(CASE WHEN aDate < '20130101' THEN a ELSE 0 END) AS SumA,
SUM(CASE WHEN aDate BETWEEN '20130101' AND '20130106' THEN a ELSE 0 END) AS SumARange,
other, aDate
FROM
SomeTable
WHERE
aDate <= '20130106'
GROUP BY
other, aDate
) X
ORDER BY
aDate;
SQLFiddle example and another
Use option with APPLY operator to calculate the totals. You need also add additional CASE expression in the GROUP BY clause
;WITH cte AS
(
SELECT SUM(a) AS sumA, [stuff], MAX([Date]) AS [Date]
FROM SomeTable
WHERE [Date] <= '20130106'
GROUP BY [stuff], CASE WHEN [Date] <= '20130101' THEN 1 ELSE [Date] END
)
SELECT o.total, [stuff], [Date]
FROM cte c CROSS APPLY (
SELECT SUM(c2.sumA) AS total
FROM cte c2
WHERE c.[Date] >= c2.[Date]
) o
See example on SQLFiddle
I have a table of items that, for sake of simplicity, contains the ItemID, the StartDate, and the EndDate for a list of items.
ItemID StartDate EndDate
1 1/1/2011 1/15/2011
2 1/2/2011 1/14/2011
3 1/5/2011 1/17/2011
...
My goal is to be able to join this table to a table with a sequential list of dates,
and say both how many items are open on a particular date, and also how many items are cumulatively open.
Date ItemsOpened CumulativeItemsOpen
1/1/2011 1 1
1/2/2011 1 2
...
I can see how this would be done with a WHILE loop,
but that has performance implications. I'm wondering how
this could be done with a set-based approach?
SELECT COUNT(CASE WHEN d.CheckDate = i.StartDate THEN 1 ELSE NULL END)
AS ItemsOpened
, COUNT(i.StartDate)
AS ItemsOpenedCumulative
FROM Dates AS d
LEFT JOIN Items AS i
ON d.CheckDate BETWEEN i.StartDate AND i.EndDate
GROUP BY d.CheckDate
This may give you what you want
SELECT DATE,
SUM(ItemOpened) AS ItemsOpened,
COUNT(StartDate) AS ItemsOpenedCumulative
FROM
(
SELECT d.Date, i.startdate, i.enddate,
CASE WHEN i.StartDate = d.Date THEN 1 ELSE 0 END AS ItemOpened
FROM Dates d
LEFT OUTER JOIN Items i ON d.Date BETWEEN i.StartDate AND i.EndDate
) AS x
GROUP BY DATE
ORDER BY DATE
This assumes that your date values are DATE data type. Or, the dates are DATETIME with no time values.
You may find this useful. The recusive part can be replaced with a table. To demonstrate it works I had to populate some sort of date table. As you can see, the actual sql is short and simple.
DECLARE #i table (itemid INT, startdate DATE, enddate DATE)
INSERT #i VALUES (1,'1/1/2011', '1/15/2011')
INSERT #i VALUES (2,'1/2/2011', '1/14/2011')
INSERT #i VALUES (3,'1/5/2011', '1/17/2011')
DECLARE #from DATE
DECLARE #to DATE
SET #from = '1/1/2011'
SET #to = '1/18/2011'
-- the recusive sql is strictly to make a datelist between #from and #to
;WITH cte(Date)
AS (
SELECT #from DATE
UNION ALL
SELECT DATEADD(day, 1, DATE)
FROM cte ch
WHERE DATE < #to
)
SELECT cte.Date, sum(case when cte.Date=i.startdate then 1 else 0 end) ItemsOpened, count(i.itemid) ItemsOpenedCumulative
FROM cte
left join #i i on cte.Date between i.startdate and i.enddate
GROUP BY cte.Date
OPTION( MAXRECURSION 0)
If you are on SQL Server 2005+, you could use a recursive CTE to obtain running totals, with the additional help of the ranking function ROW_NUMBER(), like this:
WITH grouped AS (
SELECT
d.Date,
ItemsOpened = COUNT(i.ItemID),
rn = ROW_NUMBER() OVER (ORDER BY d.Date)
FROM Dates d
LEFT JOIN Items i ON d.Date BETWEEN i.StartDate AND i.EndDate
GROUP BY d.Date
WHERE d.Date BETWEEN #FilterStartDate AND #FilterEndDate
),
cumulative AS (
SELECT
Date,
ItemsOpened,
ItemsOpenedCumulative = ItemsOpened
FROM grouped
WHERE rn = 1
UNION ALL
SELECT
g.Date,
g.ItemsOpened,
ItemsOpenedCumulative = g.ItemsOpenedCumulative + c.ItemsOpened
FROM grouped g
INNER JOIN cumulative c ON g.Date = DATEADD(day, 1, c.Date)
)
SELECT *
FROM cumulative
Yesterday Thomas helped me a lot by providing exactly the query I wanted. And now I need a variant of it, and hopes someone can help me out.
I want it to output only one row, namely a max value - but it has to build on the algorithm in the following query:
WITH Calendar AS (SELECT CAST(#StartDate AS datetime) AS Date
UNION ALL
SELECT DATEADD(d, 1, Date) AS Expr1
FROM Calendar AS Calendar_1
WHERE (DATEADD(d, 1, Date) < #EndDate))
SELECT C.Date, C2.Country, COALESCE (SUM(R.[Amount of people per day needed]), 0) AS [Allocated testers]
FROM Calendar AS C CROSS JOIN
Country AS C2 LEFT OUTER JOIN
Requests AS R ON C.Date BETWEEN R.[Start date] AND R.[End date] AND R.CountryID = C2.CountryID
WHERE (C2.Country = #Country)
GROUP BY C.Date, C2.Country OPTION (MAXRECURSION 0)
The output from above will be like:
Date Country Allocated testers
06/01/2010 Chile 3
06/02/2010 Chile 4
06/03/2010 Chile 0
06/04/2010 Chile 0
06/05/2010 Chile 19
but what I need right now is
Allocated testers
19
that is - only one column - one row - the max value itself... (for the (via parameters (that already exists)) selected period of dates and country)
use order and limit
ORDER BY 'people needed DESC' LIMIT 1
EDITED
as LIMIT is not exist in sql
use ORDER BY and TOP
select TOP 1 .... ORDER BY 'people needed' DESC
WITH Calendar
AS (
SELECT
CAST(#StartDate AS datetime) AS Date
UNION ALL
SELECT
DATEADD(d, 1, Date) AS Expr1
FROM
Calendar AS Calendar_1
WHERE
( DATEADD(d, 1, Date) < #EndDate )
)
SELECT TOP 1 *
FROM
(
SELECT
C.Date
,C2.Country
,COALESCE(SUM(R.[Amount of people per day needed]), 0) AS [Allocated testers]
FROM
Calendar AS C
CROSS JOIN Country AS C2
LEFT OUTER JOIN Requests AS R
ON C.Date BETWEEN R.[Start date] AND R.[End date]
AND R.CountryID = C2.CountryID
WHERE
( C2.Country = #Country )
GROUP BY
C.Date
,C2.Country
OPTION
( MAXRECURSION 0 )
) lst
ORDER BY lst.[Allocated testers] DESC
Full example following the discussion in #Salil answer..
WITH Calendar AS (SELECT CAST(#StartDate AS datetime) AS Date
UNION ALL
SELECT DATEADD(d, 1, Date) AS Expr1
FROM Calendar AS Calendar_1
WHERE (DATEADD(d, 1, Date) < #EndDate))
SELECT TOP 1 C.Date, C2.Country, COALESCE (SUM(R.[Amount of people per day needed]), 0) AS [Allocated testers]
FROM Calendar AS C CROSS JOIN
Country AS C2 LEFT OUTER JOIN
Requests AS R ON C.Date BETWEEN R.[Start date] AND R.[End date] AND R.CountryID = C2.CountryID
WHERE (C2.Country = #Country)
GROUP BY C.Date, C2.Country
ORDER BY 3 DESC
OPTION (MAXRECURSION 0)
the ORDER BY 3 means order by the 3rd field in the SELECT statement.. so if you remove the first two fields, change this accordingly..