I have a table like this:
Id | From | To
---+------------+------------
1 | 2018-01-28 | 2018-02-01
2 | 2018-02-10 | 2018-02-12
3 | 2018-02-27 | 2018-03-01
How to get all dates between From and To dates like this?
FromDate
----------
2018-01-28
2018-01-29
2018-01-30
2018-01-31
2018-02-01
2018-02-10
2018-02-11
2018-02-12
2018-02-27
2018-02-28
2018-02-01
Generate a calendar table containing all dates within, e.g. 2018, and then inner join that table to your current table:
DECLARE #todate datetime, #fromdate datetime
SELECT #fromdate='2018-01-01', #todate='2018-12-31'
;WITH calendar (FromDate) AS (
SELECT #fromdate AS FromDate
UNION ALL
SELECT DATEADD(day, 1, FromDate)
FROM Calendar
WHERE FromDate < #todate
)
SELECT t1.FromDate
FROM calendar t1
INNER JOIN yourTable t2
ON t1.FromDate BETWEEN t2.[From] AND t2.[To];
Demo
If you don't have that many dates, a recursive CTE is a pretty easy approach:
with cte as (
select id, fromdate as dte, fromdate, todate
from t
union all
select id, dateadd(day, 1, dte), fromdate, todate
from cte
where dte < todate
)
select id, dte
from cte;
A recursive CTE has a default "depth" of 100. That means that it will work for spans up to 100 dates long (for each id). You can override this with the MAXRECURSION option.
It is usually slightly more efficient to do this with some sort of numbers table. However, recursive CTEs are surprisingly efficient for this sort of calculation. And this is a good way to start learning about them.
Related
I have a requirement where I will have to split overlapping records on a given table with 2 date fields.
Consider this to be my input table TableT.
ID
EFFECTIVE_DATE
END_DATE
JKL
2016-01-01
2016-12-31
JKL
2016-04-01
2016-12-31
JKL
2016-01-01
2016-03-04
JKL
2016-04-01
2016-12-31
JKL
2016-01-01
2016-12-31
I would want my output to look like below. I need to achieve this in both SQL Server and Oracle\DB2 so I am looking for a generic solution.
ID
EFFECTIVE_DATE
END_DATE
JKL
2016-01-01
2016-03-04
JKL
2016-03-05
2016-03-31
JKL
2016-04-01
2016-12-31
This is what I have tried
With EndDates as (
select END_DATE as END_DATE,TRIM(ID) as ID FROM TableT
union all
select ADD_DAYS(EFFECTIVE_DATE, -1) as END_DATE,TRIM(ID) as ID FROM TableT
), Periods as (
select ID as ID,MIN(EFFECTIVE_DATE) as EFFECTIVE_DATE,
(select MIN(END_DATE) from EndDates e
where e.ID = t.ID and
e.END_DATE >= MIN(EFFECTIVE_DATE)) as END_DATE
from
TableT t
group by ID),
EXTN_PERIOD as (select p.ID as ID, ADD_DAYS(p.END_DATE, 1) as EFFECTIVE_DATE,e.END_DATE as END_DATE
from
Periods p
inner join
EndDates e
on
p.ID = e.ID and
p.END_DATE < e.END_DATE
where
not exists (select * from EndDates e2 where
e2.ID = p.ID and
e2.END_DATE > p.END_DATE and
e2.END_DATE < e.END_DATE)
)
select * from EXTN_PERIOD
union
select * from PERIODS
It works partially fine but does not give me the desired output.
This is what the output I get when I run the above query:
ID
EFFECTIVE_DATE
END_DATE
JKL
2016-01-01
2016-03-04
JKL
2016-03-05
2016-03-31
Thanks in advance!
WITH
/*
MY_TAB (ID, EFFECTIVE_DATE, END_DATE) AS
(
VALUES
('JKL', DATE('2016-01-01'), DATE('2016-12-31'))
, ('JKL', DATE('2016-04-01'), DATE('2016-12-31'))
, ('JKL', DATE('2016-01-01'), DATE('2016-03-04'))
, ('JKL', DATE('2016-04-01'), DATE('2016-12-31'))
, ('JKL', DATE('2016-01-01'), DATE('2016-12-31'))
)
,
*/
A AS
(
SELECT DISTINCT T.ID, DECODE(V.I, 1, T.EFFECTIVE_DATE, 2, T.END_DATE + 1) DT
FROM MY_TAB T, (VALUES 1, 2) V(I)
)
, INTL AS
(
SELECT
ID
, LAG(DT) OVER (PARTITION BY ID ORDER BY DT) AS EFF_DT
, DT AS END_DT
FROM A
)
SELECT ID, EFF_DT, END_DT - 1 AS END_DT
FROM INTL
WHERE EFF_DT IS NOT NULL
ORDER BY 1, 2;
Almost universal. The only customization is the way the "virtual" table with the correlation name V of 2 rows (with INTEGERS 1 and 2) is generated.
The idea is to convert your data first to [inclusive, exclusive) form to simplify further calculations. Then we merge all effective and end dates and construct intervals using the OLAP LAG function. Finally we revert to your [inclusive, inclusive] form.
db<>fiddle link to test.
In Oracle you could do something like this:
with
tablet (id, effective_date, end_date) as (
select 'JKL', date '2016-01-01', date '2016-12-31' from dual union all
select 'JKL', date '2016-04-01', date '2016-12-31' from dual union all
select 'JKL', date '2016-01-01', date '2016-03-04' from dual union all
select 'JKL', date '2016-04-01', date '2016-12-31' from dual union all
select 'JKL', date '2016-01-01', date '2016-12-31' from dual
)
, prep (id, dt) as (
select distinct id, case col when 'EFF' then val else val + 1 end
from tablet
unpivot (val for col in (effective_date as 'EFF', end_date as 'END'))
)
, almost_done (id, effective_date, end_date) as (
select id, dt, lead(dt) over (partition by id order by dt) - 1
from prep
)
select id, effective_date, end_date
from almost_done
where end_date is not null
;
ID EFFECTIVE_DATE END_DATE
--- -------------- ----------
JKL 2016-01-01 2016-03-04
JKL 2016-03-05 2016-03-31
JKL 2016-04-01 2016-12-31
Notice the first CTE (tablet, used to generate testing data - you don't need it in your real-life case). Then, the first step is to unpivot the data; I don't know how SQL Server supports unpivoting, worst case you can do it manually with a cross join. (NOT with UNION ALL - that is inefficient.) Then you remove duplicates, and the rest is easy with the LEAD analytic function, which SQL Server should support too.
Notice the 2017-04-01, 2018-02-01, 2018-07-01, and 2019-01-01 months are missing in the output. I want to show only those months which are missing. Does anyone know how to go about this?
Query:
SELECT TO_DATE("Month", 'mon''yy') as dates FROM sample_sheet
group by dates
order by dates asc;
Output:
2017-01-01
2017-02-01
2017-03-01
2017-05-01
2017-06-01
2017-07-01
2017-08-01
2017-09-01
2017-10-01
2017-11-01
2017-12-01
2018-01-01
2018-03-01
2018-04-01
2018-05-01
2018-06-01
2018-08-01
2018-09-01
2018-10-01
2018-11-01
2018-12-01
2019-02-01
2019-03-01
2019-04-01
I don't know Vertica, so I wrote a working proof of concept in Microsoft SQL Server and tried to convert it to Vertica syntax based on the online documentation.
It should look like this:
with
months as (
select 2017 as date_year, 1 as date_month, to_date('2017-01-01', 'YYYY-MM-DD') as first_date, to_date('2017-01-31', 'yyyy-mm-dd') as last_date
union all
select
year(add_months(first_date, 1)) as date_year,
month(add_months(first_date, 1)) as date_month,
add_months(first_date, 1) as first_date,
last_day(add_months(first_date, 1)) as last_date
from months
where first_date < current_date
),
sample_dates (a_date) as (
select to_date('2017-01-15', 'YYYY-MM-DD') union all
select to_date('2017-01-22', 'YYYY-MM-DD') union all
select to_date('2017-02-01', 'YYYY-MM-DD') union all
select to_date('2017-04-15', 'YYYY-MM-DD') union all
select to_date('2017-06-15', 'YYYY-MM-DD')
)
select *
from sample_dates right join months on sample_dates.a_date between first_date and last_date
where sample_dates.a_date is null
Months is a recursive dynamic table that holds all months since 2017-01, with first and last day of the month. sample_dates is just a list of dates to test the logic - you should replace it with your own table.
Once you build that monthly calendar table all you need to do is check your dates against it using an outer query to see what dates are not between any of those periods between first_date and last_date columns.
You can build a TIMESERIES of all dates between the first input date and the last input date (The highest granularity of a TIMESERIES is the day.), and filter out only the months' first days out of that; then left join that created sequence of firsts of month with your input to find out where the join would fail, checking for NULLS from the input branch of the join:
WITH
-- your input
input(mth1st) AS (
SELECT DATE '2017-01-01'
UNION ALL SELECT DATE '2017-02-01'
UNION ALL SELECT DATE '2017-03-01'
UNION ALL SELECT DATE '2017-05-01'
UNION ALL SELECT DATE '2017-06-01'
UNION ALL SELECT DATE '2017-07-01'
UNION ALL SELECT DATE '2017-08-01'
UNION ALL SELECT DATE '2017-09-01'
UNION ALL SELECT DATE '2017-10-01'
UNION ALL SELECT DATE '2017-11-01'
UNION ALL SELECT DATE '2017-12-01'
UNION ALL SELECT DATE '2018-01-01'
UNION ALL SELECT DATE '2018-03-01'
UNION ALL SELECT DATE '2018-04-01'
UNION ALL SELECT DATE '2018-05-01'
UNION ALL SELECT DATE '2018-06-01'
UNION ALL SELECT DATE '2018-08-01'
UNION ALL SELECT DATE '2018-09-01'
UNION ALL SELECT DATE '2018-10-01'
UNION ALL SELECT DATE '2018-11-01'
UNION ALL SELECT DATE '2018-12-01'
UNION ALL SELECT DATE '2019-02-01'
UNION ALL SELECT DATE '2019-03-01'
UNION ALL SELECT DATE '2019-04-01'
)
,
-- need a series of month's firsts
-- TIMESERIES works for INTERVAL DAY TO SECOND
-- so build that timeseries, and filter out
-- the month's firsts
limits(mth1st) AS (
SELECT MIN(mth1st) FROM input
UNION ALL SELECT MAX(mth1st) FROM input
)
,
alldates AS (
SELECT dt::DATE FROM limits
TIMESERIES dt AS '1 day' OVER(ORDER BY mth1st::TIMESTAMP)
)
,
allfirsts(mth1st) AS (
SELECT dt FROM alldates WHERE DAY(dt)=1
)
SELECT
allfirsts.mth1st
FROM allfirsts
LEFT JOIN input USING(mth1st)
WHERE input.mth1st IS NULL;
-- out mth1st
-- out ------------
-- out 2017-04-01
-- out 2018-02-01
-- out 2018-07-01
-- out 2019-01-01
I have a SQL table that contains employeeid, StartDateTime and EndDatetime as follows:
CREATE TABLE Sample
(
SNO INT,
EmployeeID NVARCHAR(10),
StartDateTime DATE,
EndDateTime DATE
)
INSERT INTO Sample
VALUES
( 1, 'xyz', '2018-01-01', '2018-01-02' ),
( 2, 'xyz', '2018-01-03', '2018-01-05' ),
( 3, 'xyz', '2018-01-06', '2018-02-01' ),
( 4, 'xyz', '2018-02-15', '2018-03-15' ),
( 5, 'xyz', '2018-03-16', '2018-03-19' ),
( 6, 'abc', '2018-01-16', '2018-02-25' ),
( 7, 'abc', '2018-03-08', '2018-03-19' ),
( 8, 'abc', '2018-02-26', '2018-03-01' )
I want the result to be displayed as
EmployeeID | StartDateTime | EndDateTime
------------+-----------------+---------------
xyz | 2018-01-01 | 2018-02-01
xyz | 2018-02-15 | 2018-03-19
abc | 2018-01-16 | 2018-03-01
abc | 2018-03-08 | 2018-03-19
Basically, I want to recursively look at records of each employee and datemine the continuity of Start and EndDates and make a set of continuous date records.
I wrote my query as follows:
SELECT *
FROM dbo.TestTable T1
LEFT JOIN dbo.TestTable t2 ON t2.EmpId = T1.EmpId
WHERE t1.EndDate = DATEADD(DAY, -1, T2.startdate)
to see if I could decipher something from the output looking for a pattern. Later realized that with the above approach, I need to join the same table multiple times to get the output I desire.
Also, there is a case that there can be multiple employee records, so I need direction on efficient way of getting this desired output.
Any help is greatly appreciated.
This will do it for you. Use a recursive CTE to get all the adjacent rows, then get the highest end date for each start date, then the first start date for each end date.
;with cte as (
select EmployeeID, StartDateTime, EndDateTime
from sample s
union all
select CTE.EmployeeID, CTE.StartDateTime, s.EndDateTime
from sample s
join cte on cte.EmployeeID=s.EmployeeID and s.StartDateTime=dateadd(d,1,CTE.EndDateTime)
)
select EmployeeID, Min(StartDateTime) as StartDateTime, EndDateTime from (
select EmployeeID, StartDateTime, Max(EndDateTime) as EndDateTime from cte
group by EmployeeID, StartDateTime
) q group by EmployeeID, EndDateTime
You can use this.
WITH T AS (
SELECT S1.SNO,
S1.EmployeeID,
S1.StartDateTime,
ISNULL(S2.EndDateTime, S1.EndDateTime) EndDateTime,
ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId ORDER BY S1.StartDateTime)
- ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId, CASE WHEN S2.StartDateTime IS NULL THEN 0 ELSE 1 END ORDER BY S1.StartDateTime ) RN,
ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId, ISNULL(S2.EndDateTime, S1.EndDateTime) ORDER BY S1.EmployeeId, S1.StartDateTime) RN_END
FROM Sample S1
LEFT JOIN Sample S2 ON DATEADD(DAY,1,S1.EndDateTime) = S2.StartDateTime
)
SELECT EmployeeID, MIN(StartDateTime) StartDateTime,MAX(EndDateTime) EndDateTime FROM T
WHERE RN_END = 1
GROUP BY EmployeeID, RN
ORDER BY EmployeeID DESC, StartDateTime
Result:
EmployeeID StartDateTime EndDateTime
---------- ------------- -----------
xyz 2018-01-01 2018-02-01
xyz 2018-02-15 2018-03-19
abc 2018-01-16 2018-03-01
abc 2018-03-08 2018-03-19
I have a dataset like this:
ID | IssueDate
194924 | 2013-07-31 00:00:00.000
194924 | 2010-06-15 00:00:00.000
194924 | 2012-07-30 00:00:00.000
194924 | 2012-12-11 00:00:00.000
194924 | 2014-08-04 00:00:00.000
194966 | 2012-06-02 00:00:00.000
194966 | 2011-02-03 00:00:00.000
194966 | 2011-02-01 00:00:00.000
194987 | 2013-04-25 00:00:00.000
194987 | 2010-12-03 00:00:00.000
I want to sort data with ID and IssueDate first, and then subtract IssueDates of two consecutive rows (to find the time between one row and next row), then calculate max, min and average of this times for each unique ID.
If your Sql Server version is 2014 then the below one might be help you.
Schema for your case:
CREATE TABLE #TAB (
ID BIGINT
,IssuDate DATETIME
)
INSERT INTO #TAB
SELECT 194924
,'2013-07-31 00:00:00.000'
UNION ALL
SELECT 194924
,'2010-06-15 00:00:00.000'
UNION ALL
SELECT 194924
,'2012-07-30 00:00:00.000'
UNION ALL
SELECT 194924
,'2012-12-11 00:00:00.000'
UNION ALL
SELECT 194924
,'2014-08-04 00:00:00.000'
UNION ALL
SELECT 194966
,'2012-06-02 00:00:00.000'
UNION ALL
SELECT 194966
,'2011-02-03 00:00:00.000'
UNION ALL
SELECT 194966
,'2011-02-01 00:00:00.000'
UNION ALL
SELECT 194987
,'2013-04-25 00:00:00.000'
UNION ALL
SELECT 194987
,'2010-12-03 00:00:00.000'
Result after sorting and finding the Time difference:
SELECT *, DATEDIFF(DD, ISNULL(LAG(ISSUDATE) OVER(PARTITION BY ID ORDER BY ID,IssuDate ), IssuDate),IssuDate) AS TIME_DIFF_IN_DAYS
FROM #TAB
For aggregation with min Max & avg
SELECT ID, MIN(TIME_DIFF_IN_DAYS) AS MIN_TIME_TAKEN, MAX(TIME_DIFF_IN_DAYS) MAX_TIME_TAKEN, AVG(TIME_DIFF_IN_DAYS) AVG_TIME_TAKEN FROM (
SELECT *, DATEDIFF(DD, ISNULL(LAG(ISSUDATE) OVER(PARTITION BY ID ORDER BY ID,IssuDate ), IssuDate),IssuDate) AS TIME_DIFF_IN_DAYS FROM #TAB
)AS A
WHERE TIME_DIFF_IN_DAYS>0 --This one you can comment if you want to show 0 diffence in time
GROUP BY ID
I am not sure about "and c1.id=c.id" in CTE1 coz I am not sure about your exact requrement. Neverthless you can try some thing like,
declare #t table(ID int,IssuDate datetime)
insert into #t values
(194924,'2013-07-31 00:00:00.000')
,(194924,'2010-06-15 00:00:00.000')
,(194924,'2012-07-30 00:00:00.000')
,(194924,'2012-12-11 00:00:00.000')
,(194924,'2014-08-04 00:00:00.000')
,(194966,'2012-06-02 00:00:00.000')
,(194966,'2011-02-03 00:00:00.000')
,(194966,'2011-02-01 00:00:00.000')
,(194987,'2013-04-25 00:00:00.000')
,(194987,'2010-12-03 00:00:00.000')
;with CTE as
(select *,ROW_NUMBER()over(order by id,IssuDate)rn
from #t
)
,Cte1 as
(
select *
,(select datediff(second,c.IssuDate,c1.IssuDate) from CTE c1 where c1.rn=c.rn+1 and c1.id=c.id)Time_between
from CTE C
)
select sum(Time_between),min(Time_between),avg(Time_between),max(Time_between) from cte1
group by id
I have two attendance tables.One isemployeelist and another is attendence_info.Employeelist contain Emp_Id and Emp_name. Attendance_info is Emp_Id, Date.As below:
Emp_ID Date
----------- -----------------------
1 2014-12-11 00:00:00.000
2 2014-12-11 00:00:00.000
4 2014-12-11 00:00:00.000
5 2014-12-11 00:00:00.000
2 2014-12-10 00:00:00.000
4 2014-12-10 00:00:00.000
5 2014-12-10 00:00:00.000
1 2014-12-09 00:00:00.000
2 2014-12-09 00:00:00.000
3 2014-12-09 00:00:00.000
Here each date some id are absent. I want to find out all absent list with date.Please help to find it by Sql server query. My desired output should be as below:
absentId Date
3 2014-12-11 00:00:00.000
1 2014-12-10 00:00:00.000
3 2014-12-10 00:00:00.000
4 2014-12-09 00:00:00.000
5 2014-12-09 00:00:00.000
You can do this by generating a list of all employees and dates and then removing the ones where the employee is present. This is basically a cross join and left join:
select el.emp_id, d.date
from (select distinct date from Attendance_info) d cross join
employeelist el left join
Attendance_info ai
on ai.date = d.date and ai.emp_id = el.emp_id
where ai.emp_id is null;
select e.Emp_ID as absentId,a.Date as date
from employeelist e
join attendence_info a
on e.Emp_ID=a.Emp_ID
order by a.Date
DECLARE #StartDate DATETIME = '2014-12-09'
DECLARE #EndDate DATETIME = '2014-12-11'
;WITH MyCte ([Date])
AS
(
SELECT [Date] = #StartDate
UNION ALL
SELECT DATEADD(DAY, 1, [Date]) FROM MyCte WHERE [Date] < #EndDate
)
,EmployeeList ([EmpID], [Date])
AS
(
SELECT 1, '2014-12-11 00:00:00.000' UNION
SELECT 2, '2014-12-11 00:00:00.000' UNION
SELECT 4, '2014-12-11 00:00:00.000' UNION
SELECT 5, '2014-12-11 00:00:00.000' UNION
SELECT 2, '2014-12-10 00:00:00.000' UNION
SELECT 4, '2014-12-10 00:00:00.000' UNION
SELECT 5, '2014-12-10 00:00:00.000' UNION
SELECT 1, '2014-12-09 00:00:00.000' UNION
SELECT 2, '2014-12-09 00:00:00.000' UNION
SELECT 3, '2014-12-09 00:00:00.000'
)
SELECT DISTINCT E.[EmpID], M.[Date]
FROM EmployeeList E
CROSS APPLY (SELECT [Date] FROM MyCTE WHERE [Date] NOT IN (SELECT [Date] FROM EmployeeList WHERE EmpID = E.EmpID)) M