I have data in an Ingres table something like this;
REF FROM_DATE TO_DATE
A 01.04.1997 01.04.1998
A 01.04.1998 27.05.1998
A 27.05.1998 01.04.1999
B 01.04.1997 01.04.1998
B 01.04.1998 26.07.1998
B 01.04.2012 01.04.2013
Some refs have continuous periods from the min(from_date) to the max(to_date), but some have gaps in the period.
I would like to know a way in Ingres SQL of identifying which refs have gaps in the date periods.
I am doing this as a Unix shell script calling the Ingres sql command.
Please advise.
I am not familiar with the date functions in Ingres. Let me assume that - gets the difference between two dates in days.
If there are no overlaps in the data, then you can do what you want pretty easily. If there are no gaps, then the difference between the minimum and maximum date is the same as the sum of the differences on each line. If the difference is greater than 0, then there are gaps.
So:
select ref,
((max(to_date) - min(from_date)) -
sum(to_date - from_date)
) as total_gaps
from t
group by ref;
I believe this will work in your case. In other cases, there might be an "off-by-1" problem, depending on whether or not the end date is included in the period.
This query works in SQL SERVER. PARTITION is a ANSI SQL command, I don't know if INGRES supports it. if partition is supported probably you would have an equivalent to Dense_Rank()
select *
INTO #TEMP
from (
select 'A' as Ref, Cast('1997-01-04' as DateTime) as From_date, Cast('1998-01-04' as DateTime) as to_date
union
select 'A' as Ref, Cast('1998-01-04' as DateTime) as From_date, Cast('1998-05-27' as DateTime) as to_date
union
select 'A' as Ref, Cast('1998-05-27' as DateTime) as From_date, Cast('1999-01-04' as DateTime) as to_date
union
select 'B' as Ref, Cast('1997-01-04' as DateTime) as From_date, Cast('1998-01-04' as DateTime) as to_date
union
select 'B' as Ref, Cast('1998-01-04' as DateTime) as From_date, Cast('1998-07-26' as DateTime) as to_date
union
select 'B' as Ref, Cast('2012-01-04' as DateTime) as From_date, Cast('2013-01-04' as DateTime) as to_date
) X
SELECT *
FROM
(
SELECT Ref, Min(NewStartDate) From_Date, MAX(To_Date) To_Date, COUNT(1) OVER (PARTITION BY Ref ) As [CountRanges]
FROM
(
SELECT Ref, From_Date, To_Date,
NewStartDate = Range_UNTIL_NULL.From_Date + NUMBERS.number,
NewStartDateGroup = DATEADD(d,
1 - DENSE_RANK() OVER (PARTITION BY Ref ORDER BY Range_UNTIL_NULL.From_Date + NUMBERS.number),
Range_UNTIL_NULL.From_Date + NUMBERS.number)
FROM
(
--This subquery is necesary needed to "expand the To_date" to the next day and allowing it to be null
SELECT
REF, From_date, DATEADD(d, 1, ISNULL(To_Date, From_Date)) AS to_date
FROM #Temp T1
WHERE
NOT EXISTS ( SELECT *
FROM #Temp t2
WHERE T1.Ref = T2.Ref and T1.From_Date > T2.From_Date AND T2.To_Date IS NULL
)
) AS Range_UNTIL_NULL
CROSS APPLY Enumerate ( ABS(DATEDIFF(d, From_Date, To_Date))) AS NUMBERS
) X
GROUP BY Ref, NewStartDateGroup
) OVERLAPED_RANGES_WITH_COUNT
-- WHERE OVERLAPED_RANGES_WITH_COUNT.CountRanges >= 2 --This filter is for identifying ranges that have at least one gap
ORDER BY Ref, From_Date
The result for the given example is:
Ref From_Date To_Date CountRanges
---- ----------------------- ----------------------- -----------
A 1997-01-04 00:00:00.000 1999-01-05 00:00:00.000 1
B 1997-01-04 00:00:00.000 1998-07-27 00:00:00.000 2
B 2012-01-04 00:00:00.000 2013-01-05 00:00:00.000 2
as you can see those ref having "CountRanges" > 1 have at least one gap
This answer goes far beyound the initial question, because:
Ranges can be overlaped, is not clear if in the initial question that can happen
The question only ask which refs have gaps but with this query you can list the gaps
Tis query allows To_date in null, representing a semi segment to the infinite
Related
I have three tabels, each of them has a date column (the date column is an INT field and needs to stay that way). I need a UNION accross all three tables so that I get the list of unique dates in accending order like this:
20040602
20051215
20060628
20100224
20100228
20100422
20100512
20100615
Then I need to add a column to the result of the query where I subtract one from each date and place it one row above as the end date. Basically I need to generate the end date from the start date somehow and this is what I got so far (not working):
With Query1 As (
Select date_one As StartDate
From table_one
Union
Select date_two As StartDate
From table_two
Union
Select date_three e As StartDate
From table_three
Order By Date Asc
)
Select Query1.StartDate - 1 As EndDate
From Query1
Thanks a lot for your help!
Building on your existing union cte, we can use lead() in the outer query to get the start_date of the next record, and withdraw 1 from it.
with q as (
select date_one start_date from table_one
union select date_two from table_two
union select date_three from table_three
)
select
start_date,
dateadd(day, -1, lead(start_date) over(order by start_date)) end_date
from q
order by start_date
If the datatype the original columns are numeric, then you need to do some casting before applying date functions:
with q as (
select cast(cast(date_one as varchar(8)) as date) start_date from table_one
union select cast(cast(date_two as varchar(8)) as date) from table_two
union select cast(cast(date_three as varchar(8)) as date) from table_three
)
select
start_date,
dateadd(day, -1, lead(start_date) over(order by start_date)) end_date
from q
order by start_date
I'm trying to find the snowflake equivalent of generate_series() (the PostgreSQL syntax).
SELECT generate_series(timestamp '2017-11-01', CURRENT_DATE, '1 day')
Just wanted to expand on Marcin Zukowski's comment to say that these gaps started to show up almost immediately after using a date range generated this way in a JOIN.
We ultimately ended up doing this instead!
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date
from table (generator(rowcount => 90))
I had a similar problem and found an approach, which avoids the issue of a generator requiring a constant value by using a session variable in addition to the already great answers here. This is closest to the requirement of the OP to my mind.
-- set parameter to be used as generator "constant" including the start day
set num_days = (Select datediff(day, TO_DATE('2017-11-01','YYYY-MM-DD'), current_date()+1));
-- use parameter in bcrowell's answer now
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date
from table (generator(rowcount => ($num_days)));
-- clean up previously set variable
unset num_days;
WITH RECURSIVE rec_cte AS (
-- start date
SELECT '2017-11-01'::DATE as dt
UNION ALL
SELECT DATEADD('day',1,dt) as dt
FROM rec_cte
-- end date (inclusive)
WHERE dt < current_date()
)
SELECT * FROM rec_cte
Adding this answer for completitude, in case you have an initial and last date:
select -1 + row_number() over(order by 0) i, start_date + i generated_date
from (select '2020-01-01'::date start_date, '2020-01-15'::date end_date)
join table(generator(rowcount => 10000 )) x
qualify i < 1 + end_date - start_date
I found the generator function in Snowflake quite limiting for all but the simplest use cases. For example, it was not clear how to take a single row specification, explode it into a table of dates and join it back to the original spec table.
Here is an alternative that uses recursive CTEs.
-- A 2 row table that contains "specs" for a date range
create local temp table date_spec as
select 1 as id, '2022-04-01'::date as start_date, current_date() as end_date
union all
select 2, '2022-03-01', '2032-03-30'
;
with explode_date(id, date, next_date, end_date) as (
select
id
, start_date as date -- start_date is the first date
, date + 1 as next_date -- next_date is the date of for the subsequent row in the recursive cte
, end_date
from date_spec
union all
select
ds.id
, ed.next_date -- the current_date is the value of next_date from above
, ed.next_date + 1
, ds.end_date
from date_spec ds
join explode_date ed
on ed.id = ds.id
where ed.date <= ed.end_date -- keep running until you hit the end_date
)
select * from explode_date
order by id, date desc
;
This is how I was able to generate a series of dates in Snowflake. I set row count to 1095 to get 3 years worth of dates, you can of course change that to whatever suits your use case
select
dateadd(day, '-' || seq4(), current_date()) as dte
from
table
(generator(rowcount => 1095))
Originally found here
EDIT: This solution is not correct. seq4 does not guarantee a sequence without gaps. Please follow other answers, not this one. Thanks #Marcin Zukowski for pointing that out.
I am trying to figure out how to write a query that looks at certain records and finds missing date ranges between today and 9999-12-31.
My data looks like below:
ID |start_dt |end_dt |prc_or_disc_1
10412 |2018-07-17 00:00:00.000 |2018-07-20 00:00:00.000 |1050.000000
10413 |2018-07-23 00:00:00.000 |2018-07-26 00:00:00.000 |1040.000000
So for this data I would want my query to return:
2018-07-10 | 2018-07-16
2018-07-21 | 2018-07-22
2018-07-27 | 9999-12-31
I'm not really sure where to start. Is this possible?
You can do that using the lag() function in MS SQL (but that is available starting with 2012?).
with myData as
(
select *,
lag(end_dt,1) over (order by start_dt) as lagEnd
from myTable),
myMax as
(
select Max(end_dt) as maxDate from myTable
)
select dateadd(d,1,lagEnd) as StartDate, dateadd(d, -1, start_dt) as EndDate
from myData
where lagEnd is not null and dateadd(d,1,lagEnd) < start_dt
union all
select dateAdd(d,1,maxDate) as StartDate, cast('99991231' as Datetime) as EndDate
from myMax
where maxDate < '99991231';
If lag() is not available in MS SQL 2008, then you can mimic it with row_number() and joining.
select
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then end_dt +1 END as F1,
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then ISNULL(LEAD(start_dt) over (order by ID) - 1, '99991231') END as F2
from t
Working SQLFiddle example is -> Here
FOR 2008 VERSION
SELECT
X.end_dt + 1 as F1,
ISNULL(Y.start_dt-1, '99991231') as F2
FROM t X
LEFT JOIN (
SELECT
*
, (SELECT MAX(ID) FROM t WHERE ID < A.ID) as ID2
FROM t A) Y ON X.ID = Y.ID2
WHERE DATEDIFF(day, X.end_dt, ISNULL(Y.start_dt, '99991231')) > 1
Working SQLFiddle example is -> Here
This should work in 2008, it assumes that ranges in your table do not overlap. It will also eliminate rows where the end_date of the current row is a day before the start date of the next row.
with dtRanges as (
select start_dt, end_dt, row_number() over (order by start_dt) as rownum
from table1
)
select t2.end_dt + 1, coalesce(start_dt_next -1,'99991231')
FROM
( select dr1.start_dt, dr1.end_dt,dr2.start_dt as start_dt_next
from dtRanges dr1
left join dtRanges dr2 on dr2.rownum = dr1.rownum + 1
) t2
where
t2.end_dt + 1 <> coalesce(start_dt_next,'99991231')
http://sqlfiddle.com/#!18/65238/1
SELECT
*
FROM
(
SELECT
end_dt+1 AS start_dt,
LEAD(start_dt-1, 1, '9999-12-31')
OVER (ORDER BY start_dt)
AS end_dt
FROM
yourTable
)
gaps
WHERE
gaps.end_dt >= gaps.start_dt
I would, however, strongly urge you to use end dates that are "exclusive". That is, the range is everything up to but excluding the end_dt.
That way, a range of one day becomes '2018-07-09', '2018-07-10'.
It's really clear that my range is one day long, if you subtract one from the other you get a day.
Also, if you ever change to needing hour granularity or minute granularity you don't need to change your data. It just works. Always. Reliably. Intuitively.
If you search the web you'll find plenty of documentation on why inclusive-start and exclusive-end is a very good idea from a software perspective. (Then, in the query above, you can remove the wonky +1 and -1.)
This solves your case, but provide some sample data if there will ever be overlaps, fringe cases, etc.
Take one day after your end date and 1 day before the next line's start date.
DECLARE # TABLE (ID int, start_dt DATETIME, end_dt DATETIME, prc VARCHAR(100))
INSERT INTO # (id, start_dt, end_dt, prc)
VALUES
(10410, '2018-07-09 00:00:00.00','2018-07-12 00:00:00.000','1025.000000'),
(10412, '2018-07-17 00:00:00.00','2018-07-20 00:00:00.000','1050.000000'),
(10413, '2018-07-23 00:00:00.00','2018-07-26 00:00:00.000','1040.000000')
SELECT DATEADD(DAY, 1, end_dt)
, DATEADD(DAY, -1, LEAD(start_dt, 1, '9999-12-31') OVER(ORDER BY id) )
FROM #
You may want to take a look at this:
http://sqlfiddle.com/#!18/3a224/1
You just have to edit the begin range to today and the end range to 9999-12-31.
I have following table tbl in database and I have dynamic joining date 1-1-2012 and I want this date is between (Fall and spring) or (spring and summer) or (summer and fall).I want query in which i passed only joining date which return semestertime and joining date in Oracle.
Semestertime joiningDate
Fall 10-13-2011
Spring 2-1-2012
Summer 6-11-2012
Fall 10-1-2015
If I understand your question correctly:
SELECT *
FROM your_table
WHERE joiningDate between to_date (your_lower_limit_date_here, 'mm-dd-yyyy')
AND to_date (your_upper_limit_date_here, 'mm-dd-yyyy`);
What about something like that:
select 'BEFORE' term,
t."Semestertime", to_char(t."joiningDate", 'MM-DD-YYYY')
from (
select tbl.*, rownum rn from tbl where tbl."joiningDate" < to_date('1-1-2012','MM-DD-YYYY')
-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- your reference date
order by tbl."joiningDate" desc) t
where rn = 1
union all
select 'AFTER' term,
t."Semestertime", to_char(t."joiningDate", 'MM-DD-YYYY')
from (
select tbl.*, rownum rn from tbl where tbl."joiningDate" > to_date('1-1-2012','MM-DD-YYYY')
-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- your reference date
order by tbl."joiningDate" asc) t
where rn = 1
This will return the "term" before and after a given date. You will probably have to adapt such query to your specific needs. But that might be a good starting point.
For example, given your business rules, you might consider using <= instead of <. You you might require to have the result displayer a column instead of rows. Bu all of this shouldn't be too had to change.
As an alternate solution using CTE and sub-queries:
with testdata as (select to_date('1-1-2012','MM-DD-YYYY') refdate from dual)
select v.what, tbl.* from tbl join
(
select 'BEFORE' what, max(t1."joiningDate") d
from tbl t1
where t1."joiningDate" < to_date('1-1-2012','MM-DD-YYYY')
union all
select 'AFTER' what, min(t1."joiningDate") d
from tbl t1
where t1."joiningDate" > to_date('1-1-2012','MM-DD-YYYY')
) v
on tbl."joiningDate" = v.d
See http://sqlfiddle.com/#!4/c7fa5/15 for a live demo comparing those solutions.
Edit: added an id, to make it more graspable
I stumbled over this problem a couple of times and always solved it per PL/SQL, but I am wondering, if there is a SQL-solution.
There is a table with a from_date and a to_date. The data in there is seamless for every to_date, there is a new row with a from_date on the next day.
create table test_date
(
id number,
from_date date,
to_date date
)
/
insert into test_date values(1, to_date('01022003', 'ddmmyyyy'), to_date('28022003', 'ddmmyyyy'))
/
insert into test_date values(2, to_date('01032003', 'ddmmyyyy'), to_date('31032003', 'ddmmyyyy'))
/
There is another table, which breaks this time periods.
create table test_date2
(
id number,
from_date date,
to_date date
)
/
insert into test_date2 values(3, to_date('05022003', 'ddmmyyyy'), to_date('10022003', 'ddmmyyyy'))
/
So, I want a view, that shows this time periods and the "breaks" in different columns, but this should also be seamless after the "break" with test_date2 it should go right on with the data in test_date and I can't get that going:
select typ, id, from_date, decode(typ, 1, decode(to_date+1, lead_from_date, to_date, lead_from_date-1), to_date) to_date
from(
select typ, id, from_date, to_date, lead(from_date) over (order by from_date, typ) lead_from_date
from
(select 1 typ, id, from_date, to_date
from test_date t
union all
select 2 typ, id, from_date, to_date
from test_date2 t2
) a
)
What I get here is
1 1 01/02/2003 04/02/2003
2 3 05/02/2003 10/02/2003
1 2 01/03/2003
the period between 11/02/2003 and 28/02/2003 (for the row in test_data with id=1) is missing.
So, what I want, is this:
1 1 01/02/2003 04/02/2003
2 3 05/02/2003 10/02/2003
1 1 11/02/2003 28/02/2003
1 2 01/03/2003
I think this is what you're after; your're not getting the same answer because you're not generating the full list of dates. If you normalise your data in order to get a unique list of dates you can then use LEAD() or LAG() to find the next/previous date, and re-generate your list.
I use UNPIVOT here to transform the from_date and to_date into a single column but 4 unions will provide the same result:
with all_tables as (
select *
from test_date
union all
select *
from test_date2
)
, all_dates as (
select dt
from all_tables
unpivot ( dt for dates in ( from_date, to_date ))
)
select dt
, lead(dt) over (order by dt) as to_date
from all_dates;
DT TO_DATE
---------- ----------
01/02/2003 05/02/2003
05/02/2003 10/02/2003
10/02/2003 28/02/2003
28/02/2003 01/03/2003
01/03/2003 31/03/2003
31/03/2003
6 rows selected.