IBM DB2: Generate list of dates between two dates - sql

I need a query which will output a list of dates between two given dates.
For example, if my start date is 23/02/2016 and end date is 02/03/2016, I am expecting the following output:
Date
----
23/02/2016
24/02/2016
25/02/2016
26/02/2016
27/02/2016
28/02/2016
29/02/2016
01/03/2016
02/03/2016
Also, I need the above using SQL only (without the use of 'WITH' statement or tables). Please help.

I am using ,ostly DB2 for iSeries, so I will give you an SQL only solution that works on it. Currently I don't have an access to the server, so the query is not tested but it should work. EDIT Query is already tested and working
SELECT
d.min + num.n DAYS
FROM
-- create inline table with min max date
(VALUES(DATE('2015-02-28'), DATE('2016-03-01'))) AS d(min, max)
INNER JOIN
-- create inline table with numbers from 0 to 999
(
SELECT
n1.n + n10.n + n100.n AS n
FROM
(VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS n1(n)
CROSS JOIN
(VALUES(0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) AS n10(n)
CROSS JOIN
(VALUES(0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) AS n100(n)
) AS num
ON
d.min + num.n DAYS<= d.max
ORDER BY
num.n;
if you don't want to execute the query only once, you should consider creating a real table with values for the loop:
CREATE TABLE dummy_loop AS (
SELECT
n1.n + n10.n + n100.n AS n
FROM
(VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS n1(n)
CROSS JOIN
(VALUES(0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) AS n10(n)
CROSS JOIN
(VALUES(0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) AS n100(n)
) WITH DATA;
ALTER TABLE dummy_loop ADD PRIMARY KEY (dummy_loop.n);
It depends on the reason for which you like to use it, but you could even create table for lets say for 100 years. It will be only 100*365 = 36500 rows with just a date field, so the table will be quite small and fast for joins.
CREATE TABLE dummy_dates AS (
SELECT
DATE('1970-01-01') + (n1.n + n10.n + n100.n) DAYS AS date
FROM
(VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS n1(n)
CROSS JOIN
(VALUES(0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) AS n10(n)
CROSS JOIN
(VALUES(0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) AS n100(n)
) WITH DATA;
ALTER TABLE dummy_dates ADD PRIMARY KEY (dummy_dates.date);
And the select query could look like:
SELECT
*
FROM
dummy_days
WHERE
date BETWEEN(:startDate, :endDate);
EDIT 2: Thanks to #Lennart suggestion I have changed TABLE(VALUES(..,..,..)) to VALES(..,..,..) because as he said TABLE is a synonym to LATERAL that was a real surprise for me.
EDIT 3: Thanks to #godric7gt I have removed TIMESTAMPDIFF and will remove from all my scripts, because as it is said in the documentation:
These assumptions are used when converting the information in the second argument, which is a timestamp duration, to the interval type specified in the first argument. The returned estimate may vary by a number of days. For example, if the number of days (interval 16) is requested for the difference between '1997-03-01-00.00.00' and '1997-02-01-00.00.00', the result is 30. This is because the difference between the timestamps is 1 month, and the assumption of 30 days in a month applies.
It was a real surprise, because I was always trust this function for days difference.

For generating rows recusive SQL will needed.
Usually this looks like this in DB2:
with temp (date) as (
select date('23.02.2016') as date from sysibm.sysdummy1
union all
select date + 1 day from temp
where date < date('02.03.2016')
)
select * from temp
For whatever reason a CTE (using WITH) should be avoided.
A possible workaround would be setting
db2set DB2_COMPATIBILITY_VECTOR=8
which enables the use of the Oracle style recusion with CONNECT BY
SELECT date('22.02.2016') + level days as dt
FROM sysibm.sysdummy1 CONNECT BY date('22.02.2016') + level days <= date('02.03.2016')
Please note: after setting the DB2_COMPATIBILITY_VECTOR a instance restart is necessary.

This solution doesn't use WITH, but it does use WHILE and a temp table...hopefully that meets your needs still?
EDIT -- I built this in SSMS 2014
DECLARE #Start DATE
DECLARE #End DATE
SET #Start = '2016-02-23'
SET #End = '2016-03-02'
CREATE TABLE #Dates ([Date] DATE)
WHILE #Start <= #End
BEGIN
INSERT INTO #Dates
SELECT #Start
SET #Start = DATEADD(Day,1,#Start)
END
SELECT * FROM #Dates
DROP TABLE #Dates

I assume AS400 does not support recursive CTE's, and that's why you want a solution without them. I have no clue whether it supports any of the following constructions, but it might be worth a shot. First we will need a generator, any table with a sufficient number of rows will do. If you don't have a table large enough for the number of days you want you can create a cartesian product. Example:
select row_number() over ()
from a_table
cross join a_table
Another way of extending the domain is to create the powerset of a table using group by cube, see below.
Assume we one way or another can create a large enough set of rows. You can generate the dates like:
select date('23/02/2016') + n days
from (
select row_number() over () as n
from a_table
) as t
where n < 100
order by n
If for some reason you don't want to use an existing table, group by cube will produce a relation with a cardinality equal to the power set of the attributes. Here I use 4 columns which will generate 16 rows.
select date('2016-01-01') + row_number() over () days
from sysibm.dual x
group by cube(x.dummy, x.dummy, x.dummy, x.dummy)
If you want to generate say 100 rows you need 7 (since 2^7=128) attributes in the group by cube clause and a fetch first 100 rows:
select date('2016-01-01') + row_number() over () days
from sysibm.dual x
group by cube(x.dummy, x.dummy, x.dummy, x.dummy, x.dummy, x.dummy, x.dummy)
order by 1
fetch first 100 rows only

Related

Identify missing hours - find the gaps in time

I have a table with hours, but there are gaps. I need to find which are the missing hours.
select datehour
from stored_hours
order by 1;
The gaps in this timeline are easy to find:
select lag(datehour) over(order by datehour) since, datehour until
, timestampdiff(hour, lag(datehour) over(order by datehour), datehour) - 1 missing
from stored_hours
qualify missing > 0
How can I create a list of the missing hours during these days?
(with Snowflake and SQL)
To create a list/table of the missing hours:
Generate a list of all the hours between the min/max of the existing table.
To generate that list with Snowflake you will need to use session variables (as the generator only takes constants for the length.
Then find the missing hours with a left join, looking for nulls.
Use variables to find out the start and total number of hours:
set (min_hour, total_hours) = (
select min(datehour) min_hour
, timestampdiff('hour', min(datehour), max(datehour)) total_hours
from stored_hours
);
Then do the left join with a generated table of all hours, to find the missing ones:
select generated_hour missing_hour
from ( -- generated hours
select timestampadd('hour', row_number() over(order by 0), $min_hour) generated_hour
from table(generator(rowcount => $total_hours))
) a
left outer join stored_hours b
on generated_hour=b.datehour
where datehour is null;
The result is a list of the missing hours:
(you could apply a similar technique for missing days, if the input are dates)

generate each minute string for a day within specified time limit

My aim is to generate per minute count of all records existing in a table like this.
SELECT
COUNT(*) as RECORD_COUNT,
to_Char(MY_DATE,'HH24:MI') MINUTE_GAP
FROM
TABLE_A
WHERE
BLAH='Blah! Blah!!'
GROUP BY
to_Char(MY_DATE,'HH24:MI')
However, This query doesn't give me the minutes where there were no results.
To get the desired result it, I'm to using the following query to fill the gaps in the original query by doing a JOIN between these two results.
SELECT
*
FROM
( SELECT
TO_CHAR(TRUNC(SYSDATE)+( (ROWNUM-1) /1440) ,'HH24:MI') as MINUTE_GAP,
0 as COUNT
FROM
SOME_LARGE_TABLE_B
WHERE
rownum<=1440
)
WHERE
minute_gap>'07:00' /*I want only the data starting from 7:00AM*/
This works for me, But
I can't rely on SOME_LARGE_TABLE_B to generate the minutes
because it might have no records at some point in future
The query doesn't look like a professional solution.
Is there any easier way to do this?
NOTE:I don't want any new tables created with static values for all the minutes just for one query.
Just generate your timestamps and left join your grouped data to it:
SELECT MINUTE, ....
FROM (
SELECT TO_CHAR(TO_DATE((LEVEL + 419) * 60, 'SSSSS'), 'HH24:MI') MINUTE /* 07:00 - 23:59 */ FROM DUAL CONNECT BY LEVEL <= 1020)
LEFT JOIN (
<your grouped subquery>
) ON MINUTE = MINUTE_GAP

db2 suppress recursive warning

I have a recursive sql that I am running which works but gives me the following warning.
SQL0347W The recursive common table expression "DT_LAST_YEAR" may
contain an infinite loop. SQLSTATE=01605
How can I get rid of the warning?
INSERT INTO REP_MAN_TRAN_COUNTS (SITEDIRECTORYID, BUSINESSDATE, TRANCOUNT)
WITH dt_this_year (level, seqdate) AS
(
SELECT 1, date(current timestamp) -7 DAYS FROM sysibm.sysdummy1
UNION ALL
SELECT level, seqdate + level days FROM dt_this_year WHERE level < 1000 AND seqdate + 1 days < date(current timestamp)
)
,dt_last_year (level, seqdate) AS
(
SELECT 1, date(current timestamp) -7 DAYS - 1 year FROM sysibm.sysdummy1
UNION ALL
SELECT level, seqdate + level days FROM dt_last_year WHERE level < 1000 AND seqdate + 1 days < date(current timestamp) -1 year
)
select 10049, date(dts.calendarday), count(*) trancount
from (
SELECT seqdate AS calendarday FROM dt_this_year
UNION
SELECT seqdate AS calendarday FROM dt_last_year
) dts LEFT JOIN ccftrxheader ccf
ON date(dts.calendarday) = date(ccf.businessdate)
WHERE ccf.sitedirectoryid=10049
GROUP BY ccf.sitedirectoryid,dts.calendarday
How do you get rid of warnings?
By changing the code so that it no longer generates the warning in the first place. Hiding warnings is problematic, because it often disguises a potentially larger problem. I'm fairly certain it's complaining here because the termination clause you provide for level can't ever be reached (because you never manipulate it).
Personally, I'd probably re-write your query into something like this:
INSERT INTO Rep_Man_Tran_Counts (siteDirectoryId, businessDate, tranCount)
WITH dt_Calendar_Data (level, calendarDay) AS
(SELECT l, c
FROM (VALUES (1, CURRENT_DATE - 7 DAYS),
(1, CURRENT_DATE - 7 DAYS - 1 YEAR)) t(l, c)
UNION ALL
SELECT level + 1, calendarDay + 1 DAYS
FROM dt_Calendar_Data
WHERE level < 7)
SELECT 10049, dtCal.calendarDay, COALESCE(COUNT(*), 0) as tranCount
FROM dt_Calendar_Data dtCal
LEFT JOIN ccftrxHeader ccf
ON ccf.businessDate = dtCal.calendarDay
AND ccf.siteDirectoryId = 10049
GROUP BY dtCal.seqDate
(untested, as you've provided no sample data, and I don't have a DB2 instance)
I've assumed you actually wanted a LEFT JOIN, as opposed to the regular INNER JOIN you were actually getting (due to the condition in the WHERE clause, and probably the GROUP BY as well). To avoid adding nulls to your data, I've wrapped the count in COALESCE(...), which will give you 0 instead.
I've also assumed that businessDate is a DATE type, and not a timestamp. If it is a timestamp this query needs to be adjusted (note that the function you were using would for the optimizer to ignore indices).
Note that order of operations with dates matter! Thankfully when dealing with year ranges, you only have one day to worry about in the Gregorian calendar (February 29th). Your current ordering will compare identical calendar days at the start of the range (which one has the "gap" depends on whether this year or last year is a leap year).
EDIT:
Sure, lets look at that CTE:
FROM(VALUES (1, CURRENT_DATE - 7 DAYS),
(1, CURRENT_DATE - 7 DAYS - 1 YEAR)) t(l, c)
This is just a standard VALUES clause used as a table reference. This is the SQL Standard way to construct a small temp table (Rather than referencing the dummy tables, which tend to be vendor-specific). If the statement is run on 2014-02-26 then the resulting table will be:
t
l c
===============
1 "2014-02-19"
1 "2013-02-19"
These columns get renamed by the column listing of the CTE, which are then referenced in the join (and in the case of a recursive CTE, by the recursive portion).
This then forms the starting data for the rest of the recursive query:
UNION ALL
SELECT level + 1, calendarDay + 1 DAYS
FROM dt_Calendar_Data
WHERE level < 7
In DB2 (and some other RDBMSs), recursive CTEs essentially execute iteratively, acting off the results of the "previous" invocation. Every time around, we increment level, and add another day to calendarDay. The "next" rows are then:
level calendarDay
======================
2 "2014-02-20"
2 "2013-02-20"
This continues until the "previous" row has level = 7, which means a new row is not generated (check the WHERE clause). In general, it's best to only have one termination condition (and make progress every iteration), to make it easier for the optimizer to spot. The resulting data is then in the ranges:
level calendarDay
=====================
1 "2014-02-19"
. .....
7 "2014-02-26"
1 "2013-02-19"
. .....
7 "2013-02-26"
... as a side note, I generated the this year/last year data together to make the number of references shorter. If you only needed the one year, level is unnecessary.

Find closest date in SQL Server

I have a table dbo.X with DateTime column Y which may have hundreds of records.
My Stored Procedure has parameter #CurrentDate, I want to find out the date in the column Y in above table dbo.X which is less than and closest to #CurrentDate.
How to find it?
The where clause will match all rows with date less than #CurrentDate and, since they are ordered descendantly, the TOP 1 will be the closest date to the current date.
SELECT TOP 1 *
FROM x
WHERE x.date < #CurrentDate
ORDER BY x.date DESC
Use DateDiff and order your result by how many days or seconds are between that date and what the Input was
Something like this
select top 1 rowId, dateCol, datediff(second, #CurrentDate, dateCol) as SecondsBetweenDates
from myTable
where dateCol < #currentDate
order by datediff(second, #CurrentDate, dateCol)
I have a better solution for this problem i think.
I will show a few images to support and explain the final solution.
Background
In my solution I have a table of FX Rates. These represent market rates for different currencies. However, our service provider has had a problem with the rate feed and as such some rates have zero values. I want to fill the missing data with rates for that same currency that as closest in time to the missing rate. Basically I want to get the RateId for the nearest non zero rate which I will then substitute. (This is not shown here in my example.)
1) So to start off lets identify the missing rates information:
Query showing my missing rates i.e. have a rate value of zero
2) Next lets identify rates that are not missing.
Query showing rates that are not missing
3) This query is where the magic happens. I have made an assumption here which can be removed but was added to improve the efficiency/performance of the query. The assumption on line 26 is that I expect to find a substitute transaction on the same day as that of the missing / zero transaction.
The magic happens is line 23: The Row_Number function adds an auto number starting at 1 for the shortest time difference between the missing and non missing transaction. The next closest transaction has a rownum of 2 etc.
Please note that in line 25 I must join the currencies so that I do not mismatch the currency types. That is I don't want to substitute a AUD currency with CHF values. I want the closest matching currencies.
Combining the two data sets with a row_number to identify nearest transaction
4) Finally, lets get data where the RowNum is 1
The final query
The query full query is as follows;
; with cte_zero_rates as
(
Select *
from fxrates
where (spot_exp = 0 or spot_exp = 0)
),
cte_non_zero_rates as
(
Select *
from fxrates
where (spot_exp > 0 and spot_exp > 0)
)
,cte_Nearest_Transaction as
(
select z.FXRatesID as Zero_FXRatesID
,z.importDate as Zero_importDate
,z.currency as Zero_Currency
,nz.currency as NonZero_Currency
,nz.FXRatesID as NonZero_FXRatesID
,nz.spot_imp
,nz.importDate as NonZero_importDate
,DATEDIFF(ss, z.importDate, nz.importDate) as TimeDifferece
,ROW_NUMBER() Over(partition by z.FXRatesID order by abs(DATEDIFF(ss, z.importDate, nz.importDate)) asc) as RowNum
from cte_zero_rates z
left join cte_non_zero_rates nz on nz.currency = z.currency
and cast(nz.importDate as date) = cast(z.importDate as date)
--order by z.currency desc, z.importDate desc
)
select n.Zero_FXRatesID
,n.Zero_Currency
,n.Zero_importDate
,n.NonZero_importDate
,DATEDIFF(s, n.NonZero_importDate,n.Zero_importDate) as Delay_In_Seconds
,n.NonZero_Currency
,n.NonZero_FXRatesID
from cte_Nearest_Transaction n
where n.RowNum = 1
and n.NonZero_FXRatesID is not null
order by n.Zero_Currency, n.NonZero_importDate

SQL Average Inter-arrival Time, Time Between Dates

I have a table with sequential timestamps:
2011-03-17 10:31:19
2011-03-17 10:45:49
2011-03-17 10:47:49
...
I need to find the average time difference between each of these(there could be dozens) in seconds or whatever is easiest, I can work with it from there. So for example the above inter-arrival time for only the first two times would be 870 (14m 30s). For all three times it would be: (870 + 120)/2 = 445 (7m 25s).
A note, I am using postgreSQL 8.1.22 .
EDIT: The table I mention above is from a different query that is literally just a one-column list of timestamps
Not sure I understood your question completely, but this might be what you are looking for:
SELECT avg(difference)
FROM (
SELECT timestamp_col - lag(timestamp_col) over (order by timestamp_col) as difference
FROM your_table
) t
The inner query calculates the distance between each row and the preceding row. The result is an interval for each row in the table.
The outer query simply does an average over all differences.
i think u want to find avg(timestamptz).
my solution is avg(current - min value). but since result is interval, so add it to min value again.
SELECT avg(target_col - (select min(target_col) from your_table))
+ (select min(target_col) from your_table)
FROM your_table
If you cannot upgrade to a version of PG that supports window functions, you
may compute your table's sequential steps "the slow way."
Assuming your table is "tbl" and your timestamp column is "ts":
SELECT AVG(t1 - t0)
FROM (
-- All this silliness would be moot if we could use
-- `` lead(ts) over (order by ts) ''
SELECT tbl.ts AS t0,
next.ts AS t1
FROM tbl
CROSS JOIN
tbl next
WHERE next.ts = (
SELECT MIN(ts)
FROM tbl subquery
WHERE subquery.ts > tbl.ts
)
) derived;
But don't do that. Its performance will be terrible. Please do what
a_horse_with_no_name suggests, and use window functions.