ORACLE (11.2.0.1.0) - Recursive CTE with a date expression - sql

The right answer on the following question:
That is a bug that was fixed in 11.2.0.3 or later if I recall correctly. (11.2.0.1 is no longer supported anyway. 11.2.0.4 is the only 11.2 release that is still supported) – #a_horse_with_no_name
The bug number is 11840579 and it was fixed in 11.2.0.3 and 12.1.0.1
– #a_horse_with_no_name
Question
I have a table
CREATE TABLE test(
from_date date,
to_date date
);
INSERT INTO test(from_date,to_date)
--VALUES('20171101','20171115');
VALUES(TO_DATE('20171101','YYYYMMDD'),TO_DATE('20171115','YYYYMMDD'));
The following query in Oracle return only one row (expected 15 rows)
WITH dateCTE(from_date,to_date,d,i) AS(
SELECT from_date,to_date,from_date AS d,1 AS i
FROM test
UNION ALL
SELECT from_date,to_date,d+INTERVAL '1' DAY,i+1
FROM dateCTE
WHERE d<to_date
)
SELECT d,i
FROM dateCTE
SQL Fiddle - http://sqlfiddle.com/#!4/36907/8
For test I changed the condition to i<10
WITH dateCTE(from_date,to_date,d,i) AS(
SELECT from_date,to_date,from_date AS d,1 AS i
FROM test
UNION ALL
SELECT from_date,to_date,d+INTERVAL '1' DAY,i+1
FROM dateCTE
--WHERE d<to_date
WHERE i<10 -- exit condition
)
SELECT d,i
FROM dateCTE
And get the next result
| D | I |
|------------|----|
| 2017-11-01 | 1 |
| 2017-10-31 | 2 |
| 2017-10-30 | 3 |
| 2017-10-29 | 4 |
| 2017-10-28 | 5 |
| 2017-10-27 | 6 |
| 2017-10-26 | 7 |
| 2017-10-25 | 8 |
| 2017-10-24 | 9 |
| 2017-10-23 | 10 |
Why do this recursive query returned bad result in Oracle?
SQL Fiddle - http://sqlfiddle.com/#!4/36907/5
I ran a similar query in SQLServer and I get the right result
WITH dateCTE(from_date,to_date,d,i) AS(
SELECT from_date,to_date,from_date AS d,1 AS i
FROM test
UNION ALL
SELECT from_date,to_date,DATEADD(DAY,1,d),i+1
FROM dateCTE
WHERE d<to_date
)
SELECT d,i
FROM dateCTE
The right result
d i
2017-11-01 1
2017-11-02 2
2017-11-03 3
2017-11-04 4
2017-11-05 5
2017-11-06 6
2017-11-07 7
2017-11-08 8
2017-11-09 9
2017-11-10 10
2017-11-11 11
2017-11-12 12
2017-11-13 13
2017-11-14 14
2017-11-15 15
Why it doesn't work in Oracle? What alternative variants can you suggest? Thank you!
Screen shots from a real system:

If you want to have a sequential from-date to to-date, Use such this select:
SELECT DATE '2017-11-01' + LEVEL - 1 AS D, LEVEL AS I
FROM DUAL
CONNECT BY LEVEL <= DATE '2017-11-15' - DATE '2017-11-01' + 1;

It's working for me in Oracle 11.2 Enterprise. Your SQL fiddle, however, plainly shows that it doesn't in every Oracle version.
I remember I had similar problems with dates in recursive queries. So there is a bug, that has been fixed in newer versions. It is very likely, your Oracle version has this bug.
A workaround: Get only the integer offsets from the recursive query and add them to the from_date afterwards:
WITH dateCTE(from_date, i, iend) AS
(
SELECT from_date, 1 AS i, to_date - from_date as iend
FROM test
UNION ALL
SELECT from_date, i + 1, iend
FROM dateCTE
WHERE i <= iend
)
, dates as (select i, from_date + i - 1 as d from dateCTE)
SELECT d, i
FROM dates;

Related

Time and attendance

I have a table with below data
EMPID | DEVICE | EVENTTIME
-----------------------------------------
112 | READ_IN | 2018-11-02 07:00:00.000
112 | READ_IN | 2018-11-02 08:00:00.000
112 | READ_OUT | 2018-11-02 12:00:00.000
112 | READ_IN | 2018-11-02 13:00:00.000
112 | READ_OUT | 2018-11-02 16:00:00.000
I need a select query to achieve below data:
ID_Emp |Date |TimeIn |TimeOut|Hours
112 |02/11/2018 |8:00 |16:00 |7:00
In my table, the employee came at 7:00 but he didn't do his work then after one hour he came back and work. He took his lunch break at 12:00-13:00 and left his work at 16:00. So his total working hours will be 7 hours.
At first you need to eliminate time between 12 and 1, I wrote simple where clause for this. After that
I used PIVOT for transposing rows to columns by max EVENTTIME.
And finally, I wrote outermost SELECT query for converting columns to your intended format.
here is the fiddler link: http://sqlfiddle.com/#!4/f1189/10
here is the code:
SELECT
EMPID,
TO_CHAR(READ_IN, 'HH24:MI') READ_IN,
TO_CHAR(READ_OUT, 'HH24:MI') READ_OUT,
EXTRACT(HOUR FROM READ_OUT - READ_IN) HOUR
FROM (
select * from (
select * from Table1
WHERE
extract(hour from eventtime) not between '12' and '13'
)
PIVOT (
MAX(EVENTTIME)
for DEVICE in ( 'READ_IN' READ_IN, 'READ_OUT' READ_OUT )
)
)
please note that this example only works for oracle.

How to write a SQL statement to sum data using group by the same day of every two neighboring months

I have a data table like this:
datetime data
-----------------------
...
2017/8/24 6.0
2017/8/25 5.0
...
2017/9/24 6.0
2017/9/25 6.2
...
2017/10/24 8.1
2017/10/25 8.2
I want to write a SQL statement to sum the data using group by the 24th of every two neighboring months in certain range of time such as : from 2017/7/20 to 2017/10/25 as above.
How to write this SQL statement? I'm using SQL Server 2008 R2.
The expected results table is like this:
datetime_range data_sum
------------------------------------
...
2017/8/24~2017/9/24 100.9
2017/9/24~2017/10/24 120.2
...
One conceptual way to proceed here is to redefine a "month" as ending on the 24th of each normal month. Using the SQL Server month function, we will assign any date occurring after the 24th as belonging to the next month. Then we can aggregate by the year along with this shifted month to obtain the sum of data.
WITH cte AS (
SELECT
data,
YEAR(datetime) AS year,
CASE WHEN DAY(datetime) > 24
THEN MONTH(datetime) + 1 ELSE MONTH(datetime) END AS month
FROM yourTable
)
SELECT
CONVERT(varchar(4), year) + '/' + CONVERT(varchar(2), month) +
'/25~' +
CONVERT(varchar(4), year) + '/' + CONVERT(varchar(2), (month + 1)) +
'/24' AS datetime_range,
SUM(data) AS data_sum
FROM cte
GROUP BY
year, month;
Note that your suggested ranges seem to include the 24th on both ends, which does not make sense from an accounting point of view. I assume that the month includes and ends on the 24th (i.e. the 25th is the first day of the next accounting period.
Demo
I would suggest dynamically building some date range rows so that you can then join you data to those for aggregation, like this example:
+----+---------------------+---------------------+----------------+
| | period_start_dt | period_end_dt | your_data_here |
+----+---------------------+---------------------+----------------+
| 1 | 24.04.2017 00:00:00 | 24.05.2017 00:00:00 | 1 |
| 2 | 24.05.2017 00:00:00 | 24.06.2017 00:00:00 | 1 |
| 3 | 24.06.2017 00:00:00 | 24.07.2017 00:00:00 | 1 |
| 4 | 24.07.2017 00:00:00 | 24.08.2017 00:00:00 | 1 |
| 5 | 24.08.2017 00:00:00 | 24.09.2017 00:00:00 | 1 |
| 6 | 24.09.2017 00:00:00 | 24.10.2017 00:00:00 | 1 |
| 7 | 24.10.2017 00:00:00 | 24.11.2017 00:00:00 | 1 |
| 8 | 24.11.2017 00:00:00 | 24.12.2017 00:00:00 | 1 |
| 9 | 24.12.2017 00:00:00 | 24.01.2018 00:00:00 | 1 |
| 10 | 24.01.2018 00:00:00 | 24.02.2018 00:00:00 | 1 |
| 11 | 24.02.2018 00:00:00 | 24.03.2018 00:00:00 | 1 |
| 12 | 24.03.2018 00:00:00 | 24.04.2018 00:00:00 | 1 |
+----+---------------------+---------------------+----------------+
DEMO
declare #start_dt date;
set #start_dt = '20170424';
select
period_start_dt, period_end_dt, sum(1) as your_data_here
from (
select
dateadd(month,m.n,start_dt) period_start_dt
, dateadd(month,m.n+1,start_dt) period_end_dt
from (
select #start_dt start_dt ) seed
cross join (
select 0 n union all
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10 union all
select 11
) m
) r
-- LEFT JOIN YOUR DATA
-- ON yourdata.date >= r.period_start_dt and data.date < r.period_end_dt
group by
period_start_dt, period_end_dt
Please don't be tempted to use "between" when it comes to joining to your data. Follow the note above and use yourdata.date >= r.period_start_dt and data.date < r.period_end_dt otherwise you could double count information as between is inclusive of both lower and upper boundaries.
I think the simplest way is to subtract 25 days and aggregate by the month:
select year(dateadd(day, -25, datetime)) as yr,
month(dateadd(day, -25, datetime)) as mon,
sum(data)
from t
group by dateadd(day, -25, datetime);
You can format yr and mon to get the dates for the specific ranges, but this does the aggregation (and the yr/mon columns might be sufficient).
Step 0: Build a calendar table. Every database needs a calendar table eventually to simplify this sort of calculation.
In this table you may have columns such as:
Date (primary key)
Day
Month
Year
Quarter
Half-year (e.g. 1 or 2)
Day of year (1 to 366)
Day of week (numeric or text)
Is weekend (seems redundant now, but is a huge time saver later on)
Fiscal quarter/year (if your company's fiscal year doesn't start on Jan. 1)
Is Holiday
etc.
If your company starts its month on the 24th, then you can add a "Fiscal Month" column that represents that.
Step 1: Join on the calendar table
Step 2: Group by the columns in the calendar table.
Calendar tables sound weird at first, but once you realize that they are in fact tiny even if they span a couple hundred years they quickly become a major asset.
Don't try to cheap out on disk space by using computed columns. You want real columns because they are much faster and can be indexed if necessary. (Though honestly, usually just the PK index is enough for even wide calendar tables.)

Postgres query for calendar

I am trying to write a query to retrieve data from an events query for a simple calendar app.
The table structure is as followed:
table name: events
Column | Type
---------+-----------
id | integer
start | timestamp
end | timestamp
the data inside of the table
id| start | end
--+---------------------+--------------------
1 | 2017-09-01 12:00:00 | 2017-09-01 12:00:00
2 | 2017-09-03 10:00:00 | 2017-09-03 12:00:00
3 | 2017-09-08 12:00:00 | 2017-09-11 12:00:00
4 | 2017-09-11 12:00:00 | 2017-09-11 12:00:00
the expected result is
date | event.id
-----------+---------
2017-09-01 | 1
2017-09-03 | 2
2017-09-08 | 3
2017-09-09 | 3
2017-09-10 | 3
2017-09-11 | 3
2017-09-11 | 4
As you can see, only days with an event (not just start and end, but also the days in between) is retrieved, days without an event are not retrieved at all.
In the second step I would like to be able to limit the amount of distinct days, e.g. "get 4 days with events" what might be more than 4 rows.
Right now I am able to retrieve the events based on start date only using the following query:
SELECT start::date, id FROM events WHERE events.start::date >= '2017-09-01' LIMIT 3
Thinks I already though about are DENSE_RANK and generate_series, but up to now I didn't find a way to fill the gaps between start and end, but not on days where there are no data.
So in short:
What I want to get is: get the next X days where there is an event. A date with an event is a day where start <= date >= end
Any ideas ?
Edit
Thanks to Tim I have now the following query (modified to use generate_series instead of a table and added a limit using dense_rank):
select date, id FROM (
SELECT
DENSE_RANK() OVER (ORDER BY t1.date) as rank,
t1.date,
events.id
FROM
generate_series([DATE]::date, [DATE]::date + interval '365 day', '1 day') as t1
INNER JOIN
events
ON t1.date BETWEEN events.start::date AND events."end"::date
) as t
WHERE rank <= [LIMIT]
This is working really good, even though I am not 100% sure about the performance hit with this kind of limit
I think you really need a calendar table here to cover the full range of dates in which your data may appear. In the first CTE below, I generate a table covering the month of September 2017. Then all we need to do is inner join this calendar table with the events table on the criteria of a given day appearing within a given range.
WITH cte AS (
SELECT CAST('2017-09-01' AS DATE) + (n || ' day')::INTERVAL AS date
FROM generate_series(0, 29) n
)
SELECT
t1.date,
t2.id
FROM cte t1
INNER JOIN events t2
ON t1.date BETWEEN CAST(t2.start AS DATE) AND CAST(t2.end AS DATE);
Output:
date id
1 01.09.2017 00:00:00 1
2 03.09.2017 00:00:00 2
3 08.09.2017 00:00:00 3
4 09.09.2017 00:00:00 3
5 10.09.2017 00:00:00 3
6 11.09.2017 00:00:00 3
7 11.09.2017 00:00:00 4
Demo here:
Rextester

Joining series of dates and counting continous days

Let's say I have a table as below
date add_days
2015-01-01 5
2015-01-04 2
2015-01-11 7
2015-01-20 10
2015-01-30 1
what I want to do is to check the days_balance, i.e. if date is greater or smaller than previous date + N days (add_days) and take the cumulated sum of days count if they are a continuous series.
So the algorithm should work like
for i in 2:N_rows {
days_balance[i] := date[i-1] + add_days[i-1] - date[i]
if days_balance[i] >= 0 then
date[i] := date[i] + days_balance[i]
}
The expected result should be as follows
date days_balance
2015-01-01 0
2015-01-04 2
2015-01-11 -3
2015-01-20 -2
2015-01-30 0
Is it possible in pure SQL? I imagine it should be with some conditional joins, but cannot see how it could be implemented.
I'm posting another answer since it may be nice to compare them since they use different methods (this one just does a n^2 style join, other one used a recursive CTE). This one takes advantage of the fact that you don't have to calculate the days_balance for each previous row before calculating it for a particular row, you just need to sum things from previous days....
drop table junk
create table junk(date DATETIME, add_days int)
insert into junk values
('2015-01-01',5 ),
('2015-01-04',2 ),
('2015-01-11',7 ),
('2015-01-20',10 ),
('2015-01-30',1 )
;WITH cte as
(
select ROW_NUMBER() OVER (ORDER BY date) i, date, add_days, ISNULL(DATEDIFF(DAY, LAG(date) OVER (ORDER BY date), date), 0) days_since_prev
FROM Junk
)
, combinedWithAllPreviousDaysCte as
(
select i [curr_i], date [curr_date], add_days [curr_add_days], days_since_prev [curr_days_since_prev], 0 [prev_add_days], 0 [prev_days_since_prev] from cte where i = 1 --get first row explicitly since it has no preceding rows
UNION ALL
select curr.i [curr_i], curr.date [curr_date], curr.add_days [curr_add_days], curr.days_since_prev [curr_days_since_prev], prev.add_days [prev_add_days], prev.days_since_prev [prev_days_since_prev]
from cte curr
join cte prev on curr.i > prev.i --join to all previous days
)
select curr_i, curr_date, SUM(prev_add_days) - curr_days_since_prev - SUM(prev_days_since_prev) [days_balance]
from combinedWithAllPreviousDaysCte
group by curr_i, curr_date, curr_days_since_prev
order by curr_i
outputs:
+--------+-------------------------+--------------+
| curr_i | curr_date | days_balance |
+--------+-------------------------+--------------+
| 1 | 2015-01-01 00:00:00.000 | 0 |
| 2 | 2015-01-04 00:00:00.000 | 2 |
| 3 | 2015-01-11 00:00:00.000 | -3 |
| 4 | 2015-01-20 00:00:00.000 | -5 |
| 5 | 2015-01-30 00:00:00.000 | -5 |
+--------+-------------------------+--------------+
Well, I think I have it with a recursive CTE (sorry, I only have Microsoft SQL Server available to me at the moment, so it may not comply with PostgreSQL).
Also I think the expected results you had were off (see comment above). If not, this can probably be modified to conform to your math.
drop table junk
create table junk(date DATETIME, add_days int)
insert into junk values
('2015-01-01',5 ),
('2015-01-04',2 ),
('2015-01-11',7 ),
('2015-01-20',10 ),
('2015-01-30',1 )
;WITH cte as
(
select ROW_NUMBER() OVER (ORDER BY date) i, date, add_days, ISNULL(DATEDIFF(DAY, LAG(date) OVER (ORDER BY date), date), 0) days_since_prev
FROM Junk
)
,recursiveCte (i, date, add_days, days_since_prev, days_balance, math) as
(
select top 1
i,
date,
add_days,
days_since_prev,
0 [days_balance],
CAST('no math for initial one, just has zero balance' as varchar(max)) [math]
from cte where i = 1
UNION ALL --recursive step now
select
curr.i,
curr.date,
curr.add_days,
curr.days_since_prev,
prev.days_balance - curr.days_since_prev + prev.add_days [days_balance],
CAST(prev.days_balance as varchar(max)) + ' - ' + CAST(curr.days_since_prev as varchar(max)) + ' + ' + CAST(prev.add_days as varchar(max)) [math]
from cte curr
JOIN recursiveCte prev ON curr.i = prev.i + 1
)
select i, DATEPART(day,date) [day], add_days, days_since_prev, days_balance, math
from recursiveCTE
order by date
And the results are like so:
+---+-----+----------+-----------------+--------------+------------------------------------------------+
| i | day | add_days | days_since_prev | days_balance | math |
+---+-----+----------+-----------------+--------------+------------------------------------------------+
| 1 | 1 | 5 | 0 | 0 | no math for initial one, just has zero balance |
| 2 | 4 | 2 | 3 | 2 | 0 - 3 + 5 |
| 3 | 11 | 7 | 7 | -3 | 2 - 7 + 2 |
| 4 | 20 | 10 | 9 | -5 | -3 - 9 + 7 |
| 5 | 30 | 1 | 10 | -5 | -5 - 10 + 10 |
+---+-----+----------+-----------------+--------------+------------------------------------------------+
I don’t quite get how your algorithm returns your expected results? But let me share a technique I came up with that might help.
This will only work if the end result of your data is to be exported to Excel, and even then it won’t work in all scenarios depending on what format you export your dataset in, but here it is....
If you’ll familiar with Excel Formulas, what I discovered is that if you write an Excel formula in your SQL as another field, it will execute that formula for you as soon as you export to excel (best method that works for me is just coping and pasting it into Excel, so that it doesn’t format it as text)
So for your example, here’s what you could do (noting again I don’t understand your algorithm, so this is probably wrong, but it’s just to give you the concept)
SELECT
date
, add_days
, '=INDEX($1:$65536,ROW()-1,COLUMN()-2)'
||'+INDEX($1:$65536,ROW()-1,COLUMN()-1)'
||'-INDEX($1:$65536,ROW(),COLUMN()-2)'
AS "days_balance[i]"
,'=IF(INDEX($1:$65536,ROW(),COLUMN()-1)>=0'
||',INDEX($1:$65536,ROW(),COLUMN()-3)'
||'+INDEX($1:$65536,ROW(),COLUMN()-1))'
AS "date[i]"
FROM
myTable
ORDER BY /*Ensure to order by whatever you need for your formula to work*/
The key part to making this work is using the INDEX formula function to select a cell based on the position of the current cell. So ROW()-1 tells it get me the result of the previous record, and COLUMN()-2 means take the value from two columns to the left of the current. Because you can't use cell references like A2+B2-A3 because the row numbers won't change on export, and it assumes the position of the columns.
I used SQL string concatenation with || just so it's easier to read on screen.
I tried this one in excel; it didn’t match your expected results. But if this technique works for you then just correct the excel formula to suit.

Having issues joing a table with a recursive function in Sqlite

I'm building a complex query but I have a problem...
Pratically, I retrieve a dates range from recursive function in sqlite:
WITH RECURSIVE dates(d)
AS (VALUES('2014-05-01')
UNION ALL
SELECT date(d, '+1 day')
FROM dates
WHERE d < '2014-05-5')
SELECT d AS date FROM dates
This is the result:
2014-05-01
2014-05-02
2014-05-03
2014-05-04
2014-05-05
I would join this query on other query, about this:
select date_column, column1, column2 from table
This is the result:
2014-05-03 column_value1 column_value2
Finally, I would like to see a similar result in output (join first query and date_column of second query):
2014-05-01 | | |
2014-05-02 | | |
2014-05-03 | column_value1 | column_value2 |
2014-05-04 | | |
2014-05-05 | | |
How can I obtain this result?
Thanks!!!
Why don't you simply do something like this ?
WITH RECURSIVE dates(d)
AS (VALUES('2014-05-01')
UNION ALL
SELECT date(d, '+1 day')
FROM dates
WHERE d < '2014-05-5')
SELECT
dates.d AS date
,table.column1
,table.column2
FROM dates
left join table
ON strftime("%Y-%m-%d", table.date_column) = dates.date
Perhaps you will need to convert your date...