Duplicate row based on date difference - sql

help needed. I have a set of data from oracle import to tableau for calculation. But in order to do that, i need to duplicate charts as shown in table below. For example, if there is date diff between start and end, then i need to duplicate it and assign with code 0,1 depend on how many date differences. The purpose is i need to use this function in Tableau for time interval calculation. Thanks

Pregenerate codes up to max possible value and join original table to code series so that number of row duplications is determined by difference between dates on particular row:
with t (s,e) as (
select timestamp '2020-08-16 18:30:00', timestamp '2020-08-16 20:00:00' from dual union all
select timestamp '2020-08-17 08:00:00', timestamp '2020-08-18 08:00:00' from dual union all
select timestamp '2020-08-19 08:00:00', timestamp '2020-08-19 00:00:00' from dual union all
select timestamp '2020-08-20 10:00:00', timestamp '2020-08-22 03:00:00' from dual
), series (code) as (
select level - 1 from dual connect by level <= (select count(*) from t)
)
select t.*, series.code
from t
join series on trunc(e) - trunc(s) >= series.code
order by s,code;

Related

Dynamic LAG function (Standard SQL, BigQuery). Is it possible?

I am trying hard to find a solution for that. I've attached an image with a overview about what I want too, but I will write here too.
In LAG function, is it possible to have a dynamic number in the syntax?
LAG(sessions, 3)
Instead of using 3, I need the number of column minutosdelift which is 3 in this example, but it will be different for each situation.
I've tried to use LAG(sessions, minutosdelift) but It is not possible. I've tried LAG(sessions, COUNT(minutosdelift)) and it is not possible either.
The final goal is to calculate the difference between 52 and 6. So, (52/6)-1 which gives me 767%. But to do it I need a dynamic number into LAG function (or another idea to do it).
I've tried using ROWS PRECEDING AND ROWS UNBOUNDED PRECEDING, but again it needs a literal number.
Please, any idea about how to do it? Thanks!
This screenshot might explain it:
enter image description here
My code: this is the last query I've tried, because I have 7 previous views
SELECT
DATE, HOUR, MINUTE, SESSIONS, PROGRAMA_2,
janela_lift_teste, soma_sessao_programa_2, minutosdelift,
CASE
WHEN minutosdelift != 0
THEN LAG(sessions, 3) OVER(ORDER BY DATE, HOUR, MINUTE ASC)
END AS lagtest,
CASE
WHEN programa_2 = "#N/A" OR programa_2 is null
THEN LAST_VALUE(sessions) OVER (PARTITION BY programa_2 ORDER BY DATE, HOUR, MINUTE ASC)
END AS firstvaluetest,
FROM
tbl8
GROUP BY
DATE, HOUR, MINUTE, SESSIONS, PROGRAMA_2,
janela_lift_teste, minutosdelift, soma_sessao_programa_2
ORDER BY
DATE, HOUR, MINUTE ASC
In BigQuery (as in some other databases), the argument to lag() has to be a constant.
One method to get around this uses a self join. I find it hard to follow your query, but the idea is:
with tt as (
select row_number() over (order by sessions) as seqnum,
t.*
from t
)
select t.*, tprev.*
from t join
t tprev
on tprev.seqnum = t.seqnum - minutosdelift;
Consider below example - hope you can apply this approach to your use case
#standardSQL
with `project.dataset.table` as (
select 1 session, timestamp '2021-01-01 00:01:00' ts, 10 minutosdelift union all
select 2, '2021-01-01 00:02:00', 1 union all
select 3, '2021-01-01 00:03:00', 2 union all
select 4, '2021-01-01 00:04:00', 3 union all
select 5, '2021-01-01 00:05:00', 4 union all
select 6, '2021-01-01 00:06:00', 5 union all
select 7, '2021-01-01 00:07:00', 3 union all
select 8, '2021-01-01 00:08:00', 1 union all
select 9, '2021-01-01 00:09:00', 2 union all
select 10, '2021-01-01 00:10:00', 8 union all
select 11, '2021-01-01 00:11:00', 6 union all
select 12, '2021-01-01 00:12:00', 4 union all
select 13, '2021-01-01 00:13:00', 2 union all
select 14, '2021-01-01 00:14:00', 1 union all
select 15, '2021-01-01 00:15:00', 11 union all
select 16, '2021-01-01 00:16:00', 1 union all
select 17, '2021-01-01 00:17:00', 8
)
select a.*, b.session as lagtest
from `project.dataset.table` a
left join `project.dataset.table` b
on b.ts = timestamp_sub(a.ts, interval a.minutosdelift minute)
with output

Problem with date matching between a variable and a date column Oracle SQL

The problem is that this query is working fine:
CREATE TABLE PROCESGEN_TEST
(PROCESENDTIME TIMESTAMP);
INSERT INTO PROCESGEN_TEST
(SELECT DISTINCT PROCESENDTIME FROM dwh_procesgeneriek#xob10
WHERE PROCESENDTIME IS NOT NULL
AND PROCESENDTIME > '10-09-2020 01:00:00');
Def TIME2 = (SELECT MAX_EXEC_TIME FROM EXEC_TIME);
SELECT PROCESENDTIME
FROM PROCESGEN_TEST
WHERE PROCESENDTIME < &TIME2
AND PROCESEINDTIJD IS NOT NULL
In the above situation we first put the data into a table created in de database management system we use (named xor01 and not the xob10). In the query beneath we extract the data directly from xob10. This isn’t working when we want to select a date (greater or lower then..) and we don’t know why.
CREATE TABLE EXEC_TIME
(MAX_EXEC_TIME DATE);
INSERT INTO EXEC_TIME
(
SELECT TO_DATE(
TO_CHAR(
MAX(EXEC_DATE),
'DD-MM-YYYY HH24:MI:SS'
),
'DD-MM-YYYY HH24:MI:SS'
) - 1.1666
from L3DD_MIN_ACTIVITIES_BRD_BAK_Will
);
Def TIME3 = (SELECT MAX_EXEC_TIME FROM EXEC_TIME);
SELECT PROCESENDTIME
FROM dwh_procesgeneriek#xob10
WHERE PROCESENDTIME IS NOT NULL
AND TO_DATE(PROCESENDTIME,'DD-MM-YY HH24:MI:SS')
> TO_DATE(&TIME3, 'DD-MM-YY HH24:MI:SS');
The problem is that the query is not finding a single date in the last query and keeps on executing. If we replace TO_DATE(&TIME3, 'DD-MM-YY HH24:MI:SS') with a certain date like '10-08-2020 20:00:00' the query will find the right dates. We have tried all kinds of things, like working with TIMESTAMP format and TO_TIMESTAMP. Nothing works. It looks like a rather simple problem.
Does anyone know what’s causing the problem the query can’t find any dates in the second query?
You don't need:
the EXEC_TIME table;
to convert a timestamp to a string and then back to a date;
to use a variable; or
to filter on PROCESENDTIME IS NOT NULL (since the > filter only works on non-NULL values).
Then you can use:
SELECT PROCESENDTIME
FROM dwh_procesgeneriek#xob10
WHERE PROCESENDTIME
> (
SELECT MAX(EXEC_DATE) - INTERVAL '1 4' DAY TO HOUR
FROM L3DD_MIN_ACTIVITIES_BRD_BAK_Will
);
If you do want the EXEC_TIME table then:
CREATE TABLE EXEC_TIME( MAX_EXEC_TIME DATE );
INSERT INTO EXEC_TIME
SELECT MAX(EXEC_DATE) - INTERVAL '1 4' DAY TO HOUR
FROM L3DD_MIN_ACTIVITIES_BRD_BAK_Will;
SELECT PROCESENDTIME
FROM dwh_procesgeneriek#xob10
WHERE PROCESENDTIME > ( SELECT MAX_EXEC_TIME FROM EXEC_TIME );

SQL query to sum timestamp diff with null handling

I have a (Oracle)DB table with 2 columns t1 and t2 both with datatype timestamp. column t2 is nullable. I need a SQL query to give me something like the below pseudocode.
sum ((t2 if t2 !=null else sys.currentTimestamp) - t1)
There are two issues here. Firstly, how to substitute a default value for a null. That's easy, we have nvl and coalesce. For example:
with demo (t1, t2) as
( select timestamp '2020-01-01 00:00:00'
, timestamp '2020-01-01 01:02:03'
from dual
union all
select timestamp '2020-01-01 00:00:00', null from dual )
select t1
, t2
, nvl(t2, current_timestamp)
from demo;
T1 T2 NVL(T2,CURRENT_TIMESTAMP)
---------------------- ---------------------- -----------------------------
2020-01-01 00:00:00.00 2020-01-01 01:02:03.00 2020-01-01 01:02:03.000000000
2020-01-01 00:00:00.00 2020-08-28 11:25:22.989000000
The harder part is how to sum t2 - nvl(t2,current_timestamp). The difference between two timestamps is an interval day to second, and although you can do arithmetic with intervals (add, subtract, multiply etc), you can't currently sum them. (You can add your vote to this suggestion on the Oracle Database Ideas forum to get the functionality added.)
In the meantime, you can either write your own using the Oracle Data Cartridge Interface, or use a workaround such as this one from Stew Stryker:
with demo (t1, t2) as
( select timestamp '2020-01-01 00:00:00'
, timestamp '2020-01-01 01:02:03'
from dual
union all
select timestamp '2020-01-01 00:00:00', null from dual )
select numtodsinterval(
sum(
((sysdate + (nvl(t2,systimestamp) -t1)) - sysdate) * 86400
+ extract(second from (nvl(t2,systimestamp) -t1))
- trunc(extract(second from (nvl(t2,systimestamp) -t1)))
)
, 'second'
) as duration
from demo;
DURATION
--------------------------------------------------------------------------------
+000000240 12:37:23.646000000

Reason for using trunc function on dates in Oracle

I am currently working in a project on a Oracle database. I have observed in the application code that dates are almost never used directly. Instead, they are always used in conjunction with the trunc function (TRUNC(SYSDATE), TRUNC(event_date), etc.)
Can anyone explain the reason behind using the trunc function instead of using the date directly?
A DATE in Oracle has not only a date part, but also a time part. This can lead to surprising results when querying data, e.g. the query
with v_data(pk, dt) as (
select 1, to_date('2014-06-25 09:00:00', 'YYYY-MM-DD hh24:mi:ss') from dual union all
select 2, to_date('2014-06-26 09:00:00', 'YYYY-MM-DD hh24:mi:ss') from dual union all
select 3, to_date('2014-06-27 09:00:00', 'YYYY-MM-DD hh24:mi:ss') from dual)
select * from v_data where dt = date '2014-06-25'
will return no rows, since you're comparing to 2014-06-25 at midnight.
The usual workaround for this is to use TRUNC() to get rid of the time part:
with v_data(pk, dt) as (
select 1, to_date('2014-06-25 09:00:00', 'YYYY-MM-DD hh24:mi:ss') from dual union all
select 2, to_date('2014-06-26 09:00:00', 'YYYY-MM-DD hh24:mi:ss') from dual union all
select 3, to_date('2014-06-27 09:00:00', 'YYYY-MM-DD hh24:mi:ss') from dual)
select * from v_data where trunc(dt) = date '2014-06-25'
Other, somewhat less frequently used approaches for this problem include:
convert both dates with to_char('YYYY-MM-DD') and check for equality
use a between clause: WHERE dt between date '2014-06-25' and date '2014-06-26'
You use the trunc() function to remove the time component of the date. By default, the date data type in Oracle stores both dates and times.
The trunc() function also takes a format argument, so you can remove other components of the dates, not just the time. For instance, you can trunc to the nearest hour. However, without the format, the purpose is to remove the time component.
If you the column in your table, for example event_date, is indexed, then avoid using trunc on the column because if you do that then Oracle can't use the index (otherwise, you can create a function based index)
so do not do:
select *
from mytable
where trunc(event_date) < date '2014-01-01'
but instead do
select *
from mytable
where event_date < date '2014-01-02'
In the second case, Oracle can do a range scan on the index on event_date, in the first case it has to do a full table scan.

Oracle order by year, month

I have a query which I want to order by year and then month. I have tryed order by to_date( depdate, 'mm' ) and TO_CHAR(depdate, 'YYYY/MM'). Here is an sqlfiddle to the table i am querying and the query itself sqlfiddle
You want to sort by the date value, not by the character string representation. That means that you also want to group by the date value. trunc(<<date column>>, 'mm') truncates a date to midnight on the first of the month. So something like this
SELECT to_char(trunc(DEPDATE,'MM'), 'Mon-YYYY') AS MONTH,
SUM(AMOUNTROOM) AS ROOMTOTAL,
SUM(AMOUNTEXTRAS) AS EXTRATOTAL,
SUM(AMOUNTEXTRAS + AMOUNTROOM) AS OATOTAL
FROM checkins
WHERE checkinstatus = 'D' AND depdate > TO_DATE('2013-12-01', 'yyyy/mm/dd')
AND depdate <= TO_DATE('2014-04-10', 'yyyy/mm/dd')
GROUP BY trunc(depdate,'mm')
ORDER BY trunc(depdate,'mm');
should be what you're looking for. See the updated fiddle
Check out this query. If it is a date field, just plain order by would work for you. You need not use TO_CHAR to convert to string and then sort:
WITH TAB AS
(
SELECT SYSDATE DATEVAL FROM DUAL
UNION
SELECT SYSDATE + 100 DATEVAL FROM DUAL
UNION
SELECT SYSDATE -500 DATEVAL FROM DUAL
UNION
SELECT SYSDATE + 30 DATEVAL FROM DUAL
UNION
SELECT SYSDATE -30 DATEVAL FROM DUAL
) SELECT * FROM TAB
ORDER BY DATEVAL DESC