Generate a table with interval of months in a year -Oracle - sql

I have to create a table in the below format:-
TS_RANGE_BEGIN TS_RANGE_END
2019-01-01 17:00:00 2019-01-31 17:00:00
2019-02-01 17:00:00 2019-02-28 17:00:00
2019-03-01 17:00:00 2019-03-31 17:00:00
Could you please help on this?
Thanks,

Looks like a simple row generator:
SQL> alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';
Session altered.
SQL> with std (datum) as
2 (select to_date('01.01.2019 17:00', 'dd.mm.yyyy hh24:mi') from dual)
3 select add_months(datum, level - 1) ts_range_begin,
4 add_months(datum, level) - 1 ts_range_end
5 from std
6 connect by level <= 12;
TS_RANGE_BEGIN TS_RANGE_END
------------------- -------------------
2019-01-01 17:00:00 2019-01-31 17:00:00
2019-02-01 17:00:00 2019-02-28 17:00:00
2019-03-01 17:00:00 2019-03-31 17:00:00
2019-04-01 17:00:00 2019-04-30 17:00:00
2019-05-01 17:00:00 2019-05-31 17:00:00
2019-06-01 17:00:00 2019-06-30 17:00:00
2019-07-01 17:00:00 2019-07-31 17:00:00
2019-08-01 17:00:00 2019-08-31 17:00:00
2019-09-01 17:00:00 2019-09-30 17:00:00
2019-10-01 17:00:00 2019-10-31 17:00:00
2019-11-01 17:00:00 2019-11-30 17:00:00
2019-12-01 17:00:00 2019-12-31 17:00:00
12 rows selected.
SQL>
The STD CTE is used to set starting date.

Related

Analyze a Time Series

I am inserting data into a table with date/time column.
I want to find speed of inserts during a particular duration as follows :
Duration # of Records
1:00pm - 2:00PM 1000
2:00pm - 3:00PM 1400
.......................
11:00PM- 12:00 1100
Though I can find above by repeatedly executing follows:
select count(*) from table_A where insert_date between 1:00pm and 2:00pm
Is there Oracle supplied package/function which can produce above report - without having to execute separate statements ?
Here's a couple of examples. To get "sparse" results, ie, just the data that exists within the table, you simply use TRUNC
SQL> create table data ( d date );
Table created.
SQL>
SQL> insert into data
2 select date '2022-02-10' + dbms_random.normal/10
3 from dual
4 connect by level <= 10000;
10000 rows created.
SQL>
SQL> select trunc(d,'HH24'), count(*)
2 from data
3 group by trunc(d,'HH24')
4 order by 1;
TRUNC(D,'HH24') COUNT(*)
------------------- ----------
09/02/2022 13:00:00 1
09/02/2022 15:00:00 4
09/02/2022 16:00:00 10
09/02/2022 17:00:00 40
09/02/2022 18:00:00 126
09/02/2022 19:00:00 282
09/02/2022 20:00:00 595
09/02/2022 21:00:00 948
09/02/2022 22:00:00 1389
09/02/2022 23:00:00 1577
10/02/2022 00:00:00 1609
10/02/2022 01:00:00 1362
10/02/2022 02:00:00 956
10/02/2022 03:00:00 624
10/02/2022 04:00:00 281
10/02/2022 05:00:00 134
10/02/2022 06:00:00 43
10/02/2022 07:00:00 16
10/02/2022 08:00:00 2
10/02/2022 10:00:00 1
20 rows selected.
If you need to get ALL hours, even if there was no data for a given hour, you can OUTER JOIN the raw data to a synthetic list of rows with all hours for the desired range, eg
SQL> with full_range as
2 ( select date '2022-02-09' + rownum/24 hr
3 from dual
4 connect by level <= 48
5 ),
6 raw_data as
7 ( select trunc(d,'HH24') dhr, count(*) cnt
8 from data
9 group by trunc(d,'HH24')
10 )
11 select full_range.hr, raw_data.cnt
12 from raw_data, full_range
13 where full_range.hr = raw_data.dhr(+)
14 order by 1;
HR CNT
------------------- ----------
09/02/2022 01:00:00
09/02/2022 02:00:00
09/02/2022 03:00:00
09/02/2022 04:00:00
09/02/2022 05:00:00
09/02/2022 06:00:00
09/02/2022 07:00:00
09/02/2022 08:00:00
09/02/2022 09:00:00
09/02/2022 10:00:00
09/02/2022 11:00:00
09/02/2022 12:00:00
09/02/2022 13:00:00 1
09/02/2022 14:00:00
09/02/2022 15:00:00 4
09/02/2022 16:00:00 10
09/02/2022 17:00:00 40
09/02/2022 18:00:00 126
09/02/2022 19:00:00 282
09/02/2022 20:00:00 595
09/02/2022 21:00:00 948
09/02/2022 22:00:00 1389
09/02/2022 23:00:00 1577
10/02/2022 00:00:00 1609
10/02/2022 01:00:00 1362
10/02/2022 02:00:00 956
10/02/2022 03:00:00 624
10/02/2022 04:00:00 281
10/02/2022 05:00:00 134
10/02/2022 06:00:00 43
10/02/2022 07:00:00 16
10/02/2022 08:00:00 2
10/02/2022 09:00:00
10/02/2022 10:00:00 1
10/02/2022 11:00:00
10/02/2022 12:00:00
10/02/2022 13:00:00
10/02/2022 14:00:00
10/02/2022 15:00:00
10/02/2022 16:00:00
10/02/2022 17:00:00
10/02/2022 18:00:00
10/02/2022 19:00:00
10/02/2022 20:00:00
10/02/2022 21:00:00
10/02/2022 22:00:00
10/02/2022 23:00:00
11/02/2022 00:00:00
48 rows selected.

Get max data for every day in BigQuery

I have a table with daily data by hour. I want to get a table with only one row per day. That row should have the max value for the column AforoTotal.
This is a part of the table, containing the records of three days.
FechaHora
Fecha
Hora
AforoTotal
2022-01-13T16:00:00Z
2022-01-13
16:00:00
4532
2022-01-13T15:00:00Z
2022-01-13
15:00:00
4419
2022-01-13T14:00:00Z
2022-01-13
14:00:00
4181
2022-01-13T13:00:00Z
2022-01-13
13:00:00
3914
2022-01-13T12:00:00Z
2022-01-13
12:00:00
3694
2022-01-13T11:00:00Z
2022-01-13
11:00:00
3268
2022-01-13T10:00:00Z
2022-01-13
10:00:00
2869
2022-01-13T09:00:00Z
2022-01-13
09:00:00
2065
2022-01-13T08:00:00Z
2022-01-13
08:00:00
1308
2022-01-13T07:00:00Z
2022-01-13
07:00:00
730
2022-01-13T06:00:00Z
2022-01-13
06:00:00
251
2022-01-13T05:00:00Z
2022-01-13
05:00:00
95
2022-01-13T04:00:00Z
2022-01-13
04:00:00
44
2022-01-13T03:00:00Z
2022-01-13
03:00:00
35
2022-01-13T02:00:00Z
2022-01-13
02:00:00
28
2022-01-13T01:00:00Z
2022-01-13
01:00:00
6
2022-01-13T00:00:00Z
2022-01-13
00:00:00
-18
2022-01-12T23:00:00Z
2022-01-12
23:00:00
1800
2022-01-12T22:00:00Z
2022-01-12
22:00:00
2042
2022-01-12T21:00:00Z
2022-01-12
21:00:00
2358
2022-01-12T20:00:00Z
2022-01-12
20:00:00
2827
2022-01-12T19:00:00Z
2022-01-12
19:00:00
3681
2022-01-12T18:00:00Z
2022-01-12
18:00:00
4306
2022-01-12T17:00:00Z
2022-01-12
17:00:00
4377
2022-01-12T16:00:00Z
2022-01-12
16:00:00
4428
2022-01-12T15:00:00Z
2022-01-12
15:00:00
4424
2022-01-12T14:00:00Z
2022-01-12
14:00:00
4010
2022-01-12T13:00:00Z
2022-01-12
13:00:00
3826
2022-01-12T12:00:00Z
2022-01-12
12:00:00
3582
2022-01-12T11:00:00Z
2022-01-12
11:00:00
3323
2022-01-12T10:00:00Z
2022-01-12
10:00:00
2805
2022-01-12T09:00:00Z
2022-01-12
09:00:00
2159
2022-01-12T08:00:00Z
2022-01-12
08:00:00
1378
2022-01-12T07:00:00Z
2022-01-12
07:00:00
790
2022-01-12T06:00:00Z
2022-01-12
06:00:00
317
2022-01-12T05:00:00Z
2022-01-12
05:00:00
160
2022-01-12T04:00:00Z
2022-01-12
04:00:00
106
2022-01-12T03:00:00Z
2022-01-12
03:00:00
95
2022-01-12T02:00:00Z
2022-01-12
02:00:00
86
2022-01-12T01:00:00Z
2022-01-12
01:00:00
39
2022-01-12T00:00:00Z
2022-01-12
00:00:00
0
2022-01-11T23:00:00Z
2022-01-11
23:00:00
2032
2022-01-11T22:00:00Z
2022-01-11
22:00:00
2109
2022-01-11T21:00:00Z
2022-01-11
21:00:00
2362
2022-01-11T20:00:00Z
2022-01-11
20:00:00
2866
2022-01-11T19:00:00Z
2022-01-11
19:00:00
3948
2022-01-11T18:00:00Z
2022-01-11
18:00:00
4532
2022-01-11T17:00:00Z
2022-01-11
17:00:00
4590
2022-01-11T16:00:00Z
2022-01-11
16:00:00
4821
2022-01-11T15:00:00Z
2022-01-11
15:00:00
4770
2022-01-11T14:00:00Z
2022-01-11
14:00:00
4405
2022-01-11T13:00:00Z
2022-01-11
13:00:00
4040
2022-01-11T12:00:00Z
2022-01-11
12:00:00
3847
2022-01-11T11:00:00Z
2022-01-11
11:00:00
3414
2022-01-11T10:00:00Z
2022-01-11
10:00:00
2940
2022-01-11T09:00:00Z
2022-01-11
09:00:00
2105
2022-01-11T08:00:00Z
2022-01-11
08:00:00
1353
2022-01-11T07:00:00Z
2022-01-11
07:00:00
739
2022-01-11T06:00:00Z
2022-01-11
06:00:00
248
2022-01-11T05:00:00Z
2022-01-11
05:00:00
91
2022-01-11T04:00:00Z
2022-01-11
04:00:00
63
2022-01-11T03:00:00Z
2022-01-11
03:00:00
46
2022-01-11T02:00:00Z
2022-01-11
02:00:00
42
2022-01-11T01:00:00Z
2022-01-11
01:00:00
18
2022-01-11T00:00:00Z
2022-01-11
00:00:00
5
My expected result is:
FechaHora
Fecha
Hora
AforoTotal
2022-01-13T16:00:00Z
2022-01-13
16:00:00
4532
2022-01-12T16:00:00Z
2022-01-12
16:00:00
4428
2022-01-11T17:00:00Z
2022-01-11
17:00:00
4590
Consider below approach
select as value
array_agg(t order by AforoTotal desc limit 1)[offset(0)]
from your_table t
group by Fecha
if to apply to sample data in your question - output is
Another way which is little bit costly:
It will be working when (Fetcha and max(AforoTotal)) combination is unique.
In given example, I find it is unique.
SELECT * FROM your_table
WHERE Fecha||AforoTotal
IN
(SELECT Fecha||MAX( AforoTotal ) FROM your_table GROUP BY Fecha);
[Output]
https://i.stack.imgur.com/IFzWA.jpg
thanks for your approach. This can be saved as a view in BigQuery and I can use it in DataStudio. I have not tested what happens when the combination is not unique, I will see how it behaves.
I think you can do something like this, though I haven't tested it:
SELECT LAST_VALUE(FetchaHora) OVER (Partition BY Fecha ORDER BY AforoTotal ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), Fetcha, LAST_VALUE(Hora) OVER (Partition BY Fecha ORDER BY AforoTotal ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), LAST_VALUE(AforoTotal) OVER (Partition BY Fecha ORDER BY AforoTotal ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AforoTotal FROM your_table

Add 10 to 40 minutes randomly to a datetime column in pandas

I have a data frame as shown below
start
2010-01-06 09:00:00
2018-01-07 08:00:00
2012-01-08 11:00:00
2016-01-07 08:00:00
2010-02-06 14:00:00
2018-01-07 16:00:00
To the above df, I would like to add a column called 'finish' by adding minutes between 10 to 40 with start column randomly with replacement.
Expected Ouput:
start finish
2010-01-06 09:00:00 2010-01-06 09:20:00
2018-01-07 08:00:00 2018-01-07 08:12:00
2012-01-08 11:00:00 2012-01-08 11:38:00
2016-01-07 08:00:00 2016-01-07 08:15:00
2010-02-06 14:00:00 2010-02-06 14:24:00
2018-01-07 16:00:00 2018-01-07 16:36:00
Create timedeltas by to_timedelta and numpy.random.randint for integers between 10 and 40:
arr = np.random.randint(10, 40, size=len(df))
df['finish'] = df['start'] + pd.to_timedelta(arr, unit='Min')
print (df)
start finish
0 2010-01-06 09:00:00 2010-01-06 09:25:00
1 2018-01-07 08:00:00 2018-01-07 08:30:00
2 2012-01-08 11:00:00 2012-01-08 11:29:00
3 2016-01-07 08:00:00 2016-01-07 08:12:00
4 2010-02-06 14:00:00 2010-02-06 14:31:00
5 2018-01-07 16:00:00 2018-01-07 16:39:00
You can achieve it by using pandas.Series.apply() in combination with pandas.to_timedelta() and random.randint().
from random import randint
df['finish'] = df.start.apply(lambda dt: dt + pd.to_timedelta(randint(10, 40), unit='m'))

Using generate_serires with partition by generate triple duplicate row

I have a table which look like this:
dt type
-----------------------------
2019-07-01 10:00:00 A
2019-07-01 10:15:00 A
2019-07-01 11:00:00 A
2019-07-01 08:30:00 B
2019-07-01 08:45:00 B
2019-07-01 09:30:00 B
Each type has it own dt value but each type should have a consecutive 15 minute range dt. But some row are missing. So, I used generate_strings() to add date and partition by to do it based on each type column by using this:
SELECT
generate_series(min(dt) over (partition by type),
max(dt) over (partition by type), interval '15 minute')
, type
FROM t
which I generate datetime in dt column based on in min to max dt with a range of 15 minutes.
This is what I expect to get:
dt type
-----------------------------
2019-07-01 10:00:00 A
2019-07-01 10:15:00 A
2019-07-01 10:30:00 A
2019-07-01 10:45:00 A
2019-07-01 11:00:00 A
2019-07-01 08:30:00 B
2019-07-01 08:45:00 B
2019-07-01 09:00:00 B
2019-07-01 09:15:00 B
2019-07-01 09:30:00 B
But what I got as a result is like the expected one but it return triple for each type and datetime.
E.g.
dt type
-----------------------------
2019-07-01 10:00:00 A
2019-07-01 10:15:00 A
2019-07-01 10:30:00 A
2019-07-01 10:45:00 A
2019-07-01 11:00:00 A
2019-07-01 10:00:00 A
2019-07-01 10:15:00 A
2019-07-01 10:30:00 A
2019-07-01 10:45:00 A
2019-07-01 11:00:00 A
2019-07-01 10:00:00 A
2019-07-01 10:15:00 A
2019-07-01 10:30:00 A
2019-07-01 10:45:00 A
2019-07-01 11:00:00 A
2019-07-01 08:30:00 B
. . .
This also happened to type B as well.
So, from my query, what do I need to change to get the expected result?
You just want to run generate_series() over the aggregation:
SELECT type, generate_series(min_dt, max_dt, interval '15 minute')
FROM (SELECT type, MIN(dt) as min_dt, MAX(dt) as max_dt
FROM t
GROUP BY type
) t;
The window functions start by adding the min and max value to each row. Then each row gets its own series.

Translate help MS SQL => Oracle

I am trying to translate an MS SQL function to oracle I am running into trouble. They reason being creating temporary table within the function to gradually add to. I can't seem to replace the temporary table with a cursor to gradually add to. Someone has to have a good idea how to write this in Oracle:
ALTER FUNCTION [dbo].[F_GetDateIntervalTable]
(
#OccurredFrom datetime,
#OccurredTo datetime,
#Interval decimal
)
RETURNS #Tbl table
(
[Dts] datetime
)
AS
BEGIN
DECLARE #Count int
--DECLARE #Tbl table([Dts] datetime)
DECLARE #Dts datetime
DECLARE #SeedDts datetime
SET #Count = 1
SET #Dts = DATEADD(MINUTE, FLOOR(DATEDIFF(MINUTE,0,#OccurredFrom)/#Interval)*#Interval, 0);
SET #SeedDts = DATEADD(MINUTE, FLOOR(DATEDIFF(MINUTE,0,#OccurredFrom)/#Interval)*#Interval, 0);
SET #OccurredTo = DATEADD(MINUTE, -#Interval, #OccurredTo);
WHILE (#SeedDts < #OccurredTo)
BEGIN
SET #SeedDts = DATEADD(MINUTE, #Interval*(#Count-1), #Dts)
INSERT INTO #Tbl(Dts) VALUES(#SeedDts)
SET #Count = (#Count + 1)
END
RETURN
END
The output should be this (given the parameters):
#OccurredFrom = '2013-01-01',
#OccurredTo = '2013-01-02',
#Interval = 60
2013-01-01 00:00:00.000
2013-01-01 01:00:00.000
2013-01-01 02:00:00.000
2013-01-01 03:00:00.000
2013-01-01 04:00:00.000
2013-01-01 05:00:00.000
2013-01-01 06:00:00.000
2013-01-01 07:00:00.000
2013-01-01 08:00:00.000
2013-01-01 09:00:00.000
2013-01-01 10:00:00.000
2013-01-01 11:00:00.000
2013-01-01 12:00:00.000
2013-01-01 13:00:00.000
2013-01-01 14:00:00.000
2013-01-01 15:00:00.000
2013-01-01 16:00:00.000
2013-01-01 17:00:00.000
2013-01-01 18:00:00.000
2013-01-01 19:00:00.000
2013-01-01 20:00:00.000
2013-01-01 21:00:00.000
2013-01-01 22:00:00.000
2013-01-01 23:00:00.000
Any ideas are grealty appreciated!
Assuming you need to use a function rather than simply writing a SQL query
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select date '2013-01-01' start_date,
3 date '2013-01-02' end_date,
4 60 interval
5 from dual
6 )
7 select start_date + numtodsinterval( interval * (level-1), 'minute' )
8 from x
9* connect by level <= (end_date - start_date)*24*60/interval
SQL> /
START_DATE+NUMTODSI
-------------------
2013-01-01 00:00:00
2013-01-01 01:00:00
2013-01-01 02:00:00
2013-01-01 03:00:00
2013-01-01 04:00:00
2013-01-01 05:00:00
2013-01-01 06:00:00
2013-01-01 07:00:00
2013-01-01 08:00:00
2013-01-01 09:00:00
2013-01-01 10:00:00
2013-01-01 11:00:00
2013-01-01 12:00:00
2013-01-01 13:00:00
2013-01-01 14:00:00
2013-01-01 15:00:00
2013-01-01 16:00:00
2013-01-01 17:00:00
2013-01-01 18:00:00
2013-01-01 19:00:00
2013-01-01 20:00:00
2013-01-01 21:00:00
2013-01-01 22:00:00
2013-01-01 23:00:00
24 rows selected.
you can create a pipelined table function
SQL> create type tbl_date as table of date;
2 /
Type created.
SQL> create or replace function get_date_interval( p_start_date in date,
2 p_end_date in date,
3 p_interval in number )
4 return tbl_date
5 pipelined
6 is
7 l_return_dt date := p_start_date;
8 begin
9 while( l_return_dt < p_end_date )
10 loop
11 pipe row( l_return_dt );
12 l_return_dt := l_return_dt + numtodsinterval( p_interval, 'minute' );
13 end loop;
14 return;
15 end;
16 /
Function created.
SQL> select *
2 from table( get_date_interval( date '2013-01-01',
3 date '2013-01-02',
4 60 ));
COLUMN_VALUE
-------------------
2013-01-01 00:00:00
2013-01-01 01:00:00
2013-01-01 02:00:00
2013-01-01 03:00:00
2013-01-01 04:00:00
2013-01-01 05:00:00
2013-01-01 06:00:00
2013-01-01 07:00:00
2013-01-01 08:00:00
2013-01-01 09:00:00
2013-01-01 10:00:00
2013-01-01 11:00:00
2013-01-01 12:00:00
2013-01-01 13:00:00
2013-01-01 14:00:00
2013-01-01 15:00:00
2013-01-01 16:00:00
2013-01-01 17:00:00
2013-01-01 18:00:00
2013-01-01 19:00:00
2013-01-01 20:00:00
2013-01-01 21:00:00
2013-01-01 22:00:00
2013-01-01 23:00:00
24 rows selected.