Date from Date to
2018-12-11 2019-01-08
2019-01-08 2019-02-09
2019-02-10 2019-03-14
2019-03-17 2019-04-11
2019-04-15 2019-05-16
2019-05-16 2019-06-13
output will be like this
Date from Date to Days
2018-12-11 2019-01-08 0
2019-01-08 2019-02-09 1
2019-02-10 2019-03-14 3
2019-03-17 2019-04-11 4
2019-04-15 2019-05-16 0
2019-05-16 2019-06-13 -
To return the difference between two date values in days you could use the DATEDIFF() Function, something like:
SELECT DATEDIFF(DAY, DayFrom, DayTo) AS 'DaysBetween'
FROM DateTable
You want lead() and a date diff function:
select
date_from,
date_to,
datediff(day, date_to, lead(date_from) over(order by date_from)) days
from mytable
datediff() is a SQLServer function. There are equivalents in other RDBMS.
Side note: I would recommend againts using a string value (-) for records that do not have a next record, since other values are numeric (the datatypes in a column must be consistant). null is good enough for this (which the above query will produce).
Demo on DB Fiddle:
date_from | date_to | days
:------------------ | :------------------ | ---:
11/12/2018 00:00:00 | 08/01/2019 00:00:00 | 0
08/01/2019 00:00:00 | 09/02/2019 00:00:00 | 1
10/02/2019 00:00:00 | 14/03/2019 00:00:00 | 3
17/03/2019 00:00:00 | 11/04/2019 00:00:00 | 4
15/04/2019 00:00:00 | 16/05/2019 00:00:00 | 0
16/05/2019 00:00:00 | 13/06/2019 00:00:00 | null
Related
I have got the following table.
date2 Group number
2020-28-05 00:00:00 A 55
2020-28-05 00:00:00 B 1.09
2020-28-05 00:00:00 C 1.8
2020-29-05 00:00:00 A 68
2020-29-05 00:00:00 B 1.9
2020-29-05 00:00:00 C 1.19
2020-01-06 00:00:00 A 10
2020-01-06 00:00:00 B 15
2020-01-06 00:00:00 C 0.88
2020-02-06 00:00:00 A 22
2020-02-06 00:00:00 B 15
2020-02-06 00:00:00 C 13
2020-03-06 00:00:00 A 66
2020-03-06 00:00:00 B 88
2020-03-06 00:00:00 C 99
As you can see between dates 2020-30-05 and 2020-31-05 are missing in this table. So it is necessary to fill these dates with 2020-29-05 information grouped by GROUP. As a result the final output should be like that:
date2 Group number
2020-28-05 00:00:00 A 55
2020-28-05 00:00:00 B 1.09
2020-28-05 00:00:00 C 1.8
2020-29-05 00:00:00 A 68
2020-29-05 00:00:00 B 1.9
2020-29-05 00:00:00 C 1.19
2020-30-05 00:00:00 A 68
2020-30-05 00:00:00 B 1.9
2020-30-05 00:00:00 C 1.19
2020-31-05 00:00:00 A 68
2020-31-05 00:00:00 B 1.9
2020-31-05 00:00:00 C 1.19
2020-01-06 00:00:00 A 10
2020-01-06 00:00:00 B 15
2020-01-06 00:00:00 C 0.88
2020-02-06 00:00:00 A 22
2020-02-06 00:00:00 B 15
2020-02-06 00:00:00 C 13
2020-03-06 00:00:00 A 66
2020-03-06 00:00:00 B 88
2020-03-06 00:00:00 C 99
I tried to do in the following way:
create a temporary table (table B) with only dates for period 2020-28-05 till 2020-03-06 and then use left merge, thus making these new dates as null (in order to then insert a CASE when null, so fill in last_value). However, it does not work, because when merging I got nulls only for one date (but should be 3 times one date(because of groups). This is only part of the larger dataset, can you help how can I get the necessary output?
PS I use Vertica
It's Vertica. And Vertica has the TIMESERIES clause, which seems to exactly match with what you need:
Out of a time series - like you have one - with irregular intervals between the rows, or with longer gaps in an otherwise regular time series, it creates a regular time series, with the same interval between each row pair as you specify in the AS sub-clause of the TIMESERIES clause itself. TS_FIRST_VALUE() and TS_LAST_VALUE() are functions that rely on that clause and return the right value deduced from the input rows at the generated time stamp. This right value can be obtained 'const', that is from the row in the original row set closest to the generated time stamp, or 'linear', that is, interpolated from the original row just before and the original row just after the generated timestamp. For your needs, you would use the constant value. See here:
WITH
-- your input ....
input(tmstmp,grp,nbr) AS (
SELECT TIMESTAMP '2020-05-28 00:00:00','A',55
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','B',1.09
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','C',1.8
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','A',68
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','B',1.9
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','C',1.19
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','A',10
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','C',0.88
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','A',22
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','C',13
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','A',66
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','B',88
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','C',99
)
-- real query here ...
SELECT
ts AS tmstmp
, grp
, TS_FIRST_VALUE(nbr,'const') AS nbr
FROM input
TIMESERIES ts AS '1 DAY' OVER(PARTITION BY grp ORDER BY tmstmp)
ORDER BY 1,2
;
-- out tmstmp | grp | nbr
-- out ---------------------+-----+-------
-- out 2020-05-28 00:00:00 | A | 55.00
-- out 2020-05-28 00:00:00 | B | 1.09
-- out 2020-05-28 00:00:00 | C | 1.80
-- out 2020-05-29 00:00:00 | A | 68.00
-- out 2020-05-29 00:00:00 | B | 1.90
-- out 2020-05-29 00:00:00 | C | 1.19
-- out 2020-05-30 00:00:00 | A | 68.00
-- out 2020-05-30 00:00:00 | B | 1.90
-- out 2020-05-30 00:00:00 | C | 1.19
-- out 2020-05-31 00:00:00 | A | 68.00
-- out 2020-05-31 00:00:00 | B | 1.90
-- out 2020-05-31 00:00:00 | C | 1.19
-- out 2020-06-01 00:00:00 | A | 10.00
-- out 2020-06-01 00:00:00 | B | 15.00
-- out 2020-06-01 00:00:00 | C | 0.88
-- out 2020-06-02 00:00:00 | A | 22.00
-- out 2020-06-02 00:00:00 | B | 15.00
-- out 2020-06-02 00:00:00 | C | 13.00
-- out 2020-06-03 00:00:00 | A | 66.00
-- out 2020-06-03 00:00:00 | B | 88.00
I want to GROUP by data by time range. The example I have start_date and end_date, and I want
the separate range between start_date and end_date on 25 range and get sum value from 1 to 25.
Simple presentation of my table:
select * from t1
where time between start_date and end_date
table t1 have:
time 2019-10-01 value 50
time 2019-10-01 value 50
time 2019-10-02 value 50
time 2019-10-02 value 50
time 2019-10-02 value 50
time 2019-10-02 value 50
time 2019-10-03 value 50
time 2019-10-04 value 50
time 2019-10-05 value 50
time 2019-10-05 value 50
time 2019-10-05 value 50
start_date 2019-10-01
end_date 2019-10-25
generate_series function to separate on
2019-10-01
2019-10-02
2019-10-03
2019-10-04
2019-10-05
2019-10-06
2019-10-07
2019-10-07
2019-10-07
2019-10-08
2019-10-09
2019-10-10
2019-10-11
2019-10-12
2019-10-13
2019-10-14
2019-10-15
2019-10-16
2019-10-17
2019-10-18
2019-10-19
2019-10-20
2019-10-21
2019-10-22
2019-10-23
2019-10-24
2019-10-25
and sum by how this 25
for 2019-10-01 to have value 100
for 2019-10-02 to have value 400
I am going to recommend a lateral join:
select d.dt, t.total_value
from generate_series(date '2019-10-01', date '2019-10-25', interval '1' day
) d(dt) left join lateral
(select coalesce(sum(value), 0) as total_value
from t
where t.time >= d.dt and
t.time < d.dt + interval '1' day
) t
on true;
A lateral join can have better performance than overall aggregation, particularly with an index on (time, value).
I understand that you want to generate a list of days, and compute the sum of a column for each:
select d.dt, coalesce(sum(value), 0) total_value
from
generate_series(date'2019-10-01', date'2019-10-25', interval '1' day) as d(dt)
left join mytable t
on t.time >= d.dt
and t.time < d.dt + interval '1' day
group by d.dt
order by d.dt
On dates for which no record is available in your table, total_value will display 0.
Assuming start_date and end_date are variables, you might wanna try the following CTE. It will group by a sum over value by time. In case you want to replace the null values with a 0, try coalesce as pointed out by #GMB in the other answer.
WITH j AS (
SELECT generate_series(DATE '2019-10-01', DATE '2019-10-25', '1 day') AS day)
SELECT j.day, coalesce(sum(value), 0) FROM t1
RIGHT JOIN j ON j.day = time
GROUP BY j.day ORDER BY j.day;
day | coalesce
------------------------+----------
2019-10-01 00:00:00+02 | 100
2019-10-02 00:00:00+02 | 200
2019-10-03 00:00:00+02 | 50
2019-10-04 00:00:00+02 | 50
2019-10-05 00:00:00+02 | 150
2019-10-06 00:00:00+02 | 0
2019-10-07 00:00:00+02 | 0
2019-10-08 00:00:00+02 | 0
2019-10-09 00:00:00+02 | 0
2019-10-10 00:00:00+02 | 0
2019-10-11 00:00:00+02 | 0
2019-10-12 00:00:00+02 | 0
2019-10-13 00:00:00+02 | 0
2019-10-14 00:00:00+02 | 0
2019-10-15 00:00:00+02 | 0
2019-10-16 00:00:00+02 | 0
2019-10-17 00:00:00+02 | 0
2019-10-18 00:00:00+02 | 0
2019-10-19 00:00:00+02 | 0
2019-10-20 00:00:00+02 | 0
2019-10-21 00:00:00+02 | 0
2019-10-22 00:00:00+02 | 0
2019-10-23 00:00:00+02 | 0
2019-10-24 00:00:00+02 | 0
2019-10-25 00:00:00+02 | 0
(25 rows)
EDIT (see comments below):
Changing the series with a 12 hours interval between the generated elements.
WITH j AS (
SELECT generate_series(DATE '2019-10-01 01:30:00',
DATE '2019-10-03 12:30:00', '12 hours') AS day)
SELECT j.day, coalesce(sum(value),0) FROM t1
RIGHT JOIN j ON j.day = time
GROUP BY j.day ORDER BY j.day;
day | coalesce
------------------------+----------
2019-10-01 00:00:00+02 | 100
2019-10-01 12:00:00+02 | 0
2019-10-02 00:00:00+02 | 200
2019-10-02 12:00:00+02 | 0
2019-10-03 00:00:00+02 | 50
(5 rows)
You can change the parameters inside of the generate_series function as you wish, e.g. 30 minutes, 1 hour, etc.
The same can be done with TIMESTAMP, but the dates you'll join with your table need to be identical!
WITH j AS (
SELECT generate_series(TIMESTAMP '2019-10-01 00:00:00',
TIMESTAMP '2019-10-05 12:30:00', '8 hours') AS day)
SELECT j.day, coalesce(sum(value),0) FROM t1
RIGHT JOIN j ON j.day = time
GROUP BY j.day ORDER BY j.day;
day | coalesce
---------------------+----------
2019-10-01 00:00:00 | 100
2019-10-01 08:00:00 | 0
2019-10-01 16:00:00 | 0
2019-10-02 00:00:00 | 200
2019-10-02 08:00:00 | 0
2019-10-02 16:00:00 | 0
2019-10-03 00:00:00 | 50
2019-10-03 08:00:00 | 0
2019-10-03 16:00:00 | 0
2019-10-04 00:00:00 | 50
2019-10-04 08:00:00 | 0
2019-10-04 16:00:00 | 0
2019-10-05 00:00:00 | 150
2019-10-05 08:00:00 | 0
(14 rows)
In Postgres below query is working using generate_series function
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Below query is also working in Oracle but only for date interval
select to_date('2019-03-01','YYYY-MM-DD') + rownum -1 as dates
from all_objects
where rownum <= to_date('2019-03-06','YYYY-MM-DD')-to_date('2019-03-01','YYYY-MM-DD')+1
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
I want same result in Oracle for below query
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Use a hierarchical query:
SELECT DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE AS dates
FROM DUAL
CONNECT BY DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE <= DATE '2019-04-01';
Output:
| DATES |
| :------------------ |
| 2019-03-01 00:00:00 |
| 2019-03-01 00:30:00 |
| 2019-03-01 01:00:00 |
| 2019-03-01 01:30:00 |
| 2019-03-01 02:00:00 |
| 2019-03-01 02:30:00 |
| 2019-03-01 03:00:00 |
| 2019-03-01 03:30:00 |
| 2019-03-01 04:00:00 |
| 2019-03-01 04:30:00 |
| 2019-03-01 05:00:00 |
| 2019-03-01 05:30:00 |
...
| 2019-03-31 19:30:00 |
| 2019-03-31 20:00:00 |
| 2019-03-31 20:30:00 |
| 2019-03-31 21:00:00 |
| 2019-03-31 21:30:00 |
| 2019-03-31 22:00:00 |
| 2019-03-31 22:30:00 |
| 2019-03-31 23:00:00 |
| 2019-03-31 23:30:00 |
| 2019-04-01 00:00:00 |
db<>fiddle here
If I want to select a date relative to today's date I can do something like:
DateAdd(month, -2, N'1-Jan-2019')
This will give me the 1st of November 2018.
How would I get the Date of the 1st of September, from the previous year?
E.G
Say it's July 2019,
I want the 1st of September 2018, NOT 2019.
However,
Say it's November 2019,
I want the 1st of September 2019, NOT 2018.
How is this possible?
You can do this by subtracting 8 months from your date value and then using the resulting year to build up your September date:
declare #d table(d date);
insert into #d values ('20170101'),('20180101'),('20181101'),('20190101'),('20191001'),('20190901'),('20190921'),('20190808');
select d
,datefromparts(year(dateadd(month,-8,d)),9,1) as PrevSeptDate
,datetimefromparts(year(dateadd(month,-8,d)),9,1,0,0,0,0) as PrevSeptDateTime
from #d
order by d;
Output
+------------+--------------+-------------------------+
| d | PrevSeptDate | PrevSeptDateTime |
+------------+--------------+-------------------------+
| 2017-01-01 | 2016-09-01 | 2016-09-01 00:00:00.000 |
| 2018-01-01 | 2017-09-01 | 2017-09-01 00:00:00.000 |
| 2018-11-01 | 2018-09-01 | 2018-09-01 00:00:00.000 |
| 2019-01-01 | 2018-09-01 | 2018-09-01 00:00:00.000 |
| 2019-08-08 | 2018-09-01 | 2018-09-01 00:00:00.000 |
| 2019-09-01 | 2019-09-01 | 2019-09-01 00:00:00.000 |
| 2019-09-21 | 2019-09-01 | 2019-09-01 00:00:00.000 |
| 2019-10-01 | 2019-09-01 | 2019-09-01 00:00:00.000 |
+------------+--------------+-------------------------+
I have been searching the web for the proper postgreSQL syntax for current_week. I searched through the link attached but could not get anything fruition out of it Date/Time. My task is to get Sunday as the start of the week.
I tried same as current_date but it failed:
select current_week
There has to be a current week syntax for postgreSQL.
knowing that for extract('dow' from
The day of the week as Sunday (0) to Saturday (6)
and
By definition, ISO weeks start on Mondays
You can workaround by substracting one day:
select date_trunc('week', current_date) - interval '1 day' as current_week
current_week
------------------------
2016-12-18 00:00:00+00
(1 row)
Here is sample:
t=# with d as (select generate_series('2016-12-11','2016-12-28','1 day'::interval) t)
select date_trunc('week', d.t)::date - interval '1 day' as current_week, extract('dow' from d.t), d.t from d
;
current_week | date_part | t
---------------------+-----------+------------------------
2016-12-04 00:00:00 | 0 | 2016-12-11 00:00:00+00
2016-12-11 00:00:00 | 1 | 2016-12-12 00:00:00+00
2016-12-11 00:00:00 | 2 | 2016-12-13 00:00:00+00
2016-12-11 00:00:00 | 3 | 2016-12-14 00:00:00+00
2016-12-11 00:00:00 | 4 | 2016-12-15 00:00:00+00
2016-12-11 00:00:00 | 5 | 2016-12-16 00:00:00+00
2016-12-11 00:00:00 | 6 | 2016-12-17 00:00:00+00
2016-12-11 00:00:00 | 0 | 2016-12-18 00:00:00+00
2016-12-18 00:00:00 | 1 | 2016-12-19 00:00:00+00
2016-12-18 00:00:00 | 2 | 2016-12-20 00:00:00+00
2016-12-18 00:00:00 | 3 | 2016-12-21 00:00:00+00
2016-12-18 00:00:00 | 4 | 2016-12-22 00:00:00+00
2016-12-18 00:00:00 | 5 | 2016-12-23 00:00:00+00
2016-12-18 00:00:00 | 6 | 2016-12-24 00:00:00+00
2016-12-18 00:00:00 | 0 | 2016-12-25 00:00:00+00
2016-12-25 00:00:00 | 1 | 2016-12-26 00:00:00+00
2016-12-25 00:00:00 | 2 | 2016-12-27 00:00:00+00
2016-12-25 00:00:00 | 3 | 2016-12-28 00:00:00+00
(18 rows)
Time: 0.483 ms
One method would be date_trunc():
select date_trunc('week', current_date) as current_week