Back-filling time-series data with previous time's values - sql

I have a table of time-series data that has some gaps in the series. An example of the data is below:
Date
Value
2022-11-17
1
2022-11-14
2
I want to insert rows for the dates between the existing rows (2022-11-15, 2022-11-16) that have the value of the latest date before the date being inserted (the 2022-11-14 row).
I started by using an imperative solution in my application programming language but I'm convinced there must be a way to do this in SQL.

demo:db<>fiddle
INSERT INTO mytable -- 5
SELECT
generate_series( -- 1
mydate + 1, -- 2
lead(mydate) OVER (ORDER BY mydate) - 1, -- 3
interval '1 day'
)::date as gs,
t.myvalue -- 4
FROM mytable t;
Use generate_series() to generate date series
Start of your date series is the next day of the row's date value
End of your date series is the day before the next row's date value. Here the lead() window function is used to access the next row
Use the generated dates from the function and the value of the actual row for the newly generated rows.
Finally insert them into your table.

Related

Where Clause on two column sequentially

I want to select query where I can fetch records from a table based 2 column where timestamp value match in both column sequentially.
SELECT *
FROM commerce_order
WHERE CHANGED BETWEEN 1638342000 AND 1641020400
AND created BETWEEN 1638342000 AND 1641020400
Like column changed update 31st at 10 AM and Column created have value 31 at 11AM both should be shown in the result 1 10 AM then next in 11 PM
10 AM
11 AM
So the "created" timestamp should be 1 minute after the "changed" timestamp?
Then you could truncate them on the minute to compare them.
select *
from commerce_order
where changed between unix_timestamp('2021-12-01 07:00') and unix_timestamp('2022-01-01 07:00')
and cast(date_format(from_unixtime(created), '%Y-%m-%d %H:%i') AS datetime) = cast(date_format(from_unixtime(changed), '%Y-%m-%d %H:%i') AS datetime) + interval 1 minute
db<>fiddle here

Generate date array with all dates in between week range

I am trying to generate a date array for joining purposes, where I have all dates relative to a week range in the second column.
i.e.
SELECT
generate_date_array(date(2021,1,1),date(2021,11,26),INTERVAL 6 DAY)
Produces
What I need is a second column, that shows all the dates between each date row.
i.e.
Since I am ultimately joining this to another table, to see what week a record creation date is in, for this interval, I'm wanting to make this a table with two columns, and a row for each entry.
There are many way of doing this - below is just one and to re-use the "code" you already started with
select week, week + step as date
from (
select
generate_date_array(date(2021,1,1),date(2021,11,26), interval 6 DAY) weeks
), unnest (weeks) week, unnest([0, 1, 2, 3, 4, 5]) step
with output

Get the number of the weekday from each date in a date list

DB-Fiddle
CREATE TABLE dates (
date_list DATE
);
INSERT INTO dates
(date_list)
VALUES
('2020-01-29'),
('2020-01-30'),
('2020-01-31'),
('2020-02-01'),
('2020-02-02');
Expected Results:
Weekday
2
3
4
5
6
I want go get the number of the weekday for each date in the table dates.
Therefore, I tried to go with the solution from this question but could not make it work:
SELECT
EXTRACT(DOW FROM DATE d.date_list))
FROM dates d
How do I need to modify the query to get the expected result?
Get rid of the date keyword it is only needed to introduce a DATE constant. If you already have a DATE value (which your column is) it's not needed:
select extract(dow from d.date_list)
from dates d

Facing issue in Hive query in generating missing dates

I have a requirement where I need to go back to previous values for a column until 1000 rows and get those previous 1000 dates for my next steps, but all those 1000 previous dates are not present for that column in the table. But I need those missing dates to get from output of the query.
When I try to run below query it is not displaying 1000 previous date values from current date.
Example: let's say only 2 dates are available for date column
date
2019-01-16
2019-01-19
I have come up with a query to get back 1000 dates but it is giving only nearest date as all previous back dates are missing
SELECT date FROM table1 t
WHERE
date >= date_sub(current_date,1000) and dt<current_date ORDER BY date LIMIT 1
If I run above query it is displaying 2019-01-16, since previous 1000 days back date are not present it is giving nearest date ,which is 2019-01-16 but I need missing dates starting from 2016-04-23 (1000th date from current date) till before current date (2019-01-18) as output of my query.
You can generate dates for required range in the subquery (see date_range subquery in the example below) and left join it with your table. If there is no record in your table on some dates, the value will be null, dates will be returned from the date_range subquery without gaps. Set start_date and end_date parameters for date_range required:
set hivevar:start_date=2016-04-23; --replace with your start_date
set hivevar:end_date=current_date; --replace with your end_date
set hive.exec.parallel=true;
set hive.auto.convert.join=true; --this enables map-join
set hive.mapjoin.smalltable.filesize=25000000; --size of table to fit in memory
with date_range as
(--this query generates date range, check it's output
select date_add ('${hivevar:start_date}',s.i) as dt
from ( select posexplode(split(space(datediff('${hivevar:end_date}','${hivevar:start_date}')),' ')) as (i,x) ) s
)
select d.dt as date,
t.your_col --some value from your table on date
from date_range d
left join table1 t on d.dt=t.date
order by d.dt --order by dates if necessary

Oracle sql. finding last quarter's last date for given number of dates

I need to find the last quarter's last date and insert into another column for dates from present in a column. i.e read from the same table and insert into another column
EX
column 1 | column 2
02-aug-16|30-jun-16
05-dec-16|30-sep-16
Assuming you know how to insert a value in a column, and also assuming - if your date is not in the correct date datatype but instead is a string, then you know how to change it to a date with to_date() ...
the only remaining question is, given a date, how do you find the last date of the previous quarter.
trunc() can used with a date parameter. The function truncates the input date. You can give it a second argument to show what to truncate to. 'q' is for quarter. So trunc(date_col, 'q') will return the first day of the "current" quarter (current to the value stored in date_col, that is). Then you can subtract 1 (which means one day) to get the last day of the previous quarter.
SQL> select sysdate as today, trunc(sysdate, 'q') - 1 as last_day_of_prev_qtr from dual;
TODAY LAST_DAY_OF_PREV_QTR
---------- --------------------
2016-08-02 2016-06-30
If I got it right, truncate the source date to a quarter start and substract one day
select col1, TRUNC(col1,'Q') - interval '1' day col2
from (
select cast('02-aug-16' as date) col1 from dual
union all
select cast('05-dec-16' as date) col1 from dual
);