split single row into multiple rows in SQL - sql

In my table there are two fields start and stop which stores the start time and stop time respectively.
for example the Start time = 2014-01-01 23:43:00 and stop = 2014-01-03 03:33:00. This timestamp needs to brokendown to.
1=> 2014-01-01 23:43:00 - 2014-01-02 00:00:00, as date 2014-01-01
2=> 2014-01-02 00:00:01 - 2014-01-03 00:00:00, as date 2014-01-02
3=> 2014-01-03 00:00:01 - 2014-01-03 03:33:00, as date 2014-01-03
as three different rows.
Here the problem is the difference in stop and start time varies say 1 day to 10 days.
To make it more complicate, in the above example i split the period on basis of date, this i need to split on basis of time ie. say split at time 02:30:00, so the spiting should e as follows.
1=> 2014-01-01 23:43:00 - 2014-01-02 02:30:00, as date 2014-01-01
2=> 2014-01-02 02:30:01 - 2014-01-03 02:30:00, as date 2014-01-02
3=> 2014-01-03 02:30:01 - 2014-01-03 02:30:00, as date 2014-01-03
4=> 2014-01-03 02:30:01 - 2014-01-03 03:33:00, as date 2014-01-04
Once the above split has done, i need to count the rows grouped by date.
I'm using PostgreSQL.
Can anyone throw some light on this !!
Thanks!

I think your "split by time" desired output sample is wrong and should
in instead be this
1=> 2014-01-01 23:43:00 - 2014-01-02 02:30:00, as date 2014-01-01
2=> 2014-01-02 02:30:01 - 2014-01-03 02:30:00, as date 2014-01-02
3=> 2014-01-03 02:30:01 - 2014-01-03 03:33:00, as date 2014-01-03
If that is the case then this do it
select day, count(*)
from (
select generate_series(
(start_time - interval '2 hours 30 minutes')::date,
stop_time,
interval '1 day'
)::date as day
from t
) s
group by day
order by day

Create a collateral table - period and insert all possible periods in the table for let's say from now() -10 years to now()+10 years.
In case of days periods it should be all days of the 20 years.
After that you can select from the period table and JOIN your table with period extracting code

You probably will need an additional table containing every date to join it to your base table.
If you had a table like dates containing values like:
date_col
2014-01-01 00:00:00
2014-01-02 00:00:00
2014-01-03 00:00:00
...
Then you can do a query like
SELECT GREATEST(start_time, date_col), LEAST(end_time,date_col + interval 1 day) FROM base_table
JOIN dates ON date_col BETWEEN start_time AND end_time
SQL not tested but it shows tha idea.

Related

Ability to count per hour per day?

I'm new to SSAS so be gentle!
I have (simplified):
fact table that has an ID, start date, start datetime, end date, end datetime
A date dimension that has a granularity from Year to Calendar Date.
What I'd like to be able to do is get the count of ID per hour per date member/current member. However I'm not exactly sure how to get there.
Fact Table Example
ID
Start Date
End Date
Start DateTime
End DateTime
1
2022-01-01
2022-01-04
2022-01-01 23:00
2022-01-04 05:33
53
2022-01-01
2022-01-07
2022-01-01 04:00
2022-01-07 12:05
Wanted results:
Date
Hour
Count
2022-01-02
00:00
1
2022-01-02
01:00
1
2022-01-02
02:00
1
2022-01-02
03:00
1
2022-01-02
04:00
2
2022-01-02
05:00
2
I expect I need an hour dimension that somehow links to the date dimension and then some sort of measure that does a between comparison but not exactly sure how to go about this.
Any help is appreciated!
Edit: above tables may not be showing right for some reason. Looks great when I go to edit them...

Impala: Split single row into multiple rows based on Date and time

I want to split a single row into multiple rows based on time.
SrNo Employee StartDate EndDate
---------------------------------------------------------------------------
1 emp1 30/03/2020 09:00:00 31/03/2020 07:15:00
2 emp2 01/04/2020 09:00:00 02/04/2020 08:00:00
Expected output is below:
SrNo Employee StartDate EndDate
---------------------------------------------------------------------------
1 emp1 30/03/2020 09:00:00 30/03/2020 11:59:00
1 emp1 31/03/2020 00:00:00 31/03/2020 07:15:00
2 emp2 01/04/2020 09:00:00 01/04/2020 11:59:00
2 emp2 02/04/2020 00:00:00 02/04/2020 08:00:00
Day start from 00:00 AM to next day 00:00 AM. When EndDate time is greater than 00:00 AM (midnight) then split this date in two rows. First row end date is 30/03/2020 11:59:00 and next row start 31/03/2020 00:00:00.
Please help me to get is solved.
This would be a good spot for a recursive CTE, but unfortunatly Hive does not support those. Here is another aproach, that uses a derived table of numbers to split the periods:
select
t.SrNo,
t.Employee,
greatest(t.startDate, date_add(to_date(t.startDate), x.n)) startDate,
least(t.endDate, date_add(to_date(t.startDate), x.n + 1)) endDate
from mytable t
inner join (select 0 n union all select 1 union all select 2) x
on date_add(to_date(t.startDate), x.n) <= t.endDate
You can expand the subquery to handle more possible periods per row.
Also note that this generates half-open intervals, where the end of the preceding interval is equal to the start of the next one (while in your resultset there is a one minute lag). The logic is that the interval is inclusive on its smaller bound and exclusive on the the outer bound (that way, you make sure to not leave any gap).

BigQuery - A way to generate timestamps based on hour/minute/seconds?

Is there a way to generate sequential timestamps in BigQuery that is focused on hours, minutes, and seconds?
In BigQuery you can generate sequential dates by:
select *
FROM UNNEST(GENERATE_DATE_ARRAY('2016-10-18', '2016-10-19', INTERVAL 1 DAY)) as day
This will generate the dates from 2016-10-18 to 2016-10-19 in date intervals
Row day
1 2016-10-18
2 2016-10-19
But let's say I want intervals in 15 minutes or 5 minutes, is there a way to do that?
First, I would recommend "starring" the feature request for GENERATE_TIMESTAMP_ARRAY to express interest in having a function like this. Given GENERATE_ARRAY, though, the best option currently is to use a query of this form:
SELECT TIMESTAMP_ADD('2018-04-01', INTERVAL 15 * x MINUTE)
FROM UNNEST(GENERATE_ARRAY(0, 13)) AS x;
If you want a minute-based GENERATE_TIMESTAMP_ARRAY equivalent, you can use a UDF like this:
CREATE TEMP FUNCTION GenerateMinuteTimestampArray(
t0 TIMESTAMP, t1 TIMESTAMP, minutes INT64) AS (
ARRAY(
SELECT TIMESTAMP_ADD(t0, INTERVAL minutes * x MINUTE)
FROM UNNEST(GENERATE_ARRAY(0, TIMESTAMP_DIFF(t1, t0, MINUTE))) AS x
)
);
SELECT ts
FROM UNNEST(GenerateMinuteTimestampArray('2018-04-01', '2018-04-01 12:00:00', 15)) AS ts;
This returns a timestamp for each 15-minute interval between midnight and 12 PM on April 1.
Update: You can now use the GENERATE_TIMESTAMP_ARRAY function in BigQuery. If you want to generate timestamps at intervals of 15 minutes, for example, you can use:
SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-18', '2016-10-19', INTERVAL 15 MINUTE);
Epochs seems like the way to go.
But requires to convert date to epoch first.
select TIMESTAMP_MICROS(CAST(day * 1000000 as INT64))
FROM UNNEST(GENERATE_ARRAY(1522540800, 1525132799, 900)) as day
Row f0_
1 2018-04-01 00:00:00.000 UTC
2 2018-04-01 00:15:00.000 UTC
3 2018-04-01 00:30:00.000 UTC
4 2018-04-01 00:45:00.000 UTC
5 2018-04-01 01:00:00.000 UTC
6 2018-04-01 01:15:00.000 UTC
7 2018-04-01 01:30:00.000 UTC
8 2018-04-01 01:45:00.000 UTC
9 2018-04-01 02:00:00.000 UTC
10 2018-04-01 02:15:00.000 UTC
11 2018-04-01 02:30:00.000 UTC
12 2018-04-01 02:45:00.000 UTC
13 2018-04-01 03:00:00.000 UTC

Is it possible to convert integer to days and hours in SQL?

I am using SQL Server 2014.
What I'm trying to do is add a new time to an old datetime.
I'm not even sure if it's possible but I thought I'd ask the experts.
So these are what my columns look like:
CurrentDate | Hours | NewDate
2017-03-10 08:00:00 | 25 | ??
2017-01-01 10:00:00 | 27 | ??
What I want is the Hours to be converted to days and hours so it can be added to the CurrentDate to create a NewDate.
So the NewDate would be: 2017-03-11 09:00:00 because 25 hours equates to 1 day and 1 hour. And the second NewDate would be: 2017-01-02 01:00:00 because 27 equates to 1 day and 3 hours.
I actually don't think this is possible and there's a chance I might have to put the hours already converted into days and times but if that's the case, how would I write 25 hours? Would it be 00-00-01 01:00:00? And would 27 hours be 00-00-01 03:00:00 and then just add those values into CurrentDate?
Thanks! Feel free to tell me this has been asked before (I looked, but couldn't find anything as unique as this or maybe I didn't look hard enough) or if this can't be done.
You can simply use DATEADD, no need to convert the hours to days first:
SELECT CurrentDate,
Hours,
DATEADD(HOUR,Hours,CurrentDate) NewDate
FROM dbo.YourTable;
You can try this:
select DATEADD(HOUR,25,'2017-03-10 08:00:00') -- 2017-03-11 09:00:00.000
select DATEADD(HOUR,27,'2017-01-01 10:00:00') -- 2017-01-02 13:00:00.000

Create a series of dates in PostgreSQL 9.3

I am trying to generate a series of DATE objects for each day since January 1st, 2014.
The following query works:
SELECT day::DATE FROM
(SELECT
generate_series('2014-01-01'::DATE, now(), '1 day') as day) sq;
day
------------
2014-01-01
2014-01-02
2014-01-03
2014-01-04
2014-01-05
2014-01-06
2014-01-07
2014-01-08
2014-01-09
2014-01-10
2014-01-11
2014-01-12
...
2015-12-13
2015-12-14
2015-12-15
(714 rows)
However, the subquery seems inelegant to me. Is there a way to create the date objects directly from the generate_series query?
select day::date
from generate_series('2014-01-01'::date, now(), '1 day') sq (day)
Or directly in the select list:
select generate_series('2014-01-01'::date, now(), '1 day')::date as day