Teradata sql query from grouping records using Intervals - sql

In Teradata SQL how to assign same row numbers for the group of records created with in 8 seconds of time Interval.
Example:-
Customerid Customername Itembought dateandtime
(yyy-mm-dd hh:mm:ss)
100 ALex Basketball 2017-02-10 10:10:01
100 ALex Circketball 2017-02-10 10:10:06
100 ALex Baseball 2017-02-10 10:10:08
100 ALex volleyball 2017-02-10 10:11:01
100 ALex footbball 2017-02-10 10:11:05
100 ALex ringball 2017-02-10 10:11:08
100 Alex football 2017-02-10 10:12:10
My Expected result shoud have additional column with Row_number where it should assign the same number for all the purchases of the customer with in 8 seconds: Refer the below expected result
Customerid Customername Itembought dateandtime Row_number
(yyy-mm-dd hh:mm:ss)
100 ALex Basketball 2017-02-10 10:10:01 1
100 ALex Circketball 2017-02-10 10:10:06 1
100 ALex Baseball 2017-02-10 10:10:08 1
100 ALex volleyball 2017-02-10 10:11:01 2
100 ALex footbball 2017-02-10 10:11:05 2
100 ALex ringball 2017-02-10 10:11:08 2
100 Alex football 2017-02-10 10:12:10 3

This is one way to do it with a recursive cte. Reset the running total of difference from the previous row's timestamp when it gets > 8 to 0 and start a new group.
WITH ROWNUMS AS
(SELECT T.*
,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY TM) AS RNUM
/*Replace DATEDIFF with Teradata specific function*/
,DATEDIFF(SECOND,COALESCE(MIN(TM) OVER(PARTITION BY ID
ORDER BY TM ROWS BETWEEN 1 PRECEDING AND CURRENT ROW), TM),TM) AS DIFF
FROM T --replace this with your tablename and add columns as required
)
,RECURSIVE CTE(ID,TM,DIFF,SUM_DIFF,RNUM,GRP) AS
(SELECT ID,
TM,
DIFF,
DIFF,
RNUM,
CAST(1 AS int)
FROM ROWNUMS
WHERE RNUM=1
UNION ALL
SELECT T.ID,
T.TM,
T.DIFF,
CASE WHEN C.SUM_DIFF+T.DIFF > 8 THEN 0 ELSE C.SUM_DIFF+T.DIFF END,
T.RNUM,
CAST(CASE WHEN C.SUM_DIFF+T.DIFF > 8 THEN T.RNUM ELSE C.GRP END AS int)
FROM CTE C
JOIN ROWNUMS T ON T.RNUM=C.RNUM+1 AND T.ID=C.ID
)
SELECT ID,
TM,
DENSE_RANK() OVER(PARTITION BY ID ORDER BY GRP) AS row_num
FROM CTE
Demo in SQL Server

I am going to interpret the problem differently from vkp. Any row within 8 seconds of another row should be in the same group. Such values can chain together, so the overall span can be more than 8 seconds.
The advantage of this method is that recursive CTEs are not needed, so it should be faster. (Of course, this is not an advantage if the OP does not agree with the definition.)
The basic idea is to look at the previous date/time value; if it is more than 8 seconds away, then add a flag. The cumulative sum of the flag is the row number you are looking for.
select t.*,
sum(case when prev_dt >= dateandtime - interval '8' second
then 0 else 1
end) over (partition by customerid order by dateandtime
) as row_number
from (select t.*,
max(dateandtime) over (partition by customerid order by dateandtime row between 1 preceding and 1 preceding) as prev_dt
from t
) t;

Using Teradata's PERIOD data type and the awesome td_normalize_overlap_meet:
Consider table test32:
SELECT * FROM test32
+----+----+------------------------+
| f1 | f2 | f3 |
+----+----+------------------------+
| 1 | 2 | 2017-05-11 03:59:00 PM |
| 1 | 3 | 2017-05-11 03:59:01 PM |
| 1 | 4 | 2017-05-11 03:58:58 PM |
| 1 | 5 | 2017-05-11 03:59:26 PM |
| 1 | 2 | 2017-05-11 03:59:28 PM |
| 1 | 2 | 2017-05-11 03:59:46 PM |
+----+----+------------------------+
The following will group your records:
WITH
normalizedCTE AS
(
SELECT *
FROM TABLE
(
td_normalize_overlap_meet(NEW VARIANT_TYPE(periodCTE.f1), periodCTE.fper)
RETURNS (f1 integer, fper PERIOD(TIMESTAMP(0)), recordCount integer)
HASH BY f1
LOCAL ORDER BY f1, fper
) as output(f1, fper, recordcount)
),
periodCTE AS
(
SELECT f1, f2, f3, PERIOD(f3, f3 + INTERVAL '9' SECOND) as fper FROM test32
)
SELECT t2.f1, t2.f2, t2.f3, t1.fper, DENSE_RANK() OVER (PARTITION BY t2.f1 ORDER BY t1.fper) as fgroup
FROM normalizedCTE t1
INNER JOIN periodCTE t2 ON
t1.fper P_INTERSECT t2.fper IS NOT NULL
Results:
+----+----+------------------------+-------------+
| f1 | f2 | f3 | fgroup |
+----+----+------------------------+-------------+
| 1 | 2 | 2017-05-11 03:59:00 PM | 1 |
| 1 | 3 | 2017-05-11 03:59:01 PM | 1 |
| 1 | 4 | 2017-05-11 03:58:58 PM | 1 |
| 1 | 5 | 2017-05-11 03:59:26 PM | 2 |
| 1 | 2 | 2017-05-11 03:59:28 PM | 2 |
| 1 | 2 | 2017-05-11 03:59:46 PM | 3 |
+----+----+------------------------+-------------+
A Period in Teradata is a special data type that holds a date or datetime range. The first parameter is the start of the range and the second is the ending time (up to, but not including which is why it's "+ 9 seconds"). The result is that we get a 8 second time "Period" where each record might "intersect" with another record.
We then use td_normalize_overlap_meet to merge records that intersect, sharing the f1 field's value as the key. In your case that would be customerid. The result is three records for this one customer since we have three groups that "overlap" or "meet" each other's time periods.
We then join the td_normalize_overlap_meet output with the output from when we determined the periods. We use the P_INTERSECT function to see which periods from the normalized CTE INTERSECT with the periods from the initial Period CTE. From the result of that P_INTERSECT join we grab the values we need from each CTE.
Lastly, Dense_Rank() gives us a rank based on the normalized period for each group.

Related

SQL query to find the visitor together with the date time

My visitor log table has id, visitor, department,vtime fields.
id | visitor | Visittime | Department_id
--------------------------------------------------------------
1 1 2019-05-07 13:53:50 1
2 2 2019-05-07 13:56:54 1
3 1 2019-05-07 14:54:10 3
4 2 2019-05-08 13:54:49 1
5 1 2019-05-08 13:58:15 1
6 2 2019-05-08 18:54:30 2
7 1 2019-05-08 18:54:37 2
And I have already have the following index
CREATE INDEX Idx_VisitorLog_Visitor_VisitTime_Includes ON VisitorLog
(Visitor, VisitTime) INCLUDE (DepartmentId, ID)
From the above table 4 filters are passed from User interface, visitor 1 and visitor 2 and visiting start time and end time.
In what are the department visitor 1 and visitor 2 both together with the VisitTime difference with in 5 mins those need to be filtered
Output shout be
id | visitor | Visittime | Department_id
--------------------------------------------------------------
1 1 2019-05-07 13:53:50 1
2 2 2019-05-07 13:56:54 1
4 2 2019-05-08 13:54:49 1
5 1 2019-05-08 13:58:15 1
For that I had used the following query,
;with CTE1 AS(
Select id,visitor,Visittime,department_id from visitorlog where visitor=1
)
,CTE2 AS(
Select id,visitor,Visittime,department_id from visitorlog where visitor=2
)
select * from CTE2 V2
Inner join CTE1 V1 on V2.department_id=V1.department_id and DATEDIFF(minute,V2.Visittime,V1.Visittime)between -5 and 5**
The above query takes too much of time to give response. Because in my table, almost 20 million records are available
Could any one suggest the correct way for my requirement.
Thanks in advance
This is a completely revised answer, based upon your additional information above.
After reviewing the data file above and the results you desire, this seems like the cleanest way to provide your results. First, we need a different index:
create index idx_POC_visitorlog on visitorlog
(visitor, Department_id, Visittime) include(id);
With this index, we can limit the queries to only the two passed in IDs. To simulate that, I created variables to hold their values. This query returns the data you are looking for.
DECLARE #Visitor1 int = 1,
#Visitor2 int = 2
;with t as (
select Department_id,
dateadd(minute, -5, visittime) as EarlyTime,
dateadd(minute, 5, Visittime) as LateTime,
id
from visitorlog
where visitor = #Visitor1
),
v as (
select v.id,
t.id as tid
from visitorlog v
INNER JOIN t
ON v.visitor = #Visitor2
AND v.Department_id = t.Department_id
and v.Visittime BETWEEN t.EarlyTime and t.LateTime
)
SELECT *
FROM visitorlog vl
WHERE ID IN (
SELECT v.id
FROM v
UNION
SELECT v.tid
FROM v
)
ORDER BY visittime;
If your version of SQL Server supports the LAG and LEAD functions, try rewriting the query as follows:
with t as (
select
*,
dateadd(minute, 5,
lag(Visittime) over(partition by Department_id order by Visittime)) lag_visit_time,
dateadd(minute, -5,
lead(Visittime) over(partition by Department_id order by Visittime)) lead_visit_time
from visitorlog
where visitor in(1, 2)
)
select
id, visitor, visittime, department_id
from t
where lag_visit_time >= Visittime or lead_visit_time <= Visittime;
This index is called a POC.
Results:
+----+---------+----------------------+---------------+
| id | visitor | visittime | department_id |
+----+---------+----------------------+---------------+
| 1 | 1 | 2019-05-07T13:53:50Z | 1 |
| 2 | 2 | 2019-05-07T13:56:54Z | 1 |
| 4 | 2 | 2019-05-08T13:54:49Z | 1 |
| 5 | 1 | 2019-05-08T13:58:15Z | 1 |
| 6 | 2 | 2019-05-08T18:54:30Z | 2 |
| 7 | 1 | 2019-05-08T18:54:37Z | 2 |
+----+---------+----------------------+---------------+
Demo.

Is it possible to do projection in Google Big Query?

I have a query (due to restrictions, it is using Legacy SQL) that produces a column that is the rolling average of last 3 days of sale (excluding today)
SELECT
id, date, sales, AVG(sales) OVER (PARTITION BY id ORDER BY date RANGE BETWEEN 4 PRECEDING AND 1 PRECEDING) AS projected_sale
FROM tableA
tableA
+-------+---------+---------+
| id | date | sales |
+-------+---------+---------+
| 1 | 01-01-17| 5 |
| 1 | 01-02-17| 6 |
| 1 | 01-03-17| 7 |
| 1 | 01-04-17| 10 |
+-------+---------+---------+
The query produces
+-------+---------+---------+--------------+
| id | date | sales |projected_sale|
+-------+---------+---------+--------------+
| 1 | 01-01-17| 5 | . |
| 1 | 01-02-17| 6 | . |
| 1 | 01-03-17| 7 | . |
| 1 | 01-04-17| 10 | 6 |
+-------+---------+---------+--------------+
Since the average is excluding the current row, theoretically I can project the sale for 01-05-17 using the sales from (01-02 to 01-04). However since tableA doesn't actually have a entry with date 01-05-17, my query stops at 01-04-17 as the last row.
Is what I am trying to do possible in Big Query?
Thank you
First, I think using RANGE is incorrect here - it should be ROWS instead
Anyway, below is an example for BigQuery Legacy SQL that demonstrates how to achieve result you need.
#legacySQL
SELECT
id, dt, sales,
AVG(sales) OVER (
PARTITION BY id ORDER BY dt
ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING
) AS projected_sale
FROM tableA, (SELECT 1 id, '01-05-17' dt, 0 sales)
As you can see here you just simply adding (UNION ALL - comma in Kegacy SQL) that missing day. Of course you can transform that one such that it will add such missing row for all id's
Nevetherless - hope this is a good starting point for you
You can test / play with it using dummy data as in your question
#legacySQL
SELECT
id, dt, sales,
AVG(sales) OVER (
PARTITION BY id ORDER BY dt
ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING
) AS projected_sale
FROM (
SELECT * FROM
(SELECT 1 id, '01-01-17' dt, 5 sales),
(SELECT 1 id, '01-02-17' dt, 6 sales),
(SELECT 1 id, '01-03-17' dt, 7 sales),
(SELECT 1 id, '01-04-17' dt, 10 sales)
) tableA, (SELECT 1 id, '01-05-17' dt, 0 sales)
with result as
Row id dt sales projected_sale
1 1 01-01-17 5 null
2 1 01-02-17 6 5.0
3 1 01-03-17 7 5.5
4 1 01-04-17 10 6.0
5 1 01-05-17 0 7.0

sql group by personalised condition

Hi,I have a column as below
+--------+--------+
| day | amount|
+--------+---------
| 2 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 2 |
| 3 | 3 |
| 4 | 3 |
+--------+--------+
now I want something like this sum day 1- day2 as row one , sum day1-3 as row 2, and so on.
+--------+--------+
| day | amount|
+--------+---------
| 1-2 | 11 |
| 1-3 | 14 |
| 1-4 | 17 |
+--------+--------+
Could you offer any one help ,thanks!
with data as(
select 2 day, 2 amount from dual union all
select 1 day, 3 amount from dual union all
select 1 day, 4 amount from dual union all
select 2 day, 2 amount from dual union all
select 3 day, 3 amount from dual union all
select 4 day, 3 amount from dual)
select distinct day, sum(amount) over (order by day range unbounded preceding) cume_amount
from data
order by 1;
DAY CUME_AMOUNT
---------- -----------
1 7
2 11
3 14
4 17
if you are using oracle you can do something like the above
Assuming the day range in left column always starts from "1-", What you need is a query doing cumulative sum on the grouped table(dayWiseSum below). Since it needs to be accessed twice I'd put it into a temporary table.
CREATE TEMPORARY TABLE dayWiseSum AS
(SELECT day,SUM(amount) AS amount FROM table1 GROUP BY day ORDER BY day);
SELECT CONCAT("1-",t1.day) as day, SUM(t2.amount) AS amount
FROM dayWiseSum t1 INNER JOIN dayWiseSum
t2 ON t1.day > t2.day
--change to >= if you want to include "1-1"
GROUP BY t1.day, t1.amount ORDER BY t1.day
DROP TABLE dayWiseSum;
Here's a fiddle to test with:
http://sqlfiddle.com/#!9/c1656/1/0
Note: Since sqlfiddle isn't allowing CREATE statements, I've replaced dayWiseSum with it's query there. Also, I've used "Text to DDL" option to paste the exact text of the table from your question to generate the create table query :)

Querying DAU/MAU over time (daily)

I have a daily sessions table with columns user_id and date. I'd like to graph out DAU/MAU (daily active users / monthly active users) on a daily basis. For example:
Date MAU DAU DAU/MAU
2014-06-01 20,000 5,000 20%
2014-06-02 21,000 4,000 19%
2014-06-03 20,050 3,050 17%
... ... ... ...
Calculating daily active users is straightforward but calculating the monthly active users e.g. the number of users that logged in today minus 30 days, is causing problems. How is this achieved without a left join for each day?
Edit: I'm using Postgres.
Assuming you have values for each day, you can get the total counts using a subquery and range between:
with dau as (
select date, count(userid) as dau
from dailysessions ds
group by date
)
select date, dau,
sum(dau) over (order by date rows between -29 preceding and current row) as mau
from dau;
Unfortunately, I think you want distinct users rather than just user counts. That makes the problem much more difficult, especially because Postgres doesn't support count(distinct) as a window function.
I think you have to do some sort of self join for this. Here is one method:
with dau as (
select date, count(distinct userid) as dau
from dailysessions ds
group by date
)
select date, dau,
(select count(distinct user_id)
from dailysessions ds
where ds.date between date - 29 * interval '1 day' and date
) as mau
from dau;
This one uses COUNT DISTINCT to get the rolling 30 days DAU/MAU:
(calculating reddit's user engagement in BigQuery - but the SQL is standard enough to be used on other databases)
SELECT day, dau, mau, INTEGER(100*dau/mau) daumau
FROM (
SELECT day, EXACT_COUNT_DISTINCT(author) dau, FIRST(mau) mau
FROM (
SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) day, author
FROM [fh-bigquery:reddit_comments.2015_09]
WHERE subreddit='AskReddit') a
JOIN (
SELECT stopday, EXACT_COUNT_DISTINCT(author) mau
FROM (SELECT created_utc, subreddit, author FROM [fh-bigquery:reddit_comments.2015_09], [fh-bigquery:reddit_comments.2015_08]) a
CROSS JOIN (
SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) stopday
FROM [fh-bigquery:reddit_comments.2015_09]
GROUP BY 1
) b
WHERE subreddit='AskReddit'
AND SEC_TO_TIMESTAMP(created_utc) BETWEEN DATE_ADD(stopday, -30, 'day') AND TIMESTAMP(stopday)
GROUP BY 1
) b
ON a.day=b.stopday
GROUP BY 1
)
ORDER BY 1
I went further at How to calculate DAU/MAU with BigQuery (engagement)
I've written about this on my blog.
The DAU is easy, as you noticed. You can solve the MAU by first creating a view with boolean values for when a user activates and de-activates, like so:
CREATE OR REPLACE VIEW "vw_login" AS
SELECT *
, LEAST (LEAD("date") OVER w, "date" + 30) AS "activeExpiry"
, CASE WHEN LAG("date") OVER w IS NULL THEN true ELSE false AS "activated"
, CASE
WHEN LEAD("date") OVER w IS NULL THEN true
WHEN LEAD("date") OVER w - "date" > 30 THEN true
ELSE false
END AS "churned"
, CASE
WHEN LAG("date") OVER w IS NULL THEN false
WHEN "date" - LAG("date") OVER w <= 30 THEN false
WHEN row_number() OVER w > 1 THEN true
ELSE false
END AS "resurrected"
FROM "login"
WINDOW w AS (PARTITION BY "user_id" ORDER BY "date")
This creates boolean values per user per day when they become active, when they churn and when they re-activate.
Then do a daily aggregate of the same:
CREATE OR REPLACE VIEW "vw_activity" AS
SELECT
SUM("activated"::int) "activated"
, SUM("churned"::int) "churned"
, SUM("resurrected"::int) "resurrected"
, "date"
FROM "vw_login"
GROUP BY "date"
;
And finally calculate running totals of active MAUs by calculating the cumulative sums over the columns. You need to join the vw_activity twice, since the second one is joined to the day when the user becomes inactive (i.e. 30 days since their last login).
I've included a date series in order to ensure that all days are present in your dataset. You can do without it too, but you might skip days in your dataset.
SELECT
d."date"
, SUM(COALESCE(a.activated::int,0)
- COALESCE(a2.churned::int,0)
+ COALESCE(a.resurrected::int,0)) OVER w
, d."date", a."activated", a2."churned", a."resurrected" FROM
generate_series('2010-01-01'::date, CURRENT_DATE, '1 day'::interval) d
LEFT OUTER JOIN vw_activity a ON d."date" = a."date"
LEFT OUTER JOIN vw_activity a2 ON d."date" = (a2."date" + INTERVAL '30 days')::date
WINDOW w AS (ORDER BY d."date") ORDER BY d."date";
You can of course do this in a single query, but this helps understand the structure better.
You didn't show us your complete table definition, but maybe something like this:
select date,
count(*) over (partition by date_trunc('day', date) order by date) as dau,
count(*) over (partition by date_trunc('month', date) order by date) as mau
from sessions
order by date;
To get the percentage without repeating the window functions, just wrap this in a derived table:
select date,
dau,
mau,
dau::numeric / (case when mau = 0 then null else mau end) as pct
from (
select date,
count(*) over (partition by date_trunc('day', date) order by date) as dau,
count(*) over (partition by date_trunc('month', date) order by date) as mau
from sessions
) t
order by date;
Here is an example output:
postgres=> select * from sessions;
session_date | user_id
--------------+---------
2014-05-01 | 1
2014-05-01 | 2
2014-05-01 | 3
2014-05-02 | 1
2014-05-02 | 2
2014-05-02 | 3
2014-05-02 | 4
2014-05-02 | 5
2014-06-01 | 1
2014-06-01 | 2
2014-06-01 | 3
2014-06-02 | 1
2014-06-02 | 2
2014-06-02 | 3
2014-06-02 | 4
2014-06-03 | 1
2014-06-03 | 2
2014-06-03 | 3
2014-06-03 | 4
2014-06-03 | 5
(20 rows)
postgres=> select session_date,
postgres-> dau,
postgres-> mau,
postgres-> round(dau::numeric / (case when mau = 0 then null else mau end),2) as pct
postgres-> from (
postgres(> select session_date,
postgres(> count(*) over (partition by date_trunc('day', session_date) order by session_date) as dau,
postgres(> count(*) over (partition by date_trunc('month', session_date) order by session_date) as mau
postgres(> from sessions
postgres(> ) t
postgres-> order by session_date;
session_date | dau | mau | pct
--------------+-----+-----+------
2014-05-01 | 3 | 3 | 1.00
2014-05-01 | 3 | 3 | 1.00
2014-05-01 | 3 | 3 | 1.00
2014-05-02 | 5 | 8 | 0.63
2014-05-02 | 5 | 8 | 0.63
2014-05-02 | 5 | 8 | 0.63
2014-05-02 | 5 | 8 | 0.63
2014-05-02 | 5 | 8 | 0.63
2014-06-01 | 3 | 3 | 1.00
2014-06-01 | 3 | 3 | 1.00
2014-06-01 | 3 | 3 | 1.00
2014-06-02 | 4 | 7 | 0.57
2014-06-02 | 4 | 7 | 0.57
2014-06-02 | 4 | 7 | 0.57
2014-06-02 | 4 | 7 | 0.57
2014-06-03 | 5 | 12 | 0.42
2014-06-03 | 5 | 12 | 0.42
2014-06-03 | 5 | 12 | 0.42
2014-06-03 | 5 | 12 | 0.42
2014-06-03 | 5 | 12 | 0.42
(20 rows)
postgres=>

SQL duration between two dates in different rows

I would really appreciate some assistance if somebody could help me construct a MSSQL Server 2000 query that would return the duration between a customer's A entry and their B entry.
Not all customers are expected to have a B record and so no results would be returned.
Customers Audit
+---+---------------+---+----------------------+
| 1 | Peter Griffin | A | 2013-01-01 15:00:00 |
| 2 | Martin Biggs | A | 2013-01-02 15:00:00 |
| 3 | Peter Griffin | C | 2013-01-05 09:00:00 |
| 4 | Super Mario | A | 2013-01-01 15:00:00 |
| 5 | Martin Biggs | B | 2013-01-03 18:00:00 |
+---+---------------+---+----------------------+
I'm hoping for results similar to:
+--------------+----------------+
| Martin Biggs | 1 day, 3 hours |
+--------------+----------------+
Something like the below (don't know your schema, so you'll need to change names of objects) should suffice.
SELECT ABS(DATEDIFF(HOUR, CA.TheDate, CB.TheDate)) AS HoursBetween
FROM dbo.Customers CA
INNER JOIN dbo.Customers CB
ON CB.Name = CA.Name
AND CB.Code = 'B'
WHERE CA.Code = 'A'
SELECT A.CUSTOMER, DATEDIFF(HOUR, A.ENTRY_DATE, B.ENTRY_DATE) DURATION
FROM CUSTOMERSAUDIT A, CUSTOMERSAUDIT B
WHERE B.CUSTOMER = A.CUSTOMER AND B.ENTRY_DATE > A.ENTRY_DATE
This is Oracle query but all features available in MS Server as far as I know. I'm sure I do not have to tell you how to concatenate the output to get desired result. All values in output will be in separate columns - days, hours, etc... And it is not always easy to format the output here:
SELECT id, name, grade
, NVL(EXTRACT(DAY FROM day_time_diff), 0) days
, NVL(EXTRACT(HOUR FROM day_time_diff), 0) hours
, NVL(EXTRACT(MINUTE FROM day_time_diff), 0) minutes
, NVL(EXTRACT(SECOND FROM day_time_diff), 0) seconds
FROM
(
SELECT id, name, grade
, (begin_date-end_date) day_time_diff
FROM
(
SELECT id, name, grade
, CAST(start_date AS TIMESTAMP) begin_date
, CAST(end_date AS TIMESTAMP) end_date
FROM
(
SELECT id, name, grade, start_date
, LAG(start_date, 1, to_date(null)) OVER (ORDER BY id) end_date
FROM stack_test
)
)
)
/
Output:
ID NAME GRADE DAYS HOURS MINUTES SECONDS
------------------------------------------------------------
1 Peter Griffin A 0 0 0 0
2 Martin Biggs A 1 1 0 0
3 Peter Griffin C 2 17 0 0
4 Super Mario A -3 -18 0 0
5 Martin Biggs A 2 3 0 0
The table structure/columns I used - it would be great if you took care of this and data in advance:
CREATE TABLE stack_test
(
id NUMBER
,name VARCHAR2(50)
,grade VARCHAR2(3)
,start_date DATE
)
/