SQL query to find number of successes between failures - sql

I have an oracle database table that contains test result records. Each record contains the test START_TIME, the INSTRUMENT that the test was performed on, and an ERROR_CODE if an error occurred during the test, among other information.
For every record with an ERROR_CODE equal to '5900', '6900' or '5905', I need to determine the number of successful tests (ERROR_CODE = null) that have occurred on that INSTRUMENT before the datetime of the error record. In other words, I need to know the number of successful tests performed on the instrument before an error was generated.
The database contains over 500 instruments that can each have between 1 and 500,000 test records.
Notes: Only interested in number of successes before ERROR_CODES '5900', '6000' and '5905'. Some instruments may have zero of those errors. Some instruments may have multiple consecutive errors, with no success between them. An error may have occurred on that instrument's first or last test.
Example:
START_TIME INSTRUMENT ERROR_CODE
12/1/2015 22:15:03 A540 null
12/1/2015 22:17:14 A700 null
12/1/2015 22:17:53 A700 null
12/1/2015 22:19:24 A700 5905
12/1/2015 23:28:15 A700 null
12/1/2015 23:35:10 A540 6000
12/2/2015 02:15:13 A540 5900
12/2/2015 03:07:03 A540 null
12/2/2015 03:44:52 A540 null
12/2/2015 09:15:56 A700 null
12/2/2015 14:17:09 A700 5900
12/2/2015 17:15:42 A980 null
12/3/2015 08:17:53 A540 5900
12/3/2015 08:18:49 A540 5900
12/3/2015 11:17:57 A540 null
should give the following results
ERROR_TIME INSTRUMENT SUCCESSES_BEFORE_ERROR
12/1/2015 22:19:24 A700 2
12/1/2015 23:35:10 A540 1
12/2/2015 02:15:13 A540 1
12/2/2015 14:17:09 A700 4
12/3/2015 08:17:53 A540 3
12/3/2015 08:18:49 A540 3

Here's a way using analytic functions:
WITH test_results AS (SELECT to_date('12/01/2015 22:15:03', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/01/2015 22:17:14', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A700' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/01/2015 22:17:53', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A700' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/01/2015 22:19:24', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A700' instrument, 5905 ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/01/2015 23:28:15', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A700' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/01/2015 23:35:10', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, 6000 ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/02/2015 02:15:13', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, 5900 ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/02/2015 03:07:03', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/02/2015 03:44:52', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/02/2015 09:15:56', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A700' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/02/2015 14:17:09', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A700' instrument, 5900 ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/02/2015 17:15:42', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A980' instrument, NULL ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/03/2015 08:17:53', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, 5900 ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/03/2015 08:18:49', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, 5900 ERROR_CODE FROM dual UNION ALL
SELECT to_date('12/03/2015 11:17:57', 'mm/dd/yyyy hh24:mi:ss') start_time, 'A540' instrument, NULL ERROR_CODE FROM dual)
-- end of mimicking a table with data in it called "test_results"
-- for use in the following select statement:
SELECT start_time,
instrument,
running_total success_before_error
FROM (SELECT start_time,
instrument,
ERROR_CODE,
sum(CASE WHEN ERROR_CODE IS NOT NULL THEN 0
ELSE 1
END) OVER (PARTITION BY instrument ORDER BY start_time) running_total
FROM test_results)
WHERE ERROR_CODE IS NOT NULL -- this may need to be "error_code in (5900, 6000, 5905)"
ORDER BY start_time;
START_TIME INSTRUMENT SUCCESS_BEFORE_ERROR
------------------- ---------- --------------------
12/01/2015 22:19:24 A700 2
12/01/2015 23:35:10 A540 1
12/02/2015 02:15:13 A540 1
12/02/2015 14:17:09 A700 4
12/03/2015 08:17:53 A540 3
12/03/2015 08:18:49 A540 3

I dont know source table name I call it table_one.
EDIT: As I see now I make a mistake I calculate cosequece of successfull test. I leave it as is
ordered_tab as (
select START_TIME
,INSTRUMENT
,ERROR_CODE
,row_number() over (partition by INSTRUMENT order by START_TIME) rn
from table_one)
select START_TIME as ERROR_TIME
,INSTRUMENT
,SUCCESSES_BEFORE_ERROR
FROM (
select START_TIME
,INSTRUMENT
,ERROR_CODE
,rn -1
- nvl(last_value(nvl2(ERROR_CODE,rn,null) ignore nulls)
over (partition by INSTRUMENT order by START_TIME rows between unbounded preceding and 1 preceding),0) as SUCCESSES_BEFORE_ERROR
from ordered_tab
) where ERROR_CODE IN (5905, 5900, 6000)

There may be a way to do this with analytic functions (no doubt there is). But the simplest way to express the logic -- in my opinion -- is to use a correlated subquery:
select t.*,
(select count(*)
from t t2
where t2.instrument = t.instrument and
t2.start_time < t.start_time and
t2.error_code is null
) as SUCCESSES_BEFORE_ERROR
from t
where t.error_code is not null;

Related

Squashing values in a column by another timestamp colum

I have a following data which shows the status of a support ticket:
Edit:
More concise and generic example:
STATUS SEQ_NO
New 1
Open 2
Open 3
Open 4
Queued 5
Open 6
Open 7
Open 8
Completed 9
Completed 10
Completed 11
Closed 12
From this, I would like to extract the records,
STATUS SEQ_NO
New 1
Open 2
Queued 5
Open 6
Completed 9
Closed 12
Original question:
-- SELECT status, start_time FROM events_tab ORDER BY start_time;
STATUS START_TIME
New 30/09/2014 3:48:10 PM -- I want this record,
Open 30/09/2014 3:48:10 PM -- and this,
Open 1/10/2014 10:41:57 AM
Open 4/03/2015 9:59:04 AM
Queued 18/06/2015 1:31:30 PM -- and this,
Open 20/06/2015 10:10:47 PM -- and this,
Open 20/06/2015 11:20:11 PM
Open 27/06/2015 1:18:50 PM
Completed 27/06/2015 1:22:08 PM -- and this,
Completed 28/09/2015 9:31:55 AM
Completed 5/10/2015 11:57:38 AM
Closed 11/01/2016 9:31:26 AM -- and this.
These are events that happened in each state. I want to make a timeline of state changes from it.
I want to squash these records such that only the very first row of a group is show. However, notice that there are actually two groups of Open status. So I should get two records with Open status.
Basically I want the following result:
STATUS START_TIME
New 30/09/2014 3:48:10 PM
Open 30/09/2014 3:48:10 PM
Queued 18/06/2015 1:31:30 PM
Open 20/06/2015 10:10:47 PM
Completed 27/06/2015 1:22:08 PM
Closed 11/01/2016 9:31:26 AM
How can I achieve this with an SQL statement?
I have tried,
SELECT status, MIN(start_time)
FROM events_tab
GROUP BY status;
But this does not include multiple records in Open status, as my intention above.
You can use the Tabibitosan technique to achieve this goal:
WITH your_table AS (SELECT 'New' status, to_date('30/09/2014 03:48:10 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Open' status, to_date('30/09/2014 03:48:10 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Open' status, to_date('1/10/2014 10:41:57 AM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Open' status, to_date('4/03/2015 09:59:04 AM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Queued' status, to_date('18/06/2015 01:31:30 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Open' status, to_date('20/06/2015 10:10:47 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Open' status, to_date('20/06/2015 11:20:11 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Open' status, to_date('27/06/2015 01:18:50 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Completed' status, to_date('27/06/2015 01:22:08 PM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Completed' status, to_date('28/09/2015 09:31:55 AM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Completed' status, to_date('5/10/2015 11:57:38 AM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual UNION ALL
SELECT 'Closed' status, to_date('11/01/2016 09:31:26 AM', 'dd/mm/yyyy hh:mi:ss AM') start_time FROM dual)
SELECT status,
MIN(start_time) start_time
FROM (SELECT status,
start_time,
row_number() OVER (ORDER BY start_time, status) - row_number() OVER (PARTITION BY status ORDER BY start_time, status) grp
FROM your_table)
GROUP BY status, grp
ORDER BY start_time, status;
STATUS START_TIME
--------- -------------------
New 30/09/2014 15:48:10
Open 30/09/2014 15:48:10
Queued 18/06/2015 13:31:30
Open 20/06/2015 22:10:47
Completed 27/06/2015 13:22:08
Closed 11/01/2016 09:31:26
N.B. Since you have rows with different statuses having the same start_time, I have added status into the order by, in order to get the results you were after. I don't know if that was a typo, or whether multiple rows really can have the same date.
Also, I assume that the data in your example refers to one "thing", but in your real table, you can have multiple "things" each with their own set of statuses etc.
In that case, you would need to add the column(s) that differentiate the "things" (e.g. id or event_name or etc) into both row_number() analytic functions. (e.g. row_number() over (partition by <thing column(s)> order by start_time, status))
You can also try the SQL for Pattern Matching
WITH tickets(STATUS, START_TIME) AS (
SELECT 'New', TO_DATE('30/09/2014 3:48:10 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Open', TO_DATE('30/09/2014 3:48:10 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Open', TO_DATE('1/10/2014 10:41:57 AM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Open', TO_DATE('4/03/2015 9:59:04 AM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Queued', TO_DATE('18/06/2015 1:31:30 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Open', TO_DATE('20/06/2015 10:10:47 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Open', TO_DATE('20/06/2015 11:20:11 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Open', TO_DATE('27/06/2015 1:18:50 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Completed', TO_DATE('27/06/2015 1:22:08 PM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Completed', TO_DATE('28/09/2015 9:31:55 AM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Completed', TO_DATE('5/10/2015 11:57:38 AM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual UNION ALL
SELECT 'Closed', TO_DATE('11/01/2016 9:31:26 AM', 'dd/mm/yyyy hh:mi:ss AM') FROM dual)
SELECT STATUS, START_TIME
FROM tickets
MATCH_RECOGNIZE (
ORDER BY START_TIME
MEASURES
START_TIME AS START_TIME,
STATUS as STATUS
PATTERN ( CHNG )
DEFINE
CHNG AS CHNG.STATUS <> PREV(CHNG.STATUS) OR PREV(CHNG.STATUS) IS NULL
)
STATUS START_TIME
========== ====================
New 30.09.2014 15:48:10
Open 30.09.2014 15:48:10
Queued 18.06.2015 13:31:30
Open 20.06.2015 22:10:47
Completed 27.06.2015 13:22:08
Closed 11.01.2016 09:31:26
CHNG.STATUS <> PREV(CHNG.STATUS) matches each row where STATUS is different to previous row. PREV(CHNG.STATUS) IS NULL is used to get also the very first row.
use row_number window function
select STATUS ,START_TIME from
(
select STATUS,START_TIME,
row_number() over (partition by STATUS,EXTRACT(YEAR FROM START_TIME) order by START_TIME) rn
from events_tab
) t where rn=1
Use LAG Function as you need to track the change in status:
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=38a991b698c858f6f0417c7d4c0dc9d3
with cte1 (st,dt) as
(
select 'New' as st, '30/09/2014 3:48:10 PM' as dt from dual
union all
select 'Open' as st, '30/09/2014 3:48:10 PM' as dt from dual
union all
select 'Open' as st, '20/09/2014 3:48:10 PM' as dt from dual
union all
select 'Qued' as st, '18/06/2015 1:31:30' as dt from dual
)
select st, min(case when st<>prev_order_date then dt else dt end) as d
from
(
SELECT st, dt,
LAG (st,1) OVER (ORDER BY st) AS prev_order_date
FROM cte1
)a
group by st

Oracle SQL to find sum of difference of date by Group

I am trying to find a total duration consume by a Group by calculating date difference in a following query
with event AS (
SELECT 9000 AS ID, TO_DATE('2018-03-01 09:00:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9000 AS ID, TO_DATE('2018-03/10 10:00:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03-10 11:00:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03/20 10:00:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9000 AS ID, TO_DATE('2018-03-20 10:05:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9000 AS ID, TO_DATE('2018-03/25 09:00:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03-25 10:15:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03/26 12:00:00','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9002 AS ID, TO_DATE('2017-03-26 14:30:27','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9002 AS ID, TO_DATE('2017-04-05 15:02:56','RRRR-MM-DD HH24:MI:SS') AS
TIMESTAMP, 'END' AS EVENT FROM DUAL
)
select id, min(timestamp) as call_start_ts, max(timestamp) as call_end_ts,
max(timestamp) - min(timestamp) as duration
from event t
group by id
order by 1;
I have also configure the SQLFiddle
Please help me
EDIT
Expected Result will be like below
Use the LAG or LEAD analytic functions to get the next END event's time:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE event ( id, timestamp, event ) AS
SELECT 9000, TO_DATE('2018-03-01 09:00:00','RRRR-MM-DD HH24:MI:SS'), 'Start' FROM DUAL UNION ALL
SELECT 9000, TO_DATE('2018-03/10 10:00:00','RRRR-MM-DD HH24:MI:SS'), 'END' FROM DUAL UNION ALL
SELECT 9001, TO_DATE('2018-03-10 11:00:00','RRRR-MM-DD HH24:MI:SS'), 'Start' FROM DUAL UNION ALL
SELECT 9001, TO_DATE('2018-03/20 10:00:00','RRRR-MM-DD HH24:MI:SS'), 'END' FROM DUAL UNION ALL
SELECT 9000, TO_DATE('2018-03-20 10:05:00','RRRR-MM-DD HH24:MI:SS'), 'Start' FROM DUAL UNION ALL
SELECT 9000, TO_DATE('2018-03/25 09:00:00','RRRR-MM-DD HH24:MI:SS'), 'END' FROM DUAL UNION ALL
SELECT 9001, TO_DATE('2018-03-25 10:15:00','RRRR-MM-DD HH24:MI:SS'), 'Start' FROM DUAL UNION ALL
SELECT 9001, TO_DATE('2018-03/26 12:00:00','RRRR-MM-DD HH24:MI:SS'), 'END' FROM DUAL UNION ALL
SELECT 9002, TO_DATE('2017-03-26 14:30:27','RRRR-MM-DD HH24:MI:SS'), 'Start' FROM DUAL UNION ALL
SELECT 9002, TO_DATE('2017-04-05 15:02:56','RRRR-MM-DD HH24:MI:SS'), 'END' FROM DUAL;
Query 1:
SELECT id,
MIN( timestamp ) AS start_ts,
MAX( end_time ) AS end_ts,
SUM( end_time - timestamp ) AS duration
FROM (
SELECT id,
timestamp,
event,
LEAD( CASE event WHEN 'END' THEN timestamp END )
OVER ( PARTITION BY id ORDER BY timestamp ) AS end_time
FROM event
)
WHERE event = 'Start'
GROUP BY id
ORDER BY id
Results:
| ID | START_TS | END_TS | DURATION |
|------|----------------------|----------------------|--------------------|
| 9000 | 2018-03-01T09:00:00Z | 2018-03-25T09:00:00Z | 13.996527777777779 |
| 9001 | 2018-03-10T11:00:00Z | 2018-03-26T12:00:00Z | 11.03125 |
| 9002 | 2017-03-26T14:30:27Z | 2017-04-05T15:02:56Z | 10.02255787037037 |
I solved the problem in two steps. First i match records in the same interval then i sum up their duration.
http://sqlfiddle.com/#!4/73f48/83
SELECT
Id,
round(SUM(duration))
FROM
(
SELECT
t.id,
MIN (t2. TIMESTAMP) - t. TIMESTAMP AS duration
FROM
event t,
event t2
WHERE
t.Id = t2.Id
AND t2.Event = 'END'
AND t.Event = 'Start'
AND t2. TIMESTAMP > t. TIMESTAMP
GROUP BY
t. TIMESTAMP,
t.Id
)
GROUP BY
Id
select
id, round(sum(end_timestamp - start_timestamp),3) DURATION
from (
select
t.id,
t.timestamp START_TIMESTAMP,
case when LEAD(t.event,1) OVER (partition by id order by timestamp, event desc) = 'END'
then LEAD(t.timestamp,1) OVER (partition by id order by timestamp, event desc)
else null end as END_TIMESTAMP
from event t
)tt
where end_timestamp is not null
group by id
Solution to your problem:
WITH event AS (
SELECT 9000 AS ID, TO_DATE('2018-03-01 09:00:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9000 AS ID, TO_DATE('2018-03/10 10:00:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03-10 11:00:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03/20 10:00:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9000 AS ID, TO_DATE('2018-03-20 10:05:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9000 AS ID, TO_DATE('2018-03/25 09:00:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03-25 10:15:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9001 AS ID, TO_DATE('2018-03/26 12:00:00','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'END' AS EVENT FROM DUAL UNION ALL
SELECT 9002 AS ID, TO_DATE('2017-03-26 14:30:27','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'Start' AS EVENT FROM DUAL UNION ALL
SELECT 9002 AS ID, TO_DATE('2017-04-05 15:02:56','RRRR-MM-DD HH24:MI:SS') AS TIMESTAMP, 'END' AS EVENT FROM DUAL
)
,rn_event AS
(
select event.*,ROW_NUMBER() OVER (Partition BY ID ORDER BY TimeSTAMP) AS rn from event
)
, diff_event AS
(
SELECT e.ID, f.TIMESTAMP AS Start_time, e.timestamp AS End_Time, e.TIMESTAMP - f.timestamp AS duration
FROM rn_event e
INNER JOIN rn_event f
ON f.id = e.id AND f.EVENT = 'Start' AND f.rn = e.rn - 1
)
SELECT ID,MIN(Start_Time) START_TS, MAX(END_TIME) END_TS, ROUND(SUM(Duration)) AS Duration
FROM diff_event
GROUP BY ID;
OUTPUT:
ID START_TS END_TS DURATION
9000 2018-03-01T09:00:00Z 2018-03-25T09:00:00Z 14
9001 2018-03-10T11:00:00Z 2018-03-26T12:00:00Z 11
9002 2017-03-26T14:30:27Z 2017-04-05T15:02:56Z 10
A demo for the above query:
http://sqlfiddle.com/#!4/73f48/87

SQL count consecutive rows

I have the following data in a table:
|event_id |starttime |person_id|attended|
|------------|-----------------|---------|--------|
| 11512997-1 | 01-SEP-16 08:00 | 10001 | N |
| 11512997-2 | 01-SEP-16 10:00 | 10001 | N |
| 11512997-3 | 01-SEP-16 12:00 | 10001 | N |
| 11512997-4 | 01-SEP-16 14:00 | 10001 | N |
| 11512997-5 | 01-SEP-16 16:00 | 10001 | N |
| 11512997-6 | 01-SEP-16 18:00 | 10001 | Y |
| 11512997-7 | 02-SEP-16 08:00 | 10001 | N |
| 11512997-1 | 01-SEP-16 08:00 | 10002 | N |
| 11512997-2 | 01-SEP-16 10:00 | 10002 | N |
| 11512997-3 | 01-SEP-16 12:00 | 10002 | N |
| 11512997-4 | 01-SEP-16 14:00 | 10002 | Y |
| 11512997-5 | 01-SEP-16 16:00 | 10002 | N |
| 11512997-6 | 01-SEP-16 18:00 | 10002 | Y |
| 11512997-7 | 02-SEP-16 08:00 | 10002 | Y |
I want to produce the following results, where the maximum number of consecutive occurences where atended = 'N' is returned:
|person_id|consec_missed_max|
| 1001 | 5 |
| 1002 | 3 |
How could this be done in Oracle (or ANSI) SQL? Thanks!
Edit:
So far I have tried:
WITH t1 AS
(SELECT t.person_id,
row_number() over(PARTITION BY t.person_id ORDER BY t.starttime) AS idx
FROM the_table t
WHERE t.attended = 'N'),
t2 AS
(SELECT person_id, MAX(idx) max_idx FROM t1 GROUP BY person_id)
SELECT t1.person_id, COUNT(1) ct
FROM t1
JOIN t2
ON t1.person_id = t2.person_id
GROUP BY t1.person_id;
The main work is in the factored subquery "prep". You seem to be somewhat familiar with analytic function, but that is not enough. This solution uses the so-called "tabibitosan" method to create groups of consecutive rows with the same characteristic in one or more dimensions; in this case, you want to group consecutive N rows with a different group for each sequence. This is done with a difference of two ROW_NUMBER() calls - one partitioned by person only, and the other by person and attended. Google "tabibitosan" to read more about the idea if needed.
with
inputs ( event_id, starttime, person_id, attended ) as (
select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10001, 'Y' from dual union all
select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all
select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all
select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual
),
prep ( starttime, person_id, attended, gp ) as (
select starttime, person_id, attended,
row_number() over (partition by person_id order by starttime) -
row_number() over (partition by person_id, attended
order by starttime)
from inputs
),
counts ( person_id, consecutive_absences ) as (
select person_id, count(*)
from prep
where attended = 'N'
group by person_id, gp
)
select person_id, max(consecutive_absences) as max_consecutive_absences
from counts
group by person_id
order by person_id;
OUTPUT:
PERSON_ID MAX_CONSECUTIVE_ABSENCES
---------- ---------------------------------------
10001 5
10002 3
If you are using Oracle 12c you could use MATCH_RECOGNIZE:
Data:
CREATE TABLE data AS
SELECT *
FROM (
with inputs ( event_id, starttime, person_id, attended ) as (
select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10001, 'Y' from dual union all
select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all
select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all
select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all
select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all
select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual
)
SELECT * FROM inputs
);
And query:
SELECT PERSON_ID, MAX(LEN) AS MAX_ABSENCES_IN_ROW
FROM data
MATCH_RECOGNIZE (
PARTITION BY PERSON_ID
ORDER BY STARTTIME
MEASURES FINAL COUNT(*) AS len
ALL ROWS PER MATCH
PATTERN(a b*)
DEFINE b AS attended = a.attended
)
WHERE attended = 'N'
GROUP BY PERSON_ID;
Output:
"PERSON_ID","MAX_ABSENCES_IN_ROW"
10001,5
10002,3
EDIT:
As #mathguy pointed it could be rewritten as:
SELECT PERSON_ID, MAX(LEN) AS MAX_ABSENCES_IN_ROW
FROM data
MATCH_RECOGNIZE (
PARTITION BY PERSON_ID
ORDER BY STARTTIME
MEASURES COUNT(*) AS len
PATTERN(a+)
DEFINE a AS attended = 'N'
)
GROUP BY PERSON_ID;
db<>fiddle demo

Group by issue using sql

I am trying to perform aggregation on a table. But it is not aggregating properly for some cases. Please find the below input.
Table t1.
CHANNEL;VALUE;STATUS;ERROR_CODE;RND_TIMESTAMP;SESSION_CD;NAR;
-------------------------------------------------------------
USD;4;12;;2-NOV-2015 11:00:00;;
USD;4;12;;2-NOV-2015 11:00:00;;
USD;2;12;;2-NOV-2015 11:00:00;;
USD;3;12;;2-NOV-2015 11:00:00;;
Output table t2
CHANNEL;VALUE;STATUS;ERROR_CODE;HOUR_TIMESTAMP;SESSION_CD;NAR;
--------------------------------------------------------------
USD;5;12;;2-NOV-2015 11:00:00;;
Query:
select
channel, sum(value),
status, error_code, rnd_timestamp, session_cd, nar
from
t1
where
rnd_timestamp > (select max(hour_timestamp) from t2)
group by
channel, status, error_code, rnd_timestamp, session_cd, nar
Why is it not considering the other 2 rows for aggregation. Is it because some columns in group by have null? How to solve this issue?
Output must be :
USD;13;12;;2-NOV-2015 11:00:00;;
Why do you think your query has an issue?
By switching the hour_timestamp in t2 to be 10am not 11am, your query works as expected for me:
with t1 as (select 'USD' channel, 4 value, 12 status, null error_code, to_date('02/11/2015 11:00:00', 'dd/mm/yyyy hh24:mi:ss') rnd_timestamp, null session_cd, null nar from dual union all
select 'USD' channel, 4 value, 12 status, null error_code, to_date('02/11/2015 11:00:00', 'dd/mm/yyyy hh24:mi:ss') rnd_timestamp, null session_cd, null nar from dual union all
select 'USD' channel, 2 value, 12 status, null error_code, to_date('02/11/2015 11:00:00', 'dd/mm/yyyy hh24:mi:ss') rnd_timestamp, null session_cd, null nar from dual union all
select 'USD' channel, 3 value, 12 status, null error_code, to_date('02/11/2015 11:00:00', 'dd/mm/yyyy hh24:mi:ss') rnd_timestamp, null session_cd, null nar from dual),
t2 as (select 'USD' channel, 5 value, 12 status, null error_code, to_date('02/11/2015 10:00:00', 'dd/mm/yyyy hh24:mi:ss') hour_timestamp, null session_cd, null nar from dual)
--- end of mimicking your tables t1 and t2 with data in; see SQL below:
select channel,
sum(value),
status,
error_code,
rnd_timestamp,
session_cd,
nar
from t1
where rnd_timestamp > (select max(hour_timestamp) from t2)
group by channel,
status,
error_code,
rnd_timestamp,
session_cd,
nar;
CHANNEL SUM(VALUE) STATUS ERROR_CODE RND_TIMESTAMP SESSION_CD NAR
------- ---------- ---------- ---------- --------------------- ---------- ---
USD 13 12 02/11/2015 11:00:00

Count records per hour within a time span

I have a table with a userID, a startDate and an endDate.
I would like to count hour by hour the number of userID concerned.
For example, the user '4242' with startDate = '21/05/2014 01:15:00' and with endDate = '21/05/2014 05:22:00' should be counted once from 01 to 02, once from 02 to 03, once from 03 to 04, ...
It would give a result like that:
DATE AND TIME COUNT
-------------------------------------
20140930 18-19 198
20140930 19-20 220
20140930 20-21 236
20140930 21-22 257
20140930 22-23 257
20140930 23-00 257
20141001 00-01 259
20141001 01-02 259
20141001 02-03 258
20141001 03-04 259
20141001 04-05 258
20141001 05-06 258
How would you do that ?
Well, I tried a lot of things. Here's my latest attempt. If the code is too messy, don't even bother reading it, just tell me how you would handle this problem ;) Thanks !
WITH timespan AS (
SELECT lpad(rownum - 1,2,'00') ||'-'|| lpad(mod(rownum,24),2,'00') AS hours
FROM dual
connect BY level <= 24
),
UserID_min_max AS (
SELECT USERS.UserID,
min(USERS.date_startUT) AS min_date,
max(USERS.date_end) AS max_date,
code_etat
FROM USERS
WHERE (
(USERS.date_startUT >= to_date('01/10/2014 00:00:00','dd/MM/YYYY HH24:mi:ss')
AND USERS.date_end <= to_date('08/10/2014 23:59:00','dd/MM/YYYY HH24:mi:ss'))
OR ( USERS.date_startUT <= to_date('01/10/2014 00:00:00','dd/MM/YYYY HH24:mi:ss')
AND USERS.date_end >= to_date('01/10/2014 00:00:00','dd/MM/YYYY HH24:mi:ss')
AND USERS.date_end <= to_date('08/10/2014 23:59:00','dd/MM/YYYY HH24:mi:ss'))
OR (USERS.date_startUT BETWEEN to_date('01/10/2014 00:00:00','dd/MM/YYYY HH24:mi:ss') AND to_date('08/10/2014 23:59:00','dd/MM/YYYY HH24:mi:ss')))
GROUP BY USERS.UserID, code_etat
),
hours_list AS (
SELECT UserID, min_date, max_date, code_etat
, to_char(min_date + row_number() over (partition BY UserID ORDER BY 1)-1,'yyyymmdd') AS days
, to_char(min_date,'yyyymmdd') AS date_start
, to_char(min_date, 'hh24') || '-' || lpad(to_number(to_char(min_date, 'hh24')) + 1, 2, '00') AS timespan_date_start
, to_char(max_date,'yyyymmdd') AS date_end
, to_char(max_date, 'hh24') || '-' || lpad(to_number(to_char(max_date, 'hh24')) + 1, 2, '00') AS timespan_date_end
FROM UserID_min_max cmm
connect BY level <= trunc(max_date) - trunc(min_date)+1
AND PRIOR UserID = UserID
AND prior sys_guid() IS NOT NULL
),
all_timespan_hours_list AS (
SELECT lj.*, t.*, lj.days ||' '|| t.hours AS days_hours
FROM hours_list lj
JOIN timespan t
ON lj.days || t.hours >= lj.date_start || lj.timespan_date_start
AND lj.days || t.hours <= lj.date_end || lj.timespan_date_end
)
SELECT DISTINCT days_hours, COUNT(*)
FROM (
SELECT *
FROM all_timespan_hours_list ttlj
WHERE CODE_ETAT IN ('SOH','SOL')
)
GROUP BY days_hours
ORDER BY days_hours;
Here's how I would do something similar:
with dt_tab as (select trunc(:p_start_date, 'hh') + (level - 1)/24 hr
from dual
connect by level <= (trunc(:p_end_date, 'hh') - trunc(:p_start_date, 'hh'))*24 + 1),
sample_data as (select 4242 usr, to_date('21/05/2015 01:15:00', 'dd/mm/yyyy hh24:mi:ss') start_date, to_date('21/05/2015 05:22:00', 'dd/mm/yyyy hh24:mi:ss') end_date from dual union all
select 4243 usr, to_date('20/05/2015 18:32:42', 'dd/mm/yyyy hh24:mi:ss') start_date, to_date('21/05/2015 01:36:56', 'dd/mm/yyyy hh24:mi:ss') end_date from dual union all
select 4244 usr, to_date('21/05/2015 07:00:00', 'dd/mm/yyyy hh24:mi:ss') start_date, null end_date from dual)
select to_char(dt.hr, 'dd/mm/yyyy hh24-')||to_char(dt.hr + 1/24, 'hh24') date_and_time,
count(sd.usr) cnt
from dt_tab dt
left outer join sample_data sd on (dt.hr < nvl(sd.end_date, :p_end_date) and dt.hr >= sd.start_date)
group by to_char(dt.hr, 'dd/mm/yyyy hh24-')||to_char(dt.hr + 1/24, 'hh24')
order by date_and_time;
:p_start_date := 20/05/2015 08:00:00
:p_end_date := 21/05/2015 08:00:00
DATE_AND_TIME CNT
---------------- ---
20/05/2015 08-09 0
20/05/2015 09-10 0
20/05/2015 10-11 0
20/05/2015 11-12 0
20/05/2015 12-13 0
20/05/2015 13-14 0
20/05/2015 14-15 0
20/05/2015 15-16 0
20/05/2015 16-17 0
20/05/2015 17-18 0
20/05/2015 18-19 0
20/05/2015 19-20 1
20/05/2015 20-21 1
20/05/2015 21-22 1
20/05/2015 22-23 1
20/05/2015 23-00 1
21/05/2015 00-01 1
21/05/2015 01-02 1
21/05/2015 02-03 1
21/05/2015 03-04 1
21/05/2015 04-05 1
21/05/2015 05-06 1
21/05/2015 06-07 0
21/05/2015 07-08 1
21/05/2015 08-09 0
(depending on how your time period start and end dates are configured, you might want to change from using bind variables - eg. use the min/max dates in your table, etc)
The above works when I run it in Toad. For something that works in SQL*Plus, or when you run it as a script (e.g. in Toad), the below should work:
variable p_start_date varchar2(20)
variable p_end_date varchar2(20)
exec :p_start_date := '20/05/2015 08:00:00';
exec :p_end_date := '21/05/2015 08:00:00';
with dt_tab as (select trunc(to_date(:p_start_date, 'dd/mm/yyyy hh24:mi:ss'), 'hh') + (level - 1)/24 hr
from dual
connect by level <= (trunc(to_date(:p_end_date, 'dd/mm/yyyy hh24:mi:ss'), 'hh') - trunc(to_date(:p_start_date, 'dd/mm/yyyy hh24:mi:ss'), 'hh'))*24 + 1),
sample_data as (select 4242 usr, to_date('21/05/2015 01:15:00', 'dd/mm/yyyy hh24:mi:ss') start_date, to_date('21/05/2015 05:22:00', 'dd/mm/yyyy hh24:mi:ss') end_date from dual union all
select 4243 usr, to_date('20/05/2015 18:32:42', 'dd/mm/yyyy hh24:mi:ss') start_date, to_date('21/05/2015 01:36:56', 'dd/mm/yyyy hh24:mi:ss') end_date from dual union all
select 4244 usr, to_date('21/05/2015 07:00:00', 'dd/mm/yyyy hh24:mi:ss') start_date, null end_date from dual)
select to_char(dt.hr, 'dd/mm/yyyy hh24-')||to_char(dt.hr + 1/24, 'hh24') date_and_time,
count(sd.usr) cnt
from dt_tab dt
left outer join sample_data sd on (dt.hr < nvl(sd.end_date, to_date(:p_end_date, 'dd/mm/yyyy hh24:mi:ss')) and dt.hr >= sd.start_date)
group by to_char(dt.hr, 'dd/mm/yyyy hh24-')||to_char(dt.hr + 1/24, 'hh24')
order by date_and_time;
Try to use the function TRUNC(date,[fmt]) like this:
select trunc(some_date, 'HH24')
from some_table
group by trunc(some_date, 'HH24');
Try this:
SELECT
DATE_FORMAT(your_datetime_column, '%Y%m%d %H') AS `hourly`,
COUNT(*) AS `count`
FROM your_table
GROUP BY DATE_FORMAT(your_datetime_column, '%Y%m%d %H')