Date difference when group by value change - sql

I have a table like this
|---------------------|------------------|---------------------|------------------|
| FILE | UNIT | DATE | ID SEQUENCE |
|---------------------|------------------|---------------------|------------------|
| 10 | 34 | 01/02/2000 | 10 |
|---------------------|------------------|---------------------|------------------|
| 10 | 34 | 01/05/2000 | 11 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/05/2000 | 12 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/02/2000 | 13 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/02/2000 | 14 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/15/2000 | 15 |
|---------------------|------------------|---------------------|------------------|
| 10 | 34 | 01/16/2000 | 16 |
|---------------------|------------------|---------------------|------------------|
| 10 | 70 | 01/17/2000 | 17 |
|---------------------|------------------|---------------------|------------------|
| 10 | 70 | 01/28/2000 | 18 |
|---------------------|------------------|---------------------|------------------|
I need to build a view like this (get the amount of days a file stay in every unit)
|---------------------|------------------|---------------------|
| FILE | UNIT | DAYS IN UNITY |
|---------------------|------------------|---------------------|
| 10 | 34 | 3 |
|---------------------|------------------|---------------------|
| 10 | 40 | 10 |
|---------------------|------------------|---------------------|
| 10 | 34 | 1 |
|---------------------|------------------|---------------------|
| 10 | 70 | 11 |
|---------------------|------------------|---------------------|
Any advice
Thanks in advance

This is a form of gaps-and-islands. For this purpose, I am thinking difference of row numbers:
select file, unit,
(lead(min(date), 1, max(date)) over (partition by file, unit) -
min(date)
) as days_in_unity
from (select t.*,
row_number() over (partition by file order by id_sequence) as seqnum,
row_number() over (partition by file, unit order by id_sequence) as seqnum_2
from t
) t
group by file, unit, (seqnum - seqnum_2)
Why this works is a little tricky to explain. If you look at the results of the subquery, you will see how the difference of the two row numbers is constant for the rows with the same unit value.

If you are using a newer version of Oracle (12c and above), it may be an idea to use MATCH_RECOGNIZE (pattern matching). In the following query, we define 2 patterns:
{1} multiple days, with a start date and an end date. This pattern may contain "iffy" dates eg the 01/02/2000 in between 01/05/2000 and 01/15/2000 for unit 40.
{2} single day: this occurs when the "start" and "end" are on one and the same day. In the MEASURES clause we pick up all the columns we need, and COALESCE them in the SELECT clause (MATCH_RECOGNIZE documentation here).
Table
create table fileandunit( FILE_, UNIT, DATE_, ID_SEQUENCE )
as
select 10, 34, to_date( '02-JAN-2000', 'DD-MON-YYYY'), 10 from dual union all
select 10, 34, to_date( '05-JAN-2000', 'DD-MON-YYYY'), 11 from dual union all
select 10, 40, to_date( '05-JAN-2000', 'DD-MON-YYYY'), 12 from dual union all
select 10, 40, to_date( '02-JAN-2000', 'DD-MON-YYYY'), 13 from dual union all
select 10, 40, to_date( '02-JAN-2000', 'DD-MON-YYYY'), 14 from dual union all
select 10, 40, to_date( '15-JAN-2000', 'DD-MON-YYYY'), 15 from dual union all
select 10, 34, to_date( '16-JAN-2000', 'DD-MON-YYYY'), 16 from dual union all
select 10, 70, to_date( '17-JAN-2000', 'DD-MON-YYYY'), 17 from dual union all
select 10, 70, to_date( '28-JAN-2000', 'DD-MON-YYYY'), 18 from dual ;
Data and patterns
select * from fileandunit order by id_sequence ;
FILE_ UNIT DATE_ ID_SEQUENCE
10 34 02-JAN-00 10 -- start
10 34 05-JAN-00 11 -- end
10 40 05-JAN-00 12 -- start
10 40 02-JAN-00 13 -- iffy
10 40 02-JAN-00 14 -- iffy
10 40 15-JAN-00 15 -- end
10 34 16-JAN-00 16 -- single day
10 70 17-JAN-00 17 -- start
10 70 28-JAN-00 18 -- end
Query
select
coalesce( RP.m_file, RP.s_file ) file_
, coalesce( RP.m_unit, RP.s_unit ) unit_
, coalesce( ( RP.m_end - RP.m_start ), 1 ) days_
from fileandunit
match_recognize(
partition by file_ order by id_sequence
measures
enddt.file_ as m_file -- m_: for multiple days
, enddt.unit as m_unit
, startdt.date_ as m_start
, enddt.date_ as m_end
, singledt.file_ as s_file -- s_: single day
, singledt.unit as s_unit
, singledt.date_ as s_date
one row per match
pattern ( ( startdt iffydt* enddt ) | singledt ) -- multiple days (or) single day
define
startdt as ( prev( date_ ) <= date_ or prev( date_ ) is null )
and ( prev( unit ) <> unit or prev( unit ) is null )
--
, enddt as ( next( date_ ) >= date_ or next( date_ ) is null )
and ( next( unit ) <> unit or next( unit ) is null )
--
, iffydt as ( prev( date_ ) >= date_ ) -- detect incorrect dates inside a multiple day block
and ( prev( unit ) = unit )
--
, singledt as ( prev( date_ ) = date_ - 1 and next( date_ ) = date_ + 1 )
and ( prev( unit ) <> unit and next( unit ) <> unit )
) RP ;
Result
FILE_ UNIT_ DAYS_
10 34 3
10 40 10
10 34 1
10 70 11
Tested with Oracle 18c. DBfiddle here.

Related

Pivot two columns and keep the values same in sql

I have created a query to get different time types and hours
SELECT calc_time.hours measure,
calc_time.payroll_time_type elements,
calc_time.person_id,
calc_time.start_time
FROM hwm_tm_rep_work_hours_sum_v calc_time,
per_all_people_f papf
WHERE grp_type_id = 200
AND payroll_time_type IN ( 'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay ',
'OT' )
AND (To_date(To_char(calc_time.start_time, 'YYYY-MM-DD') , 'YYYY-MM-DD') BETWEEN To_date(To_char(:From_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD')
AND To_date( To_char(:To_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD' ))
AND papf.person_id = calc_time.person_id
I get the output like -
Start_time person_id elements measure
01-Jan-2021 198 Regular Pay 10
01-Jan-2021 198 OT 2
01-jAN-2021 198 Afternoon shift 2
16-JAN-2021 198 Regular Pay 10
17-JAN-2021 198 OT 3
20-JAN-2021 198 EVENING SHIFT 8
08-JAN-2021 11 Regular Pay 8
09-JAN-2021 11 OT 1
08-JAN-2021 11 tl 2
10-JAN-2021 12 Evening shift 9
11-JAN-2021 12 Evening shift 9
I want this output to be dispplayed as follows WITHIN TWO DATES THAT I PASS AS PARAMETER - LIKE PARAMETER TO AND FROM DATE 01-JAN-2021 AND 31-JAN-2021
person_id Regular_pay OT OTHER_MEASURE OTHER_CODE
198 20 5 2 Afternoon shift
198 20 5 8 EVENING SHIFT
11 8 1 2 TL
12 18 Evening shift
So sum of Regular pay and OT IN separate columns and all others in other_measure and other_code
How can I tweak the main query to achieve this?
You can use:
SELECT *
FROM (
SELECT c.person_id,
SUM(CASE c.payroll_time_type WHEN 'Regular Pay' THEN SUM(c.hours) END)
OVER (PARTITION BY c.person_id) AS regular_pay,
SUM(CASE c.payroll_time_type WHEN 'OT' THEN SUM(c.hours) END)
OVER (PARTITION BY c.person_id) AS OT,
SUM(c.hours) AS other_measure,
c.payroll_time_type AS Other_code
FROM hwm_tm_rep_work_hours_sum_v c
INNER JOIN per_all_people_f p
ON (p.person_id = c.person_id)
WHERE grp_type_id = 200
AND payroll_time_type IN (
'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay',
'OT'
)
AND c.start_time >= TRUNC(:from_date)
AND c.start_time < TRUNC(:to_date) + INTERVAL '1' DAY
GROUP BY
c.person_id,
c.payroll_time_type
)
WHERE other_code NOT IN ('Regular Pay', 'OT');
Which, for the sample data:
CREATE TABLE hwm_tm_rep_work_hours_sum_v (start_time, person_id, payroll_time_type, hours) AS
SELECT DATE '2021-01-01', 198, 'Regular Pay', 10 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 198, 'OT', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 198, 'Afternoon shift', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-16', 198, 'Regular Pay', 10 FROM DUAL UNION ALL
SELECT DATE '2021-01-17', 198, 'OT', 3 FROM DUAL UNION ALL
SELECT DATE '2021-01-20', 198, 'Evening shift', 8 FROM DUAL UNION ALL
SELECT DATE '2021-01-08', 11, 'Regular Pay', 8 FROM DUAL UNION ALL
SELECT DATE '2021-01-09', 11, 'OT', 1 FROM DUAL UNION ALL
SELECT DATE '2021-01-08', 11, 'TL', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-10', 12, 'Evening shift', 9 FROM DUAL UNION ALL
SELECT DATE '2021-01-11', 12, 'Evening shift', 9 FROM DUAL;
CREATE TABLE per_all_people_f (person_id, grp_type_id) AS
SELECT 198, 200 FROM DUAL UNION ALL
SELECT 11, 200 FROM DUAL UNION ALL
SELECT 12, 200 FROM DUAL;
Outputs:
PERSON_ID
REGULAR_PAY
OT
OTHER_MEASURE
OTHER_CODE
11
8
1
2
TL
12
18
Evening shift
198
20
5
2
Afternoon shift
198
20
5
8
Evening shift
db<>fiddle here
You could try something like this - In your question, unfortunately, it is not clear in which table which columns/values ​​are available.
SELECT
calc_time.person_id,
(select sum(calc_time.start_time) FROM hwm_tm_rep_work_hours_sum_v calc_time where papf.person_id = calc_time.person_id and calc_time.payroll_time_type = 'Regular Pay') as Regular_Pay,
...
FROM hwm_tm_rep_work_hours_sum_v calc_time,
per_all_people_f papf
WHERE grp_type_id = 200
AND payroll_time_type IN ( 'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay ',
'OT' )
AND (
To_date(To_char(calc_time.start_time, 'YYYY-MM-DD') , 'YYYY-MM-DD') BETWEEN To_date(To_char(:From_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD')
AND To_date( To_char(:To_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD' ) )
and papf.person_id = calc_time.person_id
-- use a group by
GROUP BY
calc_time.person_id
You may use aggregation and then apply model clause to calculate the required columns. Below is the code with comments, assuming you can manage filter by dates.
select *
from t
PERSON_ID | ELEMENTS | MEASURE
--------: | :-------------- | ------:
198 | Regular Pay | 1
198 | Regular Pay | 2
198 | Afternoon shift | 3
198 | Afternoon shift | 4
198 | OT | 5
198 | OT | 6
198 | EVENING SHIFT | 7
198 | EVENING SHIFT | 8
11 | Regular Pay | 11
11 | Regular Pay | 12
11 | TL | 13
11 | TL | 14
11 | EVENING SHIFT | 15
11 | EVENING SHIFT | 16
12 | TL | 21
12 | TL | 22
12 | EVENING SHIFT | 23
12 | EVENING SHIFT | 24
select
person_id,
ot,
regular_pay,
elements as other_code,
mes as other_measure
from (
/*First you need to aggregate all the measures by person_id and code*/
select
person_id,
elements,
sum(measure) as mes
from t
/*Date filter goes here*/
group by
person_id,
elements
)
model
/*RETURN UPDATED ROWS
will do the trick,
because we'll update only "other"
measures, so OT and Regular pay will no go
to the output*/
return updated rows
/*Where to break the calculation*/
partition by (person_id)
/*To be able to reference by code*/
dimension by (elements)
measures (
mes,
0 as ot,
0 as regular_pay
)
rules upsert (
ot[
elements not in ('OT', 'Regular Pay')
] = sum(mes)['OT'],
regular_pay[
elements not in ('OT', 'Regular Pay')
] = sum(mes)['Regular Pay']
)
PERSON_ID | OT | REGULAR_PAY | OTHER_CODE | OTHER_MEASURE
--------: | ---: | ----------: | :-------------- | ------------:
198 | 11 | 3 | EVENING SHIFT | 15
198 | 11 | 3 | Afternoon shift | 7
11 | null | 23 | TL | 27
11 | null | 23 | EVENING SHIFT | 31
12 | null | null | TL | 43
12 | null | null | EVENING SHIFT | 47
db<>fiddle here

Return True if Date in 1 table is within 24 hours of Date from table 2

I basically want to return a new column, 'First_24?' as shown in the desired output below that will return 'TRUE' IFF the the Folder_ID has a 'FOLDER_DATE_CREATED' within the first 24 hours following the 'TEAM_CREATE_DATE'. My attempt below for some reason yields TRUE for everything?
SELECT Folder_ID,
TEAM_ID,
FOLDER_DATE_CREATED,
NAME,
CASE WHEN t.TEAM_CREATE_DATE <= d.FOLDER_DATE_CREATED + interval '24 hours'
THEN TRUE
ELSE FALSE END AS First_24?
FROM DATA d
JOIN TEAM t on t.id = d.id
Current data Model
Folder_ID TEAM_ID FOLDER_DATE_CREATED NAME
11 100 1/21/2021 Sample 1
12 101 1/24/2021 Sample 2
13 102 4/21/2021 Sample 3
14 103 3/11/2021 Sample 4
15 104 5/31/2021 Sample 5
16 104 4/12/2021 Sample 6
TEAM_ID Team_Create_Date
100 1/21/2021
101 1/24/2021
102 2/20/2020
103 3/21/2020
104 4/12/2021
104 4/12/2021
Desired Output
Folder_ID TEAM_ID FOLDER_DATE_CREATED NAME First_24?
11 100 1/21/2021 Sample 1 TRUE
12 101 1/24/2021 Sample 2 TRUE
13 102 4/21/2021 Sample 3 FALSE
14 103 3/11/2021 Sample 4 FALSE
15 104 5/31/2021 Sample 5 FALSE
16 104 4/12/2021 Sample 6 TRUE
You can use a subquery with EXISTS
SELECT d.*,
(EXISTS (SELECT 1
FROM TEAM t
WHERE d.FOLDER_DATE_CREATED >= t.TEAM_CREATE_DATE AND
d.FOLDER_DATE_CREATED < t.TEAM_CREATE_DATE + interval '24 hours'
)
) AS First_24?
FROM DATA d;
You need to check, if FOLDER_DATE_CREATED is less than TEAM_CREATE_DATE plus 24 hours. You've added 24 to another side:
with f(Folder_ID, TEAM_ID, FOLDER_DATE_CREATED, NAME) as (
select *
from (values
( 11, 100, date '2021-01-21', 'Sample 1' ),
( 12, 101, date '2021-01-24', 'Sample 2' ),
( 13, 102, date '2021-04-21', 'Sample 3' ),
( 14, 103, date '2021-03-11', 'Sample 4' ),
( 15, 104, date '2021-05-31', 'Sample 5' ),
( 16, 104, date '2021-04-12', 'Sample 6' )
) as t
)
, t(TEAM_ID, Team_Create_Date) as (
select *
from (values
( 100, date '2021-01-21' ),
( 101, date '2021-01-24' ),
( 102, date '2020-02-20' ),
( 103, date '2020-03-21' ),
( 104, date '2021-04-12' )
) as a
)
select f.*
, cast(
f.FOLDER_DATE_CREATED
between t.Team_Create_Date
and t.Team_Create_Date + interval '24 hours'
as varchar(5)) as "First_24?"
from f
join t
using(team_id)
folder_id | team_id | folder_date_created | name | First_24?
--------: | ------: | :------------------ | :------- | :--------
11 | 100 | 2021-01-21 | Sample 1 | true
12 | 101 | 2021-01-24 | Sample 2 | true
13 | 102 | 2021-04-21 | Sample 3 | false
14 | 103 | 2021-03-11 | Sample 4 | false
15 | 104 | 2021-05-31 | Sample 5 | false
16 | 104 | 2021-04-12 | Sample 6 | true
db<>fiddle here

With Oracle SQL how can I find 3 days where total sum >= 150

I have a report that needs to list activity where total is >= 150 over 3 consecutive days.
Let's say I've created a temp table foo, to summarize daily totals.
| ID | Day | Total |
| -- | ---------- | ----- |
| 01 | 2020-01-01 | 10 |
| 01 | 2020-01-02 | 50 |
| 01 | 2020-01-03 | 50 |
| 01 | 2020-01-04 | 50 |
| 01 | 2020-01-05 | 20 |
| 02 | 2020-01-01 | 10 |
| 02 | 2020-01-02 | 10 |
| 02 | 2020-01-03 | 10 |
| 02 | 2020-01-04 | 10 |
| 02 | 2020-01-05 | 10 |
How Would I write SQL to return ID 01, but not 02?
Example Result:
| ID |
| -- |
| 01 |
I suspect that you want window functions here:
select distinct id
from (
select
t.*,
sum(total) over(partition by id order by day rows between 2 preceding and current row) sum_total,
count(*) over(partition by id order by day rows between 2 preceding and current row) cnt
from mytable t
) t
where cnt = 3 and sum_total >= 150
This gives you the ids that have a total greater than the given threshold over 3 consecutive days - which is how I understood your question.
If you just want to output the rows that have 3 consecutive days with a sum >= 150, you can use an analytic function to determine the moving total across each 3 day period per id, and then find the aggregate max value of the moving total per id, returning the id where it's >= 150.
E.g.:
WITH your_table AS (SELECT 1 ID, to_date('01/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 1 ID, to_date('02/01/2020', 'dd/mm/yyyy') dy, 50 total FROM dual UNION ALL
SELECT 1 ID, to_date('03/01/2020', 'dd/mm/yyyy') dy, 50 total FROM dual UNION ALL
SELECT 1 ID, to_date('04/01/2020', 'dd/mm/yyyy') dy, 50 total FROM dual UNION ALL
SELECT 1 ID, to_date('05/01/2020', 'dd/mm/yyyy') dy, 20 total FROM dual UNION ALL
SELECT 2 ID, to_date('01/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('02/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('03/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('04/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('05/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual),
moving_sums AS (SELECT ID,
dy,
total,
SUM(total) OVER (PARTITION BY ID ORDER BY dy RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) moving_sum
FROM your_table)
SELECT ID
FROM moving_sums
GROUP BY ID
HAVING MAX(moving_sum) >= 150;
ID
----------
1
You can use a HAVING Clause GROUPED BY ID to list the desired ID values
SELECT ID
FROM foo
GROUP BY ID
HAVING COUNT( distinct day )>=3 AND SUM( NVL(Total,0) ) >= 150
Demo
Use this if you are to specify dates
WITH foo( ID, Day, Total ) AS
(
SELECT '01', date'2020-01-01' , 10 FROM dual
UNION ALL SELECT '01', date'2020-01-02' , 50 FROM dual
UNION ALL SELECT '01', date'2020-01-03' , 50 FROM dual
UNION ALL SELECT '01', date'2020-01-04' , 50 FROM dual
UNION ALL SELECT '01', date'2020-01-05' , 20 FROM dual
UNION ALL SELECT '02', date'2020-01-01' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-02' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-03' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-04' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-05' , 10 FROM dual
)SELECT
ID
FROM foo
WHERE day BETWEEN TO_DATE('2020-01-01', 'YYYY-MM-DD' ) AND TO_DATE('2020-01-04', 'YYYY-MM-DD' )
GROUP BY ID HAVING SUM(Total) >= 150;
RESULT:
ID|
--|
01|
Maybe you can try something like this :
SELECT
*
FROM foo
WHERE day BETWEEN 2020-01-01 AND 2020-01-04
AND total > 150

Getting wrong next date from a date column for all customer Oracle

This is my NM_CUST_APPLIANCE_HISTORY table ( for custoner_id=96 ) .
Customer_id | Last_effective_date | Present_quentity
--------------+---------------------+-----------------
96 | 2009-12-20 | 10
96 | 2014-11-18 | 12
96 | 2015-11-26 | 14
I execute my query to get start_date and immediate date of next row as a end_date for a single customer ( customer_id=96 ) .
SELECT NM.CUSTOMER_ID customer_id,
NM.LATEST_EFFECTIVE_DATE start_date,
NVL (
CASE
WHEN nm.LATEST_EFFECTIVE_DATE IS NULL
THEN
TO_DATE ('12/12/9999', 'dd/mm/yyyy')
ELSE
FIRST_VALUE (
nm.LATEST_EFFECTIVE_DATE)
OVER (ORDER BY nm.LATEST_EFFECTIVE_DATE
RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
END,
TO_DATE ('12/12/9999', 'dd/mm/yyyy'))
end_date,
NM.PRESENT_QUANTITY PRESENT_quantity
FROM nm_cust_appliance_history nm
WHERE NM.APPLIANCE_INFO_ID = 10484
AND NM.CUSTOMER_ID = 96
ORDER BY customer_id, start_date;
And the result comes perfectly AS I WANT. like below :
Customer_id | START_DATE | END_DATE | PRESENT_QUANTITY
------------+------------+------------+-----------------
96 | 2009-12-20 | 2014-11-18 | 10
96 | 2014-11-18 | 2015-11-26 | 12
96 | 2015-11-26 | 9999-12-12 | 14
But when i execute this query for all customer ( removing NM.CUSTOMER_ID = 96 from query ) it gives me same START_DATE and END_DATE and end_date comes added a day AS LIKE below ... I i also give you a snapshot of my output of query and marked out that customer result with red color box...
SELECT NM.CUSTOMER_ID customer_id,
NM.LATEST_EFFECTIVE_DATE start_date,
NVL (
CASE
WHEN nm.LATEST_EFFECTIVE_DATE IS NULL
THEN
TO_DATE ('12/12/9999', 'dd/mm/yyyy')
ELSE
FIRST_VALUE (
nm.LATEST_EFFECTIVE_DATE)
OVER (ORDER BY nm.LATEST_EFFECTIVE_DATE
RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
END,
TO_DATE ('12/12/9999', 'dd/mm/yyyy'))
end_date,
NM.PRESENT_QUANTITY PRESENT_quantity
FROM nm_cust_appliance_history nm
WHERE NM.APPLIANCE_INFO_ID = 10484
--AND NM.CUSTOMER_ID = 96
ORDER BY customer_id, start_date;
Result is:
Customer_id | START_DATE | END_DATE | Present_quentity
--------------+-------------+------------+-----------------
74 | 2008-10-26 | 2008-10-27 | 5
> 96 | 2009-12-20 | 2009-12-21 | 10
> 96 | 2014-11-18 | 2014-11-19 | 12
> 96 | 2015-11-26 | 2015-11-27 | 14
100 | 2009-01-07 | 2009-01-09 | 7
Image of query Result
I want the result for all customer like the result of single customer.
How can i solve my problem?
Help me any one
Your window clause is looking at last_effective_dates across all your data. You need to add a partition by clause to restrict it to the current customer:
OVER (PARTITION BY nm.CUSTOMER_ID
ORDER BY nm.LATEST_EFFECTIVE_DATE
RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
So:
SELECT NM.CUSTOMER_ID customer_id,
NM.LATEST_EFFECTIVE_DATE start_date,
NVL (
CASE
WHEN nm.LATEST_EFFECTIVE_DATE IS NULL
THEN
TO_DATE ('12/12/9999', 'dd/mm/yyyy')
ELSE
FIRST_VALUE (
nm.LATEST_EFFECTIVE_DATE)
OVER (PARTITION BY nm.CUSTOMER_ID
ORDER BY nm.LATEST_EFFECTIVE_DATE
RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
END,
TO_DATE ('12/12/9999', 'dd/mm/yyyy'))
end_date,
NM.PRESENT_QUANTITY PRESENT_quantity
FROM nm_cust_appliance_history nm
WHERE NM.APPLIANCE_INFO_ID = 10484
ORDER BY customer_id, start_date;
If you ever need to run it for more than one appliance_info_id then you'll need to add that to the partition by clause too.
Using a dummy extra record to kind of simulate what you're seeing, supplied via a CTE:
with nm_cust_appliance_history(appliance_info_id, customer_id, latest_effective_date, present_quantity) as (
select 10484, 96, date '2009-12-20', 10 from dual
union all select 10484, 96, date '2014-11-18', 12 from dual
union all select 10484, 96, date '2015-11-26', 14 from dual
union all select 10484, 42, date '2009-12-21', 15 from dual
)
your original query gets:
CUSTOMER_ID START_DATE END_DATE PRESENT_QUANTITY
----------- ---------- ---------- ----------------
42 2009-12-21 2014-11-18 15
96 2009-12-20 2009-12-21 10
96 2014-11-18 2015-11-26 12
96 2015-11-26 9999-12-12 14
and the partition-by query above gets:
CUSTOMER_ID START_DATE END_DATE PRESENT_QUANTITY
----------- ---------- ---------- ----------------
42 2009-12-21 9999-12-12 15
96 2009-12-20 2014-11-18 10
96 2014-11-18 2015-11-26 12
96 2015-11-26 9999-12-12 14

Return a value when a different value changes

I have a query, which returns the following, EXCEPT for the last column, which is what I need to figure out how to create. For each given ObservationID I need to return the date on which the status changes; something like a LEAD() function that would take conditions and not just offsets. Can it be done?
I need to calculate the column Change Date; it should be the last date the status was not the current status.
+---------------+--------+-----------+--------+-------------+
| ObservationID | Region | Date | Status | Change Date | <-This field
+---------------+--------+-----------+--------+-------------+
| 1 | 10 | 1/3/2012 | Ice | 1/4/2012 |
| 2 | 10 | 1/4/2012 | Water | 1/6/2012 |
| 3 | 10 | 1/5/2012 | Water | 1/6/2012 |
| 4 | 10 | 1/6/2012 | Gas | 1/7/2012 |
| 5 | 10 | 1/7/2012 | Ice | |
| 6 | 20 | 2/6/2012 | Water | 2/10/2012 |
| 7 | 20 | 2/7/2012 | Water | 2/10/2012 |
| 8 | 20 | 2/8/2012 | Water | 2/10/2012 |
| 9 | 20 | 2/9/2012 | Water | 2/10/2012 |
| 10 | 20 | 2/10/2012 | Ice | |
+---------------+--------+-----------+--------+-------------+
a model clause (10g+) can do this in a compact way:
SQL> create table observation(ObservationID , Region ,obs_date, Status)
2 as
3 select 1, 10, date '2012-03-01', 'Ice' from dual union all
4 select 2, 10, date '2012-04-01', 'Water' from dual union all
5 select 3, 10, date '2012-05-01', 'Water' from dual union all
6 select 4, 10, date '2012-06-01', 'Gas' from dual union all
7 select 5, 10, date '2012-07-01', 'Ice' from dual union all
8 select 6, 20, date '2012-06-02', 'Water' from dual union all
9 select 7, 20, date '2012-07-02', 'Water' from dual union all
10 select 8, 20, date '2012-08-02', 'Water' from dual union all
11 select 9, 20, date '2012-09-02', 'Water' from dual union all
12 select 10, 20, date '2012-10-02', 'Ice' from dual ;
Table created.
SQL> select ObservationID, obs_date, Status, status_change
2 from observation
3 model
4 dimension by (Region, obs_date, Status)
5 measures ( ObservationID, obs_date obs_date2, cast(null as date) status_change)
6 rules (
7 status_change[any,any,any] = min(obs_date2)[cv(Region), obs_date > cv(obs_date), status != cv(status)]
8 )
9 order by 1;
OBSERVATIONID OBS_DATE STATU STATUS_CH
------------- --------- ----- ---------
1 01-MAR-12 Ice 01-APR-12
2 01-APR-12 Water 01-JUN-12
3 01-MAY-12 Water 01-JUN-12
4 01-JUN-12 Gas 01-JUL-12
5 01-JUL-12 Ice
6 02-JUN-12 Water 02-OCT-12
7 02-JUL-12 Water 02-OCT-12
8 02-AUG-12 Water 02-OCT-12
9 02-SEP-12 Water 02-OCT-12
10 02-OCT-12 Ice
fiddle: http://sqlfiddle.com/#!4/f6687/1
i.e. we will dimension on region, date and status as we want to look at cells with the same region, but get the first date that the status differs on.
we also have to measure date too so i created an alias obs_date2 to do that, and we want a new column status_change to hold the date the status changed.
this line is the line that does all the working out for us:
status_change[any,any,any] = min(obs_date2)[cv(Region), obs_date > cv(obs_date), status != cv(status)]
it says, for our three dimensions, only look at the rows with the same region (cv(Region),) and look at rows where the date follows the date of the current row (obs_date > cv(obs_date)) and also the status is different from the current row (status != cv(status)) finally get the minimum date that satisfies this set of conditions (min(obs_date2)) and assign it to status_change. The any,any,any part on the left means this calculation applies to all rows.
I've tried many times to understand the MODEL clause and never really quite managed it, so thought I would add another solution
This solution takes some of what Ronnis has done but instead uses the IGNORE NULLS clause of the LEAD function. I think that this is only new with Oracle 11 but you could probably replace it with the FIRST_VALUE function for Oracle 10 if necessary.
select
observation_id,
region,
observation_date,
status,
lead(case when is_change = 'Y' then observation_date end) ignore nulls
over (partition by region order by observation_date) as change_observation_date
from (
select
a.observation_id,
a.region,
a.observation_date,
a.status,
case
when status = lag(status) over (partition by region order by observation_date)
then null
else 'Y' end as is_change
from observations a
)
order by 1
I frequently do this when cleaning up overlapping from/to-dates and duplicate rows.
Your case is much simpler though, since you only have the "from-date" :)
Setting up the test data
create table observations(
observation_id number not null
,region number not null
,observation_date date not null
,status varchar2(10) not null
);
insert
into observations(observation_id, region, observation_date, status)
select 1, 10, date '2012-03-01', 'Ice' from dual union all
select 2, 10, date '2012-04-01', 'Water' from dual union all
select 3, 10, date '2012-05-01', 'Water' from dual union all
select 4, 10, date '2012-06-01', 'Gas' from dual union all
select 5, 10, date '2012-07-01', 'Ice' from dual union all
select 6, 20, date '2012-06-02', 'Water' from dual union all
select 7, 20, date '2012-07-02', 'Water' from dual union all
select 8, 20, date '2012-08-02', 'Water' from dual union all
select 9, 20, date '2012-09-02', 'Water' from dual union all
select 10, 20, date '2012-10-02', 'Ice' from dual;
commit;
The below query has three points of interest:
Identifying repeated information (the recording show the same as previous recording)
Ignoring the repeated recordings
Determining the date from the "next" change
.
with lagged as(
select a.*
,case when status = lag(status, 1) over(partition by region
order by observation_date)
then null
else rownum
end as change_flag -- 1
from observations a
)
select observation_id
,region
,observation_date
,status
,lead(observation_date, 1) over(
partition by region
order by observation_date
) as change_date --3
,lead(observation_date, 1, sysdate) over(
partition by region
order by observation_date
) - observation_date as duration
from lagged
where change_flag is not null -- 2
;