How to select the pair of rows associated with status change - sql

Date ID X or Y
-------------------------------
01.01.2016 1234 Y
01.01.2017 1234 X
01.01.2018 1234 Y
01.01.2019 1234 Y
01.01.2020 1234 Y
01.01.2021 1234 X
01.01.2016 4321 X
01.01.2017 4321 X
01.01.2018 4321 X
01.01.2019 4321 Y
01.01.2020 4321 Y
The above table shows the structure of the data I'm using. What I want to do is reducing it to another table where I only have the rows associated with the change in X/Y status; however, I do not only need the first observation after X becomes Y (or vice versa), but also the last observation before the change. How can I achieve the output that looks exactly like the below table with SQL running on Oracle database?
Date ID X or Y
-------------------------------
01.01.2016 1234 Y
01.01.2017 1234 X
01.01.2018 1234 Y
01.01.2020 1234 Y
01.01.2021 1234 X
01.01.2018 4321 X
01.01.2019 4321 Y

Here's one option:
sample data from line #1 - 14
TEMP CTE: previous (LAG) and next (LEAD) x/y values per ID, sorted by date value
final select retrieves the result
SQL> with test (datum, id, xy) as
2 (select date '2016-01-01', 1234, 'y' from dual union all
3 select date '2017-01-01', 1234, 'x' from dual union all
4 select date '2018-01-01', 1234, 'y' from dual union all
5 select date '2019-01-01', 1234, 'y' from dual union all
6 select date '2020-01-01', 1234, 'y' from dual union all
7 select date '2021-01-01', 1234, 'x' from dual union all
8 --
9 select date '2016-01-01', 4321, 'x' from dual union all
10 select date '2017-01-01', 4321, 'x' from dual union all
11 select date '2018-01-01', 4321, 'x' from dual union all
12 select date '2019-01-01', 4321, 'y' from dual union all
13 select date '2020-01-01', 4321, 'y' from dual
14 ),
15 temp as
16 (select datum, id, xy,
17 lag(xy) over (partition by id order by datum) laxy,
18 lead(xy) over (partition by id order by datum) lexy
19 from test
20 )
21 --
22 select datum, id, xy
23 from temp
24 where xy <> laxy or xy <> lexy
25 order by id, datum;
DATUM ID X
---------- ---------- -
01.01.2016 1234 y
01.01.2017 1234 x
01.01.2018 1234 y
01.01.2020 1234 y
01.01.2021 1234 x
01.01.2018 4321 x
01.01.2019 4321 y
7 rows selected.
SQL>

Seems you need to use LEAD() and LAG() function together to filter them out :
WITH t2 AS
(
SELECT t.*,
LAG(x_y,1,x_y) OVER (PARTITION BY id ORDER BY id, dt) AS lg_xy,
LEAD(x_y,1,x_y) OVER (PARTITION BY id ORDER BY id, dt) AS ld_xy
FROM t
ORDER BY id, dt
)
SELECT dt, id, x_y
FROM t2
WHERE NOT ( x_y = lg_xy AND x_y = ld_xy )
Demo

Related

Fetch record with max number in one column except if date in that column is > than today

I have a problem with fetching few exceptions from DB.
Example, table b:
sn
v_num
start_date
end_date
1
001
01-01-2019
31-12-2099
1
002
01-01-2021
31-01-2022
1
003
01-02-2022
31-12-2099
2
001
01-01-2022
31-12-2099
2
002
01-07-2022
31-07-2022
2
003
01-08-2022
31-12-2099
Expected output:
sn
v_num
start_date
end_date
1
003
01-02-2022
31-12-2099
2
001
01-01-2022
31-12-2099
Currently I'm here:
SELECT * FROM table a, table b
WHERE a.sn = b.sn
AND b.v_num = (SELECT max (v_num) FROM b WHERE a.sn = b.sn)
but obviously that is not good because of a few cases like this with sn = 2.
Conclusion, I need to get unique sn record where v_num is max (95% of them in DB) except in case if start_date of max v_num record is > today.
Filter using start_date <= TRUNC(SYSDATE) then use the ROW_NUMBER analytic function:
SELECT *
FROM (
SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY sn ORDER BY v_num DESC) AS rn
FROM "TABLE" a
WHERE start_date <= TRUNC(SYSDATE)
)
WHERE rn = 1;
If the start_date has a time component then you can use start_date < TRUNC(SYSDATE) + INTERVAL '1' DAY to get all the values for today from 00:00:00 to 23:59:59.
If you can have ties for the maximum and want to return all the ties then you can use the RANK analytic function instead of ROW_NUMBER.
Which, for the sample data:
CREATE TABLE "TABLE" (sn, v_num, start_date, end_date) AS
SELECT 1, '001', DATE '2022-01-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 1, '002', DATE '2022-01-01', DATE '2022-01-31' FROM DUAL UNION ALL
SELECT 1, '003', DATE '2022-02-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 2, '001', DATE '2022-01-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 2, '002', DATE '2022-07-01', DATE '2022-07-31' FROM DUAL UNION ALL
SELECT 2, '003', DATE '2022-08-01', DATE '2099-12-31' FROM DUAL;
Outputs:
SN
V_NUM
START_DATE
END_DATE
RN
1
003
2022-02-01 00:00:00
2099-12-31 00:00:00
1
2
001
2022-01-01 00:00:00
2099-12-31 00:00:00
1
db<>fiddle here

I need data for all 24 months of data even with missing months

I need data for all 24 months of data even with missing months.
sample data
id custname reportdate sales
1 xx 31-JAN-17 1256
1 xx 31-MAR-17 3456
1 xx 30-JUN-17 5678
1 xx 31-DEC-17 6785
2 xx 31-JAN-17 1223
2 xx 31-APR-17 3435
2 xx 30-JUN-17 6777
2 xx 31-DEC-17 9643
what i need as a output
id custname reportdate sales
1 xx JAN-17 1256
1 xx FEB-17 <null>
1 xx MAR-17 3456
.....................................
.....................................
1 xx DEC-17 6785
And similarly for id 2 ....
Tried something like this without any luck
select CUSTNAME, reportdate, sales from
(
select TRIM( LEADING '0' FROM TO_CHAR( statementdate, 'YYYY-MM') ) AS REPORTDATE mm, CUSTNAME
froM MYTABLE) SALES,
(
select to_char(date '2017-01-01' + numtoyminterval(level,'month'), 'mm') MonthName
--i actually need format as MON-Last 2 digit of year eg:JAN-17
from dual
connect by level <= 24) ALLMONTHS
where mm = MonthName(+)
also tried with CTE and i cant use my_year.year_month CTE with outer join
my_year as (
select date '2017-01-31' start_date,date '2018-12-31' end_date from dual
)
select (to_char(add_months(trunc(start_date,'mm'),level - 1),'yyyy')||'-'||(to_char(add_months(trunc(start_date,'mm'),level - 1),'mm'))) year_month
from my_year
connect by trunc(end_date,'mm') >= add_months(trunc(start_date,'mm'),level - 1);
select id, customername, reportdate, sales,
TRIM( LEADING '0' FROM TO_CHAR( reportdate, 'YYYY-MM') ) AS stmntdate
from my_oracle_tbl a
where a.stmntdate = my_year.year_month (+)
also tried this as recommended by #Littlefoot, which isnt working
WITH mydates AS (
select LAST_DAY(add_months(date '2017-01-01', level - 1)) as mth, min_id,min_custname
from (
select min(id) as min_id, min(CUSTNAME) as min_custname
from my_oracle_tbl
)
connect by level <= 24)
select
nvl(t.id, a.min_id)id,
nvl(t.CUSTNAME,a.min_custname)CUSTNAME, a.mth, t.sales
from mydates a left join my_oracle_tbl t on a.mth= LAST_DAY(t.reporttdate)
where
t.id=2
;
You can use some old school tricks (UNION ALL IN COMBINATION WITH ADD_MONTHS FUNCTION AND SUM):
select id, custname,month,
decode(sum(sales),0,null,sum(sales)) sales from
(select id, custname, to_char(reportdate, 'mon-rrrr')
month,sales from my_oracle_tbl
UNION ALL
select a.*,b.*,0 sales from
(select distinct id, custname from my_oracle_tbl) a,
(
select to_char(sysdate,'mon')||'-2017' month from dual
UNION ALL
select ,to_char(add_months(sysdate,1),'mon')||'-2017' month from dual
UNION ALL
select ,to_char(add_months(sysdate,2),'mon')||'-2017' from dual
.......
UNION ALL
select ,to_char(add_months(sysdate,11),'mon')||'-2017' from dual) b
)
group by id, custname,month;
This is what i came up with do u see any concerns? is there a better way to write this? I need to get order by lowest to largest dates?? how can i achieve this. As of now it repeats order like this 12-2018,12-2017,11-2018,11-2017. I want 2017 dates first and then 2018
select CUSTNAME, reportdate, sum(sales), mth
from ( select to_char(add_months(date '2017-01-01', level - 1), 'mmyyyy') mth
from dual
connect by level <= 24)mo
left outer join oracle_tbl dc on mo.mth = to_char(reportdate, 'mmyyyy')
group by CUSTNAME, reportdate,mth
order by mth
Here's an example; see if it helps. It displays 12 months (you'd substitute it with 24 in line #21).
SQL> alter session set nls_date_format = 'dd.mm.yyyy';
Session altered.
SQL> with test (id, custname, reportdate, sales) as
2 (select 1, 'xx', date '2017-01-31', 1256 from dual union all
3 select 1, 'xx', date '2017-03-31', 3456 from dual union all
4 select 1, 'xx', date '2017-06-30', 5678 from dual union all
5 --
6 select 2, 'xx', date '2017-03-31', 1223 from dual union all
7 select 2, 'xx', date '2017-07-31', 3435 from dual union all
8 select 2, 'xx', date '2017-09-30', 6777 from dual
9 ),
10 all_dates as
11 (select add_months(min_repdate, column_value - 1) c_mon,
12 min_id,
13 min_custname
14 from (select min(reportdate) min_repdate,
15 id min_id,
16 min(custname) min_custname
17 from test
18 group by id
19 ),
20 table(cast(multiset(select level from dual
21 connect by level <= 12
22 ) as sys.odcinumberlist))
23 )
24 select nvl(t.id, a.min_id) id,
25 nvl(t.custname, a.min_custname) custname,
26 a.c_mon,
27 t.sales
28 from all_dates a left join test t on a.min_id = t.id and a.c_mon = t.reportdate
29 order by id, a.c_mon;
ID CU C_MON SALES
---------- -- ---------- ----------
1 xx 31.01.2017 1256
1 xx 28.02.2017
1 xx 31.03.2017 3456
1 xx 30.04.2017
1 xx 31.05.2017
1 xx 30.06.2017 5678
1 xx 31.07.2017
1 xx 31.08.2017
1 xx 30.09.2017
1 xx 31.10.2017
1 xx 30.11.2017
1 xx 31.12.2017
2 xx 31.03.2017 1223
2 xx 30.04.2017
2 xx 31.05.2017
2 xx 30.06.2017
2 xx 31.07.2017 3435
2 xx 31.08.2017
2 xx 30.09.2017 6777
2 xx 31.10.2017
2 xx 30.11.2017
2 xx 31.12.2017
2 xx 31.01.2018
2 xx 28.02.2018
24 rows selected.
SQL>

Only returning rows where date is greatest and one addition critieria

I have the following data
PET_REF XDATE TYPE
123 01/01/2017 OBJ
123 01/01/2017 OBJ
123 01/01/2017 OBJ
123 02/01/2017 LVE
456 01/01/2017 OBJ
456 01/01/2017 LVE
456 02/01/2017 OBJ
Is it possible to only return rows for PET_REF where the latest (by XDATE) TYPE is not LVE
So, for the data above, the output should be
PET_REF XDATE TYPE
456 01/01/2017 OBJ
456 01/01/2017 LVE
456 02/01/2017 OBJ
Use FIRST_VALUE analytic function
Select * from
(
select PET_REF, XDATE, TYPE, First_Value(TYPE)over(Partition by PET_REF order by XDATE desc) as Latest_Type
from yourtable
)a
Where Latest_Type <> 'LVE'
SQLFIDDLE DEMO
One way of solving this is to try putting them in a subquery.
SELECT *
FROM t
WHERE c1 IN (
SELECT c1
FROM t
WHERE (c1,c2) IN (SELECT c1, MAX(c2)
FROM t
GROUP BY 1)
AND c3 <> 'LVE');
Here's one option:
SQL> with test (pet_ref, xdate, type) as
2 (select 123, date '2017-01-01', 'obj' from dual union all
3 select 123, date '2017-01-01', 'obj' from dual union all
4 select 123, date '2017-01-01', 'obj' from dual union all
5 select 123, date '2017-01-02', 'lve' from dual union all --
6 select 456, date '2017-01-01', 'obj' from dual union all
7 select 456, date '2017-01-01', 'lve' from dual union all --
8 select 456, date '2017-01-02', 'obj' from dual
9 ),
10 inter as
11 (select pet_ref, type,
12 rank() over (partition by pet_ref order by xdate desc) rnk
13 from test
14 )
15 select * from test t
16 where t.pet_ref not in (select i.pet_ref from inter i
17 where i.rnk = 1
18 and i.type = 'lve');
PET_REF XDATE TYP
---------- ---------- ---
456 02/01/2017 obj
456 01/01/2017 lve
456 01/01/2017 obj
SQL>
Easier way to do this just by using order by :
SELECT *
FROM datatable
WHERE PET_REF LIKE (SELECT MAX(PET_REF) FROM datatable)
ORDER BY XDATE ASC, TYPE DESC;
Try the SQLFiddle

Finding dates when accounts reach zero

Thanks for taking the time to examine my issue.
I'm trying to figure out a way to return dates when an account reaches 0
Sample data:
DATE ACCOUNT AMOUNT
11/01 001 100
11/02 002 50
11/03 001 -100
11/07 001 20
11/15 002 -50
11/20 001 -20
Wanted results:
Account ZeroDate
001 11/03
002 11/15
001 11/20
So far I haven't been able to figure out anything that works. Might you be able to point me in the right direction?
Thanks again in advance!
You can use analytic functions to compute the running balance
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select date '2011-11-01' dt, 1 account, 100 amt from dual union all
3 select date '2011-11-02', 2, 50 from dual union all
4 select date '2011-11-03', 1, -100 from dual union all
5 select date '2011-11-07', 1, 20 from dual union all
6 select date '2011-11-15', 2, -50 from dual union all
7 select date '2011-11-20', 1, -20 from dual
8 )
9 select dt,
10 account,
11 amt,
12 sum(amt) over (partition by account order by dt) current_balance
13* from x
SQL> /
DT ACCOUNT AMT CURRENT_BALANCE
--------- ---------- ---------- ---------------
01-NOV-11 1 100 100
03-NOV-11 1 -100 0
07-NOV-11 1 20 20
20-NOV-11 1 -20 0
02-NOV-11 2 50 50
15-NOV-11 2 -50 0
6 rows selected.
and then use the running balance to find the zero dates.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select date '2011-11-01' dt, 1 account, 100 amt from dual union all
3 select date '2011-11-02', 2, 50 from dual union all
4 select date '2011-11-03', 1, -100 from dual union all
5 select date '2011-11-07', 1, 20 from dual union all
6 select date '2011-11-15', 2, -50 from dual union all
7 select date '2011-11-20', 1, -20 from dual
8 )
9 select account,
10 dt zero_date
11 from (
12 select dt,
13 account,
14 amt,
15 sum(amt) over (partition by account order by dt) current_balance
16 from x
17 )
18* where current_balance = 0
SQL> /
ACCOUNT ZERO_DATE
---------- ---------
1 03-NOV-11
1 20-NOV-11
2 15-NOV-11
create table myacct (dt varchar2(5)
, account varchar2(3)
, amount number
)
;
insert into myacct values ('11/01', '001', 100);
insert into myacct values ('11/02', '002', 50);
insert into myacct values ('11/03', '001', -100);
insert into myacct values ('11/07', '001', 20);
insert into myacct values ('11/15', '002', -50);
insert into myacct values ('11/20', '001', -20);
commit;
/* results wanted:
Account ZeroDate
001 11/03
002 11/15
001 11/20 */
select account "Account", dt "ZeroDate"
from myacct
where amount <= 0
;
/* results from above query:
Account ZeroDate
001 11/03
002 11/15
001 11/20
*/

Select Distinct Rows Outside of Time Frame

I am trying to use SQL to select distinct data entries based on the time difference between one entry and the next. It's easier to explain with an example:
My data table has
Part DateTime
123 12:00:00
123 12:00:05
123 12:00:06
456 12:10:23
789 12:12:13
123 12:14:32
I would like to return all rows as long with the limitation that if there are multiple entries with the same "Part" number I would like to retrieve only those that have a difference of at least 5 minutes.
The query should return:
Part DateTime
123 12:00:00
456 12:10:23
789 12:12:13
123 12:14:32
The code I'm using is the following:
SELECT data1.*, to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss')
FROM data data1
where exists
(
select *
from data data2
where data1.part_serial_number = data2.part_serial_number AND
data2.scan_time + 5/1440 >= data1.scan_time
and data2.info is null
)
order by to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss'), data1.part_serial_number
This is not working unfortunately. Does anyone know what i'm doing wrong or can suggest an alternate approach??
Thanks
Analytic functions to the rescue.
You can use the analytic function LEAD to get the data for the next row for the part.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 123 part, timestamp '2011-12-08 00:00:00' ts
3 from dual
4 union all
5 select 123, timestamp '2011-12-08 00:00:05'
6 from dual
7 union all
8 select 123, timestamp '2011-12-08 00:00:06'
9 from dual
10 union all
11 select 456, timestamp '2011-12-08 00:10:23'
12 from dual
13 union all
14 select 789, timestamp '2011-12-08 00:12:13'
15 from dual
16 union all
17 select 123, timestamp '2011-12-08 00:14:32'
18 from dual
19 )
20 select part,
21 ts,
22 lead(ts) over (partition by part order by ts) next_ts
23* from x
SQL> /
PART TS NEXT_TS
---------- ------------------------------- -------------------------------
123 08-DEC-11 12.00.00.000000000 AM 08-DEC-11 12.00.05.000000000 AM
123 08-DEC-11 12.00.05.000000000 AM 08-DEC-11 12.00.06.000000000 AM
123 08-DEC-11 12.00.06.000000000 AM 08-DEC-11 12.14.32.000000000 AM
123 08-DEC-11 12.14.32.000000000 AM
456 08-DEC-11 12.10.23.000000000 AM
789 08-DEC-11 12.12.13.000000000 AM
6 rows selected.
Once you've done that, then you can create an inline view and simply select those rows where the next date is more than 5 minutes after the current date.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 123 part, timestamp '2011-12-08 00:00:00' ts
3 from dual
4 union all
5 select 123, timestamp '2011-12-08 00:00:05'
6 from dual
7 union all
8 select 123, timestamp '2011-12-08 00:00:06'
9 from dual
10 union all
11 select 456, timestamp '2011-12-08 00:10:23'
12 from dual
13 union all
14 select 789, timestamp '2011-12-08 00:12:13'
15 from dual
16 union all
17 select 123, timestamp '2011-12-08 00:14:32'
18 from dual
19 )
20 select part,
21 ts
22 from (
23 select part,
24 ts,
25 lead(ts) over (partition by part order by ts) next_ts
26 from x )
27 where next_ts is null
28* or next_ts > ts + interval '5' minute
SQL> /
PART TS
---------- -------------------------------
123 08-DEC-11 12.00.06.000000000 AM
123 08-DEC-11 12.14.32.000000000 AM
456 08-DEC-11 12.10.23.000000000 AM
789 08-DEC-11 12.12.13.000000000 AM
AFJ,
let's supose that we have a new field that tell us if exists a previus entry for this Part in the previous 5 minutes, then, taking the rows that this field is set to False we have the result.
select
Part,
DateTime,
coalesce(
(select distinct 1
from data ds
where ds.Part = d.Part
and ds.DateTime between d.DateTime and d.DateTime - 5/1440
)
, 0) as exists_previous
from data d
The subquery checks if they are a row with same Part in previous 5 minutes inteval
Result must be:
Part DateTime exists_previous
123 12:00:00 0
123 12:00:05 1
123 12:00:06 1
456 12:10:23 0
789 12:12:13 0
123 12:14:32 0
now, filter to get only rows with 0:
select Part, DateTime from
(select
Part,
DateTime,
coalesce(
(select distinct 1
from data ds
where ds.Part = d.Part
and ds.DateTime between d.DateTime and d.DateTime - 5/1440
)
, 0) as exists_previous
from data D
) T where T.exists_previous = 0
Disclaimer: not tested.
This has not been verified, but essentially, the trick is to group by part AND time divided by 5 minutes (floored).
select part, min(scan_time)
from data
group by part, floor(scan_time/(5/1440))
order by scan_time;