SQL - Create SCD2 table using Master and activity Log table

SQL - Create SCD2 table using Master and activity Log table - sql

I have two table one is Master table and second one is a activity log table which keeps all the activity log (insert,update,delete) of master table. I need to create third SCD2 type table which will be formed by using these two tables. I have created a query which updates the dates correctly, but the column values are not getting updated properly. Need some suggestions to improve the logic of my query which will give the correct output. Table details and expected output is given below for more understanding.
Master Table - MASTER
BANK_ID
CCY_NUMB_CODE
CCY_ALPHA_CODE
ISIN_CLOSE_DATE
ISIS_CLOSE_PRICE
ISIN_STATUS
CCY_SHORT_NAME
DEL_FLAG
LAST_MAN_MDFCN
1
9
INE079A01016
28-01-2000
148.6
A
GLOBAL TELE 21/4/99
Y
23-06-2015
Activity Log table - AlOG
BANK_ID
MODULE_ID
INSERT_DATE
REQUEST_DATE
USERID
TABLE_NAME
PRIMARY_KEY_BUFFER
FIELD_LABEL
ORA_FIELD_NAME
NEW_VALUE
OLD_VALUE
DESCRIPTION
1
6
18-10-1999
18-10-1999
MM
MASTER
1,9
null
null
null
null
NEW RECORD INSERTED
1
6
18-10-1999
18-10-1999
MM
MASTER
1,9
ISIN Status
ISIN_STATUS
A
I
UPDATED ISIN STATUS
1
6
20-10-1999
20-10-1999
MM
MASTER
1,9
ISIN Description
CCY_SHORT_NAME
GLOBAL TELE 21/4/99
GLOBAL TELE EQ.NPP
UPDATED SHORT NAME
1
6
31-01-2000
31-01-2000
MM
MASTER
1,9
Redemption Price
ISIS_CLOSE_PRICE
1387.77
540
UPDATED CLOSE PRICE
1
6
31-01-2000
31-01-2000
MM
MASTER
1,9
Close Date
ISIN_CLOSE_DATE
28-01-2000
15-10-1999
UPDATED CLOSE DATE
1
6
23-06-2015
23-06-2015
MM
MASTER
1,9
DEL_FLAG
DEL_FLAG
Y
N
UPDATED DEL FLAG
NOTE - PRIMARY_KEY_BUFFER is the combination of BANK_ID and CCY_NUMB_CODE
Expected Output -
BANK_ID
CCY_NUMB_CODE
CCY_ALPHA_CODE
ISIN_CLOSE_DATE
ISIS_CLOSE_PRICE
ISIN_STATUS
CCY_SHORT_NAME
DEL_FLAG
LAST_MAN_MDFCN
START_DATE
END_DATE
1
9
INE079A01016
15-10-1999
540
I
GLOBAL TELE EQ.NPP
N
18-10-1999
18-10-1999
18-10-1999
1
9
INE079A01016
15-10-1999
540
A
GLOBAL TELE EQ.NPP
N
18-10-1999
18-10-1999
19-10-1999
1
9
INE079A01016
15-10-1999
540
A
GLOBAL TELE 21/4/99
N
20-10-1999
20-10-1999
30-01-2000
1
9
INE079A01016
28-01-2000
1387.77
A
GLOBAL TELE 21/4/99
N
31-01-2000
31-01-2000
22-06-2015
1
9
INE079A01016
28-01-2000
1387.77
A
GLOBAL TELE 21/4/99
Y
23-06-2015
23-06-2015
31-12-9999
Below is the query I am trying with to get desire output as don't want to use MERGE or UPDATE option
SELECT
p.BANK_ID,
CCY_NUMB_CODE,
CCY_ALPHA_CODE,
(REQUEST_DATE -1) AS ISIN_CLOSE_DATE,
ISIN_CLOSE_PRICE,
REQUEST_DATE AS LAST_MAN_MDFCN,
REQUEST_DATE AS START_DATE,
NEW_VALUE,
OLD_VALUE,
DATE_LAST_MDFCN,
LEAD(REQUEST_DATE -1 ) OVER (PARTITION BY p.BANK_ID, p.CCY_NUMB_CODE ORDER BY REQUEST_DATE) AS END_DATE
--ROW_NUMBER() OVER (PARTITION BY p.BANK_ID, p.CCY_NUMB_CODE ORDER BY REQUEST_DATE) AS RN
FROM MASTER p
LEFT JOIN ALOG l
ON l.PRIMARY_KEY_BUFFER = p.BANK_ID ||','|| p.CCY_NUMB_CODE
where
TABLE_NAME = 'MASTER' AND p.BANK_ID=1 AND p.CCY_NUMB_CODE=9
order by insert_date

select * from
(SELECT
p.BANK_ID,
CCY_NUMB_CODE,
CCY_ALPHA_CODE,
(REQUEST_DATE -1) AS ISIN_CLOSE_DATE,
ISIN_CLOSE_PRICE,
REQUEST_DATE AS LAST_MAN_MDFCN,
REQUEST_DATE AS START_DATE,
NEW_VALUE,
OLD_VALUE,
DATE_LAST_MDFCN,
--LEAD(REQUEST_DATE -1 ) OVER (PARTITION BY p.BANK_ID, p.CCY_NUMB_CODE ORDER BY --REQUEST_DATE) AS END_DATE
ROW_NUMBER() OVER (PARTITION BY p.BANK_ID, p.CCY_NUMB_CODE,REQUEST_DATE ORDER REQUEST_DATE) AS RN
FROM MASTER p
LEFT JOIN ALOG l
ON l.PRIMARY_KEY_BUFFER = p.BANK_ID ||','|| p.CCY_NUMB_CODE
where
TABLE_NAME = 'MASTER' AND p.BANK_ID=1 AND p.CCY_NUMB_CODE=9
) x
where x.rn=1

Related

SQL derived attribute

I want to make the Room_Status in Table 1 (ROOM) table into a derived attribute base on the check_in and check_out date Table 2 (Booking Records) table, but I don't know if it is possible to determine the room status dynamically based on the check_in/check_out date in table 2 , eg. The room with room_no 103 suppose to be unavailable for day 19/02/2020 to 20/02/2020 because its already booked by someone else , so the room status will be displayed as unavailable or N, after date 20/02/2020 the room will be available again.
Another extra thing is that I want to calculate the days available for each room based on the table 2 check_in and check_out date eg. room 103 will be available for only 1 day if it is booked on 19/02/2020 to 20/02/2020 after one day which is 22/02/2020 the room is booked by another customer, how should I calculate the days available...
Table 1 (ROOM)
ROOM_NO ROOM_STATUS ('Y' represent 'Available' , 'N' represent 'Unavailable')
======= ============
1 Y
2 Y
3 Y
4 Y
5 Y
6 Y
7 Y
8 Y
9 Y
10 Y
more rooms.....
Table 2 (Booking Records)
BOOKING_ID CHECK_IN CHECK_OUT SPECIAL_REQ CANCEL_REASON DATE_BOOK ROOM_NO GUEST
========== ======== ========= =========== ============= ========= ======= =====
1 19/02/2020 20/02/2020 Prepare hot bath tub 17/02/2020 103 980315070652
when check in.
2 20/05/2020 27/05/2020 Prepare scented 10/05/2020 10 C00001549
candle and meal when
check in
3 21/05/2020 23/05/2020 Prepare latest news 10/05/2020 9 C00001894
paper in room
4 20/05/2020 24/05/2020 Prepare hot bath tub 17/05/2020 124 980315070652
when check in.

Not sure I'm following, something like this? Strictness of the inequality (<,<=) will depend on how you want to define available on a given day.
SELECT DISTINCT
Room_No,
CASE WHEN EXISTS
( SELECT 1
FROM Booking_Records BR2
WHERE BR.room_no = BR2.room_no
AND CURRENT_DATE >= BR2.Check_In
AND CURRENT_DATE < BR2.Check_Out
) THEN 'N'
ELSE 'Y'
END AS Currently_Available
FROM Booking_Records BR

This will get the rooms from the booking_records table that are available on 2020-05-22:
SELECT room_no,
CASE COUNT(
CASE
WHEN DATE '2020-05-22' BETWEEN CHECK_IN AND CHECK_OUT
AND cancel_reason IS NULL
THEN 1
END
)
WHEN 0
THEN 'Y'
ELSE 'N'
END AS room_status_20200522
FROM booking_records
GROUP BY room_no
Which, for your sample data:
CREATE TABLE booking_records ( BOOKING_ID, CHECK_IN, CHECK_OUT, CANCEL_REASON, ROOM_NO ) AS
SELECT 1, DATE '2020-02-19', DATE '2020-02-20', CAST( NULL AS VARCHAR2(50) ), 103 FROM DUAL UNION ALL
SELECT 2, DATE '2020-05-20', DATE '2020-05-27', NULL, 10 FROM DUAL UNION ALL
SELECT 3, DATE '2020-05-21', DATE '2020-05-23', NULL, 9 FROM DUAL UNION ALL
SELECT 4, DATE '2020-05-20', DATE '2020-05-24', NULL, 124 FROM DUAL;
Outputs:
ROOM_NO | ROOM_STATUS_20200522
------: | :-------------------
9 | N
103 | Y
10 | N
124 | N
db<>fiddle here

So, You need to fetch room status and number of day availability after room is free.
I am considering that booking_records table can contain more than one reservation for the same room_no.
Following query will give you desired details as of sysdate.
Select r.room_no,
Max(Case when b.bookin_id is not null then'N' else 'Y' end) as booking_status,
Min(bn.check_in - coalesce(b.check_out,sysdate)) as available_days
From rooms r
Left Join booking_records b
On b.room_no = r.room_no
And sysdate between b.check_in and b.check_out
Left join booking_records bn
On bn.room_no = r.room_no
And bn.check_in > sysdate
Group by r.room_no

adjust date overlaps within a group

I have this table and I want to adjust END_DATE one day prior to the next ST_DATE in case if there are overlap dates for a group of ID
TABLE HAVE
ID ST_DATE END_DATE
1 2020-01-01 2020-02-01
1 2020-05-10 2020-05-20
1 2020-05-18 2020-06-19
1 2020-11-11 2020-12-01
2 1999-03-09 1999-05-10
2 1999-04-09 2000-05-10
3 1999-04-09 2000-05-10
3 2000-06-09 2000-08-16
3 2000-08-17 2009-02-17
Below is what I'm looking for
TABLE WANT
ID ST_DATE END_DATE
1 2020-01-01 2020-02-01
1 2020-05-10 2020-05-17 =====changed to a day less than the next ST_DATE due to some sort of overlap
1 2020-05-18 2020-06-19
1 2020-11-11 2020-12-01
2 1999-03-09 1999-04-08 =====changed to a day less than the next ST_DATE due to some sort of overlap
2 1999-04-09 2000-05-10
3 1999-04-09 2000-05-10
3 2000-06-09 2000-08-16
3 2000-08-17 2009-02-17

Maybe you can use LEAD() for this. Initial idea:
select
id, st_date, end_date
, lead( st_date ) over ( partition by id order by st_date ) nextstart_
from overlap
;
-- result
ID ST_DATE END_DATE NEXTSTART
---------- --------- --------- ---------
1 01-JAN-20 01-FEB-20 10-MAY-20
1 10-MAY-20 20-MAY-20 18-MAY-20
1 18-MAY-20 19-JUN-20 11-NOV-20
1 11-NOV-20 01-DEC-20
2 09-MAR-99 10-MAY-99 09-APR-99
2 09-APR-99 10-MAY-00
3 09-APR-99 10-MAY-00 09-JUN-00
3 09-JUN-00 16-AUG-00 17-AUG-00
3 17-AUG-00 17-FEB-09
Once you have the next start date and the end_date side by side (as it were),
you can use CASE ... for adjusting the dates as you need them.
select ilv.id, ilv.st_date
, case
when ilv.end_date > ilv.nextstart_ then
to_char( ilv.nextstart_ - 1 ) || ' <- modified end date'
else
to_char( ilv.end_date )
end dt_modified
from (
select
id, st_date, end_date
, lead( st_date ) over ( partition by id order by st_date ) nextstart_
from overlap
) ilv
;
ID ST_DATE DT_MODIFIED
---------- --------- ---------------------------------------
1 01-JAN-20 01-FEB-20
1 10-MAY-20 17-MAY-20 <- modified end date
1 18-MAY-20 19-JUN-20
1 11-NOV-20 01-DEC-20
2 09-MAR-99 08-APR-99 <- modified end date
2 09-APR-99 10-MAY-00
3 09-APR-99 10-MAY-00
3 09-JUN-00 16-AUG-00
3 17-AUG-00 17-FEB-09
DBfiddle here.

If two "windows" for the same id have the same start date, then the problem doesn't make sense. So, let's assume that the problem makes sense - that is, the combination (id, st_date) is unique in the inputs.
Then, the problem can be formulated as follows: for each id, order rows by st_date ascending. Then, for each row, if its end_dt is less than the following st_date, return the row as is. Otherwise replace end_dt with the following st_date, minus 1. This last step can be achieved with the analytic lead() function.
A solution might look like this:
select id, st_date,
least(end_date, lead(st_date, 1, end_date + 1)
over (partition by id order by st_date) - 1) as end_date
from have
;
The bit about end_date + 1 in the lead function handles the last row for each id. For such rows there is no "next" row, so the default application of lead will return null. The default can be overridden by using the third parameter to the function.

Case statement for HIVE platform

I have a table with the following columns:
ID
Scheduled Date
Status
Target Date
I need to extract 'Status' corresponding to minimum 'Appointment Date' for each ID. If not available then I need to extract status corresponding to the minimum 'Target Date' for that ID.
Sample data:
ID | Scheduled_Date | Status | Target_Date
1 12/11/2017 Completed 12/11/2017
1 12/12/2017 Completed 12/12/2017
2 12/13/2017 Completed 12/13/2017
3 12/14/2017 Pending 12/14/2017
3 12/15/2017 Pending 12/15/2017
4 Confirmed 12/18/2017
4 Confirmed 12/19/2017
5 12/14/2017 Completed 12/14/2017
5 12/15/2017 Pending 12/15/2017
Can you please correct the code that I am trying to write?
SELECT ID,
CASE WHEN ID IS NOT NULL THEN
CASE WHEN MIN(SCHEDULED_DATE) IS NOT NULL
THEN STATUS
ELSE
END
CASE WHEN MIN(TARGET_DATE) IS NOT NULL
THEN STATUS
ELSE ''
END
FROM FIRST_STATUS

Try this query.
SELECT id,
status
FROM yourtable t
WHERE COALESCE (Scheduled_Date,
Target_Date) IN
(SELECT MIN(COALESCE (Scheduled_Date,Target_Date))
FROM yourtable i
WHERE i.ID = t.id
GROUP BY i.ID);
DEMO

Use row_number() analytic function:
select id,
status
from
(
select id,
status,
row_number() over(partition by id, order by nvl(Scheduled_Date,Target_Date)) rn
from yourtable t
)s
where rn=1
;

SQL JOIN - retrieve MAX DateTime from second table and the first DateTime after previous MAX for other value

I have issue with creating a proper SQL expression.
I have table TICKET with column TICKETID
TICKETID
1000
1001
I then have table STATUSHISTORY from where I need to retrieve what was the last time (maximum time) when that ticket entered VENDOR status (last VENDOR status) and when it exited VENDOR status (by exiting VENDOR status I mean the first next INPROG status, but only first INPROG after the VENDOR status, it's always INPROG the next status after VENDOR status). Also it is also possible that VENDOR status for ID does not exist at all in STATUSHISOTRY (then nulls should be returned), but INPROG exists always - it can be before but also and after VENDOR status, if ID is not anymore in VENDOR status.
Here is the example of STATUSHISTORY.
ID TICKETID STATUS DATETIME
1 1000 INPROG 01.01.2017 10:00
2 1000 VENDOR 02.01.2017 10:00
3 1000 INPROG 03.01.2017 10:00
4 1000 VENDOR 04.01.2017 10:00
5 1000 INPROG 05.01.2017 10:00
6 1000 HOLD 06.01.2017 10:00
7 1000 INPROG 07.01.2017 10:00
8 1001 INPROG 02.02.2017 10:00
9 1001 VENDOR 03.02.2017 10:00
10 1001 INPROG 04.02.2017 10:00
11 1001 VENDOR 05.02.2017 10:00
So the result when doing the query from TICKET table and doing the JOIN with table STATUSHISTORY should be:
ID VENDOR_ENTERED VENDOR_EXITED
1000 04.01.2017 10:00 05.01.2017 10:00
1001 05.02.2017 10:00 null
Because for ID 1000 last VENDOR status was at 04.01.2017 and the first INPROG status after the VENDOR status for that ID was at 05.01.2017 while for ID 1001 the last VENDOR status was at 05.02.2017 and after that INPROG status did not happen yet.
If VENDOR did not exist then both columns should be null in result.
I am really stuck with this, trying different JOINs but without any progress.
Thank you in advance if you can help me.

You can do this with window functions. First, assign a "vendor" group to the tickets. You can do this using a cumulative sum counting the number of "vendor" records on or before each record.
Then, aggregate the records to get one record per "vendor" group. And use row numbers to get the most recent records. So:
with vg as (
select ticket,
min(datetime) as vendor_entered,
min(case when status = 'INPROG' then datetime end) as vendor_exitied
from (select sh.*,
sum(case when status = 'VENDOR' then 1 else 0 end) over (partition by ticketid order by datetime) as grp
from statushistory sh
) sh
group by ticket, grp
)
select vg.tiketid, vg.vendor_entered, vg.vendor_exited
from (select vg.*,
row_number() over (partition by ticket order by vendor_entered desc) as seqnum
from vg
) vg
where seqnum = 1;

You can aggregate to get max time, then join onto all of the date values higher than that time, and then re-aggregate:
select a.TicketID,
a.VENDOR_ENTERED,
min( EXIT_TIME ) as VENDOR_EXITED
from (
select TicketID,
max( DATETIME ) as VENDOR_ENTERED
from StatusHistory
where Status = 'VENDOR'
group by TicketID
) as a
left join
(
select TicketID,
DATETIME as EXIT_TIME
from StatusHistory
where Status = 'INPROG'
) as b
on a.TicketID = b.TicketID
and EXIT_TIME >= a.VENDOR_ENTERED
group by a.TicketID,
a.VENDOR_ENTERED
DB2 is not supported in SQLfiddle, but a standard SQL example can be found here.

fill in a null cell with cell from previous record

Hi I am using DB2 sql to fill in some missing data in the following table:
Person House From To
------ ----- ---- --
1 586 2000-04-16 2010-12-03
2 123 2001-01-01 2012-09-27
2 NULL NULL NULL
2 104 2004-01-01 2012-11-24
3 987 1999-12-31 2009-08-01
3 NULL NULL NULL
Where person 2 has lived in 3 houses, but the middle address it is not known where, and when. I can't do anything about what house they were in, but I would like to take the previous house they lived at, and use the previous To date to replace the NULL From date, and use the next address info and use the From date to replace the null To date ie.
Person House From To
------ ----- ---- --
1 586 2000-04-16 2010-12-03
2 123 2001-01-01 2012-09-27
2 NULL 2012-09-27 2004-01-01
2 104 2004-01-01 2012-11-24
3 987 1999-12-31 2009-08-01
3 NULL 2009-08-01 9999-01-01
I understand that if there is no previous address before a null address, that will have to stay null, but if a null address is the last know address I would like to change the To date to 9999-01-01 as in person 3.
This type of problem seems to me where set theory no longer becomes a good solution, however I am required to find a DB2 solution because that's what my boss uses!
any pointers/suggestions welcome.
Thanks.

It might look something like this:
select
person,
house,
coalesce(from_date, prev_to_date) from_date,
case when rn = 1 then coalesce (to_date, '9999-01-01')
else coalesce(to_date, next_from_date) end to_date
from
(select person, house, from_date, to_date,
lag(to_date) over (partition by person order by from_date nulls last) prev_to_date,
lead(from_date) over (partition by person order by from_date nulls last) next_from_date,
row_number() over (partition by person order by from_date desc nulls last) rn
from temp
) t
The above is not tested but it might give you an idea.
I hope in your actual table you have a column other than to_date and from_date that allows you to order rows for each person, otherwise you'll have trouble sorting NULL dates, as you have no way of knowing the actual sequence.

create table Temp
(
person varchar(2),
house int,
from_date date,
to_date date
)
insert into temp values
(1,586,'2000-04-16','2010-12-03 '),
(2,123,'2001-01-01','2012-09-27'),
(2,NULL,NULL,NULL),
(2,104,'2004-01-01','2012-11-24'),
(3,987,'1999-12-31','2009-08-01'),
(3,NULL,NULL,NULL)
select A.person,
A.house,
isnull(A.from_date,BF.to_date) From_date,
isnull(A.to_date,isnull(CT.From_date,'9999-01-01')) To_date
from
((select *,ROW_NUMBER() over (order by (select 0)) rownum from Temp) A left join
(select *,ROW_NUMBER() over (order by (select 0)) rownum from Temp) BF
on A.person = BF.person and
A.rownum = BF.rownum + 1)left join
(select *,ROW_NUMBER() over (order by (select 0)) rownum from Temp) CT
on A.person = CT.person and
A.rownum = CT.rownum - 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Create SCD2 table using Master and activity Log table - sql

Related

SQL derived attribute

adjust date overlaps within a group

Case statement for HIVE platform

SQL JOIN - retrieve MAX DateTime from second table and the first DateTime after previous MAX for other value

fill in a null cell with cell from previous record

Categories

Resources