How Should I handle this Start and End Date for each address changes in Oracle? - sql

I have a request to generate a report with the following data in an Oracle table: Just an example of a member.
MEMBER_ID START_DATE END_DATE ADDRESS1 ADDRESS2 CITY STATE LAST_UPDATED
12345 1/1/2019 12/31/9999 1 Test Ave Apt 111 City AA 3/4/2020
12345 1/1/2019 12/31/9999 2 Test Dr Apt 222 City AA 9/5/2019
12345 1/1/2019 12/31/9999 1 Test Ave APT 111 City AA 6/3/2019
12345 1/1/2019 12/31/9999 3 Test TRL City AA 3/3/2019
I want this as my output on the report from the data above:
MEMBER_ID START_DATE END_DATE ADDRESS1 ADDRESS2 CITY STATE LAST_UPDATED
12345 10/1/2019 12/31/9999 1 Test Ave Apt 111 City AA 3/4/2020
12345 7/1/2019 9/30/2019 2 Test Dr Apt 222 City AA 9/5/2019
12345 4/1/2019 6/31/2019 1 Test Ave APT 111 City AA 6/3/2019
12345 1/1/2019 3/31/2019 3 Test TRL City AA 3/3/2019
Would someone be able to help with this? I tried Dense_rank but just couldn't figure a logic that would work correctly. Like if a member has another address change, i would need to pull in the latest change on the report as well.

You seem to want records to end on the last day of the month of the last_updated column. Then next then begins on the next day.
This is easily handled using lag():
select t.*,
( lag(last_day(last_updated)) over (partition by member_id order by last_updated) +
interval '1' day
) as new_start_date,
last_day(last_updated) as new_end_date
from t;

I think you need a quarter start and end date of the last updated date as start and end date.
Select member_id,
Trunc(last_updated,'Q') as start_date,
case
when extract(month from Trunc(last_updated,'Q')) = 12
then end_date
else Add_months(Trunc(last_updated,'Q'), 3) - 1
end as end_date,
.....
From your_table

Related

Add daily status value to records ID in bigquery

I have a table with records for each ID at a certain date. I want the table to have the missing dates for the IDs to track daily status of a value. If there isn't any change, the value from previous day will carry on.
*The date list should be set to CURRENT_DATE(), so I always have current date status of value. For the example below, 2023-02-05 is the current date therefore an ID might not have any change at current date.
This is the table
ID
First_name
Last_name
Date
Value
aaa
Adam
Glen
2023-02-02
Green
aaa
Adam
Glen
2023-02-05
Red
bbb
Daniel
Blue
2023-02-02
Red
bbb
Daniel
Blue
2023-02-04
Green
This is the output I want to have from the query
ID
First_name
Last_name
Date
Value
aaa
Adam
Glen
2023-02-02
Green
aaa
Adam
Glen
2023-02-03
Green
aaa
Adam
Glen
2023-02-04
Green
aaa
Adam
Glen
2023-02-05
Red
bbb
Daniel
Blue
2023-02-02
Red
bbb
Daniel
Blue
2023-02-03
Red
bbb
Daniel
Blue
2023-02-04
Green
bbb
Daniel
Blue
2023-02-05
Green
A little bit verbose but you might consider below approach.
SELECT * EXCEPT(Min_date, Value),
COALESCE(Value,LAST_VALUE(Value IGNORE NULLS) OVER w) AS Value
FROM (
SELECT ID, First_name, Last_name, MIN(Date) Min_date FROM sample_table GROUP BY 1, 2, 3
), UNNEST(GENERATE_DATE_ARRAY(Min_date, '2023-02-05')) Date
LEFT JOIN sample_table USING (ID, First_name, Last_name, Date)
WINDOW w AS (PARTITION BY ID ORDER BY Date);
Query results
you can replace 2023-02-05 in GENERATE_DATE_ARRAY(Min_date, '2023-02-05') with CURRENT_DATE

Trying to count unique observations in SQL using Partition By

I have these two datasets:
Conditions: I would like to count the number of Unique Discharge_ID as Total_Discharges in my final dataset.
ICU_ID is a little bit more difficult. For PT_ID 001, what is happening is that PT 001 has 4 of the same discharge dates but 4 unique ICU_IDs. Since all of these ICU_IDs occur within 30 days of the Discharge_DT, I only want to count one of them. That is why total discharges for AZ is 1 and ICU_Admits = 1.
For PT_ID 002, I have 2 different Discharge_IDs but 1 ICU Admit that occurred within 30 days of both of the Discharge_IDs. I would like to count the Discharges as 2, and ICU_admits as 1.
DF1: Dataset of Discharges from hospital and admission to ICU within 30 days of Discharge_DT
City
PT_ID
Hospital_ID
Admit_Dt
Discharge_DT
Discharge_ID
ICU_ID
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-05-2021,01-06-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-08-2021,01-09-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-11-2021,01-11-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-15-2021,01-16-2021
CA
002
DEF
04-03-2021
04-07-2021
001,ABC,04-03-2021,04-07-2021
002,LMN,04-27-2021,04-27-2021
CA
002
DEF
04-20-2021
04-21-2021
001,ABC,04-20-2021,04-21-2021
002,LMN,04-27-2021,04-27-2021
DF desired:
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
1
Current Code:
DROP TABLE IF EXISTS #edit1
WITH CTE_df1 as (
select * from df1
)
select
City,
PT_ID,
Hospital_ID,
Admit_Dt,
Discharge_DT,
Discharge_ID,
count(ICU_ID) over (partition by ICU_ID) as ICU_Pts,
count(distinct Discharge_ID) as Total_Discharges
into #edit1
from CTE_df1
group by City, Discharge_ID, ICU_ID, PT_ID
order by City,
;with CTE_edit1 as (
select * from #edit1
)
select City, sum(ICU_Pts), sum(Total_Discharges)
from CTE_edit1
group by City
order by City
Current Output: PT_ID 001 works great but PT_ID 002 shows up at 2 in ICU_Admit as it is counting both as unique ICU visits.
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
2
Any help would be appreciated

find out the records - first two date records of each member of every year- db2udb

Need Help -
Scenario - I have a table testdata with columns
memberid (varchar) ,codetype(varchar),effectivedate(datetime)
this table having 20k records - from year 2015 to 2021
I need to find out the records - first two date records of each member of every year [ only memberid is unique)
eg.
member id
codetype
effectivedate
123
ABC
1/2/2015
123
ABC
1/2/2015
123
ABC
8/15/2015
123
EFG
9/15/2015
123
EFG
2/15/2018
345
EFG
3/14/2018
345
EFG
3/17/2018
345
ABC
9/19/2020
456
EFG
12/20/2021
result should be like below
member id
codetype
effectivedate
123
ABC
1/2/2015
123
ABC
1/2/2015
123
ABC
2/15/2018
345
EFG
3/14/2018
345
EFG
3/17/2018
345
ABC
9/19/2020
456
EFG
12/20/2021
tried lot of ways but no luck so far
Try this
with u as
(select memberid, codetype, effectivedate,
row_number() over(partition by memberid, year(effectivedate) order by
memberid) as rownum
from testdata)
(select memberid, codetype, effectivedate from u
where rownum <= 2)
Basically you get the row numbers of every record partitioned by memberid and the year of the record, then keep only the records where rownum is 1 or 2.

Getting a snapshot on the first day of the each month

My dataset looks like this:
emplid region location sub_dept dept start_dt end_dt days
------ ------ -------- -------- ---- -------- ---------- ----
123456 East NY A 1 7/1/2005 9/30/2005 91
123456 East NY B 1 7/1/2012 11/9/2012 131
123456 West San Jose C 2 7/1/2013 12/31/2013 183
123457 East NY B 1 7/1/2017 9/7/2017 68
123457 East NY B 1 7/1/2005 12/31/2005 183
123458 East NY B 1 7/1/2017 9/7/2017 68
123458 West San Jose C 2 7/1/2010 7/31/2010 30
123459 East NY A 1 7/1/2017 9/7/2017 68
123460 East Boston F 3 7/1/2007 11/30/2007 152
I need to be able to get a snapshot for each 1st of the month starting from the minimum date. So in the example minimum date is 9/30/2005. So I need to know in which department/sub_dept/location/region was each empl on 10/1/2005, 11/1/2005 , 12/1/2005 all the way through the max date.
You didn't mention the name of the employee table, so I've called it employee_table. The following query (or something very close to it) should generate what you want:
With report_limits as (
Select Trunc(min(start_dt), 'MONTH') as min_rpt_dt,
Trunc(max(end_dt), 'MONTH') as max_rpt_dt
From employee_table),
report_dates as (
Select add_months(min_rpt_dt, level-1) as rpt_dt
From report_limits
Connect By add_months(min_rpt_dt, level-1) <= max_rpt_dt)
--
Select e.emplid, e.region, e.location, e.sub_dept, e.dept,
e.start_dt, e.end_dt, e.days, r.rpt_dt
From report_dates r
Inner Join employee_table e on r.rpt_dt Between e.start_dt And e.end_dt
Order By r.rpt_dt, e.emplid;
The report_limits query determines the range of report dates, the report_dates query uses a Connect By clause to generate a set of dates within the range, and the main query joins the list of dates to the employee date.
Try this query:
Declare #StartDate date='2005-09-29',
#EndDate date='2017-04-01'
Select *,
Dateadd(mm, Datediff(mm, 0, date), 0) AS FirstDateOfMonth from TableName
where date >=#StartDate and date<=#EndDate

Records with overlapping dates

I have an addresses table, say:
address_id person_id start_date stop_date address
1 123 01-JAN-15 01-JUN-15 india
2 123 01-MAY-15 null russia
3 321 01-JAN-15 01-JUN-15 us
4 321 10-MAY-15 null india
I want to find all records (address_id values) which have overlapping dates for the same person_id. In this example that would find address_id 2 and 4, as May lies between Jan and Jun.
I then want to update the stop_date to start_date - 1 of the subsequent row belonging to same person so that the overlap is removed. For instance updating stop_date to 09-MAY-2015at row withaddress_id` 3.
So I want to end up with:
address_id person_id start_date stop_date address
1 123 01-JAN-15 30-APR-15 india
2 123 01-MAY-15 null russia
3 321 01-JAN-15 09-MAY-15 us
4 321 10-MAY-15 null india
I have tried:
update (
select * from addresses a1,addresses a2
where a1.person_id = a2.person_id
and a2.start_date > a1.start_date and a2.start_date <a1.stop_date
)
set a1.stop_date = a2.start_date - 1;
This worked fine in Microsoft Access but in Oracle it an invalid identifier error for a2.start_date.
How can I perform this update?
You can use a correlated update:
update addresses a
set stop_date = (
select min(start_date) - 1
from addresses
where person_id = a.person_id
and start_date > a.start_date
and start_date <= a.stop_date
)
where exists (
select null
from addresses
where person_id = a.person_id
and start_date > a.start_date
and start_date <= a.stop_date
);
2 rows updated.
select * from addresses;
ADDRESS_ID PERSON_ID START_DATE STOP_DATE ADDRESS
---------- ---------- ---------- --------- ----------
1 123 01-JAN-15 30-APR-15 india
2 123 01-MAY-15 russia
3 321 01-JAN-15 09-MAY-15 us
4 321 10-MAY-15 india
Both the set subquery and the exists subquery look for a row for the same person whose start date is between the start and stop date of the current row (which is the correlated part). The exists means only accounts which match are updated; without that any rows which don't have an overlap would be updated to null. (You wouldn't see any difference with the sample data, but would if you had more data).