I was wondering if someone could help me out with the following logic. I'm trying to create a new case number every 30 days a unique patient has a gap in services. Currently the case numbers aren't going past 2.
select t1.*,
dense_rank() over(
partition by t1.D_UNIQUEPATIENTID
order by t1.d_ServiceStartDate) episode_rank
into #temp2
from #temp1 t1
-- select * from #temp2
--This is every episode and the days between episodes.
select *,
isnull(abs(datediff(day,t1.d_ServiceStartDate, (select top 1 t2.[d_ServiceEndDate] from #temp2 as t2 where t2.D_UNIQUEPATIENTID =t1.D_UNIQUEPATIENTID
and t2.episode_rank < t1.episode_rank order by t2.episode_rank desc))),0) as day_count, 1 AS e2
into #temp3
from #temp2 as t1
SELECT *
,(CASE WHEN t.day_count > 30 THEN t.e2 + 1 ELSE t.e2 end)AS Case_Num
into #temp4
FROM #temp3 AS t
-- select * from #temp4
-- This should return the $amt per case, per member
select D_UNIQUEPATIENTID, Case_Num,min(d_ServiceStartDate) as 'Start_Date',
max(d_ServiceEndDate) as End_Date,Sum(Visits)as Visits,sum(allowed) as 'Allowed Total'
from #temp4
where [d_IncurredPD] between '201801' and'201906'
group by D_UNIQUEPATIENTID, Case_Num
I'm trying to create a new case number every 30 days a unique patient has a gap in services.
It is unclear whether you want this per patient or overall. I assume per patient, although the solution can be adapted.
I have no idea what how the code sample relates to this question. So, I'll write this generically:
select t.*,
sum(case when prev_servicedate >= servicedate - interval '30 day'
then 0 else 1
end) over (partition by patientid order by servicedate) as patient_case
from (select t.*,
lag(servicedate) over (partition by patientid order by servicedate) as prev_servicedate
from t
) t;
Note that this uses standard date functions; these usually vary by database but you haven't specified the database (as I write this).
Related
I have a table on the database that contains statuses updated on each vehicle I have, I want to calculate how many days each vehicle spends time between two specific statuses 'Maintenance' and 'Read'.
My table looks something like this
and I want to result to be like this, only show the number of days a vehicle spends in maintenance before becoming ready on a specific day
The code I written looks like this
drop table if exists #temps1
select
VehicleId,
json_value(VehiclesHistoryStatusID.text,'$.en') as VehiclesHistoryStatus,
VehiclesHistory.CreationTime,
datediff(day, VehiclesHistory.CreationTime ,
lead(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ) ) as days,
lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) as PrevStatus,
case
when (lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) <> json_value(VehiclesHistoryStatusID.text,'$.en')) THEN datediff(day, VehiclesHistory.CreationTime , (lag(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ))) else 0 end as testing
into #temps1
from fleet.VehicleHistory VehiclesHistory
left join Fleet.Lookups as VehiclesHistoryStatusID on VehiclesHistoryStatusID.Id = VehiclesHistory.StatusId
where (year(VehiclesHistory.CreationTime) > 2021 and (VehiclesHistory.StatusId = 140 Or VehiclesHistory.StatusId = 144) )
group by VehiclesHistory.VehicleId ,VehiclesHistory.CreationTime , VehiclesHistoryStatusID.text
order by VehicleId desc
drop table if exists #temps2
select * into #temps2 from #temps1 where testing <> 0
select * from #temps2
Try this
SELECT innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
,SUM(DATEDIFF(DAY,innerQ.PrevMaintenance,innerQ.CreationDate)) AS DayDuration
FROM
(
SELECT t1.VehichleID,t1.CreationDate,t1.Status,
(SELECT top(1) t2.CreationDate FROM dbo.Test t2
WHERE t1.VehichleID=t2.VehichleID
AND t2.CreationDate<t1.CreationDate
AND t2.Status='Maintenance'
ORDER BY t2.CreationDate Desc) AS PrevMaintenance
FROM
dbo.Test t1 WHERE t1.Status='Ready'
) innerQ
WHERE innerQ.PrevMaintenance IS NOT NULL
GROUP BY innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
In this query first we are finding the most recent 'maintenance' date before each 'ready' date in the inner most query (if exists). Then calculate the time span with DATEDIFF and sum all this spans for each vehicle.
Given two time-series tables tbl1(time, b_value) and tbl2(time, u_value).
https://www.db-fiddle.com/f/4qkFJZLkZ3BK2tgN4ycCsj/1
Suppose we want to find the last value of u_value in each day, the daily cumulative sum of b_value on that day, as well as their multiplication, i.e. daily_u_value * b_value_cum_sum.
The following query calculates the desired output:
WITH cte AS (
SELECT
t1.time,
t1.b_value,
t2.u_value * t1.b_value AS bu_value,
last_value(t2.u_value)
OVER
(PARTITION BY DATE_TRUNC('DAY', t1.time) ORDER BY DATE_TRUNC('DAY', t2.time) ) AS daily_u_value
FROM stackoverflow.tbl1 t1
LEFT JOIN stackoverflow.tbl2 t2
ON
t1.time = t2.time
)
SELECT
DATE_TRUNC('DAY', c.time) AS time,
AVG(c.daily_u_value) AS daily_u_value,
SUM( SUM(c.b_value)) OVER (ORDER BY DATE_TRUNC('DAY', c.time) ) as b_value_cum_sum,
AVG(c.daily_u_value) * SUM( SUM(c.b_value) ) OVER (ORDER BY DATE_TRUNC('DAY', c.time) ) as daily_u_value_mul_b_value
FROM cte c
GROUP BY 1
ORDER BY 1 DESC
I was wondering what I can do to optimize this query? Is there any alternative solution that generates the same result?
db filddle demo
from your query: Execution Time: 250.666 ms to my query Execution Time: 205.103 ms
seems there is some progress there. Mainly reduce the time of cast, since I saw your have many times cast from timestamptz to timestamp. I wonder why not just another date column.
I first execute my query then yours, which mean the compare condition is quite fair, since second time execute generally more faster than first time.
alter table tbl1 add column t1_date date;
alter table tbl2 add column t2_date date;
update tbl1 set t1_date = time::date;
update tbl2 set t2_date = time::date;
WITH cte AS (
SELECT
t1.t1_date,
t1.b_value,
t2.u_value * t1.b_value AS bu_value,
last_value(t2.u_value)
OVER
(PARTITION BY t1_date ORDER BY t2_date ) AS daily_u_value
FROM stackoverflow.tbl1 t1
LEFT JOIN stackoverflow.tbl2 t2
ON
t1.time = t2.time
)
SELECT
t1_date,
AVG(c.daily_u_value) AS daily_u_value,
SUM( SUM(c.b_value)) OVER (ORDER BY t1_date ) as b_value_cum_sum,
AVG(c.daily_u_value) * SUM( SUM(c.b_value) ) OVER
(ORDER BY t1_date ) as daily_u_value_mul_b_value
FROM cte c
GROUP BY 1
ORDER BY 1 DESC
My tracking system do not generate sessions IDS.
I have user_id & event_date_time.
I need a new session_id for each user's session that starts 30 minutes or more after last event_date_time of each user.
My final goal is to calculate median session time.
I tried to generate session_id=1 and session_id=2 once event_date_time-next_event_time>30 and guid=guid, but i'm stuck from here
select a.*,
case when (a.next_event_date-a.event_date)*24*60<30 and userID=next_userID
then 1
when (a.next_event_date-a.event_date)*24*60>=30 and userID=next_userID then
2
end session_id
from
(select f.userID,
lead(f.userID) over (partition by f.guid order by f.event_date)
next_guid,
f.event_date,
lead(f.event_date) over (partition by f.guid order by f.event_date)
next_event_date
from event_table f
)a
where next_event_date is not null
If I understood correctly you could generate ID's this way:
select id, guid, event_date,
sum(chg) over (partition by guid order by event_date) session_id
from (
select id, guid, event_date,
case when lag(guid) over (partition by guid order by event_date) = guid
and 24 * 60 * (event_date -lag(event_date)
over (partition by guid order by event_date) ) < 30
then 0 else 1
end chg
from event_table ) a
dbfiddle demo
Compare neighbouring rows, if there are different guids or time difference is greater than 30 minutes then assign 1. Then sum these values analytically.
I think you're on the right track using lead or lag. My recommendation would be to break this into steps and create a temp table to work against:
With the first query, assign every record its own unique ID, either a sequence number or GUID. You could also capture some of the lagged data in this step.
With a second query, find the overlaps (< 30 minutes) and make the overlapping records all the same -- either the same as the earliest or latest in that grouping, doesn't matter as long as it's consistent.
Something like this:
create table events_temp as (
select f.*,
row_number() over (partition by f.userID order by f.event_date) as user_row,
lag(f.userID) over (partition by f.userID order by f.event_date) as prev_userID,
lag(f.event_date) over (partition by f.userID order by f.event_date) as prev_event_date
from event_table f
order by f.userId, f.event_date
)
select a.*,
case when prev_userID = userID
and 24 * 60 * (event_date - prev_event_date) < 30
then lag(user_row) over (partition by userID order by user_row)
else user_row
end as session_id
from events_temp
Help! We're trying to create a new column (Is Valid?) to reproduce the logic below.
It is a binary result that:
it is 1 if it is the first known value of an ID
it is 1 if it is 3 seconds or later than the previous "1" of that ID
Note 1: this is not the difference in seconds from the previous record
It is 0 if it is less than 3 seconds than the previous "1" of that ID
Note 2: there are many IDs in the data set
Note 3: original dataset has ID and Date
Attached a PoC of the data and the expected result.
You would have to do this using a recursive CTE, which is quite expensive:
with tt as (
select t.*, row_number() over (partition by id order by time) as seqnum
from t
),
recursive cte as (
select t.*, time as grp_start
from tt
where seqnum = 1
union all
select tt.*,
(case when tt.time < cte.grp_start + interval '3 second'
then tt.time
else tt.grp_start
end)
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select cte.*,
(case when grp_start = lag(grp_start) over (partition by id order by time)
then 0 else 1
end) as isValid
from cte;
Given this table:
How can I get the datediff in days between each status_date for each group of ID_Number? In other words I need to find the number of elapsed days for each status that the ID_Number has been given.
Some things to know:
All ID_Number will have a received_date which should be the earliest date for each ID_Number (but app doesn't enforce)
For each ID_Number there will be a status with a corresponding status_date which is the date that the ID_Number was given that particular status.
The status column doesn't always necessarily go in the same order every time (app doesn't enforce)
All ID_Number will have a closed_date which should be the latest date (but app doesn't enforce)
Sample output:
So for ID_Number 2001, the first date (received_date) is 2009-05-02 and the next date you encounter has a status of 'open' and is 2009-05-02 so elapsed days is 0. Moving on to the next date encountered is 2009-05-10 with a status of 'invest' and the elapsed days is 8 counting from the prior date. The next date encountered is 2009-07-11 and the elapsed days is 62 counting from the previous date.
Edited to add:
Is it possible to have the elapsed days end up as a column on this table/view?
I also forgot to add that this is SQL Server 2000.
What I understand is that you need the difference between the first status_date and the next status_date for the same id and so on up to the closed_date.
This will only work in SQL 2005 and up.
;with test as (
select
key,
id_number,
status,
received_date,
status_date,
closed_date,
row_number() over (partition by id order by status_date, key ) as rownum
from #test
)
select
t1.key,
t1.id_number,
t1.status,
t1.status_date,
t1.received_date,
t1.closed_date,
datediff(d, case when t1.rownum = 1
then t1.received_date
else
case when t2.status_date is null
then t1.closed_date
else t2.status_date
end
end,
t1.status_date
) as days
from test t1
left outer join test t2
on t1.id = t2.id
and t2.rownum = t1.rownum - 1
This solution will work with SQL 2000 but I am not sure how good will perform:
select *,
datediff(d,
case when prev_date is null
then closed_date
else prev_date
end,
status_date )
from (
select *,
isnull( ( select top 1 t2.status_date
from #test t2
where t1.id_number = t2.id_number
and t2.status_date < t1.status_date
order by t2.status_date desc
),received_date) as prev_date
from #test t1
) a
order by id_number, status_date
Note: Replace the #Test table with the name of your table.
Some sample output would really help, but this is a guess at what you mean, assuming you want that information for each ID_Number/Status combination:
select ID_Number, Status, EndDate - StartDate as DaysElapsed
from (
select ID_Number, Status, min(coalesce(received_date, status_date)) as StartDate, max(coalesce(closed_date, status_date)) as EndDate
from Table1
group by ID_Number, Status
) a
The tricky bit is determining the previous status and putting it on the same row as the current status. It would be simplified a little if there were a correlation between Key and StatusDate (i.e. that Key(x) > Key(y) always implies StatusDate(x) >= StatusDate(y)). Unfortunately, that doesn't seem to be the case.
PS: I am assuming Key is a unique identifier on your table; you haven't said anything to indicate otherwise.
SELECT Key,
ID_Number,
(
SELECT TOP 1 Key
FROM StatusUpdates prev
WHERE (prev.ID_Number = cur.ID_Number)
AND ( (prev.StatusDate < cur.StatusDate)
OR ( prev.StatusDate = cur.StatusDate
AND prev.Key < cur.Key
)
)
ORDER BY StatusDate, Key /*Consider index on (ID_Number, StatusDate, Key)*/
) PrevKey
FROM StatusUpdates cur
Once you have this as a basis, it's easy to extrapolate to any other info you need from the current or previous StatusUpdate. E.g.
SELECT c.*,
p.Status AS PrevStatus,
p.StatusDate AS PrevStatusDate,
DATEDIFF(d, c.StatusDate, p.StatusDate) AS DaysElapsed
FROM (
SELECT Key,
ID_Number,
Status,
SattusDate,
(
SELECT TOP 1 Key
FROM StatusUpdates prev
WHERE (prev.ID_Number = cur.ID_Number)
AND ( (prev.StatusDate < cur.StatusDate)
OR ( prev.StatusDate = cur.StatusDate
AND prev.Key < cur.Key
)
)
ORDER BY StatusDate, Key
) PrevKey
FROM StatusUpdates cur
) c
JOIN StatusUpdates p ON
p.Key = c.PrevKey