SQL Server 2008 query, time in each status - sql

I'm wondering if anybody can help with a query I am working on. I'm trying to gather information for 'Time in each status' from my call activity table.
I need to set up 3 time ranges in days: <3 days, 4-5 days, 6+ days, returning the number of days each CallID is spending in each status.
The trouble I'm having is that I need to identify from the table below when there was a status change. This table records any activity to the call, i.e changed customer details and not just when a status has been changed.
Apologies if this is unclear, let me know if you need further details.
I'm using SQL Server 2008. Here is the table I'm using and related values:
CREATE TABLE Activity ( CallID varchar(30), Call_Date datetime, [User] varchar(30), Status varchar(10) );
INSERT INTO Activity VALUES (366,'2013/09/27 12:24:33',13,9);
INSERT INTO Activity VALUES (366,'2013/09/28 17:36:14',13,9);
INSERT INTO Activity VALUES (366,'2013/09/29 07:29:18',13,10);
INSERT INTO Activity VALUES (366,'2013/09/30 06:22:12',13,-1);
INSERT INTO Activity VALUES (367,'2013/09/27 12:13:16',9,6);
INSERT INTO Activity VALUES (367,'2013/09/27 12:25:03',9,6);
INSERT INTO Activity VALUES (367,'2013/09/29 12:25:29',9,6);
INSERT INTO Activity VALUES (367,'2013/09/30 12:45:55',9,7);
INSERT INTO Activity VALUES (367,'2013/10/01 12:46:04',9,8);
INSERT INTO Activity VALUES (367,'2013/10/02 15:12:27',9,-1);
INSERT INTO Activity VALUES (368,'2013/08/01 15:09:01',5,10);
INSERT INTO Activity VALUES (368,'2013/08/02 14:11:20',5,13);
INSERT INTO Activity VALUES (368,'2013/08/04 16:41:11',5,13);
INSERT INTO Activity VALUES (368,'2013/08/05 01:12:56',5,-1);
Desired Output 1: E.g. if CallID 35931 took 2 days to change from status 1 to status 2, 2 days would be added to the count in the <3 column
Status <3 Days 4-5 days 6+ Days
------ ------- -------- -------
1 10 3 1
2 8 1 2
3 5 3 1
I'm stuck in the first stage trying to identify the rows where there are status changes and ignoring the rest. I'm working on a subquery which selects the top date for each change of status. It's bringing back negative values. See here:
select CallID, T2.[status], Call_Date,
sum(datediff(dd, nextDate, [Call_Date]) - (datediff(wk, nextDate, [Call_Date]) * 2) -
case when datepart(wk, nextDate) = 1 then 1 else 0 end +
case when datepart(wk, [Call_Date]) = 7 then 1 else 0 end) as TotalDays
from (select *,
(select MAX( T0.[Call_Date])
from [Activity] T0
where T0.[Call_Date] > T1.[Call_Date] and
T0.CallID = T1.CallID
) as nextDate
from [Activity] T1
) T2
where T2.[status] <> '-1'
group by Call_Date, T2.[status], CallID
Thanks for your help in advance.

First of all i think that you need only the rows with the minimum date for each id and status as they would show a status change. This can be done with a CTE and using ROW_NUMBER.
Then you should join the results in a way that on the same record you would have the old status date and the new status date. On the first time you would have nulls for the first status.
;WITH CallsCTE AS
(
SELECT CallId,
Call_Date,
Status,
ROW_NUMBER() OVER(PARTITION BY CallId, Status ORDER BY Call_Date) AS rn
FROM Activity
),
StatusChangesCTE AS
(
SELECT CallID,
Call_Date,
Status
FROM CallsCTE
WHERE rn = 1
)
SELECT Sold.*,
Snew.*
FROM StatusChangesCTE Snew
LEFT JOIN StatusChangesCTE Sold
ON Snew.CallID = Sold.CallID
AND Sold.Call_Date = (SELECT MAX(Call_Date) FROM StatusChangesCTE WHERE CallID = Sold.CallID AND Call_Date < Snew.Call_Date)
I think that you can find your way using the above, as you could use DateDiff on Snew.Call_Date and Sold.Call_Date to find the time needed for a status change.
Let me know if you need any more assistance.

Related

SQL Server iterating through time series data

I am using SQL Server and wondering if it is possible to iterate through time series data until specific condition is met and based on that label my data in other table?
For example, let's say I have a table like this:
Id Date Some_kind_of_event
+--+----------+------------------
1 |2018-01-01|dsdf...
1 |2018-01-06|sdfs...
1 |2018-01-29|fsdfs...
2 |2018-05-10|sdfs...
2 |2018-05-11|fgdf...
2 |2018-05-12|asda...
3 |2018-02-15|sgsd...
3 |2018-02-16|rgw...
3 |2018-02-17|sgs...
3 |2018-02-28|sgs...
What I want to get, is to calculate for each key the difference between two adjacent events and find out if there exists difference > 10 days between these two adjacent events. In case yes, I want to stop iterating for that specific key and put label 'inactive', otherwise 'active' in my other table. After we finish with one key, we start with another.
So for example id = 1 would get label 'inactive' because there exists two dates which have difference bigger that 10 days. The final result would be like that:
Id Label
+--+----------+
1 |inactive
2 |active
3 |inactive
Any ideas how to do that? Is it possible to do it with SQL?
When working with a DBMS you need to get away from the idea of thinking iteratively. Instead you need to try and think in sets. "Instead of thinking about what you want to do to a row, think about what you want to do to a column."
If I understand correctly, is this what you're after?
CREATE TABLE SomeEvent (ID int, EventDate date, EventName varchar(10));
INSERT INTO SomeEvent
VALUES (1,'20180101','dsdf...'),
(1,'20180106','sdfs...'),
(1,'20180129','fsdfs..'),
(2,'20180510','sdfs...'),
(2,'20180511','fgdf...'),
(2,'20180512','asda...'),
(3,'20180215','sgsd...'),
(3,'20180216','rgw....'),
(3,'20180217','sgs....'),
(3,'20180228','sgs....');
GO
WITH Gaps AS(
SELECT *,
DATEDIFF(DAY,LAG(EventDate) OVER (PARTITION BY ID ORDER BY EventDate),EventDate) AS EventGap
FROM SomeEvent)
SELECT ID,
CASE WHEN MAX(EventGap) > 10 THEN 'inactive' ELSE 'active' END AS Label
FROM Gaps
GROUP BY ID
ORDER BY ID;
GO
DROP TABLE SomeEvent;
GO
This assumes you are using SQL Server 2012+, as it uses the LAG function, and SQL Server 2008 has less than 12 months of any kind of support.
Try this. Note, replace #MyTable with your actual table.
WITH Diffs AS (
SELECT
Id
,DATEDIFF(DAY,[Date],LEAD([Date],1,0) OVER (ORDER BY [Id], [Date])) Diff
FROM #MyTable)
SELECT
Id
,CASE WHEN MAX(Diff) > 10 THEN 'Inactive' ELSE 'Active' END
FROM Diffs
GROUP BY Id
Just to share another approach (without a CTE).
SELECT
ID
, CASE WHEN SUM(TotalDays) = (MAX(CNT) - 1) THEN 'Active' ELSE 'Inactive' END Label
FROM (
SELECT
ID
, EventDate
, CASE WHEN DATEDIFF(DAY, EventDate, LEAD(EventDate) OVER(PARTITION BY ID ORDER BY EventDate)) < 10 THEN 1 ELSE 0 END TotalDays
, COUNT(ID) OVER(PARTITION BY ID) CNT
FROM EventsTable
) D
GROUP BY ID
The method is counting how many records each ID has, and getting the TotalDays by date differences (in days) between the current the next date, if the difference is less than 10 days, then give me 1, else give me 0.
Then compare, if the total days equal the number of records that each ID has (minus one) would print Active, else Inactive.
This is just another approach that doesn't use CTE.

How to get data from sql server with condition?

I am working with Sql Server!
My question is: I have 15000 records in my customer table , And I want to process first 5000 records in one day, next day I process on next 5000 records on daily basis. Every day operation is perform in limited number of records, Data of customer table changes frequently. And also get number of pending records which are not processed. Please give your helpful suggestions how to do this . Thanks
Further Details:
datetime stamp using in table
Fields: [first_name] ,[middle_name] ,[last_name] ,[created] ,[created_by] ,[customer_number]
The simplest way can be by adding two column (if not exist). updated_at and processed_at. updated_at column will be updated on update of row. processed_at column will be updated when you started process that row by your daily job. Now your query will be something like.
select * from your_table where updated_at > processed_at limit 5000;
I'm going to assume you have some form of ID in your table...
So you set a start date in your procedure, and compare to that (I have used '2016-01-01'):
with CTE as
(
select t1.*, row_number() over(order by customer_id) as r_ord
from Mytable t1
)
select CTE.*
from CTE
where (mod(datediff(day, '2016-01-01', getdate()),3) = 0 and r_ord <= 5000)
or (mod(datediff(day, '2016-01-01', getdate()),3) = 1 and r_ord between 5001 and 10000)
or (mod(datediff(day, '2016-01-01', getdate()),3) = 2 and r_ord > 10000)

Link subsequent patient visits from same table in SQL

I have a table containing records for patient admissions to a group of hospitals.
I would like to be able to link each record to the most recent previous record for each patient, if there is a previous record or return a null field if there is no previous record.
Further to this I would like to place some criteria of the linked records eg previous visit to the same hospital only, previous visit was less than 7 days before.
The data looks something like this (with a whole lots of other fields)
Record PatientID hospital Admitdate DischargeDate
1. 1. A. 1/2/12. 3/2/12
2. 2. A. 1/2/12. 4/2/12
3. 1. B. 4/3/12. 4/3/12
My thinking was a self join but I can't figure out how to join to the record where the difference between the admit date and the patient's previous discharge date is the minimum.
Thanks!
You could use row_number() to assign increasing numbers to records for each patient. Then you can left join to the previous record:
; with numbered_records as
(
select row_number() over (partition by PatientID, Hospital
order by Record desc) as rn
, *
from YourTable
)
select *
from numbered_records cur
left join
numbered_records prev
on prev.PatientID = cur.PatientID
and prev.Hospital = cur.Hospital
and prev.DischargeDate >= dateadd(day, -7, getdate())
and prev.rn = cur.rn + 1
To select only the latest row per patient, add:
where cur.rn = 1
at the end of the query.
It will give you the First 2 records of the same patients. If you want the same Hospital then add another check of Hospital with the PatientID. Also can add the Date as well.
SELECT * FROM T1 t
WHERE (2 >= (SELECT Count(*) FROM T1 tmp
WHERE t.PatientID = tmp.PatientID
AND t.Record <= tmp.Record))
It will only bring the one record if there is only one entry.
Note that:
I used DATE for data type. It might be possible that a patient visits one hospital before noon, and another in the afternoon. You would use DATETIME in that case. Sorting on the partitioning uses dt_admit before record_id, to allow for entry of data in any order.
CREATE TABLE #hdata(
record_id INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
patient_id INT NOT NULL,
hospital_id INT NOT NULL,
dt_admit DATE NOT NULL,
dt_discharge DATE NULL
);
INSERT INTO #hdata(
patient_id,
hospital_id,
dt_admit,
dt_discharge
)
VALUES (
1,
1,
'2012-02-01',
'2012-02-03'
), (
2,
1,
'2012-02-01',
'2012-02-04'
), (
1,
2,
'2012-03-04',
'2012-03-04'
);
-- 1/ link each record to the previous record for each patient, NULL if none
SELECT
record_id,
patient_id,
ROW_NUMBER() OVER (PARTITION BY patient_id ORDER BY dt_admit,record_id) AS visit_seq_id
INTO
#visit_sequence
FROM
#hdata;
SELECT
v1.record_id,
v1.patient_id,
v2.record_id AS previous_record_id
FROM
#visit_sequence AS v1
LEFT JOIN #visit_sequence AS v2 ON
v2.patient_id=v1.patient_id AND
v2.visit_seq_id=v1.visit_seq_id-1
ORDER BY
v1.record_id;
DROP TABLE #visit_sequence;
-- 2/ criteria on linked records: same hospital, previous visit < 7 days
SELECT
record_id,
patient_id,
hospital_id,
dt_admit,
ROW_NUMBER() OVER (PARTITION BY patient_id,hospital_id ORDER BY dt_admit,record_id) AS visit_seq_id
INTO
#visit_sequence_elab
FROM
#hdata;
SELECT
v1.record_id,
v1.patient_id,
v2.record_id AS previous_record_id
FROM
#visit_sequence_elab AS v1
LEFT JOIN #visit_sequence_elab AS v2 ON
v2.patient_id=v1.patient_id AND
v2.hospital_id=v1.hospital_id AND
v2.visit_seq_id=v1.visit_seq_id-1 AND
DATEDIFF(DAY,v1.dt_admit,v2.dt_admit)<7
ORDER BY
v1.record_id;
DROP TABLE #visit_sequence_elab;
DROP TABLE #hdata;

how to insert subtract of each two subsequent rows and inserting it into a new column

I have a question about writing query in sql.
in the picture 1 I want to subtract row 2 from row1 (in column date) and insert it's result in row1 of new column with the title of Recency. and again subtract row3 from row2 and insert it in row2 of the new column, and so on.
picture 1:
in fact I want to calculate the recency of each user's activity. for example in the following picture, I calculated this for one user(manually); I want to do this for all of the users by writing a query in sql.
picture 2:
..........................................................................................
and other question:
I also want to calculate the frequency of activity of each user before the current date. I want to calculate frequency for each row. for example for this example, for user abkqz we have:
user name frequency
abkqz 4
abkqz 3
abkqz 2
abkqz 1
abkqz 0
Assuming the following table structure
CREATE TABLE [15853354] -- Stack Overflow question number
(
[user-name] VARCHAR(20),
[submissions] INT,
[date] DATE,
[score] NUMERIC(9,2),
[points] NUMERIC(9,1)
)
INSERT [15853354]
VALUES
('abkqz', 5, '12 JUL 2010', 83.91, 112.5),
('abkqz', 5, '9 JUN 2010', 77.27, 0),
('abkqz', 5, '17 MAY 2010', 91.87, 315)
Then you could write the following query
;WITH [cte15853354] AS
(
SELECT
[user-name],
[submissions],
[date],
[score],
[points],
ROW_NUMBER() OVER (ORDER BY [user-name], [date] DESC) AS [ROWNUMBER]
FROM [15853354]
)
SELECT
t.[user-name],
t.[submissions],
DATEDIFF(DAY, ISNULL([t-1].[date],t.[date]),t.[date]) AS [recency],
t.[score],
t.[points]
FROM [cte15853354] t
LEFT JOIN [cte15853354] [t-1]
ON [t-1].[user-name] = t.[user-name]
AND [t-1].[ROWNUMBER] = t.[ROWNUMBER] + 1
This uses a Common Table Expression to calculate a row number, and then does a self join to join each row with the next, and then calculates the date difference in days.
This is the result:
Try something like this (untested, since sample data was only posted in a picture). The query users analytical function options that were introduced in SQL Server 2012, so this won't work on an earlier version.
select
[user-name],
submissions,
score,
datediff(day,
lag([date],1) over (
partition by [user-name]
order by [date],
[date]) as recency,
count(*) over (
partition by [user-name]
order by [date] desc) -1 as frequency
from yourTable;

Finding the number of concurrent days two events happen over the course of time using a calendar table

I have a table with a structure
(rx)
clmID int
patid int
drugclass char(3)
drugName char(25)
fillDate date
scriptEndDate date
strength int
And a query
;with PatientDrugList(patid, filldate,scriptEndDate,drugClass,strength)
as
(
select rx.patid,rx.fillDate,rx.scriptEndDate,rx.drugClass,rx.strength
from rx
)
,
DrugList(drugName)
as
(
select x.drugClass
from (values('h3a'),('h6h'))
as x(drugClass)
where x.drugClass is not null
)
SELECT PD.patid, C.calendarDate AS overlap_date
FROM PatientDrugList AS PD, Calendar AS C
WHERE drugClass IN ('h3a','h6h')
AND calendardate BETWEEN filldate AND scriptenddate
GROUP BY PD.patid, C.CalendarDate
HAVING COUNT(DISTINCT drugClass) = 2
order by pd.patid,c.calendarDate
The Calendar is simple a calendar table with all possible dates throughout the length of the study with no other columns.
My query returns data that looks like
The overlap_date represents every day that a person was prescribed a drug in the two classes listed after the PatientDrugList CTE.
I would like to find the number of consecutive days that each person was prescribed both families of drugs. I can't use a simple max and min aggregate because that wouldn't tell me if someone stopped this regimen and then started again. What is an efficient way to find this out?
EDIT: The row constructor in the DrugList CTE should be a parameter for a stored procedure and was amended for the purposes of this example.
You are looking for consecutive sequences of dates. The key observation is that if you subtract a sequence from the dates, you'll get a constant date. This defines a group of dates all in sequence, which can then be grouped.
select patid
,MIN(overlap_date) as start_overlap
,MAX(overlap_date) as end_overlap
from(select cte.*,(dateadd(day,row_number() over(partition by patid order by overlap_Date),overlap_date)) as groupDate
from cte
)t
group by patid, groupDate
This code is untested, so it might have some typos.
You need to pivot on something and a max and min work that out. Can you state if someone had both drugs on a date pivot? Then you would be limiting by date if I understand your question correctly.
EG Example SQL:
declare #Temp table ( person varchar(8), dt date, drug varchar(8));
insert into #Temp values ('Brett','1-1-2013', 'h3a'),('Brett', '1-1-2013', 'h6h'),('Brett','1-2-2013', 'h3a'),('Brett', '1-2-2013', 'h6h'),('Joe', '1-1-2013', 'H3a'),('Joe', '1-2-2013', 'h6h');
with a as
(
select
person
, dt
, max(case when drug = 'h3a' then 1 else 0 end) as h3a
, max(case when drug = 'h6h' then 1 else 0 end) as h6h
from #Temp
group by person, dt
)
, b as
(
select *, case when h3a = 1 and h6h = 1 then 1 end as Logic
from a
)
select person, count(Logic) as DaysOnBothPresriptions
from b
group by person