Need to join 3 tables and return results where everyone is over age 75 - sql

I have table employee with
id, name, dob, emplid
table documentation has
cdid, emplid, status, record
table appointment has
cvid, emplid, slotid
Everyone has a record in table Employee. Table Appointment stores everyone who schedules an appointment and table Documentation is where the record gets inserted when they complete their appointment. The problem is, they take walk-ins and will have a record in table Documentation, but no record in table Appointment. They want me to find everyone in table Employee who is over age 75, but does not currently have an appointment or has never come in as a walk-in.
I started with the below, but I am stuck on how to accurately get everyone counted.
SELECT COUNT(AgeYearsIntTrunc)
FROM (
SELECT DATEDIFF(hour,e.DOB,GETDATE())/8766.0 AS AgeYearsDecimal
,CONVERT(int,ROUND(DATEDIFF(hour,e.DOB,GETDATE())/8766.0,0)) AS AgeYearsIntRound
,DATEDIFF(hour,e.DOB,GETDATE())/8766 AS AgeYearsIntTrunc
FROM [dbo].Employee e
LEFT JOIN [dbo].Documentation d ON e.EmplID = d.EMPLID
WHERE d.Status IS NULL
) dt WHERE AgeYearsIntTrunc >='75'
Sample Data

It's normally best to not expect the compiler to do algebra for you, in other words: write predicates like x > y + 5 rather than x - y > 5.
Generally, the most efficient method for comparing dates is to keep the column you are checking without any function, do any necessary calculations on the GETDATE side.
NOT EXISTS is much easier for the compiler to reason about than LEFT JOIN / IS NULL
SELECT e.* --COUNT(*)
FROM [dbo].Employee e
WHERE e.DOB < DATEADD(year, -75, GETDATE())
AND (
NOT EXISTS (SELECT 1
FROM [dbo].Documentation d
WHERE e.EmplID = d.EMPLID
)
AND -- do you want OR maybe?
NOT EXISTS (SELECT 1
FROM dbo.Appointment a
WHERE a.EmplId = e.EmplId
)
)

Related

SQL COUNT By month

I have the following table that registers the days students went to school. I need to count the days they are PRESENT, but also need to count the total of school days for each month. (When ASISTENCIA is either 0 or 1)
This what I have so far, but it doesn't count the total.
BEGIN
SELECT
u.user_id user_id,
u.user_first_name as names,
u.user_last_name_01 as lastname1,
u.user_last_name_02 as lastname2,
MONTH(a.FECHA_ASISTENCIA) month,
COUNT(*) as absent_days,
p.PHONE as phone,
p.CITY as city,
#EDUCATION_LEVEL_ID
FROM
users u
inner join asistencia a ON u.user_id = a.USER_ID
inner join profile p ON u.rut_SF = p.RUT_SF
WHERE
a.ASISTENCIA = 0 -- NOT PRESENT
AND a.EDUCATION_LEVEL_ID = #EDUCATION_LEVEL_ID
AND YEAR(a.FECHA_ASISTENCIA) = #EDUCATION_LEVEL_YEAR
GROUP BY
u.user_id,
u.user_first_name,
u.user_last_name_01,
u.user_last_name_02,
MONTH(a.FECHA_ASISTENCIA),
p.TELEFONO,
p.CIUDAD_DOM
ORDER BY mes
END
ATTENDANCE
USER_ID
DATE
ATTENDANCE
EDUCATION_LEVEL_ID
123
2021-04-13
0
1
123
2021-04-14
1
1
DESIRED OUTPUT
names
lastname1
lastname2
month
absent_days
total_class_days
city
JOHN
SMITH
SMITH
3
10
24
CITY
JOHN
SMITH
SMITH
4
8
24
CITY
Without examples of what these tables look like, it is hard to give you a solid answer. However, it appears as though your biggest challenge is that ABSTENTIA does not exist.
This is a common problem for analysis - you need to create rows that do not exist (when the user was absent).
The general approach is to:
create a list of unique users
create a list of unique dates you care about
Cross Join (cartesian join) these to create every possible combination of user and date
Outer Join #3 to #4 so you can populate a PRESENT flag, and now you can see both who were PRESENT and which were not.
Filter out rows which don't apply (for instance if a user joined on 3/4/2021 then ignore the blank rows before this date)
You can accomplish this with some SQL that looks like this:
WITH GLOBAL_SPINE AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) as INTERVAL_ID,
DATEADD('DAY', (INTERVAL_ID - 1), '2021-01-01'::timestamp_ntz) as SPINE_START,
DATEADD('DAY', INTERVAL_ID, '2022-01-01'::timestamp_ntz) as SPINE_END
FROM TABLE (GENERATOR(ROWCOUNT => 365))
),
GROUPS AS (
SELECT
USERID,
MIN(DESIRED_INTERVAL) AS LOCAL_START,
MAX(DESIRED_INTERVAL) AS LOCAL_END
FROM RASGO.PUBLIC.RASGO_SDK__OP4__AGGREGATE_TRANSFORM__8AB1FEDF90
GROUP BY
USERID
),
GROUP_SPINE AS (
SELECT
USERID,
SPINE_START AS GROUP_START,
SPINE_END AS GROUP_END
FROM GROUPS G
CROSS JOIN LATERAL (
SELECT
SPINE_START, SPINE_END
FROM GLOBAL_SPINE S
WHERE S.SPINE_START BETWEEN G.LOCAL_START AND G.LOCAL_END
)
)
SELECT
G.USERID AS GROUP_BY_USERID,
GROUP_START,
GROUP_END,
T.*
FROM GROUP_SPINE G
LEFT JOIN {{ your_table }} T
ON DESIRED_INTERVAL >= G.GROUP_START
AND DESIRED_INTERVAL < G.GROUP_END
AND G.USERID = T.USERID;
The above script works on Snowflake, but the syntax might be slightly different depending on your RDBMS. There are also some other tweaks you can make regarding when you insert the blank rows, I chose 'local' which means that we begin inserting rows for each user on their very first day. You could change this to global if you wanted to populate data from every single day between 1/1/2021 and 1/1/2022.
I used Rasgo to generate this SQL

Select latest and 2nd latest date rows per user

I have the following query to select rows where the LAST_UPDATE_DATE field is getting records that have a date value greater than or equal to the last 7 days, which works great.
SELECT 'NEW ROW' AS 'ROW_TYPE', A.EMPLID, B.FIRST_NAME, B.LAST_NAME,
A.BANK_CD, A.ACCOUNT_NUM, ACCOUNT_TYPE, PRIORITY, A.LAST_UPDATE_DATE
FROM PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P' AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND A.EFFDT = (SELECT MAX(A1.EFFDT)
FROM PS_DIR_DEP_DISTRIB A1
WHERE A1.EMPLID = A.EMPLID
AND A1.EFFDT <= GETDATE())
AND D.EFF_STATUS = 'A'
AND D.EFFDT = (SELECT MAX(D1.EFFDT)
FROM PS_DIRECT_DEPOSIT D1
WHERE D1.EMPLID = D.EMPLID
AND D1.EFFDT <= GETDATE())
AND A.LAST_UPDATE_DATE >= GETDATE() - 7
What I would like to add onto this is to also add the previous (2nd MAX) row per EMPLID, so that I can output the 'old' row (that was prior to the last update the latest row meeting above criteria), along with the new row that I already am outputting in the query.
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE
NEW ROW 12345 JOHN SMITH 123548999 45234879 C 999 2019-03-06 00:00:00.000
OLD ROW 12345 JOHN SMITH 214080046 92178616 C 999 2018-10-24 00:00:00.000
NEW ROW 56399 CHARLES MASTER 785816167 84314314 C 999 2019-03-07 00:00:00.000
OLD ROW 56399 CHARLES MASTER 345761227 547352 C 999 2017-05-16 00:00:00.000
So the EMPLID would be ordered by NEW ROW, followed by OLD ROW as shown above. In this example the 'NEW ROW' is getting the record that is within the past 7 days, as indicated by the LAST_UPDATE_DATE.
I would like to get feedback on how to modify the query so I can also get the 'old' row (which is the max row that is less than the 'NEW' row retrieved above).
It was a slow day for crime in Gotham, so I gave this a whirl. Might work.
This is unlikely to work right out of the box, though, but it should get you started.
Your LAST_UPDATE_DATE column is on the table PS_DIR_DEP_DISTRIB, so we'll start there. First, you want to identify all of the records that were updated in the last 7 days because those are the only ones you're interested in. Throughout this, I'm assuming, and I'm probably wrong, that the natural key for the table consists of EMPLID, BANK_CD, and ACCOUNT_NUM. You'll want to sub in the actual natural key for those columns in a few places. That said, the date limiter looks something like this:
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND
limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
Now we'll use that as a correlated sub-query in a WHERE EXISTS clause that we'll correlate back to the base table to limit ourselves to records with natural key values that were updated in the last week. I altered the SELECT list to just SELECT 1, which is typical verbiage for a correlated sub, since it stops looking for a match when it finds one (1), and doesn't actually return any values at all.
Additionally, since we're filtering this record set anyway, I moved all the other WHERE clause filters for this table into this (soon to be) sub-query.
Finally, in the SELECT portion, I added a DENSE_RANK to force order the records. We' use the DENSE_RANK value later to filter off only the first (N) records of interest.
So that leaves us with this:
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
--,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
,PRIORITY
,EFFDT
,LAST_UPDATE_DATE
,DEPOSIT_TYPE
,AMOUNT_PCT
,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
EMPLID
ORDER BY
LAST_UPDATE_DATE DESC
) AS RowNum
FROM
PS_DIR_DEP_DISTRIB AS sdist
WHERE
EXISTS
(
-- Get the set of records that were last updated in the last 7 days.
-- Correlate to the outer query so it only returns records related to this subset.
-- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
-- Something like this, using the actual natural key columns in the WHERE
SELECT
1
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
--The first two define the date range.
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
AND
--And these are the correlations to the outer query.
limit.EMPLID = sdist.EMPLID
AND limit.BANK_CD = sdist.BANK_CD
AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
)
AND
(
dist.DEPOSIT_TYPE = 'P'
AND dist.AMOUNT_PCT = 100
)
OR dist.PRIORITY = 999
OR dist.DEPOSIT_TYPE = 'B'
Replace the original INNER JOIN to PS_DIR_DEP_DISTRIB with that query. In the SELECT list, the first hard-coded value is now dependent on the RowNum value, so that's a CASE expression now. In the WHERE clause, the dates are all driven by the subquery, so they're gone, several were folded into the subquery, and we're adding WHERE dist.RowNum <= 2 to bring back the top 2 records.
(I also replaced all the table aliases so I could keep track of what I was looking at.)
SELECT
CASE dist.RowNum
WHEN 1 THEN 'NEW ROW'
ELSE 'OLD ROW'
END AS ROW_TYPE
,dist.EMPLID
,emp.FIRST_NAME
,emp.LAST_NAME
,dist.BANK_CD
,dist.ACCOUNT_NUM
,ACCOUNT_TYPE
,dist.PRIORITY
,dist.LAST_UPDATE_DATE
FROM
PS_DIRECT_DEPOSIT AS dd
INNER JOIN
(
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
--,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
,PRIORITY
,EFFDT
,LAST_UPDATE_DATE
,DEPOSIT_TYPE
,AMOUNT_PCT
,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
EMPLID
ORDER BY
LAST_UPDATE_DATE DESC
) AS RowNum
FROM
PS_DIR_DEP_DISTRIB AS sdist
WHERE
EXISTS
(
-- Get the set of records that were last updated in the last 7 days.
-- Correlate to the outer query so it only returns records related to this subset.
-- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
-- Something like this, using the actual natural key columns in the WHERE
SELECT
1
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
--The first two define the date range.
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
AND
--And these are the correlations to the outer query.
limit.EMPLID = sdist.EMPLID
AND limit.BANK_CD = sdist.BANK_CD
AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
)
AND
(
dist.DEPOSIT_TYPE = 'P'
AND dist.AMOUNT_PCT = 100
)
OR dist.PRIORITY = 999
OR dist.DEPOSIT_TYPE = 'B'
) AS dist
ON
dist.EMPLID = dd.EMPLID
AND dist.EFFDT = dd.EFFDT
INNER JOIN
PS_EMPLOYEES AS emp
ON
emp.EMPLID = dist.EMPLID
WHERE
dist.RowNum <= 2
AND
emp.EMPL_STATUS NOT IN ('T', 'R', 'D')
AND
dd.EFF_STATUS = 'A';

How do I select all id's that are present in one table but not in another

I am trying to get a list of department Ids are present in one table, (PS_Y_FORM_HIRE), but which don't exist in another table (PS_DEPARTMENT_VW).
Here is the basics of what I have which isn't working:
SELECT h.DEPTID FROM PS_Y_FORM_HIRE h, PS_DEPARTMENT_VW d WHERE NOT EXISTS (
SELECT d1.DEPTID FROM PS_DEPARTMENT_VW d1 WHERE d1.DEPTID = h.DEPTID
and d1.SETID_GL_DEPT = 'IDBYU'
);
I'm trying to form this query in SQL Developer, but it just returns a long list of blanks (after spinning/running the query for a very long time).
In addition, I need this to be effective dated, so that it only grabs the correct effective-dated row, but I was unsure how and where to incorporate this into the query.
EDIT I neglected to mention that only the department table is effective dated. The form hire table is not. I need to get the current effectively dated row from that in this query (to make sure the data is accurate).
Also note that DEPTID isn't a key on PS_Y_FORM_HIRE, but is on PS_DEPARTMENT_VW. (Along with SETID_GL_DEPT and EFFDT).
So again, ideally, I will have a list of all the department ids that appear in PS_Y_FORM_HIRE, but which are not in PS_DEPARTMENT_VW.
SELECT DEPTID
FROM PS_Y_FORM_HIRE
MINUS
SELECT DEPTID
FROM PS_DEPARTMENT_VW
WHERE SETID_GL_DEPT = 'IDBYU';
or
SELECT DEPTID
FROM PS_Y_FORM_HIRE
WHERE DEPTID NOT IN (
SELECT DEPTID
FROM PS_DEPARTMENT_VW
WHERE SETID_GL_DEPT = 'IDBYU'
)
or
SELECT DEPTID
FROM PS_Y_FORM_HIRE h
WHERE NOT EXISTS (
SELECT 1
FROM PS_DEPARTMENT_VW d
WHERE SETID_GL_DEPT = 'IDBYU'
AND d.DEPTID = h.DEPTID
)
This seems like a job for the MINUS operation. Something like
select deptid from ps_y_form_hire where eff_date = <whatever>
minus
select deptid from ps_department_vw <where eff_date = ...>
You didn't provide information to determine what exactly you want done with the effective dates; adapt as needed.
SELECT h.DEPTID
FROM PS_Y_FORM_HIRE h
WHERE h.DEPTID NOT IN (SELECT p.DEPTID
FROM PS_DEPARTMENT_VW p
WHERE p.SETID_GL_DEPT = 'IDBYU')
Your question is a bit unclear around why you want effective dated rows as you are not checking effective status or any other field that may have changed between effective rows. If your question is, You want to know all DEPTIDs in PS_Y_FORM_HIRE that either don't exist or are inactive as of a current effective date, then the SQL below should help
SELECT DEPTID
FROM PS_Y_FORM_HIRE h
WHERE
H.DEPTID NOT IN ( SELECT d.DEPTID
FROM PS_DEPARTMENT_VW d
WHERE d.EFF_STATUS = 'A'
AND d.EFFDT = (SELECT MAX(EFFDT)
FROM PS_DEPARTMENT_VW d2
WHERE d2.SETID_GL_DEPT = d.SETID_GL_DEPT
AND d2.DEPTID = d.DEPTID
AND d2.EFFDT <= CURRENT_DATE)
)

SQL: multiple counts from same table

I am having a real problem trying to get a query with the data I need. I have tried a few methods without success. I can get the data with 4 separate queries, just can't get hem into 1 query. All data comes from 1 table. I will list as much info as I can.
My data looks like this. I have a customerID and 3 columns that record who has worked on the record for that customer as well as the assigned acct manager
RecID_Customer___CreatedBy____LastUser____AcctMan
1-------1374----------Bob Jones--------Mary Willis------Bob Jones
2-------1375----------Mary Willis------Bob Jones--------Bob Jones
3-------1376----------Jay Scott--------Mary Willis-------Mary Willis
4-------1377----------Jay Scott--------Mary Willis------Jay Scott
5-------1378----------Bob Jones--------Jay Scott--------Jay Scott
I want the query to return the following data. See below for a description of how each is obtained.
Employee___Created__Modified__Mod Own__Created Own
Bob Jones--------2-----------1---------------1----------------1
Mary Willis------1-----------2---------------1----------------0
Jay Scott--------2-----------1---------------1----------------1
Created = Counts the number of records created by each Employee
Modified = Number of records where the Employee is listed as Last User
(except where they created the record)
Mod Own = Number of records for each where the LastUser = Acctman
(account manager)
Created Own = Number of Records created by the employee where they are
the account manager for that customer
I can get each of these from a query, just need to somehow combine them:
Select CreatedBy, COUNT(CreatedBy) as Created
FROM [dbo].[Cust_REc] GROUP By CreatedBy
Select LastUser, COUNT(LastUser) as Modified
FROM [dbo].[Cust_REc] Where LastUser != CreatedBy GROUP By LastUser
Select AcctMan, COUNT(AcctMan) as CreatePort
FROM [dbo].[Cust_REc] Where AcctMan = CreatedBy GROUP By AcctMan
Select AcctMan, COUNT(AcctMan) as ModPort
FROM [dbo].[Cust_REc] Where AcctMan = LastUser AND NOT AcctMan = CreatedBy GROUP By AcctMan
Can someone see a way to do this? I may have to join the table to itself, but my attempts have not given me the correct data.
The following will give you the results you're looking for.
select
e.employee,
create_count=(select count(*) from customers c where c.createdby=e.employee),
mod_count=(select count(*) from customers c where c.lastmodifiedby=e.employee),
create_own_count=(select count(*) from customers c where c.createdby=e.employee and c.acctman=e.employee),
mod_own_count=(select count(*) from customers c where c.lastmodifiedby=e.employee and c.acctman=e.employee)
from (
select employee=createdby from customers
union
select employee=lastmodifiedby from customers
union
select employee=acctman from customers
) e
Note: there are other approaches that are more efficient than this but potentially far more complex as well. Specifically, I would bet there is a master Employee table somewhere that would prevent you from having to do the inline view just to get the list of names.
this seems pretty straight forward. Try this:
select a.employee,b.created,c.modified ....
from (select distinct created_by from data) as a
inner join
(select created_by,count(*) as created from data group by created_by) as b
on a.employee = b.created_by)
inner join ....
This highly inefficient query may be a rough start to what you are looking for. Once you validate the data then there are things you can do to tidy it up and make it more efficient.
Also, I don't think you need the DISTINCT on the UNION part because the UNION will return DISTINCT values unless UNION ALL is specified.
SELECT
Employees.EmployeeID,
Created =(SELECT COUNT(*) FROM Cust_REc WHERE Cust_REc.CreatedBy=Employees.EmployeeID),
Mopdified =(SELECT COUNT(*) FROM Cust_REc WHERE Cust_REc.LastUser=Employees.EmployeeID AND Cust_REc.CreateBy<>Employees.EmployeeID),
ModOwn =
CASE WHEN NOT Empoyees.IsManager THEN NULL ELSE
(SELECT COUNT(*) FROM Cust_REc WHERE AcctMan=Employees.EmployeeID)
END,
CreatedOwn=(SELECT COUNT(*) FROM Cust_REc WHERE AcctMan=Employees.EmployeeID AND CReatedBy=Employees.EMployeeID)
FROM
(
SELECT
EmployeeID,
IsManager=CASE WHEN EXISTS(SELECT AcctMan FROM CustRec WHERE AcctMan=EmployeeID)
FROM
(
SELECT DISTINCT
EmployeeID
FROM
(
SELECT EmployeeID=CreatedBy FROM Cust_Rec
UNION
SELECT EmployeeID=LastUser FROM Cust_Rec
UNION
SELECT EmployeeID=AcctMan FROM Cust_Rec
)AS Z
)AS Y
)
AS Employees
I had the same issue with the Modified column. All the other columns worked okay. DCR example would work well with the join on an employees table if you have it.
SELECT CreatedBy AS [Employee],
COUNT(CreatedBy) AS [Created],
--Couldn't get modified to pull the right results
SUM(CASE WHEN LastUser = AcctMan THEN 1 ELSE 0 END) [Mod Own],
SUM(CASE WHEN CreatedBy = AcctMan THEN 1 ELSE 0 END) [Created Own]
FROM Cust_Rec
GROUP BY CreatedBy

SQL Server 2008: Using Multiple dts Ranges to Build a Set of Dates

I'm trying to build a query for a medical database that counts the number of patients that were on at least one medication from a class of medications (the medications listed below in the FAST_MEDS CTE) and had either:
1) A diagnosis of myopathy (the list of diagnoses in the FAST_DX CTE)
2) A CPK lab value above 1000 (the lab value in the FAST_LABS CTE)
and this diagnosis or lab happened AFTER a patient was on a statin.
The query I've included below does that under the assumption that once a patient is on a statin, they're on a statin forever. The first CTE collects the ids of patients that were on a statin along with the first date of their diagnosis, the second those with a diagnosis, and the third those with a high lab value. After this I count those that match the above criteria.
What I would like to do is drop the assumption that once a patient is on a statin, they're on it for life. The table edw_dm.patient_medications has a column called start_dts and end_dts. This table has one row for each prescription written, with start_dts and end_dts denoting the start and end date of the prescription. End_dts could be null, which I'll take to assume that the patient is currently on this medication (it could be a missing record, but I can't do anything about this). If a patient is on two different statins, the start and ends dates can overlap, and there may be multiple records of the same medication for a patient, as in a record showing 3-11-2000 to 4-5-2003 and another for the same patient showing 5-6-2007 to 7-8-2009.
I would like to use these two columns to build a query where I'm only counting the patients that had a lab value or diagnosis done during a time when they were already on a statin, or in the first n (say 3) months after they stopped taking a statin. I'm really not sure how to go about rewriting the first CTE to get this information and how to do the comparison after the CTEs are built. I know this is a vague question, but I'm really stumped. Any ideas?
As always, thank you in advance.
Here's the current query:
WITH FAST_MEDS AS
(
select distinct
statins.mrd_pt_id, min(year(statins.order_dts)) as statin_yr
from
edw_dm.patient_medications as statins
inner join mrd.medications as mrd
on statins.mrd_med_id = mrd.mrd_med_id
WHERE mrd.generic_nm in (
'Lovastatin (9664708500)',
'lovastatin-niacin',
'Lovastatin/Niacin',
'Lovastatin',
'Simvastatin (9678583966)',
'ezetimibe-simvastatin',
'niacin-simvastatin',
'ezetimibe/Simvastatin',
'Niacin/Simvastatin',
'Simvastatin',
'Aspirin Buffered-Pravastatin',
'aspirin-pravastatin',
'Aspirin/Pravastatin',
'Pravastatin',
'amlodipine-atorvastatin',
'Amlodipine/atorvastatin',
'atorvastatin',
'fluvastatin',
'rosuvastatin'
)
and YEAR(statins.order_dts) IS NOT NULL
and statins.mrd_pt_id IS NOT NULL
group by statins.mrd_pt_id
)
select *
into #meds
from FAST_MEDS
;
--return patients who had a diagnosis in the list and the year that
--diagnosis was given
with
FAST_DX AS
(
SELECT pd.mrd_pt_id, YEAR(pd.init_noted_dts) as init_yr
FROM edw_dm.patient_diagnoses as pd
inner join mrd.diagnoses as mrd
on pd.mrd_dx_id = mrd.mrd_dx_id
and mrd.icd9_cd in
('728.89','729.1','710.4','728.3','729.0','728.81','781.0','791.3')
)
select *
into #dx
from FAST_DX;
--return patients who had a high cpk value along with the year the cpk
--value was taken
with
FAST_LABS AS
(
SELECT
pl.mrd_pt_id, YEAR(pl.order_dts) as lab_yr
FROM
edw_dm.patient_labs as pl
inner join mrd.labs as mrd
on pl.mrd_lab_id = mrd.mrd_lab_id
and mrd.lab_nm = 'CK (CPK)'
WHERE
pl.lab_val between 1000 AND 999998
)
select *
into #labs
from FAST_LABS;
-- count the number of patients who had a lab value or a medication
-- value taken sometime AFTER their initial statin diagnosis
select
count(distinct p.mrd_pt_id) as ct
from
mrd.patient_demographics as p
join #meds as m
on p.mrd_pt_id = m.mrd_pt_id
AND
(
EXISTS (
SELECT 'A' FROM #labs l WHERE p.mrd_pt_id = l.mrd_pt_id
and l.lab_yr >= m.statin_yr
)
OR
EXISTS(
SELECT 'A' FROM #dx d WHERE p.mrd_pt_id = d.mrd_pt_id
AND d.init_yr >= m.statin_yr
)
)
You probably don't need to select all of your CTE defined queries into temp tables.
I think that the query you're after has the form:
WITH FAST_MEDS(PatientID, StartDate, EndDate) AS
(
--your query for patients on statins, projecting the patient ID and the start/end date for the medication
),
FAST_DX(PatientID, Date) AS
(
--your query for patients with certain diagnosis, projecting the patient ID and the date
),
FAST_LABS(PatientID, Date) AS
(
--your query for patients with certain labs, projecting the patient ID and the date
)
SELECT PatientID
FROM FAST_MEDS
WHERE PatientID IN (SELECT PatientID FROM FAST_DX WHERE Date BETWEEN StartDate AND EndDate OR EndDate IS NULL AND StartDate < Date)
OR PatientID IN (SELECT PatientID FROM FAST_LABS WHERE Date BETWEEN StartDate AND EndDate OR EndDate IS NULL AND StartDate < Date)