I need to pull unique patients that meet certain criteria - sql

I pulled person_nbrs that have never had an EventType1 before or after an EventType2. I need to pull person_nbrs that have not had an EventType1 prior to having an EventType2. If they had an EventType1 after an EventType2, than it is to be ignored. Here is my query that pulls person_nbrs that have never had an EventType1 before or after EventType2.
SELECT
person_nbr, enc_nbr, enc_timestamp
FROM
person p
JOIN
patient_encounter pe ON p.person_id = pe.person_id
JOIN
patient_procedure pp ON pe.enc_id = pp.enc_id
WHERE
enc_timestamp >= '20170101'
--EventType2
AND code_id LIKE '2'
-- EventType1
AND person_nbr NOT IN (SELECT person_nbr
FROM person p
JOIN patient_encounter pe ON p.person_id = pe.person_id
JOIN patient_procedure pp ON pe.enc_id = pp.enc_id
WHERE code_id LIKE '1')
GROUP BY
person_nbr, enc_nbr, enc_timestamp
ORDER BY
person_nbr ;

You can do this with aggregation and a HAVING clause:
SELECT p.person_nbr
FROM person p JOIN
patient_encounter pe
ON p.person_id = pe.person_id JOIN
patient_procedure pp
ON pe.enc_id = pp.enc_id
GROUP BY p.person_nbr
HAVING SUM(CASE WHEN pp.code_id = 2 THEN 1 ELSE 0 END) > 0 AND -- has code 2
(MAX(CASE WHEN pp.code_id = 1 THEN pe.timestamp END) IS NULL OR
MAX(CASE WHEN pp.code_id = 1 THEN pe.timestamp END) < MIN(CASE WHEN pp.code_id = 2 THEN pe.timestamp END)
) ;
The HAVING clause has two parts:
The first specifies that the person has a code = 2.
The second specifies one of two conditions. The first is that there is no code = 1. The second alternative is that the latest c = 1 timestamp is less than the earliest code = 2 timestamp.

Related

How to write queries where there are more than 2 conditions to extract info in postgresql?

I have tables as below
guides users offers reservations manager_crm_issues
id id id offer_id issuable_id
user_id username guide_id issuable_type issuable_type
What I Would like to extract is
guide.id,
guide.username,
manager_crm_issues.count(issuuable_id)
The issuable_type's distinct values are {Reservation, Offer, Guide}, and it corresponds to issuable_id.
i.e. if issuable_type = 'Reservation' then the issuable_id = reservation.id
Question is, I Would like to count all the issues happened on Guide, and Guide is linked to Offer, Offer is linked to Reservation.
SELECT
a.guideId,
a.guideName,
count(case when cr.issuable_type = 'Reservation' and cr.issuable_id = a.rID THEN cr.id else 0 END),
count(case when co.issuable_type = 'Offer' AND co.issuable_id = a.offerId THEN co.id else 0 END),
count(case when cg.issuable_type = 'Guide' AND cg.issuable_id = a.guideId THEN cg.id else 0 END)
FROM
(SELECT
g.id AS guideId,
u.username AS guideName,
o.id as offerId,
r.id as rId
FROM guides g
INNER JOIN users u on u.id = g.user_id
INNER JOIN offers AS o on o.guide_id = g.id
INNER JOIN reservations AS r on r.offer_id = o.id) a
INNER JOIN manager_crm_issues cg ON cg.id = a.guideId
INNER JOIN manager_crm_issues co ON co.id = a.offerId
INNER JOIN manager_crm_issues cr ON cr.id = a.rId
group by 1,2
I tried to join tables like above, but the outcome seems inaccurate.
Would really appreciate your help.
Don't know if this is related to your issue because you do not say what the issue is but this does not do what you think it does:
count(case
when cr.issuable_type = 'Reservation' and cr.issuable_id = a.rID THEN cr.id
else 0
END),
count counts not nulls so your query will count everything. What you want is
count(case
when cr.issuable_type = 'Reservation' and cr.issuable_id = a.rID THEN cr.id
else null
END),

SQl Error : Each GROUP BY expression must contain at least one column that is not an outer reference [duplicate]

This question already has answers here:
Each GROUP BY expression must contain at least one column that is not an outer reference
(8 answers)
Closed 6 years ago.
I get this error
Each GROUP BY expression must contain at least one column that is not an outer reference
while running this query:
SELECT TOP 1
SUM(mla.total_current_attribute_value)
FROM
partstrack_machine_location_attributes mla (NOLOCK)
INNER JOIN
#tmpInstallParts_Temp installpartdetails ON mla.machine_sequence_id = installpartdetails.InstallKitToMachineSequenceId
AND (CASE WHEN mla.machine_side_id IS NULL THEN 1
WHEN mla.machine_side_id = installpartdetails.InstallKitToMachineSideId THEN 1 END
) = 1
INNER JOIN
partstrack_mes_attribute_mapping mam (NOLOCK) ON mla.mes_attribute = mam.mes_attribute_name
INNER JOIN
partstrack_attribute_type at (NOLOCK) ON mam.pt_attribute_id = at.pt_attribute_id
INNER JOIN
partstrack_ipp_mes_attributes ima(NOLOCK) ON at.pt_attribute_id = ima.pt_attribute_id
WHERE
mla.active_ind = 'Y' AND
ima.ipp_ID IN (SELECT ipp.ipp_id
FROM partstrack_individual_physical_part ipp
INNER JOIN #tmpInstallParts_Temp tmp ON (ipp.ipp_id = tmp.InstallingPartIPPId OR
(CASE WHEN tmp.InstallingPartIPKId = '-1' THEN 1 END) = 1
)
GROUP BY
ima.ipp_id
Can someone help me?
This is the text of the query from the first revision of the question.
In later revisions you removed the last closing bracket ) and the query became syntactically incorrect. You'd better check and fix the text of the question and format the text of the query, so it is readable.
SELECT TOP 1
SUM(mla.total_current_attribute_value)
FROM
partstrack_machine_location_attributes mla (NOLOCK)
INNER JOIN #tmpInstallParts_Temp installpartdetails
ON mla.machine_sequence_id = installpartdetails.InstallKitToMachineSequenceId
AND (CASE WHEN mla.machine_side_id IS NULL THEN 1
WHEN mla.machine_side_id = installpartdetails.InstallKitToMachineSideId THEN 1 END) = 1
INNER JOIN partstrack_mes_attribute_mapping mam (NOLOCK) ON mla.mes_attribute = mam.mes_attribute_name
INNER JOIN partstrack_attribute_type at (NOLOCK) ON mam.pt_attribute_id = at.pt_attribute_id
INNER JOIN partstrack_ipp_mes_attributes ima(NOLOCK) ON at.pt_attribute_id = ima.pt_attribute_id
WHERE
mla.active_ind = 'Y'
AND ima.ipp_ID IN
(
Select
ipp.ipp_id
FROM
partstrack_individual_physical_part ipp
INNER JOIN #tmpInstallParts_Temp tmp
ON (ipp.ipp_id = tmp.InstallingPartIPPId
OR (CASE WHEN tmp.InstallingPartIPKId = '-1' THEN 1 END) = 1)
GROUP BY
ima.ipp_id
)
With this formatting it is clear now that there is a subquery with GROUP BY.
Most likely it is just a typo: you meant to write GROUP BY ipp.ipp_id instead of GROUP BY ima.ipp_id.
If you really wanted to have the GROUP BY not in a subquery, but in the main SELECT, then you misplaced the closing bracket ) and the query should look like this:
SELECT TOP 1
SUM(mla.total_current_attribute_value)
FROM
partstrack_machine_location_attributes mla (NOLOCK)
INNER JOIN #tmpInstallParts_Temp installpartdetails
ON mla.machine_sequence_id = installpartdetails.InstallKitToMachineSequenceId
AND (CASE WHEN mla.machine_side_id IS NULL THEN 1
WHEN mla.machine_side_id = installpartdetails.InstallKitToMachineSideId THEN 1 END) = 1
INNER JOIN partstrack_mes_attribute_mapping mam (NOLOCK) ON mla.mes_attribute = mam.mes_attribute_name
INNER JOIN partstrack_attribute_type at (NOLOCK) ON mam.pt_attribute_id = at.pt_attribute_id
INNER JOIN partstrack_ipp_mes_attributes ima(NOLOCK) ON at.pt_attribute_id = ima.pt_attribute_id
WHERE
mla.active_ind = 'Y'
AND ima.ipp_ID IN
(
Select
ipp.ipp_id
FROM
partstrack_individual_physical_part ipp
INNER JOIN #tmpInstallParts_Temp tmp
ON (ipp.ipp_id = tmp.InstallingPartIPPId
OR (CASE WHEN tmp.InstallingPartIPKId = '-1' THEN 1 END) = 1)
)
GROUP BY
ima.ipp_id
In any case, proper formatting of the source code can really help.
Group By ima.ipp_id
should be applicable to outer query. Because of incorrect placement of '(' it was applying to inner query.
Now after correcting the query,it's working fine without any issues.
Final Query is :
SELECT TOP 1
SUM(mla.total_current_attribute_value)
FROM
partstrack_machine_location_attributes mla (NOLOCK)
INNER JOIN #tmpInstallParts_Temp installpartdetails
ON mla.machine_sequence_id = installpartdetails.InstallKitToMachineSequenceId
AND (CASE WHEN mla.machine_side_id IS NULL THEN 1
WHEN mla.machine_side_id = installpartdetails.InstallKitToMachineSideId THEN 1 END ) = 1
INNER JOIN partstrack_mes_attribute_mapping mam (NOLOCK) ON mla.mes_attribute = mam.mes_attribute_name
INNER JOIN partstrack_attribute_type at (NOLOCK) ON mam.pt_attribute_id = at.pt_attribute_id
INNER JOIN partstrack_ipp_mes_attributes ima(NOLOCK) ON at.pt_attribute_id = ima.pt_attribute_id
WHERE
mla.active_ind = 'Y'
AND ima.ipp_ID IN
(
Select
ipp.ipp_id
FROM
partstrack_individual_physical_part ipp
INNER JOIN #tmpInstallParts_Temp tmp
ON (ipp.ipp_id = tmp.InstallingPartIPPId
OR (CASE WHEN tmp.InstallingPartIPKId = '-1' THEN 1 END ) =1)
)
GROUP BY ima.ipp_id
Thank you all.

Condensing duplicate data while maintaining multiple entries per client

I apologize if the title is not descriptive enough. I am having a hard time describing what I am looking for.
I understand this is normally done on the front end but a client is requesting to have this data displayed in this way. this should show a single column with client information (lastname, firstname) and any/all booked appointments. Since there may be more than one appointment per client duplicate data will be displayed along each line. That will look like this:
lastName | firstName | apptDate | myStartTime | myEndTime
smith..............| john................| 4/7/2016........| 7:00.........................| 8:00
smith..............| john................| 4/9/2016........| 6:00.........................| 7:00
smith..............| john................| 4/14/2016......| 10:00.......................| 11:00
arnold.............| williams..........| 4/10/2016......| 7:00.........................| 11:00
arnold.............| williams..........| 4/11/2016......| 8:00.........................| 12:00
but I would like that to be displayed as:
smith..............| john................| 4/7/2016........| 7:00.........................| 8:00
................................................| 4/9/2016........| 6:00.........................| 7:00
................................................| 4/14/2016......| 10:00.......................| 11:00
arnold..............| williams.........| 4/10/2016.......| 7:00.........................| 11:00
................................................| 4/11/2016.......| 8:00.........................| 12:00
This is what I am currently working with:
SELECT
CASE
WHEN cli.rownum = 1 THEN cli.clientID
END AS 'clientID'
,CASE
WHEN cli.rownum = 1 THEN cli.lastName
END AS 'lastName'
,CASE
WHEN cli.rownum = 1 THEN cli.firstName
END AS 'firstName'
,CASE
WHEN cli.rownum = 1 THEN cli.homePhone
END AS 'home'
,CASE
WHEN cli.rownum = 1 THEN cli.cellPhone
END AS 'cell'
,info.*
,CASE
WHEN cli.rownum = 1 THEN cli.staffAlertMsg
END AS 'alert'
,cli.notes AS 'notes'
,CONVERT(varchar(10),cli.classDate,101) AS 'date'
,cli.myStartTime AS 'start'
,cli.myEndTime AS 'end'
,cli.typeName AS 'appointment'
FROM
(SELECT
c.clientID
,c.lastName
,c.firstName
,c.homePhone
,c.cellPhone
,info.*
,r.notes
,c.staffAlertMsg
,r.ClassDate
,r.myStartTime
,r.myEndTime
,vt.TypeName
,ROW_NUMBER()
OVER(PARTITION BY c.clientID ORDER BY c.clientID) AS 'rownum'
FROM clients c
LEFT OUTER JOIN tblReservation r
ON c.clientID = r.clientID
LEFT OUTER JOIN tblVisitTypes vt
ON vt.TypeID = r.visitType
OUTER APPLY
(SELECT
max(CASE
WHEN civ.clientIndexID = 4
THEN civ.clientIndexValueName
END) AS 'priority'
,max(CASE
WHEN civ.clientIndexID = 5
THEN civ.clientIndexValueName
END) AS 'lang'
,max(CASE
WHEN civ.clientIndexID = 17
THEN civ.clientIndexValueName
END) AS 'inter'
,max(CASE
WHEN ccf.name LIKE 'prac'
THEN ccv.TextVal
END) AS 'prac'
,max(CASE
WHEN st.TypeID = 100000001 THEN st.typeName
END) AS 'ride?'
,max(CASE
WHEN st.typeID = 100000006 THEN st.typeName
END) AS 'l?'
FROM tblClientCustomValues ccv
INNER JOIN tblClientCustomFields ccf
ON ccv.ID = ccf.ID
INNER JOIN tblClientIndexData cid
ON ccv.clientID = cid.clientID
INNER JOIN tblClientIndexValue civ
ON cid.clientIndexValueID = civ.clientIndexValueID
LEFT JOIN [type details] td
ON ccv.clientID = td.clientID
LEFT JOIN [student types] st
ON td.typeID = st.typeID
WHERE c.clientID = ccv.clientID
) info
) cli
WHERE cli.clientID != '-2'
AND cli.clientID != '0'
AND cli.clientID != '1'
AND (cli.classDate >= '4/1/2016'
AND cli.classDate <= '4/30/2016')
Assuming you want to only return the name for the first row of each client then this should work (also assuming a primary key column of ID on the clients table):
SELECT CASE WHEN cli.rownum = 1 THEN cli.lastName END AS lastName
,CASE WHEN cli.rownum = 1 THEN cli.firstName END AS firstName
,CONVERT(varchar(10),cli.apptDate,101) AS 'apptDate'
,cli.myStartTime
,cli.myEndTime
FROM
(SELECT c.ID
,c.lastName
,c.firstName
,r.apptDate
,r.myStartTime
,r.myEndTime
,ROW_NUMBER() OVER (PARTITION BY c.ID ORDER BY c.ID, r.apptDate, r.myStartTime) as rownum
FROM clients c
LEFT OUTER JOIN tblReservation r
ON c.clientID = r.clientID
LEFT OUTER JOIN tblVisitTypes vt
ON vt.TypeID = r.VisitType)
cli
ORDER BY cli.ID, cli.rownum
I have made both the join LEFT OUTER JOINS as your second INNER JOIN would have made the first join an INNER, which I assume is not what you wanted.

Query for logistic regression, multiple where exists

A logistic regression is a composed of a uniquely identifying number, followed by multiple binary variables (always 1 or 0) based on whether or not a person meets certain criteria. Below I have a query that lists several of these binary conditions. With only four such criteria the query takes a little longer to run than what I would think. Is there a more efficient approach than below? Note. tblicd is a large table lookup table with text representations of 15k+ rows. The query makes no real sense, just a proof of concept. I have the proper indexes on my composite keys.
select patient.patientid
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1000
)
then '1' else '0'
end as moreThan1000
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1500
)
then '1' else '0'
end as moreThan1500
,case when exists
(
select distinct picd.patientid from patienticd as picd
inner join patient as p on p.patientid= picd.patientid
and picd.admissiondate = p.admissiondate
and picd.dischargedate = p.dischargedate
inner join tblicd as t on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%' and patient.patientid = picd.patientid
)
then '1' else '0'
end as diabetes
,case when exists
(
select r.patientid, count(*) from patient as r
where r.patientid = patient.patientid
group by r.patientid
having count(*) >1
)
then '1' else '0'
end
from patient
order by moreThan1000 desc
I would start by using subqueries in the from clause:
select q.patientid, moreThan1000, moreThan1500,
(case when d.patientid is not null then 1 else 0 end),
(case when pc.patientid is not null then 1 else 0 end)
from patient p left outer join
(select c.patientid,
(case when count(*) > 1000 then 1 else 0 end) as moreThan1000,
(case when count(*) > 1500 then 1 else 0 end) as moreThan1500
from tblclaims as c inner join
patient as p
on p.patientid=c.patientid and
c.admissiondate = p.admissiondate and
c.dischargedate = p.dischargedate
group by c.patientid
) q
on p.patientid = q.patientid left outer join
(select distinct picd.patientid
from patienticd as picd inner join
patient as p
on p.patientid= picd.patientid and
picd.admissiondate = p.admissiondate and
picd.dischargedate = p.dischargedate inner join
tblicd as t
on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%'
) d
on p.patientid = d.patientid left outer join
(select r.patientid, count(*) as cnt
from patient as r
group by r.patientid
having count(*) >1
) pc
on p.patientid = pc.patientid
order by 2 desc
You can then probably simplify these subqueries more by combining them (for instance "p" and "pc" on the outer query can be combined into one). However, without the correlated subqueries, SQL Server should find it easier to optimize the queries.
Example of left joins as requested...
SELECT
patientid,
ISNULL(CondA.ConditionA,0) as IsConditionA,
ISNULL(CondB.ConditionB,0) as IsConditionB,
....
FROM
patient
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionA from ... where ... ) CondA
ON patient.patientid = CondA.patientID
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionB from ... where ... ) CondB
ON patient.patientid = CondB.patientID
If your Condition queries only return a maximum one row, you can simplify them down to
(SELECT patientid, 1 as ConditionA from ... where ... ) CondA

SQL LEFT JOIN combined with regular joins

I have the following query that joins a bunch of tables.
I'd like to get every record from the INDUSTRY table that has consolidated_industry_id = 1 regardless of whether or not it matches the other tables. I believe this needs to be done with a LEFT JOIN?
SELECT attr.industry_id AS option_id,
attr.industry AS option_name,
uj.ft_job_industry_id,
Avg(CASE
WHEN s.salary > 0 THEN s.salary
END) AS average,
Count(CASE
WHEN s.salary > 0 THEN attr.industry
END) AS count_non_zero,
Count(attr.industry_id) AS count_total
FROM industry attr,
user_job_ft_job uj,
salary_ft_job s,
user_job_ft_job ut,
[user] u,
user_education_mba_school mba
WHERE u.user_id = uj.user_id
AND u.user_id = ut.user_id
AND u.user_id = mba.user_id
AND uj.ft_job_industry_id = attr.industry_id
AND uj.user_job_ft_job_id = s.user_job_id
AND u.include_in_student_site_results = 1
AND u.site_instance_id IN ( 1 )
AND uj.job_type_id = 1
AND attr.consolidated_industry_id = 1
AND mba.mba_graduation_year_id NOT IN ( 8, 9 )
AND uj.admin_approved = 1
GROUP BY attr.industry_id,
attr.industry,
uj.ft_job_industry_id
This returns only one row, but there are 8 matches in the industry table where consolidated_industry_id = 1.
--- EDIT: The real question here is, how do I combine the LEFT JOIN with the regular joins?
Use left join for tables that may miss a corresponding record. Put the conditions for each table in the on clause of the join, not in the where, as that would in effect make them inner joins anyway. Something like:
select
attr.industry_id AS option_id, attr.industry AS option_name,
uj.ft_job_industry_id, AVG(CASE WHEN s.salary > 0 THEN s.salary END) AS average,
COUNT(CASE WHEN s.salary > 0 THEN attr.industry END) as count_non_zero,
COUNT(attr.industry_id) as count_total
from
industry attr
left join user_job_ft_job uj on uj.ft_job_industry_id = attr.industry_id and uj.job_type_id = 1 and uj.admin_approved = 1
left join salary_ft_job s on uj.user_job_ft_job_id = s.user_job_id
left join [user] u on u.user_id = uj.user_id and u.include_in_student_site_results = 1 and u.site_instance_id IN (1)
left join user_job_ft_job ut on u.user_id = ut.user_id
left join user_education_mba_school mba on u.user_id = mba.user_id and mba.mba_graduation_year_id not in (8, 9)
where
attr.consolidated_industry_id = 1
group by
attr.industry_id, attr.industry, uj.ft_job_industry_id
If you have any tables that you know always have a corresponding record, just use innser join for that.