Select max from several columns - sql

I'm writing code to select patients that die within 30 days of a hospital discharge, my issue is that when I have a patient with multiple discharges within that 30 day tolerance it pulls back multiple rows! I've tried to solve this using max discharge date, which worked, yet when I add extra columns it seems to pull certain elements from other rows. Here is my code:
SELECT MAX(IPS.disch_dttm) [Discharge Datetime]
,MAX(IPS.IP_SPELL_ID) [Spell ID]
,pat.PAS_ID [K Number]
,MAX(IPS.DIS_WARD_ID) [Ward ID]
,DSSU.SU_DESCRIPTION [Discharging Ward]
FROM Pat_spell AS IPS
LEFT JOIN PATIENT PAT WITH (NOLOCK) ON PAT.DIM_PATIENT_ID = IPS.DIM_PATIENT_ID
LEFT JOIN SPECIALTY SPEC WITH (NOLOCK) ON SPEC.DIM_SPECIALTY_ID = IPS.DIM_DIS_SPECT_ID
LEFT JOIN SERVICE_UNIT DSSU WITH (NOLOCK) ON IPS.DIM_DIS_WARD_ID = DSSU.DIM_SSU_ID
WHERE (IPS.DISCH_DTTM <= PAT.DEATH_DTTM + 30)
AND IPS.DIM_DIS_SPECT_ID = '7195'
AND IPS.DISCH_DTTM BETWEEN '01/01/2014' AND '30/06/2014'
GROUP BY pat.PAS_ID
,pat.DEATH_DTTM
,IPS.DIM_PATIENT_ID
,DSSU.SSU_DESCRIPTION
ORDER BY pat.PAS_ID
Here is output from the above code for a single row that I've been using to debug:
Disch Date Event_ID Unique ID Ward ID Discharging Ward
2014-06-14 8366113 A123456 77085 WardA
The above gets the ward ID correct, but the "Discharging Ward" is wrong. Also the Event_ID corresponds with a previous attendance. What I'm trying to achieve is to pull only the most recent event within 30 days of a time of death, with 'Event ID' as my unique ID. This is what the output would look like if I wanted multiple rows:
Disch Date Event ID Unique ID Ward ID Discharging Ward
1 2014-06-14 8208846 A123456 77085 Ward B
2 2014-05-16 8366113 A123456 77036 Ward A
This is what my output should look like:
Disch Date Event_ID Unique ID Ward ID Discharging Ward
2014-06-14 8208846 A123456 77085 Ward B
So to sum up, my code pulls the correct "discharge date", the correct "Ward ID" but seems to pull the rest from other rows in the table. Apologies for the huge ask - any help would be appreciated, or if this has been explored to death, please point me in the right direction.

A simplified version of what you need looks like this...
SELECT [DETAIL INFO, no need to MAX or GROUP BY]
FROM Pat_spell AS IPS
LEFT JOIN PATIENT PAT WITH (NOLOCK) ON PAT.DIM_PATIENT_ID = IPS.DIM_PATIENT_ID
LEFT JOIN SPECIALTY SPEC WITH (NOLOCK) ON SPEC.DIM_SPECIALTY_ID = IPS.DIM_DIS_SPECT_ID
LEFT JOIN SERVICE_UNIT DSSU WITH (NOLOCK) ON IPS.DIM_DIS_WARD_ID = DSSU.DIM_SSU_ID
INNER JOIN (
SELECT PatientID, MAX(IPS.disch_dttm) AS DischargeDt
FROM [AllMyTables]
WHERE (IPS.DISCH_DTTM <= PAT.DEATH_DTTM + 30)
AND IPS.DIM_DIS_SPECT_ID = '7195'
AND IPS.DISCH_DTTM BETWEEN '01/01/2014' AND '30/06/2014'
) t1 ON PAT.PatientID = t1.PatientID AND IPS.disch_dttm = t1.DischargeDt
ORDER BY pat.PAS_ID
Since the INNER SQL is returning 1 row per patient, there is no need to group on the OUTER SQL.
If I had more data to work with, I might be able to stitch together the full SQL, but maybe this points you in the right direction.

You are getting the max of each field independently. If you want the row that corresponds with the latest discharge date, you need to row_number it in Descending order, then select where that =1.
We do the same type query for 30 day re-admits.
So something like
WITH
LastDischargeCTE
AS
(SELECT pat.PAS_ID [K Number], IPS.disch_dttm [Discharge Datetime], (IPS.IP_SPELL_ID) [Spell ID]
,(IPS.DIS_WARD_ID) [Ward ID]
,DSSU.SU_DESCRIPTION [Discharging Ward]
,ROW_NUMBER () OVER (PARTITION BY pat.PAS_ID ORDER BY IPS.disch_dttm DESC) AS DischargeSequence
FROM Pat_spell
LEFT JOIN PATIENT PAT WITH (NOLOCK) ON PAT.DIM_PATIENT_ID = IPS.DIM_PATIENT_ID
LEFT JOIN SPECIALTY SPEC WITH (NOLOCK) ON SPEC.DIM_SPECIALTY_ID = IPS.DIM_DIS_SPECT_ID
LEFT JOIN SERVICE_UNIT DSSU WITH (NOLOCK) ON IPS.DIM_DIS_WARD_ID = DSSU.DIM_SSU_ID
WHERE (IPS.DISCH_DTTM <= PAT.DEATH_DTTM + 30)
AND IPS.DIM_DIS_SPECT_ID = '7195'
AND IPS.DISCH_DTTM BETWEEN '01/01/2014' AND '30/06/2014'
)
SELECT *
FROM LastDischargeCTE
WHERE DischargeSequence =1

Related

daily incremental results on table where transaction date <= #Date parameter

specifically looking for General Leger results. This means that I can't sum up transactions for specfic dates, or cant run Between date.
to get the results for say, today I would need to query the table for all transactions <= #Today.
That said, i am tasked with running this for every single day in 2020 thus far. is there a method to do this where i dont have to manually run for each day myself?
Query example:
glo.GLValue
, Sum(UnitCR) AS 'Credit'
, Sum(UnitDR) AS 'Debit'
, sum(FirmCR) AS 'FirmCredit'
, sum(FirmDR) AS 'FirmDebit'
FROM glacct ga
inner join gldetail gd on gd.glacct = ga.AcctIndex
inner join glnatural gn on ga.glnatural = gn.GLNaturalID
inner join glunit glu on ga.glunit = glu.GLUnitID
inner join gloffice glo on ga.GLOffice = glo.GLOfficeID
WHERE gn.GLNat IN ('11001','11002','11003','11005','11007','11011','11016','11019','11020','11021','11022','11024','11025','11026','11027','11032','11033',
'11034','11035','11036','11037','11040','11041','11042','11043','11044','11050','11051','11052','11053','11190','11199','11201','11202','11203','11204',
'11205','11206','11207','11301','11603','11700','11705','11801','11802','11803','11804','11806','11807','11808','11809','11901')--,'22001')
AND gd.PostDate <= #Yesterday
GROUP BY
glo.GLValue
Create a sub-table that give the sums for each PostDate and GLValue similar to above but also grouped on PostDate, then join that to your select above, e.g.
inner join gloffice glo on ga.GLOffice = glo.GLOfficeID
inner join ( ... new grouped select here ...) gs on gs.GLValue = glo.GlValue and gs.PostDate < gd.PostDate
Now you should be able to sum the gs values:
, Sum(gs.Credit) as Credit
, Sum(gs.Debit) as Debit
etc.

Can't order query correctly

A while ago I requested help to code a LEFT JOIN filtering in a particular way that the result postition the desired value in the first row.
Need to retrieve table's last inserted/updated record with some exclusions
The thing now is that there are many cases which are mixing data. The scenario is that on the same table we have 2 values that we need to organize on different columns. The PO_ID is unique, but can have 1 or more values on the other tables, and for this particular case 1 PO_ID has 3 SHIP_ID_CUS values. We only need 1 PO_ID per row (no duplicates) that is way we used the MAX() and GROUP BY.
Here is a piece of the code that I think cause issues.
select
z.po_id,
max(scdc.ship_id) as ship_id_cdc,
max(lscdc.ship_evnt_cd) as last_event_cdc,
max(lscdc.ship_evnt_tms) as event_tms_cdc,
max(scus.SHIP_ID) as ship_id_cus,
max(lscus.ship_evnt_cd) as last_event_cus,
max(lscus.ship_evnt_tms) as event_tms_cus
from TABLE.A z
left join (select distinct po_id, iltc.ship_id, s.ship_to_loc_code from TABLE.B iltc inner join TABLE.C s on iltc.ship_id=s.ship_id and iltc.ship_to_loc_code=s.ship_to_loc_code and s.ship_to_ctry<>' ') AS A ON z.po_id = a.po_id
left JOIN TABLE.C scus ON A.SHIP_ID = scus.SHIP_ID AND A.SHIP_TO_LOC_CODE = scus.SHIP_TO_LOC_CODE and scus.loc_type = 'CUS' AND DAYS(scus.shipment_tms)+10 >= DAYS(z.ship_tms)
left JOIN TABLE.C scdc ON A.SHIP_ID = scdc.SHIP_ID AND A.SHIP_TO_LOC_CODE = scdc.SHIP_TO_LOC_CODE and scdc.loc_type = 'CDC' AND DAYS(scdc.shipment_tms)+10 >= DAYS(z.ship_tms)
left join
( select ship_id_856, ship_to_loc_cd856, ship_evnt_cd, ship_evnt_tms, carr_tracking_num, event_srv_lvl
, row_number() over(partition by ship_id order by updt_job_tms desc) as RN
FROM TABLE.D
WHERE LEFT(ship_evnt_cd, 1) <> '9') lscus
ON lscus.ship_id_856=scus.ship_id and scus.ship_to_loc_code=lscus.ship_to_loc_cd856 and lscus.rn = 1
left join
( select ship_id_856, ship_to_loc_cd856, ship_evnt_cd, ship_evnt_tms, carr_tracking_num, event_srv_lvl
, row_number() over(partition by ship_id order by updt_job_tms desc) as RN
FROM TABLE.D
WHERE LEFT(ship_evnt_cd, 1) <> '9') lscdc
ON lscdc.ship_id_856=scdc.ship_id and lscdc.ship_to_loc_cd856=scdc.ship_to_loc_code and lscdc.rn = 1
WHERE
z.po_id = 'T1DLDC'
GROUP BY z.po_id
By searching that condition we get the following result
The problem is that if we search directly on the TABLE.D, the last event that we need (with last update record tms) is another one (X1) and somehow the date is incorrect.
What is even more weird, is that if we search for the ship_id_cus on the original query, we get the correct code but still with a wrong date...
WHERE
--z.po_id = 'T1DLDC'
scus.ship_id = 'D30980'
GROUP BY z.po_id
I tried other logic changes like modifying the left joins to search on a subquery.
left JOIN ( select * from TABLE.C order by updt_job_tms desc) scus ON A.SHIP_ID = scus.SHIP_ID AND A.SHIP_TO_LOC_CODE = scus.SHIP_TO_LOC_CODE and scus.loc_type = 'CUS' AND DAYS(scus.shipment_tms)+10 >= DAYS(z.ship_tms)
But this is also giving the same exact results by searching either by po_id or ship_id_cus
Any ideas or comment will be much appreciated.
Thanks
------------------------------------UPDATE-----------------------------------
Adding the result of the LEFT JOIN with the row_partition() including all the ship_id_cus for that po_id, and all the codes with the tms. None match here.
Based on all these, it should be the last ship_id_cus with X1 event/tms. If we exclude also the ones starting with 9, we would get the following result.
(I am not applying here ordering by ship_id_cus, which already described before that did not work either the way I implemented)
If you have a table: TBL1
ID APPROVED APPROVER DATE_APPROVED
====== ======== ======== =============
ABC Y JOE 2019-01-13
ABC N ZACK 2018-12-23
ABC N SUE 2019-02-23
And you do SQL:
SELECT ID, MAX(APPROVED) AS APPROVAL
,MAX(APPROVER) AS APPROVED_BY , MAX(DATE_APPROVED) AS APPROVED_ON
FROM TBL1 GROUP BY ID
you will get result:
ID APPROVAL APPROVED_BY APPROVED_ON
====== ======== =========== =============
ABC Y ZACK 2019-02-23
which is correct to the code but is NOT what you want
Try the following:
SELECT T1.ID, T1.APPROVED, T1.APPROVER, T1.DATE_APPROVED
FROM TBL1 AS T1
INNER JOIN (SELECT ID, MAX(DATE_APPROVED) AS APPROVED_ON
FROM TBL1 GROUP BY ID
) AS T2
ON T1.ID =T2.ID
AND T1.DATE_APPROVED = T2.APPROVED_ON
Result:
ID APPROVED APPROVER DATE_APPROVED
====== ======== ======== =============
ABC N SUE 2019-02-23

Sql Joining 3 Tables Query for Loan Total Result

I have to get the total of Customer Loan from 3 tables the two tables are given loan to sum and the other subtract the paid amount and the tables have Customer ID in common. so far I can get the result only if the Customer ID exist in all tables but if it doesn't exist in one table I won't get Customer in my result. or I get NULL customer IDs when I anchor to the customer.
SELECT
AS1.C_ID AS [Customer ID],
ISNULL(AS1.OldCustomerLoan, 0) AS [Old Loan],
ISNULL(AS2.NewGivenLoan, 0) AS [New Given Loan],
ISNULL(AS3.LoanPaid, 0) AS [PaidLoanAmount],
(ISNULL(AS1.OldCustomerLoan, 0) +
ISNULL(AS2.NewGivenLoan, 0) -
ISNULL(AS3.LoanPaid,0) ) AS Total
FROM
Customer C
LEFT OUTER JOIN
(SELECT
MOC.C_ID,
SUM(MOC.Quantity) AS OldCustomerLoan
FROM
Money_On_Customer MOC (NOLOCK)
GROUP BY
MOC.C_ID) AS1
ON c.C_Id = AS1.C_Id
LEFT OUTER JOIN
(SELECT
NGL.C_ID
,SUM(NGL.G_Take_Loan) AS NewGivenLoan
FROM
Given_Loan NGL
GROUP BY
NGL.C_ID) AS2
ON c.C_Id = AS2.C_Id
LEFT OUTER JOIN
(SELECT
GLP.C_ID, SUM(GLP.G_P_Loan) AS LoanPaid
FROM
Given_Loan_Paid GLP
GROUP BY
GLP.C_ID ) AS3
ON c.C_Id = AS3.C_Id
Here Is a picture of my two results:
When I get NULL Customer IDs
When I don't get All the Customers
You need to use c.c_id for the first column
In order to only get records where they exist in at least one of the tables you can add this to you query, just put you current query in place of the dots, and add the leftid col
Select *
From
(Select c.c_id custonerid,
Coalesce(Coalesce(as1.c_id,as2.c_id),as3.c_id) leftid,......
From ....
) ilv
Where leftid is not null
You might be able to just add
Where coalesce(coalesce(as1.c_id,as2.c_id),as3.c_id) is not null
To then end of your query
#Ab Bennett's answer is right,
because you should use your main(master) table's primary key column
if Customer ID is not available in as1 (Money_On_Customer) it will show null.
Hope you understand my explanation.
UPDATE:
and use following for getting customer id
COALESCE(c.C_Id, AS1.C_Id, AS2.C_Id)
you will get first not null Customer ID
Here is My answer with the help of #Ab Bennett
Select *
From
(
SELECT
C.C_Name AS [Customer ID],
ISNULL(AS1.OldCustomerLoan, 0) AS [Old Loan],
ISNULL(AS2.NewGivenLoan, 0) AS [New Given Loan],
ISNULL(AS3.LoanPaid, 0) AS [PaidLoanAmount],
(ISNULL(AS1.OldCustomerLoan, 0) +
ISNULL(AS2.NewGivenLoan, 0) -
ISNULL(AS3.LoanPaid,0) ) AS Total
FROM
Customer C
LEFT OUTER JOIN
(SELECT
MOC.C_ID,
SUM(MOC.Quantity) AS OldCustomerLoan
FROM
Money_On_Customer MOC (NOLOCK)
GROUP BY
MOC.C_ID) AS1
ON c.C_Id = AS1.C_Id
LEFT OUTER JOIN
(SELECT
NGL.C_ID
,SUM(NGL.G_Take_Loan) AS NewGivenLoan
FROM
Given_Loan NGL
GROUP BY
NGL.C_ID) AS2
ON c.C_Id = AS2.C_Id
LEFT OUTER JOIN
(SELECT
GLP.C_ID, SUM(GLP.G_P_Loan) AS LoanPaid
FROM
Given_Loan_Paid GLP
GROUP BY
GLP.C_ID ) AS3
ON c.C_Id = AS3.C_Id) ilv
Where not ([Old Loan] = 0 and [New Given Loan]=0 and PaidLoanAmount =0 )

Select Latest or most recent date in SQL query

I am running a query in SQL on our EHR/EMR database. I am primarily looking at an assessment that is done by a nurse during each patient encounter/visit and looking to return an answer for the most recent assessment date along with some other info. I have the query created and all the data is coming over, however, it is returning all assessment dates and the answers instead of just the latest date and answer. I'll attach the full code below.
SELECT DISTINCT
MAX(PTA.ASSESSMENT_DATE) AS Max_Date,
SAQ.QUESTION_TEXT, SAA.ANSWER_TEXT, dbo.PT_BASIC.PATIENT_CODE,
dbo.PT_BASIC.NAME_FULL
FROM
dbo.PTC_ASSESSMENT_ANSWER AS PAA
INNER JOIN
dbo.PTC_ASSESSMENT AS PTA ON PTA.ASSESSMENT_ID = PAA.ASSESSMENT_ID
AND PTA.PATIENT_ID = PAA.PATIENT_ID
INNER JOIN
dbo.SYS_ASSESSMENT_POINTER AS SAP ON SAP.POINTER_ID = PAA.POINTER_ID
INNER JOIN
dbo.SYS_ASSESSMENT_QUESTION AS SAQ ON SAQ.QUESTION_ID = SAP.QUESTION_ID
INNER JOIN
dbo.SYS_ASSESSMENT_ANSWER AS SAA ON SAA.ANSWER_ID = SAP.ANSWER_ID
INNER JOIN
dbo.PT_BASIC ON PTA.PATIENT_ID = dbo.PT_BASIC.PATIENT_ID
WHERE
(PTA.ASSESSMENT_DATE BETWEEN CONVERT(DATETIME, '2017-09-05 00:00:00', 102)
AND CONVERT(DATETIME, '2017-10-12 00:00:00', 102))
GROUP BY
dbo.PT_BASIC.PATIENT_CODE, dbo.PT_BASIC.NAME_FULL, SAQ.QUESTION_TEXT,
SAA.ANSWER_TEXT
HAVING
(SAA.ANSWER_TEXT LIKE '%LEVEL % -%')
The current output would be something similar to this:
9/5/2017 PATIENT ABC Answer1
9/6/2017 PATIENT ABC Answer2
9/7/2017 PATIENT ABC Answer3
9/6/2017 PATIENT XYZ Answer4
What I am expecting is:
9/7/2017 PATIENT ABC Answer3
9/6/2017 PATIENT XYZ Answer4
If your version of SQL Server supports it, using ROW_NUMBER() OVER() is an efficient and simple method for arriving at "latest" (or "earliest") rows from a single table. However as we know so little about your data model it isn't easy to guess how to reduce the rows to just the "lastest answer" which probably requires a more complex subquery. However you can still use ROW_NUMBER() OVER() on that subquery. I suspect that the nature of questions and answers is that the table aliases SAP, SAQ, SAA may all need to be involved in this subquery.
Note that instead of directly joining PTA this is now a subquery and the join condition to the outer query requires that RN=1 which is the row with the "latest" date.
SELECT
MAX(PTA.ASSESSMENT_DATE) AS Max_Date
, SAQ.QUESTION_TEXT
, SAA.ANSWER_TEXT
, dbo.PT_BASIC.PATIENT_CODE
, dbo.PT_BASIC.NAME_FULL
FROM dbo.PTC_ASSESSMENT_ANSWER AS PAA
INNER JOIN (
SELECT
*
, ROW_NUMBER() OVER (PARTITION BY PATIENT_ID
ORDER BY ASSESSMENT_DATE DESC) AS RN
FROM dbo.PTC_ASSESSMENT
WHERE ASSESSMENT_DATE BETWEEN '20170905' AND '20171012'
) AS PTA ON PTA.ASSESSMENT_ID = PAA.ASSESSMENT_ID
AND PTA.PATIENT_ID = PAA.PATIENT_ID
AND PTA.RN = 1
INNER JOIN dbo.SYS_ASSESSMENT_POINTER AS SAP ON SAP.POINTER_ID = PAA.POINTER_ID
INNER JOIN dbo.SYS_ASSESSMENT_QUESTION AS SAQ ON SAQ.QUESTION_ID = SAP.QUESTION_ID
INNER JOIN dbo.SYS_ASSESSMENT_ANSWER AS SAA ON SAA.ANSWER_ID = SAP.ANSWER_ID
INNER JOIN dbo.PT_BASIC ON PTA.PATIENT_ID = dbo.PT_BASIC.PATIENT_ID
WHERE SAA.ANSWER_TEXT LIKE '%LEVEL % -%'
GROUP BY
dbo.PT_BASIC.PATIENT_CODE
, dbo.PT_BASIC.NAME_FULL
, SAQ.QUESTION_TEXT
, SAA.ANSWER_TEXT
select distinct is not required on this query (or any similar query using GROUP BY)
yyymmdd is the safest date literal in SQL Server, you don't need the converts using style 102
your having clause should be moved to a where clause as it does not evaluate any aggregated value
Cross apply allows you to use a correlated query and chive the top most n records ordered by date desc for each patient assessment. (after review maybe you just need patient?)
Perhaps just change:
INNER JOIN
dbo.PTC_ASSESSMENT AS PTA ON PTA.ASSESSMENT_ID = PAA.ASSESSMENT_ID
AND PTA.PATIENT_ID = PAA.PATIENT_ID
TO:
CROSS APPLY (SELECT TOP 1 *
FROM dbo.PTC_ASSESSMENT PTA2
WHERE PTA2.ASSESSMENT_ID = PAA.ASSESSMENT_ID
/*AND PTA2.PATIENT_ID = PAA.PATIENT_ID*/
ORDER BY PTA2.Assessment_date desc) PTA
GIVING YOU: (I left the /AND PTA2.PATIENT_ID = PAA.PATIENT_ID/ --I think you can omit this. I left the */ in place but it's not needed)
SELECT MAX(PTA.ASSESSMENT_DATE) AS Max_Date
, SAQ.QUESTION_TEXT
, SAA.ANSWER_TEXT
, dbo.PT_BASIC.PATIENT_CODE
, dbo.PT_BASIC.NAME_FULL
FROM dbo.PTC_ASSESSMENT_ANSWER AS PAA
CROSS APPLY (SELECT TOP 1 *
FROM dbo.PTC_ASSESSMENT PTA2
WHERE PTA2.ASSESSMENT_ID = PAA.ASSESSMENT_ID --I think you can omit this.
/*AND PTA2.PATIENT_ID = PAA.PATIENT_ID*/
ORDER BY PTA2.Assessment_date desc) PTA
INNER JOIN dbo.SYS_ASSESSMENT_POINTER AS SAP
ON SAP.POINTER_ID = PAA.POINTER_ID
INNER JOIN dbo.SYS_ASSESSMENT_QUESTION AS SAQ
ON SAQ.QUESTION_ID = SAP.QUESTION_ID
INNER JOIN dbo.SYS_ASSESSMENT_ANSWER AS SAA
ON SAA.ANSWER_ID = SAP.ANSWER_ID
INNER JOIN dbo.PT_BASIC
ON PTA.PATIENT_ID = dbo.PT_BASIC.PATIENT_ID
WHERE (PTA.ASSESSMENT_DATE BETWEEN CONVERT(DATETIME, '2017-09-05 00:00:00', 102) AND CONVERT(DATETIME, '2017-10-12 00:00:00', 102))
GROUP BY dbo.PT_BASIC.PATIENT_CODE
, dbo.PT_BASIC.NAME_FULL
, SAQ.QUESTION_TEXT
, SAA.ANSWER_TEXT
HAVING (SAA.ANSWER_TEXT LIKE '%LEVEL % -%')
It appears you're not concerned about patients w/o assessments as all your joins are inner or we could use OUTER APPPLY to be sure to keep all answers regardless if an assessment has been provided.
Alternatively you could use a row_number() logic ( Tab Alleman's link has this covered) and a cte; but if cross apply is available might as well use it here.
Please include order by PTA.ASSESSMENT_DATE DESC to see the latest records at the top.

Get percentages of larger group

The query below is kind of an ugly one so I hope I've got it spaced well enough to make it readable. The query finds the percentage of people that visit a given hospital if they are from a certain area. For instance, if 100 people live in county X and 20 go to hospital A and 80 go to hospital B the query outputs. How the heck is this sort of thing done? Let me know if I need to document the query or whatever I can do to make it clearer.
hospital A 20
hospital B 80
The query below works exactly like I want it to, but it give me thinking: how could this be done for every county in my table?
select hospitalname, round(cast(counts as float)/cast(fayettestrokepop as float)*100,2)as percentSeen
from
(
SELECT tblHospitals.hospitalname, COUNT(tblHospitals.hospitalname) AS counts, tblStateCounties_1.countyName,
(SELECT COUNT(*) AS Expr1
FROM Patient INNER JOIN
tblStateCounties ON Patient.stateCode = tblStateCounties.stateCode AND Patient.countyCode = tblStateCounties.countyCode
WHERE (tblStateCounties.stateCode = '21') AND (tblStateCounties.countyName = 'fayette')) AS fayetteStrokePop
FROM Patient AS Patient_1 INNER JOIN
tblHospitals ON Patient_1.hospitalnpi = tblHospitals.hospitalnpi INNER JOIN
tblStateCounties AS tblStateCounties_1 ON Patient_1.stateCode = tblStateCounties_1.stateCode AND Patient_1.countyCode = tblStateCounties_1.countyCode
WHERE (tblStateCounties_1.stateCode = '21') AND (tblStateCounties_1.countyName = 'fayette')
GROUP BY tblHospitals.hospitalname, tblStateCounties_1.countyName
) as t
order by percentSeen desc
EDIT: sample data
The sample data below is without the outermost query (the as t order by part).
The countsInTheCounty column is the (select count(*)..) part after 'tblStateCounties_1.countyName'
hospitalName hospitalCounts countyName countsInTheCounty
st. james 23 X 300
st. jude 40 X 300
Now with the outer query we would get
st james 0.076 (23/300)
st. jude 0.1333 (40/300)
Here is my guess. You'll have to test against your data or provide proper DDL + sample data.
;WITH totalCounts AS
(
SELECT StateCode, countyCode, COUNT(*) AS totalcount
FROM dbo.Patient GROUP BY StateCode, countyCode
)
SELECT
h.hospitalName,
hospitalCounts = COUNT(p.hospitalnpi),
c.countyName,
countsInTheCounty = tc.totalCount,
percentseen = CONVERT(DECIMAL(5,2), COUNT(p.hospitalnpi)*100.0/tc.totalCount)
FROM
dbo.Patient AS p
INNER JOIN
dbo.tblHospitals AS h
ON p.hospitalnpi = h.hospitalnpi
INNER JOIN
totalCounts AS tc
ON p.StateCode = tc.StateCode
AND p.countyCode = tc.countyCode
INNER JOIN
dbo.tblStateCounties AS c
ON tc.StateCode = c.stateCode
AND tc.countyCode = c.countyCode
GROUP BY
h.hospitalname,
c.countyName,
tc.totalcount
ORDER BY
c.countyName,
percentseen DESC;