Query extensibility with WHERE EXISTS with a large table - sql

The following query is designed to find the number of people who went to a hospital, the total number of people who went to a hospital and the divide those two to find a percentage. The table Claims is two million plus rows and does have the correct non-clustered index of patientid, admissiondate, and dischargdate. The query runs quickly enough but I'm interested in how I could make it more usable. I would like to be able to add another code in the line where (hcpcs.hcpcs ='97001') and have the change in percentRehabNotHomeHealth be relfected in another column. Is there possible without writing a big, fat join statement where I join the results of the two queries together? I know that by adding the extra column the math won't look right, but I'm not worried about that at the moment. desired sample output: http://imgur.com/BCLrd
database schema
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join
--this join adds the hospitalCounts column
(
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
) as t on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select distinct p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10

You might look into CTE (Common Table Expressions) to get what you need. It would allow you to get summarized data and join that back to the detail on a common key. As an example I modified your join on the subquery to be a CTE.
;with hospitalCounts as (
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
)
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join hospitalCounts on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10

Related

Multiple joins with group by (Sum)

When I using multiple JOIN, I hope to get the sum of some column in joined tables.
SELECT
A.*,
SUM(C.purchase_price) AS purcchase_total,
SUM(D.sales_price) AS sales_total,
B.user_name
FROM
PROJECT AS A
LEFT JOIN
USER AS B ON A.user_idx = B.user_idx
LEFT JOIN
PURCHASE AS C ON A.project_idx = C.project_idx
LEFT JOIN
SALES AS D ON A.project_idx = D.project_idx
GROUP BY
????
You need to use subquery as follows:
SELECT A.project_idx,
a.project_name,
A.project_category,
sum(C.purchase_price) AS purcchase_total,
sum(D.sales_price) as sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_price
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_price
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
I am not sure but you can use inner join of project with user instead of left join.
SELECT A.project_idx,
a.project_name,
A.project_category,
purcchase_total,
sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_total
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_total
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
This is working correctly on MS-SQL Server.
Thanks to Popeye
You are attempting to aggregate over two unrelated dimensions, and that throws off all the calculations.
Correlated subqueries are an alternative:
SELECT p.*,
(SELECT SUM(pu.purchase_price)
FROM PURCHASE pu
WHERE p.project_idx = pu.project_idx
) as purchase_total,
(SELECT SUM(s.sales_price)
FROM SALES s
WHERE p.project_idx = s.project_idx
) as sales_total,
u.user_name
FROM PROJECT p LEFT JOIN
USER u
ON p.user_idx = u.user_idx ;
Note that this uses meaningful table aliases so the query is easier to read. Arbitrary letters are really no better (and perhaps worse) than using the entire table name.
Correlated subqueries avoid the outer aggregation as well -- and let you select all the columns from the first table, which is what you want. They also often have better performance with the right indexes.

SQL - joining multiple tables

I have three tables that I'm trying to join:
sales
order
employee
For example, the tables have the following attributes.
Sales:
ID
price
Order:
ID
tag
Employee:
tag
yearsWorked
I would like to keep only records that exist from the result of a left join in sales and order -> left join the result with employee
SELECT *
FROM ( SELECT *
FROM SALES
LEFT JOIN ORDER
ON SALES.ID = ORDER.ID) AS SO
LEFT JOIN EMPLOYEE
on SO.TAG = EMPLOEYE.TAG;
The above query does not work.
There is no need for a subquery. All you have to do is 2 LEFT JOIN, each for the respective table. This will make sure that only the results of the first left join are joined with the third table.
SELECT *
FROM SALES S
LEFT JOIN ORDER O ON S.ID = O.ID
LEFT JOIN EMPLOYEE E ON O.TAG = E.TAG;
I hope this works.
SELECT
*
FROM
Order
LEFT JOIN Sales
ON Order.ID = Sales.ID
LEFT JOIN Employee
ON Order.tag = Employee.tag

SQL Server Circular Query

I have 4 tables, in that I want to fetch records from all 4 and aggregate the values
I have these tables
I am expecting this output
but getting this output as a Cartesian product
It is multiplying the expenses and allocation
Here is my query
select
a.NAME, b.P_NAME,
sum(a.DURATION) DURATION,
sum(b.[EXP]) EXPEN
from
(select
e.ID, a.P_ID, e.NAME, a.DURATION DURATION
from
EMPLOYEE e
inner join
ALLOCATION a ON e.ID = a.E_ID) a
inner join
(select
p.P_ID, e.E_ID, p.P_NAME, e.amt [EXP]
from
PROJECT p
inner join
EXPENSES e ON p.P_ID = e.P_ID) b ON a.ID = b.E_ID
and a.P_ID = b.P_ID
group by
a.NAME, b.P_NAME
Can anyone suggest something about this.
The following should work:
SELECT e.Name,p.Name,COALESCE(d.Duration,0),COALESCE(exp.Expen,0)
FROM
Employee e
CROSS JOIN
Project p
LEFT JOIN
(SELECT E_ID,P_ID,SUM(Duration) as Duration FROM Allocation
GROUP BY E_ID,P_ID) d
ON
e.E_ID = d.E_ID and
p.P_ID = d.P_ID
LEFT JOIN
(SELECT E_ID,P_ID,SUM(AMT) as Expen FROM Expenses
GROUP BY E_ID,P_ID) exp
ON
e.E_ID = exp.E_ID and
p.P_ID = exp.P_ID
WHERE
d.E_ID is not null or
exp.E_ID is not null
I've tried to write a query that will produce results where e.g. there are rows in Expenses but no rows in Allocations (or vice versa) for some particular E_ID,P_ID combination.
Use left join in select query by passing common id for all table
Hi I got the answer what I want from some modification in the query
The above query is also working like a charm and have done some modification to the original query and got the answer
Just have to group by the inner queries and then join the queries it will then not showing Cartesian product
Here is the updated one
select a.NAME,b.P_NAME,sum(a.DURATION) DURATION,sum(b.[EXP]) EXPEN from
(select e.ID,a.P_ID, e.NAME,sum(a.DURATION) DURATION from EMPLOYEE e inner join ALLOCATION a
ON e.ID=a.E_ID group by e.ID,e.NAME,a.P_ID) a
inner join
(select p.P_ID,e.E_ID, p.P_NAME,sum(e.amt) [EXP] from PROJECT p inner join EXPENSES e
ON p.P_ID=e.P_ID group by p.P_ID,p.P_NAME,e.E_ID) b
ON a.ID=b.e_ID and a.P_ID=b.P_ID group by a.NAME,b.P_NAME
Showing the correct output

SQL Inner join division

I have issue with my inner join division below. From my oracle, it keep prompt me missing right parenthesis when I have already close it. I'll need to get the names of the patient who have collected all items.
Select P.name
From ((((Select Patientid From Patient) As P
Inner Join (Select Accountno, Patientid From Account) As A1
on P.PatientID = A1.PatientID)
Inner Join (Select Accountno, Itemno From AccountType) As Al
On A1.Accountno = Al.Accountno)
Inner Join (Select Itemno From Item) As I
On Al.Itemno = I.Itemno)
Group By Al.Itemno
Having Count(*) >= (Select Count(*) FROM AccountType);
Here's a simpler approach that I believe is essentially equivalent:
select a.name
from Patient a
inner join Account b on a.PatientID = b.PatientID
inner join AccountType c on b.Accountno = c.Accountno
inner join Item d on c.Itemno = d.Itemno
group by c.Accountno, a.name
having Count(*) >= (Select Count(*) FROM AccountType);
This approach is a bit simpler. It has the added benefit of being much more likely to use indexes on the tables -- if you do joins between what are essentially 'join tables' in memory, you don't get the benefit of the indexes that exist for the physical tables in memory.
I also usually alias table names using sequential letters -- 'a', 'b', 'c', 'd' as you can see. I find that when I'm writing complicated queries it makes it easier for me to follow. 'a' is the first table in the join, 'b' is the second, etc.
It sounds like you just want
SELECT p.name
FROM patient p
INNER JOIN account a ON (a.patientID = p.patientID)
INNER JOIN accountType accTyp ON (accTyp.accountNo = a.accountNo)
INNER JOIN item i ON (i.itemNo = accTyp.itemNo)
GROUP BY accTyp.itemNo
HAVING COUNT(*) = (SELECT COUNT(*)
FROM accountType);
Note that having an alias of A1 and an alias of Al is quite confusing. You want to pick more meaningful and more distinguishing aliases.

SQL query on table that has 2 columns with a foreign id on the same table

I have a table let's say in the form of: match(id, hometeam_id, awayteam_id) and team(id, name). How do I build my SQL query in order to get a result table in the form of (match_id, hometeam_name, awayteam_name), since they both (hometeam_id, awayteam_id) reference the same table (team)?
Thank you
You would just join to the team table multiple times:
SELECT m.id, away.name, home.name
FROM match m
INNER JOIN team away ON away.id = m.awayteam_id
INNER JOIN team home ON home.id = m.hometeam_id
You join to the team table twice.
select matchdate, t1.teamname, t2,teamname from
match m
join team t1 on m.hometeamId = t1.teamid
join team t2 on m.awayteamid = t2.teamid
Join to the team table twice, once for the Home Team, and again for the Away Team, using aliases after the table names in the query:
select m.match_id, homeTeam.name as HomeTeamName, awayTeam.name as AwayTeamName
from
team homeTeam join
match m on m.hometeam_id = homeTeam.hometeam_id join
team awayTeam on awayTeam.hometeam_id = m.awayteam_id
select m.id, h.name as hometeam_name, a.name as awayteam_name
from match m left join team h on m.hometeam_id = h.id
left join team a on m.awayteam_id = a.id