WHERE clause with JOIN SQL - sql

SELECT AVG(score) AS avg_score, st.name
FROM firstTable AS ft
LEFT JOIN secondTable AS st
ON ft.dog_id = st.dog_id
WHERE (SELECT COUNT(ft.dog_id) FROM firstTable) > 1
GROUP BY dog_id
The where clause doesnt seem to do anything. Why is that? - I'm essentially trying to output the average score only to the dogs that appear more than once in the first table

You should use an INNER join since you want only dogs that match in both tables and add the condition in the HAVING clause:
SELECT AVG(ft.score) AS avg_score, st.name
FROM secondTable AS st INNER JOIN firstTable AS ft
ON ft.dog_id = st.dog_id
GROUP BY st.dog_id
HAVING COUNT(*) > 1;

Related

ORACLE SQL check if the row number of the table is n, then perform a join

TABLE student: ID, ID2, NAME, AGE
TABLE class: ID, CLASS_NAME, some other columns
TABLE school: ID2, some other columns.
I am trying to perform below in Oracle SQL:
If the count of the records in TABLE "student" with age>5, is 1,
join the student table with "CLASS" table by "ID", else, join the student table with "school" table by "ID2".
I found I cannot put count in where clause. Can someone help?
I would use window functions:
select s.*, . . .
from (select s.*, sum(case when age > 5 then 1 else 0 end) over () as cnt5
from students s
) s left join
class c
on c.id = s.id and cnt5 = 1 left join
school sch
on sch.id2 = s.id2 and cnt5 <> 1
where c.id is not null or sch.id is not null
you can use left join with case
select s.*, case when age > 5 then COALESCE (c.ID,scl.ID2) as id
from student s
left join class c on s.ID=c.ID
left join school scl on s.ID2=scl.ID2
As Gordon said window functions are an option, but this is also a rare occasion where you can deliberately perform a cross join:
select s.*
,...
from students s
left join
(
select count(*) as RECORD_COUNT
from students
where age > 5
) sub
on 1 = 1
left join class c
on s.id = c.id
and sub.record_count = 1
left join school h
on s.id2 = h.id2
and c.id is null

Query extensibility with WHERE EXISTS with a large table

The following query is designed to find the number of people who went to a hospital, the total number of people who went to a hospital and the divide those two to find a percentage. The table Claims is two million plus rows and does have the correct non-clustered index of patientid, admissiondate, and dischargdate. The query runs quickly enough but I'm interested in how I could make it more usable. I would like to be able to add another code in the line where (hcpcs.hcpcs ='97001') and have the change in percentRehabNotHomeHealth be relfected in another column. Is there possible without writing a big, fat join statement where I join the results of the two queries together? I know that by adding the extra column the math won't look right, but I'm not worried about that at the moment. desired sample output: http://imgur.com/BCLrd
database schema
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join
--this join adds the hospitalCounts column
(
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
) as t on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select distinct p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10
You might look into CTE (Common Table Expressions) to get what you need. It would allow you to get summarized data and join that back to the detail on a common key. As an example I modified your join on the subquery to be a CTE.
;with hospitalCounts as (
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
)
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join hospitalCounts on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10

How to have SQL INNER JOIN accept null results

I have the following query:
SELECT TOP 25 CLIENT_ID_MD5, COUNT(CLIENT_ID_MD5) TOTAL
FROM dbo.amazonlogs
GROUP BY CLIENT_ID_MD5
ORDER BY COUNT(*) DESC;
Which returns:
283fe255cbc25c804eb0c05f84ee5d52 864458
879100cf8aa8b993a8c53f0137a3a176 126122
06c181de7f35ee039fec84579e82883d 88719
69ffb6c6fd5f52de0d5535ce56286671 68863
703441aa63c0ac1f39fe9e4a4cc8239a 47434
3fd023e7b2047e78c6742e2fc5b66fce 45350
a8b72ca65ba2440e8e4028a832ec2160 39524
...
I want to retrieve the corresponding client name (FIRM) using the returned MD5 from this query, so a row might look like:
879100cf8aa8b993a8c53f0137a3a176 126122 Burger King
So I made this query:
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL, c.FIRM
FROM dbo.amazonlogs a
INNER JOIN dbo.customers c
ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY a.CLIENT_ID_MD5, c.FIRM
ORDER BY COUNT(*) DESC;
This returns something like:
879100cf8aa8b993a8c53f0137a3a176 126122 Burger King
06c181de7f35ee039fec84579e82883d 88719 McDonalds
703441aa63c0ac1f39fe9e4a4cc8239a 47434 Wendy's
3fd023e7b2047e78c6742e2fc5b66fce 45350 Tim Horton's
Which works, except I need to return an empty value for c.FIRM if there is no corresponding FIRM for a given MD5. For example:
879100cf8aa8b993a8c53f0137a3a176 126122 Burger King
06c181de7f35ee039fec84579e82883d 88719 McDonalds
69ffb6c6fd5f52de0d5535ce56286671 68863
703441aa63c0ac1f39fe9e4a4cc8239a 47434 Wendy's
3fd023e7b2047e78c6742e2fc5b66fce 45350 Tim Horton's
How should I modify the query to still return a row even if there is no corresponding c.FIRM?
Replace INNER JOIN with LEFT JOIN
use LEFT JOIN instead of INNER JOIN
Instead of doing an INNER join, you should do a LEFT OUTER join:
SELECT
a.CLIENT_ID_MD5,
COUNT(a.CLIENT_ID_MD5) TOTAL,
ISNULL(c.FIRM,'')
FROM
dbo.amazonlogs a LEFT OUTER JOIN
dbo.customers c ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY
a.CLIENT_ID_MD5,
c.FIRM
ORDER BY COUNT(0) DESC
http://www.w3schools.com/sql/sql_join.asp
An inner join excludes NULLs; you want a LEFT OUTER join.
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL, IsNull(c.FIRM, 'Unknown') as Firm
FROM dbo.amazonlogs a
LEFT JOIN dbo.customers c ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY a.CLIENT_ID_MD5, c.FIRM ORDER BY COUNT(*) DESC;
This will give you a value of "Unknown" when records in the customers table don't exist. You could obviously drop that part and just return c.FIRM if you want to have actual nulls instead.
Change your INNER JOIN to an OUTER JOIN...
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL, c.FIRM
FROM dbo.amazonlogs a
LEFT OUTER JOIN dbo.customers c
ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY a.CLIENT_ID_MD5, c.FIRM
ORDER BY COUNT(*) DESC;
WITH amazonlogs_Tallies
AS
(
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL
FROM dbo.amazonlogs a
GROUP
BY a.CLIENT_ID_MD5
),
amazonlogs_Tallies_Firms
AS
(
SELECT a.CLIENT_ID_MD5, a.TOTAL, c.FIRM
FROM amazonlogs_Tallies a
INNER JOIN dbo.customers c
ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
)
SELECT CLIENT_ID_MD5, TOTAL, FIRM
FROM amazonlogs_Tallies_Firms
UNION
SELECT CLIENT_ID_MD5, TOTAL, '{{NOT_KNOWN}}'
FROM amazonlogs_Tallies
EXCEPT
SELECT CLIENT_ID_MD5, TOTAL, '{{NOT_KNOWN}}'
FROM amazonlogs_Tallies_Firms;

SQL Inner join division

I have issue with my inner join division below. From my oracle, it keep prompt me missing right parenthesis when I have already close it. I'll need to get the names of the patient who have collected all items.
Select P.name
From ((((Select Patientid From Patient) As P
Inner Join (Select Accountno, Patientid From Account) As A1
on P.PatientID = A1.PatientID)
Inner Join (Select Accountno, Itemno From AccountType) As Al
On A1.Accountno = Al.Accountno)
Inner Join (Select Itemno From Item) As I
On Al.Itemno = I.Itemno)
Group By Al.Itemno
Having Count(*) >= (Select Count(*) FROM AccountType);
Here's a simpler approach that I believe is essentially equivalent:
select a.name
from Patient a
inner join Account b on a.PatientID = b.PatientID
inner join AccountType c on b.Accountno = c.Accountno
inner join Item d on c.Itemno = d.Itemno
group by c.Accountno, a.name
having Count(*) >= (Select Count(*) FROM AccountType);
This approach is a bit simpler. It has the added benefit of being much more likely to use indexes on the tables -- if you do joins between what are essentially 'join tables' in memory, you don't get the benefit of the indexes that exist for the physical tables in memory.
I also usually alias table names using sequential letters -- 'a', 'b', 'c', 'd' as you can see. I find that when I'm writing complicated queries it makes it easier for me to follow. 'a' is the first table in the join, 'b' is the second, etc.
It sounds like you just want
SELECT p.name
FROM patient p
INNER JOIN account a ON (a.patientID = p.patientID)
INNER JOIN accountType accTyp ON (accTyp.accountNo = a.accountNo)
INNER JOIN item i ON (i.itemNo = accTyp.itemNo)
GROUP BY accTyp.itemNo
HAVING COUNT(*) = (SELECT COUNT(*)
FROM accountType);
Note that having an alias of A1 and an alias of Al is quite confusing. You want to pick more meaningful and more distinguishing aliases.

SQL query on table that has 2 columns with a foreign id on the same table

I have a table let's say in the form of: match(id, hometeam_id, awayteam_id) and team(id, name). How do I build my SQL query in order to get a result table in the form of (match_id, hometeam_name, awayteam_name), since they both (hometeam_id, awayteam_id) reference the same table (team)?
Thank you
You would just join to the team table multiple times:
SELECT m.id, away.name, home.name
FROM match m
INNER JOIN team away ON away.id = m.awayteam_id
INNER JOIN team home ON home.id = m.hometeam_id
You join to the team table twice.
select matchdate, t1.teamname, t2,teamname from
match m
join team t1 on m.hometeamId = t1.teamid
join team t2 on m.awayteamid = t2.teamid
Join to the team table twice, once for the Home Team, and again for the Away Team, using aliases after the table names in the query:
select m.match_id, homeTeam.name as HomeTeamName, awayTeam.name as AwayTeamName
from
team homeTeam join
match m on m.hometeam_id = homeTeam.hometeam_id join
team awayTeam on awayTeam.hometeam_id = m.awayteam_id
select m.id, h.name as hometeam_name, a.name as awayteam_name
from match m left join team h on m.hometeam_id = h.id
left join team a on m.awayteam_id = a.id