SQL Inner join division - sql

I have issue with my inner join division below. From my oracle, it keep prompt me missing right parenthesis when I have already close it. I'll need to get the names of the patient who have collected all items.
Select P.name
From ((((Select Patientid From Patient) As P
Inner Join (Select Accountno, Patientid From Account) As A1
on P.PatientID = A1.PatientID)
Inner Join (Select Accountno, Itemno From AccountType) As Al
On A1.Accountno = Al.Accountno)
Inner Join (Select Itemno From Item) As I
On Al.Itemno = I.Itemno)
Group By Al.Itemno
Having Count(*) >= (Select Count(*) FROM AccountType);

Here's a simpler approach that I believe is essentially equivalent:
select a.name
from Patient a
inner join Account b on a.PatientID = b.PatientID
inner join AccountType c on b.Accountno = c.Accountno
inner join Item d on c.Itemno = d.Itemno
group by c.Accountno, a.name
having Count(*) >= (Select Count(*) FROM AccountType);
This approach is a bit simpler. It has the added benefit of being much more likely to use indexes on the tables -- if you do joins between what are essentially 'join tables' in memory, you don't get the benefit of the indexes that exist for the physical tables in memory.
I also usually alias table names using sequential letters -- 'a', 'b', 'c', 'd' as you can see. I find that when I'm writing complicated queries it makes it easier for me to follow. 'a' is the first table in the join, 'b' is the second, etc.

It sounds like you just want
SELECT p.name
FROM patient p
INNER JOIN account a ON (a.patientID = p.patientID)
INNER JOIN accountType accTyp ON (accTyp.accountNo = a.accountNo)
INNER JOIN item i ON (i.itemNo = accTyp.itemNo)
GROUP BY accTyp.itemNo
HAVING COUNT(*) = (SELECT COUNT(*)
FROM accountType);
Note that having an alias of A1 and an alias of Al is quite confusing. You want to pick more meaningful and more distinguishing aliases.

Related

Filter results through two values in the same column if one is multiple

I am trying to list customers who have purchased from ProductCategoryKey 1, and if those customers have purchased ProductCategoryKEY 3 more than once show only one row for that customer. My code so far only filters through PC Key 1. How can I say 'of those who have purchased PC Key 1, how many have purchased PC Key 3 at least two times and only show that person once?'
SELECT DISTINCT C.FirstName, C.LastName, PC.EnglishProductCategoryName
FROM dbo.FactInternetSales AS I
INNER JOIN dbo.DimCustomer AS C
ON I.CustomerKey = C.CustomerKey
INNER JOIN dbo.DimProduct AS P
ON I.ProductKey = P.ProductKey
INNER JOIN dbo.DimProductSubcategory AS SC
ON P.ProductSubcategoryKey = SC.ProductSubcategoryKey
INNER JOIN dbo.DimProductCategory as PC
ON SC.ProductCategoryKey = PC.ProductCategoryKey
WHERE PC.ProductCategoryKey = 1
ORDER BY C.LastName ASC, FirstName ASC;
Using Microsoft SQL Server 2019
ADventureWorksDW2017
I'm a big fan of using temp tables, or variable tables, as a way to further filter data. The first query gets all customers with a PC Key of 1 and 3, and inserts those records into a temp table. Then, the second query uses aggregation and a GROUP BY clause to count all rows where the PC Key is 3, followed by a HAVING clause where the count is > 2. The GROUP BY clause in the second query effectively creates a distinct record-set for your last requirement.
I break up the query into two queries for readability. You could also use nested sub queries to achieve the same result.
SELECT DISTINCT C.FirstName, C.LastName, PC.EnglishProductCategoryName,
PC.ProductCategoryKey
into #temp1
FROM dbo.FactInternetSales AS I
INNER JOIN dbo.DimCustomer AS C
ON I.CustomerKey = C.CustomerKey
INNER JOIN dbo.DimProduct AS P
ON I.ProductKey = P.ProductKey
INNER JOIN dbo.DimProductSubcategory AS SC
ON P.ProductSubcategoryKey = SC.ProductSubcategoryKey
INNER JOIN dbo.DimProductCategory as PC
ON SC.ProductCategoryKey = PC.ProductCategoryKey
WHERE PC.ProductCategoryKey = 1
AND PC.ProductCategoryKey = 3
ORDER BY C.LastName ASC, FirstName ASC
SELECT FirstName, LastName, EnglishProductCategoryName, COUNT(*)
FROM #temp1
WHERE ProductCategoryKey = 3
GROUP BY FirstName, LastName, EnglishProductCategoryName
HAVING COUNT(*) > 2
I would use a CTE over a temp table unless you have a performance issue. This way you get a single execution plan which is often better (except when its not) than splitting into two execution plans (as using a temp table does).
Also you shouldn't need a distinct unless you have your logic wrong. A correctly grouped query very rarely needs a distinct.
WITH cte AS (
SELECT C.FirstName, C.LastName, PC.EnglishProductCategoryName, PC.ProductCategoryKey
FROM dbo.FactInternetSales AS I
INNER JOIN dbo.DimCustomer AS C ON I.CustomerKey = C.CustomerKey
INNER JOIN dbo.DimProduct AS P ON I.ProductKey = P.ProductKey
INNER JOIN dbo.DimProductSubcategory AS SC ON P.ProductSubcategoryKey = SC.ProductSubcategoryKey
INNER JOIN dbo.DimProductCategory AS PC ON SC.ProductCategoryKey = PC.ProductCategoryKey
WHERE PC.ProductCategoryKey = 1 AND PC.ProductCategoryKey = 3
)
SELECT FirstName, LastName, EnglishProductCategoryName, COUNT(*)
FROM cte
WHERE ProductCategoryKey = 3
GROUP BY FirstName, LastName, EnglishProductCategoryName
HAVING COUNT(*) > 2
ORDER BY LastName ASC, FirstName ASC;
I am assuming the logic is correct, because without sample data and expected results its hard to know.

Returning multiple aggregated columns from Subquery

I am trying to extend an existing query by aggregating some rows from another table.
It works when I only return one column like this:
Select DISTINCT
Contracts.id,
Contracts.beginTime,
Contracts.endTime,
Suppliers.name
(SELECT COUNT(p.id) from production as p where p.id_contract = Contracts.id)
FROM Contracts
LEFT JOIN Suppliers on Contracts.id = Suppliers.id_contract
Then I tried to add another column for the aggregated volume:
Select DISTINCT
Contracts.id,
Contracts.beginTime,
Contracts.endTime,
Suppliers.name
(SELECT COUNT(p.id), SUM(p.volume) from production as p where p.id_contract = Contracts.id)
FROM Contracts
LEFT JOIN Suppliers on Contracts.id = Suppliers.id_contract
However, this returns the following error:
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
I experimented a bit with the EXISTS keyword, but couldn't figure out how to make it work. Also I'm not sure whether this is the way to go in my case.
The desired output would be like so:
contract1Id, supplierInfoContract1, nrItemsContract1, sumVolumeContract1
contract2Id, supplierInfoContract2, nrItemsContract2, sumVolumeContract2
Instead of using DISTINCT and subqueries, use GROUP BY and normal joins to get the aggregates. And always use aliases, it will make your life easier:
SELECT
c.id,
c.beginTime,
c.endTime,
s.name,
COUNT(p.id) prod_count,
SUM(p.volume) prod_vol
FROM Contracts c
LEFT JOIN production p on p.id_contract = c.id
LEFT JOIN Suppliers s on c.id = s.id_contract
GROUP BY c.id, c.beginTime, c.endTime, s.name;
Another option is to APPLY the grouped up subquery:
SELECT DISTINCT
c.id,
c.beginTime,
c.endTime,
s.name,
p.prod_count,
p.prod_vol
FROM Contracts c
LEFT JOIN Suppliers s on c.id = s.id_contract
OUTER APPLY (
SELECT
COUNT(p.id) prod_count,
SUM(p.volume) prod_vol
FROM production p WHERE p.id_contract = c.id
GROUP BY ()
) p;
You can also use CROSS APPLY and leave out the GROUP BY (), this uses a scalar aggregate and returns 0 instead of null for no rows.
One last point: DISTINCT in a joined query is a bit of a code smell, it usually indicates the query writer wasn't thinking too hard about what the joined tables returned, and just wanted to get rid of duplicate rows.
You should use it like below:
Select DISTINCT Contracts.id, Contracts.beginTime, Contracts.endTime, Suppliers.name
(SELECT COUNT(p.id) from production as p where p.id_contract = Contracts.id) as CNT,
(SELECT SUM(p.volume) from production as p where p.id_contract = Contracts.id) as VOLUME
FROM Contracts
LEFT JOIN Suppliers on Contracts.id = Suppliers.id_contract
I guess you can try to rework your query as
SELECT X.ID,X.beginTime,X.endTime,X.name,CR.CNTT,CR.TOTAL_VOLUME
FROM
(
Select DISTINCT Contracts.id, Contracts.beginTime, Contracts.endTime, Suppliers.name
FROM Contracts
LEFT JOIN Suppliers on Contracts.id = Suppliers.id_contract
)X
CROSS APPLY
(
SELECT COUNT(p.id)AS CNTT,SUM(p.volume) AS TOTAL_VOLUME
from production as p where p.id_contract = X.id
)CR
I reworked you query slightly by separating the subqueries.
Select DISTINCT
Contracts.id,
Contracts.beginTime,
Contracts.endTime,
Suppliers.name,
(SELECT COUNT(p.id) from production as p where p.id_contract = Contracts.id) count_id,
(SELECT SUM(p.volume) from production as p where p.id_contract = Contracts.id) sum_volume
FROM Contracts
LEFT JOIN Suppliers on Contracts.id = Suppliers.id_contract

Multiple joins with group by (Sum)

When I using multiple JOIN, I hope to get the sum of some column in joined tables.
SELECT
A.*,
SUM(C.purchase_price) AS purcchase_total,
SUM(D.sales_price) AS sales_total,
B.user_name
FROM
PROJECT AS A
LEFT JOIN
USER AS B ON A.user_idx = B.user_idx
LEFT JOIN
PURCHASE AS C ON A.project_idx = C.project_idx
LEFT JOIN
SALES AS D ON A.project_idx = D.project_idx
GROUP BY
????
You need to use subquery as follows:
SELECT A.project_idx,
a.project_name,
A.project_category,
sum(C.purchase_price) AS purcchase_total,
sum(D.sales_price) as sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_price
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_price
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
I am not sure but you can use inner join of project with user instead of left join.
SELECT A.project_idx,
a.project_name,
A.project_category,
purcchase_total,
sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_total
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_total
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
This is working correctly on MS-SQL Server.
Thanks to Popeye
You are attempting to aggregate over two unrelated dimensions, and that throws off all the calculations.
Correlated subqueries are an alternative:
SELECT p.*,
(SELECT SUM(pu.purchase_price)
FROM PURCHASE pu
WHERE p.project_idx = pu.project_idx
) as purchase_total,
(SELECT SUM(s.sales_price)
FROM SALES s
WHERE p.project_idx = s.project_idx
) as sales_total,
u.user_name
FROM PROJECT p LEFT JOIN
USER u
ON p.user_idx = u.user_idx ;
Note that this uses meaningful table aliases so the query is easier to read. Arbitrary letters are really no better (and perhaps worse) than using the entire table name.
Correlated subqueries avoid the outer aggregation as well -- and let you select all the columns from the first table, which is what you want. They also often have better performance with the right indexes.

Query extensibility with WHERE EXISTS with a large table

The following query is designed to find the number of people who went to a hospital, the total number of people who went to a hospital and the divide those two to find a percentage. The table Claims is two million plus rows and does have the correct non-clustered index of patientid, admissiondate, and dischargdate. The query runs quickly enough but I'm interested in how I could make it more usable. I would like to be able to add another code in the line where (hcpcs.hcpcs ='97001') and have the change in percentRehabNotHomeHealth be relfected in another column. Is there possible without writing a big, fat join statement where I join the results of the two queries together? I know that by adding the extra column the math won't look right, but I'm not worried about that at the moment. desired sample output: http://imgur.com/BCLrd
database schema
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join
--this join adds the hospitalCounts column
(
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
) as t on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select distinct p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10
You might look into CTE (Common Table Expressions) to get what you need. It would allow you to get summarized data and join that back to the detail on a common key. As an example I modified your join on the subquery to be a CTE.
;with hospitalCounts as (
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
)
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join hospitalCounts on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10

Combine Two Tables in Select (SQL Server 2008)

If I have two tables, like this for example:
Table 1 (products)
id
name
price
agentid
Table 2 (agent)
userid
name
email
How do I get a result set from products that include the agents name and email, meaning that products.agentid = agent.userid?
How do I join for example SELECT WHERE price < 100?
Edited to support price filter
You can use the INNER JOIN clause to join those tables. It is done this way:
select p.id, p.name as ProductName, a.userid, a.name as AgentName
from products p
inner join agents a on a.userid = p.agentid
where p.price < 100
Another way to do this is by a WHERE clause:
select p.id, p.name as ProductName, a.userid, a.name as AgentName
from products p, agents a
where a.userid = p.agentid and p.price < 100
Note in the second case you are making a natural product of all rows from both tables and then filtering the result. In the first case you are directly filtering the result while joining in the same step. The DBMS will understand your intentions (regardless of the way you choose to solve this) and handle it in the fastest way.
This is a very rudimentary INNER JOIN:
SELECT
products.name AS productname,
price,
agent.name AS agentname
email
FROM
products
INNER JOIN agent ON products.agentid = agent.userid
I recommend reviewing basic JOIN syntax and concepts. Here's a link to Microsoft's documentation, though what you have above is pretty universal as standard SQL.
Note that the INNER JOIN here assumes every product has an associated agentid that isn't NULL. If there are NULL agentid in products, use LEFT OUTER JOIN instead to return even the products with no agent.
select p.name productname, p.price, a.name as agent_name, a.email
from products p
inner join agent a on (a.userid = p.agentid)
This is my join for slightly larger tables in Prod.Hope it helps.
SELECT TOP 1000 p.[id]
,p.[attributeId]
,p.[name] as PropertyName
,p.[description]
,p.[active],
a.[appId],
a.[activityId],
a.[Name] as AttributeName
FROM [XYZ.Gamification.V2B13.Full].[dbo].[ADM_attributeProperty] p
Inner join [XYZ.Gamification.V2B13.Full].[dbo].[ADM_activityAttribute] a
on a.id=p.attributeId
where a.appId=23098;
select ProductName=p.[name]
, ProductPrice=p.price
, AgentName=a.[name]
, AgentEmail=a.email
from products p
inner join agent a on a.userid=p.agentid
If you don't want to use inner join (or don't have possibility to do it!) and would combine rows, you can use a cross join :
SELECT *
FROM table1
CROSS JOIN table2
or simply
SELECT *
FROM table1, table2