SQL - joining multiple tables - sql

I have three tables that I'm trying to join:
sales
order
employee
For example, the tables have the following attributes.
Sales:
ID
price
Order:
ID
tag
Employee:
tag
yearsWorked
I would like to keep only records that exist from the result of a left join in sales and order -> left join the result with employee
SELECT *
FROM ( SELECT *
FROM SALES
LEFT JOIN ORDER
ON SALES.ID = ORDER.ID) AS SO
LEFT JOIN EMPLOYEE
on SO.TAG = EMPLOEYE.TAG;
The above query does not work.

There is no need for a subquery. All you have to do is 2 LEFT JOIN, each for the respective table. This will make sure that only the results of the first left join are joined with the third table.
SELECT *
FROM SALES S
LEFT JOIN ORDER O ON S.ID = O.ID
LEFT JOIN EMPLOYEE E ON O.TAG = E.TAG;
I hope this works.

SELECT
*
FROM
Order
LEFT JOIN Sales
ON Order.ID = Sales.ID
LEFT JOIN Employee
ON Order.tag = Employee.tag

Related

Multiple joins with group by (Sum)

When I using multiple JOIN, I hope to get the sum of some column in joined tables.
SELECT
A.*,
SUM(C.purchase_price) AS purcchase_total,
SUM(D.sales_price) AS sales_total,
B.user_name
FROM
PROJECT AS A
LEFT JOIN
USER AS B ON A.user_idx = B.user_idx
LEFT JOIN
PURCHASE AS C ON A.project_idx = C.project_idx
LEFT JOIN
SALES AS D ON A.project_idx = D.project_idx
GROUP BY
????
You need to use subquery as follows:
SELECT A.project_idx,
a.project_name,
A.project_category,
sum(C.purchase_price) AS purcchase_total,
sum(D.sales_price) as sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_price
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_price
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
I am not sure but you can use inner join of project with user instead of left join.
SELECT A.project_idx,
a.project_name,
A.project_category,
purcchase_total,
sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_total
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_total
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
This is working correctly on MS-SQL Server.
Thanks to Popeye
You are attempting to aggregate over two unrelated dimensions, and that throws off all the calculations.
Correlated subqueries are an alternative:
SELECT p.*,
(SELECT SUM(pu.purchase_price)
FROM PURCHASE pu
WHERE p.project_idx = pu.project_idx
) as purchase_total,
(SELECT SUM(s.sales_price)
FROM SALES s
WHERE p.project_idx = s.project_idx
) as sales_total,
u.user_name
FROM PROJECT p LEFT JOIN
USER u
ON p.user_idx = u.user_idx ;
Note that this uses meaningful table aliases so the query is easier to read. Arbitrary letters are really no better (and perhaps worse) than using the entire table name.
Correlated subqueries avoid the outer aggregation as well -- and let you select all the columns from the first table, which is what you want. They also often have better performance with the right indexes.

Trying to understand NULL operator in Query

I'm looking to see a breakdown of the total dollar business that each vendor has done (indirectly via the distributor) with each customer, where I'm trying not to use the Inner Join Syntax. I basically don't understand the difference between the two outputs produced by the two queries shown below:
Query1
select customers.cust_id, vendors.vend_id, sum(OrderItems.item_price*OrderItems.quantity) as total_business from
(((Vendors left outer join products
on vendors.vend_id = products.prod_id)
left outer join OrderItems
on products.prod_id = OrderItems.prod_id)
left outer join Orders
on OrderItems.order_num = Orders.order_num)
left outer join Customers
on Orders.cust_id = Customers.cust_id
group by Customers.cust_id, vendors.vend_id
order by total_business
I get the following output:
Query2
select customers.cust_id, Vendors.vend_id, sum(quantity*item_price) as total_business from
(((Vendors left outer join Products
on Products.vend_id = Vendors.vend_id)
left outer join OrderItems --No inner joins allowed
on OrderItems.prod_id = Products.prod_id)
left outer join Orders
on Orders.order_num = OrderItems.order_num)
left outer join Customers
on Customers.cust_id = Orders.cust_id
where Customers.cust_id is not null -- THE ONLY DIFFERENCE BETWEEN QUERY1 AND QUERY2
group by Customers.cust_id, Vendors.vend_id
order by total_business
I don't understand how there are only NULL cust_id's associated with the 1st Output when in the 2nd Output we get some non-NULL cust_ids. Why doesn't the 1st Output include these non-NULL cust_id's
Thank You
Query One is joining Vendors and Products incorrectly:
on vendors.vend_id = products.prod_id -- Vend_ID = Prod_ID
Query Two is joining Vendors and Products correctly:
on Products.vend_id = Vendors.vend_id -- Vend_ID = Vend_ID
Once that is fixed, you'll get the same IDs in both queries. Then I suggest you read Dan's answer to understand why what you were trying to do in eliminating INNER JOIN from the query is cancelled out by adding a WHERE filter to a column from the last table in the chain.
When you left join to a table, then filter on that table in the where clause, the join effectively changes to an inner join. The workaround is to apply the filter as a joining condition.
In your second query, all you have to do is is change the word "where" to "and".

SQL Server Circular Query

I have 4 tables, in that I want to fetch records from all 4 and aggregate the values
I have these tables
I am expecting this output
but getting this output as a Cartesian product
It is multiplying the expenses and allocation
Here is my query
select
a.NAME, b.P_NAME,
sum(a.DURATION) DURATION,
sum(b.[EXP]) EXPEN
from
(select
e.ID, a.P_ID, e.NAME, a.DURATION DURATION
from
EMPLOYEE e
inner join
ALLOCATION a ON e.ID = a.E_ID) a
inner join
(select
p.P_ID, e.E_ID, p.P_NAME, e.amt [EXP]
from
PROJECT p
inner join
EXPENSES e ON p.P_ID = e.P_ID) b ON a.ID = b.E_ID
and a.P_ID = b.P_ID
group by
a.NAME, b.P_NAME
Can anyone suggest something about this.
The following should work:
SELECT e.Name,p.Name,COALESCE(d.Duration,0),COALESCE(exp.Expen,0)
FROM
Employee e
CROSS JOIN
Project p
LEFT JOIN
(SELECT E_ID,P_ID,SUM(Duration) as Duration FROM Allocation
GROUP BY E_ID,P_ID) d
ON
e.E_ID = d.E_ID and
p.P_ID = d.P_ID
LEFT JOIN
(SELECT E_ID,P_ID,SUM(AMT) as Expen FROM Expenses
GROUP BY E_ID,P_ID) exp
ON
e.E_ID = exp.E_ID and
p.P_ID = exp.P_ID
WHERE
d.E_ID is not null or
exp.E_ID is not null
I've tried to write a query that will produce results where e.g. there are rows in Expenses but no rows in Allocations (or vice versa) for some particular E_ID,P_ID combination.
Use left join in select query by passing common id for all table
Hi I got the answer what I want from some modification in the query
The above query is also working like a charm and have done some modification to the original query and got the answer
Just have to group by the inner queries and then join the queries it will then not showing Cartesian product
Here is the updated one
select a.NAME,b.P_NAME,sum(a.DURATION) DURATION,sum(b.[EXP]) EXPEN from
(select e.ID,a.P_ID, e.NAME,sum(a.DURATION) DURATION from EMPLOYEE e inner join ALLOCATION a
ON e.ID=a.E_ID group by e.ID,e.NAME,a.P_ID) a
inner join
(select p.P_ID,e.E_ID, p.P_NAME,sum(e.amt) [EXP] from PROJECT p inner join EXPENSES e
ON p.P_ID=e.P_ID group by p.P_ID,p.P_NAME,e.E_ID) b
ON a.ID=b.e_ID and a.P_ID=b.P_ID group by a.NAME,b.P_NAME
Showing the correct output

Query extensibility with WHERE EXISTS with a large table

The following query is designed to find the number of people who went to a hospital, the total number of people who went to a hospital and the divide those two to find a percentage. The table Claims is two million plus rows and does have the correct non-clustered index of patientid, admissiondate, and dischargdate. The query runs quickly enough but I'm interested in how I could make it more usable. I would like to be able to add another code in the line where (hcpcs.hcpcs ='97001') and have the change in percentRehabNotHomeHealth be relfected in another column. Is there possible without writing a big, fat join statement where I join the results of the two queries together? I know that by adding the extra column the math won't look right, but I'm not worried about that at the moment. desired sample output: http://imgur.com/BCLrd
database schema
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join
--this join adds the hospitalCounts column
(
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
) as t on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select distinct p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10
You might look into CTE (Common Table Expressions) to get what you need. It would allow you to get summarized data and join that back to the detail on a common key. As an example I modified your join on the subquery to be a CTE.
;with hospitalCounts as (
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
)
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join hospitalCounts on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10

Using sum with a nested select

I'm using SQL Server. This statement lists my products per menu:
SELECT menuname, productname
FROM [web].[dbo].[tblMenus]
FULL OUTER JOIN [web].[dbo].[tblProductsRelMenus]
ON [tblMenus].Id = [tblProductsRelMenus].MenuId
FULL OUTER JOIN [web].[dbo].[tblProducts]
ON [tblProductsRelMenus].ProductId = [tblProducts].ProductId
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].Id = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
Some products don't have menus and vice versa. I use the following to establish what has been sold of each product.
SELECT [tblProducts].ProductName, SUM([tblOrderDetails].Ammount) as amount
FROM [web].[dbo].[tblProducts]
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].ProductId = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
What I want to do is complement the top table with an amount column. That is, I want a table with the same number of rows as in the first table above but with an amount value if it exists, otherwise null.
I can't figure out how to do this. Any suggestions?
If I am not missing anything, the second query could be simplified, then incorporated into the first query like this:
SELECT
m.menuname,
p.productname,
t.amount
FROM [web].[dbo].[tblMenus] m
FULL JOIN [web].[dbo].[tblProductsRelMenus] pm ON m.Id = pm.MenuId
FULL JOIN [web].[dbo].[tblProducts] p ON pm.ProductId = p.ProductId
LEFT JOIN (
SELECT ProductId, SUM(Amount) as amount
FROM [web].[dbo].[tblOrderDetails]
GROUP BY ProductId
) t ON p.ProducId = t.ProductId