How to Organize Multiple Joins SQL - sql

In SQL, how should I be joining tables together when I do multiple joins in one query. Should I join on only one table - in this case the Customers table or is it okay to do what I have done (joining on different tables as new keys are needed)?
SELECT O.OrderID, O.OrderDate, C.City, C.Country, C.PostalCode, C.ContactName, O.CustomerID, O.ShipperID, D.ProductID, COUNT(D.ProductID) ProductCount, S.SupplierID
FROM Customers C
INNER JOIN Orders O
ON O.CustomerID = C.CustomerID
INNER JOIN OrderDetails D
ON O.OrderID = D.OrderID
INNER JOIN Products P
ON D.ProductID = P.ProductID
INNER JOIN Suppliers S
ON S.SupplierID = P.SupplierID
WHERE 1 = 1
GROUP BY O.OrderID
ORDER BY OrderDate DESC
I am using W3Schools SQL TryIt editor to test this, not sure what DB engine it is!
Thanks!

Of course you can join on multiple tables in a query. That is a big part of the power of SQL.
In your particular case, you don't need the join to the Suppliers table, because the column is already in Products.
Also, you need to be careful about your SELECT and GROUP BY clauses. In general, you should put all non-aggregated columns in the GROUP BY:
SELECT O.OrderID, O.OrderDate, C.City, C.Country, C.PostalCode, C.ContactName,
O.CustomerID, O.ShipperID, D.ProductID,
COUNT(D.ProductID) as ProductCount,
P.SupplierID
FROM Customers C INNER JOIN
Orders O
ON O.CustomerID = C.CustomerID INNER JOIN
OrderDetails D
ON O.OrderID = D.OrderID INNER JOIN
Products P
ON D.ProductID = P.ProductID
GROUP BY O.OrderID, O.OrderDate, C.City, C.Country, C.PostalCode, C.ContactName,
O.CustomerID, O.ShipperID, D.ProductID, P.SupplierId
ORDER BY OrderDate DESC;
The WHERE 1=1 is also unnecessary.
I wonder if this query really does what you want. However, you don't state what you actually want the query to do, so I'm merely speculating.

The way you have done it is find, don't forget that for each inner join, your record set may reduce by the number of non matching keys in each additional join.
you could also just use the JOIN syntax.

Related

SQL - How can you use WHERE instead of LEFT/RIGHT JOIN?

since I am a bit rusty, I was practicing SQL on this link and was trying to replace the LEFT JOIN completly with WHERE. How can i do this so it does the same thing as the premade function in the website?
What I tried so far is:
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers, Orders
WHERE Customers.CustomerID = Orders.CustomerID OR Customers.CustomerID != Orders.CustomerID
Order by Customers.CustomerName;
Thanks in advance for your help.
You are trying to replace
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
with
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers, Orders
WHERE ???
this is doomed to failure. Consider Customers has two rows and Orders has zero. The outer join will return two rows.
The cross join (FROM Customers, Orders) will return zero rows.
In standard SQL a WHERE clause can only reduce the rows from that - not increase them so there is nothing you can put for ??? that will give your desired results.
Before ANSI-92 joins were introduced some systems used to have proprietary operators for this, such as *= in SQL Server but this was removed from the product.
This may work for you.
SELECT
c.CustomerName,
o.OrderID
FROM Customers c
LEFT JOIN Orders o
on c.CustomerID = o.CustomerID
Order by c.CustomerName;
If you are trying to replace this:
SELECT c.CustomerName, o.OrderID
FROM Customers c LEFT JOIN
Orders o
ON c.CustomerID = o.CustomerID
ORDER BY c.CustomerName;
Then you can use UNION ALL:
SELECT c.CustomerName, o.OrderID
FROM Customers c JOIN
Orders o
ON c.CustomerID = o.CustomerID
UNION ALL
SELECT c.CustomerName, o.OrderID
FROM Customers c
WHERE NOT EXIST (SELECT 1 FROM Orders o WHERE c.CustomerID = o.CustomerID)
ORDER BY CustomerName
However, the LEFT JOIN is really a much better way to go.

LEFT JOIN vs Stacked Left Join

I wanted to ask whats the difference between those two queries:
SELECT
Customers.CustomerID, Customers.CustomerName, Orders.OrderID,
OrderDetails.Quantity, Products.ProductName
FROM
Customers
LEFT JOIN
(Orders
LEFT JOIN
(OrderDetails
LEFT JOIN
Products ON Products.ProductID = OrderDetails.ProductID
) ON OrderDetails.OrderID = Orders.OrderID
) ON Customers.CustomerID = Orders.CustomerID
GROUP BY
Customers.CustomerName;
Vs
SELECT
Customers.CustomerID, Customers.CustomerName, Orders.OrderID,
OrderDetails.Quantity, Products.ProductName
FROM
Customers
LEFT JOIN
Orders ON Orders.CustomerID = Customers.CustomerID
LEFT JOIN
OrderDetails ON OrderDetails.OrderID = Orders.OrderID
LEFT JOIN
Products ON Products.ProductID = OrderDetails.ProductID
GROUP BY
Customers.CustomerName;
Tested here
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_join
From what I can see one selects the first of multiple entries, one selects the last of multiple entries, but is that all?
From my point of view the not nested LEFT Join is way easier to read and to understand. Is there any downside of using it?
Your problem is the incorrect use of GROUP BY. The only unaggregated columns in the SELECT should be in the GROUP BY.
The rest of this answer addresses the point about joins.
Your second query is interpreted as:
FROM (((Customers c LEFT JOIN
Orders o
ON o.CustomerID = c.CustomerID
) LEFT JOIN
OrderDetails
ON od.OrderID = o.OrderID
) LEFT JOIN
Products p
ON p.ProductID = od.ProductID
The parentheses can affect the interpretation. But what effect? Essentially, you have:
(((c left join o) left join od) left join p)
versus
c left join (o left join (od left join p)))
Both keep all records in c, regardless of matches in the second. In this case, the two versions do the same thing. But for a particular reason -- the on conditions are strictly chained (that is, c to o, o to od, od to p). If p where joined to o instead of od, then subtle differences can occur.
What are the subtle differences? Two things can differ:
Whether columns from a particular table are NULL or have values.
Whether rows get duplicated, due to multiple matches between two tables.
In practice, I don't fine parentheses particularly useful. If I can about JOIN order, I use an explicit subquery or CTE>

Query returns cartesian product when not expected

Task: Select all orders having products belonging to ‘Sea Food’ category.
Result: OrderNo, OrderDate, Product Name
I write this query but it returns Cartesian products.
select o.orderid, o.orderdate as "Order Date", p.productname , ct.categoryname from orders o,
order_details od , products p , customers c ,categories ct
where
od.orderid = o.orderid and p.productid = od.productid and ct.categoryid = p.categoryid
and ct.categoryname = 'Seafood';
Question: What is wrong with my query ?
You're doing a CROSS JOIN on customers table since you forgot to specify the connection. This is why you should use explicit JOIN syntax rather than old syntax using commas in WHERE clause.
After translating your query into explicit syntax, you will see that there is no WHERE condition involving customers table:
select
o.orderid,
o.orderdate as "Order Date",
p.productname,
ct.categoryname
from
orders o,
inner join order_details od on od.orderid = o.orderid
inner join products p on p.productid = od.productid
inner join categories ct on ct.categoryid = p.categoryid
cross join customers c -- either you don't need this table, or you need to specify conditions
where
ct.categoryname = 'Seafood'
Basically the reason you got it was that your where clause omitted join condition involving customers table, so you were left with:
from (...), customers -- cross join when joining condition not applied in where clause

Is it possible to get one row by grouping more than one column

I have a query as below. DB from http://www.w3schools.com/sql/default.asp
SELECT count(distinct C.CustomerID),C.Country
FROM Customers C
inner join Orders O
on C.CustomerID = O.CustomerID
inner join OrderDetails D
on O.OrderID = D.OrderID
inner join Products P
on D.ProductID = P.ProductID
group by C.Country,P.CategoryID
order by C.Country
Here is the result from above.
But I want to get one row per country(as below pic) by counting CustomerIDs where any CustomerIDs are in the same country and have a same CategoryID as well. So I have to group by 2 columns. Is there any way to do it? Could you please kindly suggest me?
Thank you.
That's quite simple. Just remove P.CategoryID in your GROUP BY clause.
SELECT COUNT(DISTINCT C.CustomerID), C.Country
FROM Customers C
INNER JOIN Orders O
ON C.CustomerID = O.CustomerID
INNER JOIN OrderDetails D
ON O.OrderID = D.OrderID
INNER JOIN Products P
ON D.ProductID = P.ProductID
GROUP BY C.Country
ORDER BY C.Country;
Update
Following your comment, this should be correct approach then:
SELECT T.Country, SUM(T.Cnt)
FROM (
SELECT COUNT(DISTINCT C.CustomerID) AS Cnt, C.Country
FROM Customers C
INNER JOIN Orders O
ON C.CustomerID = O.CustomerID
INNER JOIN OrderDetails D
ON O.OrderID = D.OrderID
INNER JOIN Products P
ON D.ProductID = P.ProductID
GROUP BY C.Country, P.CategoryID
) AS T
GROUP BY T.Country
ORDER BY T.Country;

SQL query with w3schools db

I should have asked multiple questions in my other post. Thanks to all who have helped, I am now stuck on another one..
Using the w3schools db, List SupplierID, SupplierName and ItemSupplied (count of number of items supplied by a supplier), sort the list first by number of items supplied (descending) and then by supplier name (ascending)
SELECT supplierid,
suppliername,
p.productname,
Count(s.supplierid) AS itemssupplied
FROM [Suppliers] AS s
INNER JOIN [Products] AS p
ON p.supplierid = s.supplierid
GROUP BY p.productid,
p.productname
ORDER BY Count (p.productid, p.productname) DESC
order BY s.suppliername
It's giving me an error, then again I am ordering by multiple ones. I think there's something I am not quite understanding here.
My other question is
List customers for each category and the total of order placed by that customer in a given category. In the query show three columnm: CategoryName, CustomerName, and TotalOrders (which is price * quantity for orders for a given customer in a given category). Sort this data in descending order by TotalOrders.
SELECT cg.CategoryName,
c.CustomerName,
Sum(p.Price * od.Quantity) AS TotalOrders
FROM [products] AS p
INNER JOIN [orderdetails] AS od
ON od.ProductID = p.ProductID
INNER JOIN [orders] AS o
ON o.OrderID = od.OrderID
INNER JOIN [customers] AS c
ON c.customerID = o.CustomerID
INNER JOIN [categories] AS cg
ON cg.CategoryID = p.CategoryID
GROUP BY c.CustomerName
ORDER BY TotalOrders DESC
Can someone please check if my query is correct? Thank you once again!
Question 1
You are really close but you only need to state ORDER BY once (also make sure to include all shown fields in your GROUP BY unless you are aggregating them):
SELECT SupplierID, SupplierName, p.ProductName, count(s.SupplierID) AS ItemsSupplied
FROM [Suppliers] AS s
INNER JOIN [Products] AS p ON p.SupplierID = s.SupplierID
GROUP BY p.ProductID, p.ProductName, SupplierID, SupplierName -- Added SupplierID, SupplierName
ORDER BY COUNT (p.productID, p.ProductName) DESC, s.SupplierName
Notice that you just place multiple sorts on the same line with a comma separating them.
Question 2
You're almost there but you need to group by any field that is not being aggregated. So in order not to get a parsing error, I added the cg.CategoryName to the GROUP BY line.
SELECT cg.CategoryName, c.CustomerName, Sum(p.Price*od.Quantity) AS TotalOrders
FROM [Products] AS p
INNER JOIN [OrderDetails] AS od ON od.ProductID = p.ProductID
INNER JOIN [Orders] AS o ON o.OrderID = od.OrderID
INNER JOIN [Customers] AS c ON c.customerID = o.CustomerID
INNER JOIN [Categories] AS cg ON cg.CategoryID = p.CategoryID
GROUP BY c.CustomerName, cg.CategoryName --Added CategoryName
ORDER BY TotalOrders DESC
You have several problems with the first query:
You're grouping by ProductID and ProductName even though you want the number of items supplied by a supplier, which means that you want to group by SupplierID and SupplierName.
You're supplying too many arguments to the COUNT function, which takes a single column name or *.
You've included a ProductName column in your results, which is not called for.
You need to ORDER BY both the number of products supplied and the SupplierName.
With those points in mind:
SELECT
s.SupplierID,
s.SupplierName,
COUNT(p.ProductID) AS ItemsSupplied
FROM
[Suppliers] AS s
INNER JOIN [Products] AS p ON p.SupplierID = s.SupplierID
GROUP BY
s.SupplierID, s.SupplierName
ORDER BY
ItemsSupplied DESC,
s.SupplierName ASC
Your second query is quite close, you're just missing one point, which is that you're looking for total of order placed by that customer in a given category. This means that in addition to grouping by c.CustomerName, you need to group by cg.CategoryID:
SELECT
cg.CategoryName,
c.CustomerName,
SUM(p.Price*od.Quantity) AS TotalOrders
FROM
[Products] AS p
INNER JOIN [OrderDetails] AS od ON od.ProductID = p.ProductID
INNER JOIN [Orders] AS o ON o.OrderID = od.OrderID
INNER JOIN [Customers] AS c ON c.customerID = o.CustomerID
INNER JOIN [Categories] AS cg ON cg.CategoryID = p.CategoryID
GROUP BY
c.CustomerName, cg.CategoryID
ORDER BY
TotalOrders DESC
The first one has two order by clauses
ORDER BY COUNT (p.productID, p.ProductName) DESC
and
ORDER BY s.SupplierName
also some databases will complain when order by columns for queries using group by are not included in the selected columns