Trying to understand NULL operator in Query - sql

I'm looking to see a breakdown of the total dollar business that each vendor has done (indirectly via the distributor) with each customer, where I'm trying not to use the Inner Join Syntax. I basically don't understand the difference between the two outputs produced by the two queries shown below:
Query1
select customers.cust_id, vendors.vend_id, sum(OrderItems.item_price*OrderItems.quantity) as total_business from
(((Vendors left outer join products
on vendors.vend_id = products.prod_id)
left outer join OrderItems
on products.prod_id = OrderItems.prod_id)
left outer join Orders
on OrderItems.order_num = Orders.order_num)
left outer join Customers
on Orders.cust_id = Customers.cust_id
group by Customers.cust_id, vendors.vend_id
order by total_business
I get the following output:
Query2
select customers.cust_id, Vendors.vend_id, sum(quantity*item_price) as total_business from
(((Vendors left outer join Products
on Products.vend_id = Vendors.vend_id)
left outer join OrderItems --No inner joins allowed
on OrderItems.prod_id = Products.prod_id)
left outer join Orders
on Orders.order_num = OrderItems.order_num)
left outer join Customers
on Customers.cust_id = Orders.cust_id
where Customers.cust_id is not null -- THE ONLY DIFFERENCE BETWEEN QUERY1 AND QUERY2
group by Customers.cust_id, Vendors.vend_id
order by total_business
I don't understand how there are only NULL cust_id's associated with the 1st Output when in the 2nd Output we get some non-NULL cust_ids. Why doesn't the 1st Output include these non-NULL cust_id's
Thank You

Query One is joining Vendors and Products incorrectly:
on vendors.vend_id = products.prod_id -- Vend_ID = Prod_ID
Query Two is joining Vendors and Products correctly:
on Products.vend_id = Vendors.vend_id -- Vend_ID = Vend_ID
Once that is fixed, you'll get the same IDs in both queries. Then I suggest you read Dan's answer to understand why what you were trying to do in eliminating INNER JOIN from the query is cancelled out by adding a WHERE filter to a column from the last table in the chain.

When you left join to a table, then filter on that table in the where clause, the join effectively changes to an inner join. The workaround is to apply the filter as a joining condition.
In your second query, all you have to do is is change the word "where" to "and".

Related

Left Join gives me a lot of duplicate value

I have two tables.there are 3 common columns. When I use left join it gives me a lot duplicate values. How should I avoid the duplicate.
You need to use left join as below:
For Example:
SELECT
cu.Id
,cu.CustomerName,
_ord.Total
FROM Customers cu
LEFT JOIN
(
SELECT
customerId
,sum(amount) Total
FROM Orders
GROUP BY customerId
) _ord on _ord.customerId=cu.id

Understand Sub-Queries

I was initially looking to see a breakdown of the total dollar business that each vendor has done (indirectly via the distributor) with each customer, where I'm trying not to use the Inner Join Syntax and used the Query below for this purpose:
select customers.cust_id, Vendors.vend_id, sum(quantity*item_price) as total_business from
(((Vendors left outer join Products
on Products.vend_id = Vendors.vend_id)
left outer join OrderItems --No inner joins allowed
on OrderItems.prod_id = Products.prod_id)
left outer join Orders
on Orders.order_num = OrderItems.order_num)
left outer join Customers
on Customers.cust_id = Orders.cust_id
where Customers.cust_id is not null -- THE ONLY DIFFERENCE BETWEEN QUERY1 AND QUERY2
group by Customers.cust_id, Vendors.vend_id
order by total_business
Now, I am trying to see the query output results for all vendor-customer combinations, including those combinations where there was no business transacted and am trying to write this via a single SQL Query. My teacher provided this solution, but I honestly cannot understand the logic at all, as I've never come across Sub-queries.
select
customers.cust_id,
Vendors.vend_id,
sum(OrderItems.quantity*orderitems.item_price)
from
(
customers
inner join
Vendors on 1 = 1
)
left outer join --synthetic product using joins
(
orders
join
orderitems on orders.order_num = OrderItems.order_num
join
Products on orderitems.prod_id = products.prod_id
) on
Vendors.vend_id = Products.vend_id and
customers.cust_id = orders.cust_id
group by customers.cust_id, vendors.vend_id
order by customers.cust_id
Thanks a lot
I would write this query as:
select c.cust_id, v.vend_id, coalesce(cv.total, 0)
fro Customers c cross join
Vendors v left outer join
(select o.cust_id, v.vend_id, sum(oi.quantity * oi.item_price) as total
from orders o join
orderitems oi
on o.order_num = oi.order_num join
Products p
on oi.prod_id = p.prod_id
group by o.cust_id, v.vend_id
) cv
on cv.vend_id = v.vend_id and
cv.cust_id = c.cust_id
order by c.cust_id;
The structure is quite similar. Both version start by creating a cross product between all customers and vendors. This creates all the rows in the output result set. Next, the aggregation needs to be calculated at this level. In the above query, this is done explicitly as a subquery which aggregates the values to the customer/vendor level. (In the original query, this is done in the outer query.)
The final step is joining these together.
Your teacher should be encouraging you to use table aliases, particularly table abbreviations. You should also be encouraged to use the proper join. So, although you can express a cross join as an inner join with on 1=1, a cross join is part of the SQL language, not a hack.
Similarly, parentheses in the from clause can make the logic harder to follow. Explicit subqueries are more easily read.

Outer Join or NOT IN

Need a little help solving problem. Apologies if this is vague which is why I need help. Show The title ID, title, publisher id, publisher name, and terms for all books that have never been sold. This means that there are no records in the Sales Table. Use OUTER JOIN or NOT IN against SALES table. In this case, the terms are referring to the items on the titles table involving money. ( Price, advance, royalty ).
I've come up with the following code and there are no records returned. I've looked at the tables and no records should be returned. Not sure my code is correct based on using the OuterJoin or NOT IN. Thanks!
SELECT titles.title_id,
titles.title,
publishers.pub_id,
sales.payterms,
titles.price,
titles.advance,
titles.royalty
FROM publishers
INNER JOIN titles
ON publishers.pub_id = titles.pub_id
INNER JOIN sales
ON titles.title_id = sales.title_id
WHERE (sales.qty = 0)
I believe all you need it to change the INNER JOIN to a LEFT JOIN for your sales table. Then check for NULL on any of the columns from the sales table.
SELECT titles.title_id, titles.title, publishers.pub_id, sales.payterms,
titles.price, titles.advance, titles.royalty
FROM publishers INNER JOIN
titles ON publishers.pub_id = titles.pub_id LEFT JOIN
sales ON titles.title_id = sales.title_id
WHERE sales.qty IS NULL
Well, you are using INNER JOIN so it will only return records that match the join condition which is not what you want. You should use LEFT OUTER JOIN (or LEFT JOIN) and check that the sales side is null (meaning there's no record matching).
SELECT titles.title_id, titles.title, publishers.pub_id, sales.payterms,
titles.price, titles.advance, titles.royalty
FROM publishers INNER JOIN
titles ON publishers.pub_id = titles.pub_id LEFT OUTER JOIN
sales ON titles.title_id = sales.title_id
WHERE sales.title_id IS NULL
IS NULL Means that a data value does not exist in the database. And just change your INNER JOIN TO LEFT JOIN.
use this..
SELECT titles.title_id,
titles.title,
publishers.pub_id,
sales.payterms,
titles.price,
titles.advance,
titles.royalty
FROM publishers
INNER JOIN titles
ON publishers.pub_id = titles.pub_id
LEFT JOIN sales
ON titles.title_id = sales.title_id
WHERE (sales.qty IS NULL)

SQL Query Showing 4x Records

The following statement works properly but shows each record 4 times. Repeated; I know the relationship is wrong but no idea how to fix it? Apologies if this is simple and i've missed it.
SELECT Customers.First_Name, Customers.Last_Name, Plants.Common_Name, Plants.Flower_Colour, Plants.Flowering_Season, Staff.First_Name, Staff.Last_Name
FROM Customers, Plants, Orders, Staff
INNER JOIN Orders AS t2 ON t2.Order_ID = Staff.Order_ID
WHERE Orders.Order_Date
BETWEEN '2011/01/01'
AND '2013/03/01'
You are generating a Cartesian product between the tables since you have not provided join syntax between any of the tables:
SELECT c.First_Name, c.Last_Name,
p.Common_Name, p.Flower_Colour, p.Flowering_Season,
s.First_Name, s.Last_Name
FROM Customers c
INNER JOIN Orders o
on c.customerId = o.customer_id
INNER JOIN Plants p
on o.plant_id = p.plant_id
INNER JOIN Staff s
ON o.Order_ID = s.Order_ID
WHERE o.Order_Date BETWEEN '2011/01/01' AND '2013/03/01'
Note: I am guessing on column names for the joins
Here is a great visual explanation of joins that can help in learning the correct syntax
In the FROM... clause you are doing a cross join - combining every customer with every plant with every order with every staff.
You should only mention one table in the FROM clause and then connect the other ones with INNER JOINS to only get related records.
I don't know exactly how your database looks like, but something like this:
SELECT Customers.First_Name, Customers.Last_Name, Plants.Common_Name,
Plants.Flower_Colour, Plants.Flowering_Season, Staff.First_Name, Staff.Last_Name
FROM Customers
INNER JOIN Orders ON Orders.Customer_ID = Customers.Customer_ID
INNER JOIN Staff ON Staff.Staff_ID = Orders.Staff_ID
INNER JOIN Plants ON Plants.Plants_ID = Orders.Plants_ID
WHERE Orders.Order_Date
BETWEEN '2011/01/01'
AND '2013/03/01'
This is because you are selecting from four tables without any joins between them, and also because you are joining Orders twice. As the result, a Cartesian product is made.
Here is how you should fix it: re-write the theta join using the ANSI syntax, and provide proper join conditions:
SELECT Customers.First_Name, Customers.Last_Name, Plants.Common_Name, Plants.Flower_Colour, Plants.Flowering_Season, Staff.First_Name, Staff.Last_Name
FROM Customers
JOIN Plants ON ...
JOIN Orders ON ...
JOIN Staff ON ...
INNER JOIN Orders AS t2 ON t2.Order_ID = Staff.Order_ID
WHERE Orders.Order_Date BETWEEN '2011/01/01' AND '2013/03/01'
Replace ... with proper join conditions; this should make the results look as expected.

Using sum with a nested select

I'm using SQL Server. This statement lists my products per menu:
SELECT menuname, productname
FROM [web].[dbo].[tblMenus]
FULL OUTER JOIN [web].[dbo].[tblProductsRelMenus]
ON [tblMenus].Id = [tblProductsRelMenus].MenuId
FULL OUTER JOIN [web].[dbo].[tblProducts]
ON [tblProductsRelMenus].ProductId = [tblProducts].ProductId
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].Id = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
Some products don't have menus and vice versa. I use the following to establish what has been sold of each product.
SELECT [tblProducts].ProductName, SUM([tblOrderDetails].Ammount) as amount
FROM [web].[dbo].[tblProducts]
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].ProductId = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
What I want to do is complement the top table with an amount column. That is, I want a table with the same number of rows as in the first table above but with an amount value if it exists, otherwise null.
I can't figure out how to do this. Any suggestions?
If I am not missing anything, the second query could be simplified, then incorporated into the first query like this:
SELECT
m.menuname,
p.productname,
t.amount
FROM [web].[dbo].[tblMenus] m
FULL JOIN [web].[dbo].[tblProductsRelMenus] pm ON m.Id = pm.MenuId
FULL JOIN [web].[dbo].[tblProducts] p ON pm.ProductId = p.ProductId
LEFT JOIN (
SELECT ProductId, SUM(Amount) as amount
FROM [web].[dbo].[tblOrderDetails]
GROUP BY ProductId
) t ON p.ProducId = t.ProductId