SQL: Left Join with three tables - sql

I have three tables
Products (idProduct, name)
Invoices(typeinvoice, numberinvoice, date)
Item-invoices(typeinvoice, numberinvoice, idProduct)
My query has to select all the products not selled in the year 2019. I can use a function to obtain the year from the date, for example year(i.date). I know that the products that don't appear in the Item-invoice table are the not selled products. So I have tried with this two codes and obtain a good output.
SELECT p.name
FROM Products p
EXECPT
SELECT ii.idProduct
FROM Item-invoices ii, Invoices i
WHERE ii.typeinvoice=i.typeinvoice
AND ii.numberinvoice=i.numberinvocice
AND year(i.date)=2019
And the other code use a sub-query:
SELECT p.name
FROM Products p
WHERE p.idProduct NOT IN
(SELECT ii.idProduct
FROM Item-invoices ii, Invoices i
WHERE ii.typeinvoice=i.typeinvoice
AND ii.numberinvoice=i.numberinvocice
AND year(i.date)=2019)
The answer is how can i use the left join command to have the same output. I've tried with
SELECT p.name
FROM Products p
LEFT JOIN Item-invoices ii ON
p.IdProduct=ii.idProduct
LEFT JOIN Invoices i ON
ii.typeinvoice=i.typeinvoice
AND ii.numberinvoice=i.numberinvocice
WHERE year(i.date)=2019
AND ii.idProduct IS NULL
I know this is wrong but can't find the solution
Any help?

You are almost there. You just need to move the condition on the invoice date to from the from clause to the on clause of the join.
Conditions in the WHERE clause are mandatory, so what you did actually turned the LEFT JOI to an INNER JOIN, which can never be fulfilled (since both conditions in the WHERE clause cannot be true at the same time).
SELECT p.name
FROM Products p
LEFT JOIN Item-invoices ii ON
p.IdProduct=ii.idProduct
LEFT JOIN Invoices i ON
ii.typeinvoice=i.typeinvoice
AND ii.numberinvoice=i.numberinvocice
AND i.date >= '2019-01-01'
AND i.date < '2020-01-01'
WHERE ii.idProduct IS NULL
Note that I changed your date filter to a half-open filter that operates directly on the stored date, without using date functions; this is a more efficient way to proceed (since it allows the database to use an existing index).

Related

2 Tables - one customer, one transactions. How to handle a customer with no transaction?

I have 2 tables-one customers, one transactions. One customer does not have any transactions. How do I handle that? As I'm trying to join my tables, the customer with no transaction does not show up as shown in code below.
SELECT Orders.Customer_Id, Customers.AcctOpenDate, Customers.CustomerFirstName, Customers.CustomerLastName, Orders.TxnDate, Orders.Amount
FROM Orders
INNER JOIN Customers ON Orders.Customer_Id=Customers.Customer_Id;
I need to be able to account for the customer with no transaction such as querying for least transaction amount.
Use below updated query - Right Outer join is used instead of Inner join to show all customers regardless of the customer placed an order yet.
SELECT Orders.Customer_Id, Customers.AcctOpenDate,
Customers.CustomerFirstName, Customers.CustomerLastName,
Orders.TxnDate, Orders.Amount
FROM Orders
Right Outer JOIN Customers ON Orders.Customer_Id=Customers.Customer_Id;
INNER Joins show only those records that are present in BOTH tables
OUTER joins gets SQL to list all the records present in the designated table and shows NULLs for the fields in the other table that are not present
LEFT OUTER JOIN (the first table)
RIGHT OUTER JOIN (the second table)
FULL OUTER JOIN (all records for both tables)
Get up to speed on the join types and how to handle NULLS and that is 90% of writing SQL script.
Below is the same query with a left join and using ISNULL to turn the amount column into 0 if it has no records present
SELECT Orders.Customer_Id, Customers.AcctOpenDate, Customers.CustomerFirstName, Customers.CustomerLastName
, Orders.TxnDate, ISNULL(Orders.Amount,0)
FROM Customers
LEFT OUTER JOIN Orders ON Orders.Customer_Id=Customers.Customer_Id;
try this :
SELECT Orders.Customer_Id, Customers.AcctOpenDate, Customers.CustomerFirstName, Customers.CustomerLastName, Orders.TxnDate, Orders.Amount
FROM Orders
Right OUTER JOIN Customers ON Orders.Customer_Id=Customers.Customer_Id;
I strongly recommend LEFT JOIN. This keeps all rows in the first table, along with matching columns in the second. If there are no matching rows, these columns are NULL:
SELECT c.Customer_Id, c.AcctOpenDate, c.CustomerFirstName, c.CustomerLastName,
o.TxnDate, o.Amount
FROM Customers c LEFT JOIN
Orders o
ON o.Customer_Id = c.Customer_Id;
Although you could use RIGHT JOIN, I never use RIGHT JOINs, because I find them much harder to follow. The logic of "keep all rows in the first table I read" is relatively simple. The logic of "I don't know which rows I'm keeping until I read the last table" is harder to follow.
Also note that I included table aliases and change the CustomerId to come from customers -- the table where you are keeping all rows.
Using CASE will replace "null" with 0 then you can sum the values. This will count customers with no transactions.
SELECT c.Name,
SUM(CASE WHEN t.ID IS NULL THEN 0 ELSE 1 END) as TransactionsPerCustomer
FROM Customers c
LEFT JOIN Transactions t
ON c.Name = t.customerID
group by c.Name
SELECT c.Name,
SUM(CASE WHEN t.ID IS NULL THEN 0 ELSE 1 END) as numberoftransaction
FROM customers c
LEFT JOIN transactions t
ON c.Name = t.customerID
group by c.Name

LEFT JOIN help in sql

I have to make a list of customer who do not have any invoice but have paid an invoice … maybe twice.
But with my code (stated below) it contains everything from the left join. However I only need the lines highlighted with green.
How should I make a table with only the 2 highlights?
Select paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.value = paymentsfrombank.value
You only want to select columns from paymentsfrombank. So why do you even join?
select invoice_number, customer, value from paymentsfrombank
except
select invoice_number, customer, value from debtors;
(This requires exact matches as in your example, i.e. same amount for the invoice/customer).
There are two issues in your SQL. First, you need to join on Invoice number, not on value, as joining on value is pointless. Second, you need to only pick those payments where there are no corresponding debts, i.e. when you left-join, the table on the right has "null" in the joining column. The SQL would be something like this:
SELECT paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.InvoiceNumber = paymentsfrombank.InvoiceNumber
WHERE debtors.InvoiceNumber is NULL
in mysql we usually have this way to flip the relation and extract the rows that dosen't have relation.
Select paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.value = paymentsfrombank.value where debtors.value is null
You can use NOT EXISTS :
SELECT p.*
FROM paymentsfrombank p
WHERE NOT EXISTS (SELECT 1 FROM debtors d WHERE d.invoice_number = p.invoice_number);
However, the LEFT OUTER JOIN would also work if you add filtered with WHERE Clause to filtered out only missing customers that haven't any invoice information :
SELECT p.invoicenumber, p.customer, p.value
FROM paymentsfrombank P LEFT OUTER JOIN
debtors d
ON d.InvoiceNumber = p.InvoiceNumber
WHERE d.InvoiceNumber IS NULL;
Note : I have used table alias (p & d) that makes query to easier read & write.

SQL Get aggregate as 0 for non existing row using inner joins

I am using SQL Server to query these three tables that look like (there are some extra columns but not that relevant):
Customers -> Id, Name
Addresses -> Id, Street, StreetNo, CustomerId
Sales -> AddressId, Week, Total
And I would like to get the total sales per week and customer (showing at the same time the address details). I have come up with this query
SELECT a.Name, b.Street, b.StreetNo, c.Week, SUM (c.Total) as Total
FROM Customers a
INNER JOIN Addresses b ON a.Id = b.CustomerId
INNER JOIN Sales c ON b.Id = c.AddressId
GROUP BY a.Name, c.Week, b.Street, b.StreetNo
and even if my SQL skill are close to none it looks like it's doing its job. But now I would like to be able to show 0 whenever the one customer don't have sales for a particular week (weeks are just integers). And I wonder if somehow I should get distinct values of the weeks in the Sales table, and then loop through them (not sure how)
Any help?
Thanks
Use CROSS JOIN to generate the rows for all customers and weeks. Then use LEFT JOIN to bring in the data that is available:
SELECT c.Name, a.Street, a.StreetNo, w.Week,
COALESCE(SUM(s.Total), 0) as Total
FROM Customers c CROSS JOIN
(SELECT DISTINCT s.Week FROM sales s) w LEFT JOIN
Addresses a
ON c.CustomerId = a.CustomerId LEFT JOIN
Sales s
ON s.week = w.week AND s.AddressId = a.AddressId
GROUP BY c.Name, a.Street, a.StreetNo, w.Week;
Using table aliases is good, but the aliases should be abbreviations for the table names. So, a for Addresses not Customers.
You should generate a week numbers, rather than using DISTINCT. This is better in terms of performance and reliability. Then use a LEFT JOIN on the Sales table instead of an INNER JOIN:
SELECT a.Name
,b.Street
,b.StreetNo
,weeks.[Week]
,COALESCE(SUM(c.Total),0) as Total
FROM Customers a
INNER JOIN Addresses b ON a.Id = b.CustomerId
CROSS JOIN (
-- Generate a sequence of 52 integers (13 x 4)
SELECT ROW_NUMBER() OVER (ORDER BY a.x) AS [Week]
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x)
CROSS JOIN (SELECT x FROM (VALUES(1),(1),(1),(1)) b(x)) b
) weeks
LEFT JOIN Sales c ON b.Id = c.AddressId AND c.[Week] = weeek.[Week]
GROUP BY a.Name
,b.Street
,b.StreetNo
,weeks.[Week]
Please try the following...
SELECT Name,
Street,
StreetNo,
Week,
SUM( CASE
WHEN Total IS NULL THEN
0
ELSE
Total
END ) AS Total
FROM Customers a
JOIN Addresses b ON a.Id = b.CustomerId
RIGHT JOIN Sales c ON b.Id = c.AddressId
GROUP BY a.Name,
c.Week,
b.Street,
b.StreetNo;
I have modified your statement in three places. The first is I changed your join to Sales to a RIGHT JOIN. This will join as it would with an INNER JOIN, but it will also keep the records from the table on the right side of the JOIN that do not have a matching record or group of records on the left, placing NULL values in the resulting dataset's fields that would have come from the left of the JOIN. A LEFT JOIN works in the same way, but with any extra records in the table on the left being retained.
I have removed the word INNER from your surviving INNER JOIN. Where JOIN is not preceded by a join type, an INNER JOIN is performed. Both JOIN and INNER JOIN are considered correct, but the prevailing protocol seems to be to leave the INNER out, where the RDBMS allows it to be left out (which SQL-Server does). Which you go with is still entirely up to you - I have left it out here for illustrative purposes.
The third change is that I have added a CASE statement that tests to see if the Total field contains a NULL value, which it will if there were no sales for that Customer for that Week. If it does then SUM() would return a NULL, so the CASE statement returns a 0 instead. If Total does not contain a NULL value, then the SUM() of all values of Total for that grouping is performed.
Please note that I am assuming that Total will not have any NULL values other than from the RIGHT JOIN. Please advise me if this assumption is incorrect.
Please also note that I have assumed that either there will be no missing Weeks for a Customer in the Sales table or that you are not interested in listing them if there are. Again, please advise me if this assumption is incorrect.
If you have any questions or comments, then please feel free to post a Comment accordingly.

Using a where statement against left outer join

I have a query that returns ALL the appointments for an advisor in a week, and what items they sold at that appointment (if non have been sold it still shows the appointment) I have done this by using a left outer join on the appointments table and ithe items table.
In this scenario I only want to display sold items in stock ( in stock is a field in the items table)
If I use a 'where instock true' against items, I loose all the appointments where no items have been sold ?
Do I need to nest these queries somehow - if so how ? Thanks
The key to what you want to do is to move the condition on the stock to the on clause from the where clause. Presumably, your existing query looks something like this:
select <whatever>
from appointments a left join
sells s
on a.appointmentid = s.appointmentid
where s.instock = 1;
The where clause is turning the left join into an inner join. When there is no match, the value of instock is NULL, which does not match 1. The solution is to move the condition to the on clause:
select <whatever>
from appointments a left join
sells s
on a.appointmentid = s.appointmentid and
s.instock = 1;
You should use ON instead WHERE. That way, if the items aren't in stock the wont match and the right part of your join will be NULL.
SELECT A.*, S.* FROM appointments A LEFT JOIN sells S ON ... AND sells.instock = 1;
I could be more precise if I had the SQL you've tried to write.

Difference b/w putting condition in JOIN clause versus WHERE clause

Suppose I have 3 tables.
Sales Rep
Rep Code
First Name
Last Name
Phone
Email
Sales Team
Orders
Order Number
Rep Code
Customer Number
Order Date
Order Status
Customer
Customer Number
Name
Address
Phone Number
I want to get a detailed report of Sales for 2010. I would be doing a join. I am interested in knowing which of the following is more efficient and why ?
SELECT
O.OrderNum, R.Name, C.Name
FROM
Order O INNER JOIN Rep R ON O.RepCode = R.RepCode
INNER JOIN Customer C ON O.CustomerNumber = C.CustomerNumber
WHERE
O.OrderDate >= '01/01/2010'
OR
SELECT
O.OrderNum, R.Name, C.Name
FROM
Order O INNER JOIN Rep R ON (O.RepCode = R.RepCode AND O.OrderDate >= '01/01/2010')
INNER JOIN Customer C ON O.CustomerNumber = C.CustomerNumber
JOINs must reflect the relationship aspect of your tables. WHERE clause, is a place where you filter records. I prefer the first one.
Make it readable first, table relationships should be obvious (by using JOINs), then profile
Efficiency-wise, the only way to know is to profile it, different database have different planner on executing the query
Wherein some database might apply filter first, then do the join subsquently; some database might join tables blindly first, then execute where clause later. Try to profile, on Postgres and MySQL use EXPLAIN SELECT ..., in SQL Server use Ctrl+K, with SQL Server you can see which of the two queries is faster relative to each other