Why this LEFT OUTER JOIN is not including all the Primary Keys from the Left - sql

The customers table has total 1000 customers of which 1500 placed orders in FY 2016. But we want to display all the customers with their total number of orders in FY 2016 whether a customer placed an order in that FY or not. But the following query in SQL Server 2012 is displaying only 1490.
What we may be missing here?
SELECT c.CustomerID, count(*) AS TotalOrders
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.FiscalYear = '2016'
GROUP BY c.CustomerID
UPDATE:
The following query returns only 1 more record (1491) - still missing 9 more records.
SELECT c.CustomerID, count(*) AS TotalOrders
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID
AND o.FiscalYear = '2016'
GROUP BY c.CustomerID

You where clause is turning the left outer join into an Inner join.
Change it to AND:
SELECT c.CustomerID, count(o.CustomerID) AS TotalOrders
FROM Customers c
LEFT JOIN Orders o
ON c.CustomerID = o.CustomerID
AND o.FiscalYear = '2016' -- Here
GROUP BY c.CustomerID

The correct SQL is:
SELECT c.CustomerID, count(o.CustomerID) AS TotalOrders,
sum(count(o.CustomerID)) over () as TotalTotalOrders
FROM Customers c LEFT JOIN
Orders o
ON c.CustomerID = o.CustomerID AND o.FiscalYear = '2016'
GROUP BY c.CustomerID;
TotalTotalOrders should be all the orders (or at least the ones with valid customer ids).

This will list all customers, whether or not they have any orders regardless of the year in which the order was placed. The sum will then count all orders that were placed in 2016, ignore the rest, and return an intenger (i.e. it will never be null).
SELECT
c.CustomerID
,sum(case when o.FiscalYear = '2016' then 1 else 0 end) AS TotalOrders
FROM Customers c
LEFT JOIN Orders o
ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerID

Related

Gross sales of top ten customers that made at least 5 purchases of some Category in a specified year

I need to get (sales.customers) | year | gross_sales of top ten customers that made at least 5 purchases of Category "Beverages" in the year 2014. I have already written these SELECT queries, but since I am new to SQL, I think I am very inefficient in writing code. This does not work properly and there is probably a simpler way of doing it. I have also pinned a picture of an ER diagram .
SELECT
T6.COMPANYNAME, YEAR, GROSS_SALES
FROM
(SELECT T1.CUSTID
FROM
(SELECT R.CUSTID, COUNT(R.CUSTID) AS NUMBEROFSALES
FROM SALES.ORDERDETAILS O
RIGHT JOIN PRODUCTION.PRODUCTS P ON P.PRODUCTID = O.PRODUCTID
RIGHT JOIN SALES.ORDERS R ON R.ORDERID = O.ORDERID
INNER JOIN SALES.CUSTOMERS C1 ON R.CUSTID = C1.CUSTID
INNER JOIN PRODUCTION.CATEGORIES C2 ON P.CATEGORYID = C2.CATEGORYID
WHERE C2.CATEGORYNAME = 'Beverages' AND YEAR(R.ORDERDATE) = 2014
GROUP BY R.CUSTID
ORDER BY SUM(R.CUSTID) DESC) T1
--HAVING COUNT(R.CUSTID) > 5
RIGHT JOIN
(SELECT R.CUSTID, SUM(O.UNITPRICE) AS MONEYSPENT
FROM SALES.ORDERDETAILS O
RIGHT JOIN PRODUCTION.PRODUCTS P ON P.PRODUCTID = O.PRODUCTID
RIGHT JOIN SALES.ORDERS R ON R.ORDERID = O.ORDERID
INNER JOIN SALES.CUSTOMERS C1 ON R.CUSTID = C1.CUSTID
INNER JOIN PRODUCTION.CATEGORIES C2 ON P.CATEGORYID = C2.CATEGORYID
WHERE C2.CATEGORYNAME = 'Beverages' AND YEAR(R.ORDERDATE) = 2014
GROUP BY R.CUSTID
ORDER BY SUM(O.UNITPRICE) DESC) T2 ON T1.CUSTID = T2.CUSTID
ORDER BY T1.NUMBEROFSALES DESC
LIMIT 10) T5
INNER JOIN
(SELECT DISTINCT(T4.COMPANYNAME), T4.CUSTID, YEAR, GROSS_SALES
FROM
(SELECT R.CUSTID AS CUSTID, YEAR(R.ORDERDATE) AS YEAR, SUM(O.UNITPRICE * O.QTY * (1 - O.DISCOUNT)) AS GROSS_SALES
FROM SALES.ORDERDETAILS O
RIGHT JOIN SALES.ORDERS R ON R.ORDERID = O.ORDERID
INNER JOIN SALES.CUSTOMERS C1 ON R.CUSTID = C1.CUSTID
GROUP BY R.CUSTID, YEAR(R.ORDERDATE)
ORDER BY YEAR(R.ORDERDATE)) T3
INNER JOIN
(SELECT C.COMPANYNAME, C.CUSTID
FROM SALES.ORDERS R
INNER JOIN SALES.CUSTOMERS C ON R.CUSTID = C.CUSTID) T4 ON T3.CUSTID = T4.CUSTID) T6 ON T5.CUSTID = T6.CUSTID
I'm not going to try and fix your code or explain the misakes there, as there are many of them. Instead based on your requirements I wrote a query that solves the problem. I show it in steps below which should make clear the process I used to solve the problem.
First how do we find the top ten customers that made 5 purchases of Beverages?
Take customer table and join to orders with beverages (inner join will exclude customers that don't meet criteria)
SELECT CUSTOMERID
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
JOIN PRODUCTS P ON OD.PRODUCTID = P.PRODUCTID
JOIN CATEGORIES C ON P.CATEGORYID = C.CATEGORYID AND C. CATEGORY_NAME = 'Beverages'
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID
HAVING COUNT(ORDER_DETAILS) >= 5
Now we need the sum of order (for 2014) by customers which looks like this:
SELECT CUSTOMERID, YEAR(ORDER_DATE) AS YEAR, SUM(OD.UNIT_PRICE*OD.QUANTITY) AS TOTAL_SPEND
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID, YEAR(ORDER_DATE)
Now we just combine these two queries like this:
SELECT CUSTOMERID, YEAR(ORDER_DATE) AS YEAR, SUM(OD.UNIT_PRICE*OD.QUANTITY) AS TOTAL_SPEND
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
JOIN (
SELECT CUSTOMERID
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
JOIN PRODUCTS P ON OD.PRODUCTID = P.PRODUCTID
JOIN CATEGORIES C ON P.CATEGORYID = C.CATEGORYID AND C. CATEGORY_NAME = 'Beverages'
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID
HAVING COUNT(ORDER_DETAILS) >= 5
) as SUB ON SUB.CUSTOMERID = C.CUSTOMERID
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID, YEAR(ORDER_DATE)
ORDER BY SUM(OD.UNIT_PRICE*OD.QUANTITY)
LIMIT 10
Note I did not test this but just wrote the SQL since I don't have a db to test against so there might be typos
Also Note: I'm expect it is possible to remove the sub query as it is doing a lot of the same joins the outer query is-- but we want to make sure we get the correct result and it is easier to see it is correct this way. You can also test the sub-query by itself to make sure it returns expected results.

SQL - List all customers that we did not make a sale to in the year 1996

This is my functioning SQL query to return the customers that we sold to in 1996:
SELECT C.CustomerID, C.CompanyName
FROM Customers C, Orders O
WHERE C.CustomerID = O.CustomerID AND YEAR(O.OrderDate) = 1996
GROUP BY C.CustomerID, C.CompanyName
ORDER BY C.CustomerID
Now I'm trying to show the opposite; return all customers that we did not sell to in 1996 (even if we did sell to them in other years). This is what I have, however it returns both the customers we didn't sell to in 1996 but also the ones we did:
SELECT C.CustomerID, C.CompanyName FROM Orders O JOIN Customers C
ON O.CustomerID = C.CustomerID
WHERE YEAR(O.OrderDate) != 1996
GROUP BY C.CustomerID, C.CompanyName
ORDER BY C.CustomerID
You can use a correlated subquery that gets the orders from 1996 of a customer with NOT EXISTS.
SELECT c.customerid,
c.companyname
FROM customers c
WHERE NOT EXISTS (SELECT *
FROM orders o
WHERE o.customerid = c.customerid
AND o.orderdate >= '1996-01-01'
AND o.orderdate < '1997-01-01');
Note that you better shouldn't use year() on orderdate as this can prevent indexes form being used, so slowing down the query.
We can build on your existing query and use the left join antipattern:
SELECT C.CustomerID, C.CompanyName
FROM Customers C
LEFT JOIN Orders O
ON C.CustomerID = O.CustomerID
AND O.OrderDate >= '1996-01-01'
AND O.OrderDate < '1997-01-01'
WHERE O.CustomerID IS NULL
ORDER BY C.CustomerID
This phrases as : try to join each customers with the orders they have placed in 1996, and filter on those without any order.
Side note:
always use explicit, standard join (with the ON keyword); old-school, implicit joins should be avoided (no comma in the FROM clause)
as also commented by sticky bit (whose answer is valid and I upvoted it), using date comparison is better form performance than relying on date functions
try this
SELECT C.CustomerID, C.CompanyName FROM Customers C
WHERE
not exists(select 1 FROM Orders O where O.CustomerID = C.CustomerID and YEAR(O.OrderDate) = 1996)
ORDER BY C.CustomerID

COUNT with LEFT JOIN

Two tables named Sales.Customers and Sales.Orders.
Sales.Customers has a foreign key relationship to a column
named CustomerID in Sales.Orders.
Requirement: A query that returns all the customers. The query must also return the number of orders that each customer placed.
Query 1:
SELECT cust.cutomername,
NumberofOrders= COUNT(ord.OrderID)
FROM Sales.Customers Cust
LEFT JOIN
Sales.Orders Ord
ON Cust.CustomerID=Ord.OrderID
GROUP BY
Cust.CutomerName;
But I'm thinking of below one also,
Query2:
SELECT cust.cutomername,
NumberofOrders= COUNT(Cust.cutomerID)
FROM Sales.Customers Cust
LEFT JOIN
Sales.Orders Ord
ON Cust.CustomerID=Ord.OrderID
GROUP BY
Cust.CutomerName;
From both which one do you recommend & why?
This query:
SELECT c.customername, COUNT(o.OrderID)
FROM Sales.Customers c LEFT JOIN
Sales.Orders o
ON c.CustomerID = o.OrderID
GROUP BY c.CustomerName;
Probably returns all customers with meaningless counts -- probably 0 except for OrderIDs that happen to match CustomerIDs.
You probably intend:
SELECT c.customername, COUNT(o.OrderID)
FROM Sales.Customers c LEFT JOIN
Sales.Orders o
ON c.CustomerID = o.CustomerId
GROUP BY c.CustomerName;
In this query, the COUNT() is counting the number of matching orders. It can take the value of 0 for customers with no orders.
For this query:
SELECT c.customername, COUNT(c.OrderID)
FROM Sales.Customers c LEFT JOIN
Sales.Orders o
ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerName;
The COUNT() is returning the number of rows. Every customer has at least one row, so the value would never be 0. Normally, you want the previous query.
Maybe this is your solution:
select c.Customername,
NumberOfOrders = (select count(o.OrderID) from Sales.Orders o where o.CustomerID = c.CustomerID)
from Sales.Customers c
order by c.Customername

SQL to pick only one record in one-to-many relationship

In the following query of current/perspective customers, I need to display, CustomerID, Customer's LastName, along with a column that displays whether customer has placed at least one order or not.
But, as expected, it displays multiple records of a customer if the customer placed multiple orders (one-to-many relationship). Question: How can we display only one record per customer here since we need only to report whether or not a customer has placed at least one order?
SELECT c.customerID, o.OrderID, CASE When ISNULL(o.OrderID, 0) = 0 Then 0 Else
1 End as YesNO
FROM Customers c
LEFT JOIN Orders o
ON c.customerID = o.customerID
using outer apply()
select
c.customerID
, o.OrderID
, case when o.OrderID is null then 0 else 1 end as YesNO
from Customers c
outer apply (
select top 1 o.OrderID
from Orders o
where c.customerID = o.customerID
) o
You could also use o.OrderId is null instead of ISNULL(o.OrderID, 0) = 0.
using group by and min()
select
c.customerID
, min(o.OrderID) as OrderId
, case when min(o.OrderID) is null then 0 else 1 end as YesNO
from Customers c
left join Orders o
on c.customerID = o.customerID
group by c.CustomerID
Use the group by.
SELECT c.customerID, o.OrderID, CASE When ISNULL(o.OrderID, 0) = 0 Then 0 Else 1 End as YesNOFROM Customers cLEFT JOIN Orders o ON c.customerID = o.customerID GROUP BY c.customerID
If the description in your question is to be believed, i.e. you want to know whether a customer has placed an order but do not need a representative OrderId for each customer:
select C.CustomerId,
case when exists ( select 42 from Orders as O where O.CustomerId = C.CustomerId )
then 1 else 0 end as YesNo
from Customers as C;
Note that exists is more efficient than count when you don't need an exact number.

Left Join in Oracle SQL

I was going through an example of LEFT JOIN on w3schools.com.
http://www.w3schools.com/sql/sql_join_left.asp
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
LEFT JOIN Orders
ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;
The above query will return me all customers with No Orders as NULL Order ID+ All customers having Orders with their Order Ids
How should I modify this query so that it returns All Customers with No Orders + All Customers having Orders with Order date as '1996-09-18'
Thanks in advance.
If you want customers with no orders and those with a specific order date, then you want a WHERE clause:
SELECT c.CustomerName, o.OrderID
FROM Customers c LEFT JOIN
Orders o
ON c.CustomerID = o.CustomerID
WHERE (o.CustomerID is NULL) OR (o.OrderDate = DATE '1996-09-18)
ORDER BY c.CustomerName;
If you wanted all customers with their order on that date (if they have one), then you would move the condition to the ON clause:
SELECT c.CustomerName, o.OrderID
FROM Customers c LEFT JOIN
Orders o
ON c.CustomerID = o.CustomerID AND o.OrderDate = DATE '1996-09-18
ORDER BY c.CustomerName;
Note the difference: the first filters the customers. The second only affects what order gets shown (and NULL will often be shown).