SQL Server WHERE NOT EXISTS not working - sql

I have the 3 below statements,
Selects the Order Numbers that dont exist
select Orders.OrderNumber
FROM Orders
inner join InvoiceControl on Orders.OrderNumber = InvoiceControl.OrderNumber
where not exists (select OrderNumber from Orders where InvoiceControl.OrderNumber = Orders.OrderNumber)
Selects a specific Order number that does not exist
select OrderNumber from Orders where OrderNumber = 987654
Selects the specific Order Number in the corresponding table that does not exist
select OrderNumber from InvoiceControl where OrderNumber = 987654
these 3 queries work in other scenarios regarding other tables but not this one, have I made an obvious mistake anywhere? below is the query ran and the outputs:
the idea behind this is to locate the OrderNumbers that do not exist in the InvoiceControl, based on the OrderNumbers in the Orders Tabl, so the top query would also return the value 987654 as this value has not yet been included in the InvoiceControl Table as this could be a new Order without an Invoice

Because your INNER JOIN will already create all correspondents between Orders.OrderNumber = InvoiceControl.OrderNumber.
After this result set is built, you actually filter out everything based on the condition in your WHERE.
where not exists (select OrderNumber from Orders where InvoiceControl.OrderNumber = Orders.OrderNumber)
Hypothetically, if you'd have just 987654 in your Orders table and you'd have a Correspondent in your InvoiceControl table, then the following query, without your WHERE clause
select Orders.OrderNumber
FROM Orders
inner join InvoiceControl on Orders.OrderNumber = InvoiceControl.OrderNumber
would return:
OrderNumber
987654
Then, by applying your where not exists (select OrderNumber from Orders where InvoiceControl.OrderNumber = Orders.OrderNumber) condition, you'd be looking for all records that do not have a correspondent (but you already have all possible correspondents between your two tables, based on your INNER JOIN).
Thus, your result will be:
OrderNumber

In the first query, you first asked for rows in both Orders and InvoiceControl (by way of the FROM and JOIN tables), and then you added in your WHERE clause a request to exclude all rows that exist in Orders. Since your starting set only includes rows that are in Orders, if you ask for all of those rows to be excluded, you will get no results back.

If what you're looking for is to find all the ordernumbers in tbl Orders and not in tbl InvoiceControl.
Then I would try this instead.
Select O.Ordernumbers from Orders O
Left Join Invoicecontrol I
On O.Ordernumbers = I.Ordernumbers
Where I.Ordernumbers is null

Related

T-SQL "not in (select " not working (as expected)

I have an ordinary one-to-many relation:
customer.id = order.customerid
I want to find customers who have no associated orders.
I tried:
-- one record
select * from customers where id = 123
-- no records
select * from orders where customerid = 123
-- NO RECORDS
select *
from customers
where id not in (select customerid from orders)
-- many records, as expected.
select *
from customers
where not exist (select customerid from orders
where customers.customerid = customer.id)
Am I mistaken, or should it work?
NOT IN does not behave as expected when the in-list contains NULL values.
In fact, if any values are NULL, then no rows are returned at all. Remember: In SQL, NULL means "indeterminate" value, not "missing value". So, if the list contains any NULL value then it might be equal to a comparison value.
So, customerid must be NULL in the orders table.
For this reason, I strongly recommend that you always use NOT EXISTS with a subquery rather than NOT IN.
I normally do this via a left join that looks for nulls created when the join fails:
SELECT c.*
FROM
customers c
LEFT JOIN orders o
ON c.id = o.customerid
WHERE
o.customerid IS NULL
Left join treats the customer table as "solid" and connects orders to it where there is an order with a given customer id and puts nulls where there isn't any matching order, hence the orders side of the relationship has "holes" in the data. By then saying we only want to see the holes (via the where clause), we get a list of "customers with no orders"
As per the comments I've always worked to the rule "do not use IN for lists longer than you'd be prepared to write by hand" but increasingly optimisers are rewriting IN, EXISTS and LEFT JOIN WHERE NULL queries to function identically as they're all recognised patterns of "data in A that has no matching data in B"

How to put conditions on left joins

I have two tables, CustomerCost and Products that look like the following:
I am joining the two tables using the following SQL query:
SELECT custCost.ProductId,
custCost.CustomerCost
FROM CUSTOMERCOST Cost
LEFT JOIN PRODUCTS prod ON Cost.productId =prod.productId
WHERE prod.productId=4
AND (Cost.Customer_Id =2717
OR Cost.Customer_Id IS NULL)
The result of the join is:
joins result
What i want to do is when I pass customerId 2717 it should return only specific customer cost i.e. 258.93, and when customerId does not match then only it should take cost as 312.50
What am I doing wrong here?
You can get your expected output as follows:
SELECT Cost.ProductId,
Cost.CustomerCost
FROM CUSTOMERCOST Cost
INNER JOIN PRODUCTS prod ON Cost.productId = prod.productId
WHERE prod.productId=4
AND Cost.Customer_Id = 2717
However, if you want to allow customer ID to be passed as NULL, you will have to change the last line to AND Cost.Customer_Id IS NULL. To do so dynamically, you'll need to use variables and generate the query based on the input.
The problem in the original query that you have posted is that you have used an alias called custCost which is not present in the query.
EDIT: Actually, you don't even need a join. The CUSTOMERCOST table seems to have both Customer and Product IDs.
You can simply:
SELECT
Cost.ProductId, Cost.CustomerCost
FROM
CUSTOMERCOST Cost
WHERE
Cost.Customer_Id = 2717
AND Cost.productId = 4
You seem to want:
SELECT c.*
FROM CUSTOMERCOST c
WHERE c.productId = 4 AND c.Customer_Id = 2717
UNION ALL
SELECT c.*
FROM CUSTOMERCOST c
WHERE c.productId = 4 AND c.Customer_Id IS NULL AND
NOT EXISTS (SELECT 1 FROM CUSTOMERCOST c2 WHERE c2.productId = 4 AND c2.Customer_Id = 2717);
That is, take the matching cost, if it exists for the customer. Otherwise, take the default cost.
SELECT custCost.ProductId,
custCost.CustomerCost
FROM CUSTOMERCOST Cost
LEFT JOIN PRODUCTS prod
ON Cost.productId =prod.productId
AND (Cost.Customer_Id =2717 OR Cost.Customer_Id IS NULL)
WHERE prod.productId=4
WHERE applies to the joined row. ON controls the join condition.
Outer joins are why FROM and ON were added to SQL-92. The old SQL-89
syntax had no support for them, and different vendors added different,
incompatible syntax to support them.

SQL query using EXIST operator - unexpected records in result

Using the AdventureWorks2014 database, I was experimenting with the EXIST keyword. Please note the following query:
select p.color, p.productid, p.name, th.Quantity
from production.product p, production.TransactionHistory th
where p.ProductID=th.ProductID and EXISTS(
select *
from Production.TransactionHistory t
where t.Quantity = 1000
and t.ProductID=p.ProductID
)
I was expecting to see only products that were ordered 1000 at a time (there is only one transaction that meets this condition), but instead I get hundreds of rows where th.Quantity is < 1000.
Removing the joined TransactionHistory table from the outer query solves the problem, but I just want to know why the original query returns the rows I am seeing.
Thanks
Edit:
For clarification, I understand how to solve the question that I want. I just wanted to understand the behavior of EXISTS and why I'm not getting the results I expected.
The following subquery (which is part of the EXISTS subquery), only returns a single result.
select *
from Production.TransactionHistory t
where t.Quantity = 1000
Therefore, if this is inside EXISTS it will return true every time. The caveat is that I am linking t.ProductID with p.ProductID in the subquery. So, for every row in the outer query, the product ID should be matching the product ID in the inner query. EXISTS should only return true when the product ID matches and the quantity is exactly 1000. To be precise, EXIST should only return true when the product ID is 994, because there is only one transaction in the entire table (with that product ID) that satisfies both the product ID requirement and the 1000 quantity requirement.
Notice the rest of the EXISTS subquery...
where t.Quantity = 1000 and t.ProductID=p.ProductID
The product ID has to match the outer record's product ID AND the quantity must be 1000.
To me, this query says "Give me the color, product id and name of all products, join in transactions, and then only include each row where there is at least one record in the transaction table whose product id matches the id of the CURRENT outer row, AND the order quantity is 1000". But this is not how it behaves. Just trying to understand why.
Your query sounds like this:
Get all transaction history entries of product if any of history entry have
Quantity equal to 1000.
EXISTS return true or false, so
EXISTS(
select *
from Production.TransactionHistory t
where t.Quantity = 1000
and t.ProductID=p.ProductID
)
will return true for all TransactionHistory rows of product which have Transaction with Quantity = 1000
In addition:
Query above will be executed for every row of "Main" query and will return True on every row in your case. Thats why you get all rows
EXISTS returns true if the following query has even one record in it.
You are looking for a query something like below:
SELECT p.color, p.productid, p.name, th.Quantity
FROM production.product p, production.TransactionHistory th
WHERE p.ProductID = th.ProductID and th.Quantity = 1000
OR you can replace it with a better looking join query which looks like this:
SELECT p.color, p.productid, p.name, th.Quantity
FROM production.product p
INNER JOIN production.TransactionHistory th ON p.ProductID = th.ProductID
WHERE th.Quantity = 1000
It's because you are checking only ProductID in EXIST clause. When it finds at least one transaction with your productID then it displays such transaction. So all transactions for product that has transaction with quantity equals to 1000 will be displayed.
basically your query is saying
Give me all product and it's transaction history WHERE there is
EVER a transaction with quantity of 1000

WHERE using a temporary table

I have three tables: customers, orders and refunds. Some customers (not all) placed orders and for some orders (not all) there were refunds.
When I join the three tables like this (the details are not that important):
SELECT ...
FROM customers LEFT JOIN orders
ON customers.customer_id=orders.customer_id
LEFT JOIN refunds
ON orders.order_id=refunds.order_id;
//WHERE order_id IS NOT NULL;// uncomment to filter out customers that have no orders
I get a big table in which all customers are listed (even the ones that have not placed any orders and they have NULL in the 'order_id' column), with all their orders and the orders' refunds (even if not all orders have refunds):
NAME ORDER_ID ORDER AMOUNT REFUND
------------------------------------------------------------
Natalie 2 12.50 NULL
Natalie 3 18.00 18.00
Brenda 4 20.00 NULL
Adam NULL NULL NULL
Since I only want to see only customers that have placed orders, i.e in this case I want to filter Adam from the table, I uncomment the 'WHERE' row from the SQL query above.
This yields the desired result.
My question is:
On which table is the WHERE executed - on the original 'orders' table (which has no order_id that is NULL) or on the table that is result of the JOINs?
Apparently it is the latter, but just want to make sure, since it is not very obvious from the SQL syntax and it is a very important point.
Thank you
In this case, you're making SQL work harder than it has to. It is operating on the results (likely a MERGE event, or something along those lines).
There's a chance SQL is realizing what you're doing and optimizing the plan and changing to an INNER JOIN for you. But I can't be certain (and neither can SQL -- it can change how it optimizes over time).
In the case where you only want where an order is there, use an INNER JOIN instead. SQL will be much more efficient at this.
SELECT ...
FROM customers
INNER JOIN orders
ON customers.customer_id=orders.customer_id
LEFT JOIN refunds
ON orders.order_id=refunds.order_id;
You can change the LEFT JOIN as INNER JOIN to eliminate customers which don't have any order
SELECT ...
FROM customers INNER JOIN orders
ON customers.customer_id=orders.customer_id
LEFT JOIN refunds
ON orders.order_id=refunds.order_id;
It's because you're using LEFT JOIN, which will return all rows from the left hand table, in your case this is the Customer Table, and return NULL where no corresponding values appear in the right hand tables.
Just rewrite it using inner joins, so only rows where matching data is found will be returned.
SELECT ...
FROM customers
INNER JOIN orders
ON customers.customer_id=orders.customer_id
INNER JOIN refunds
ON orders.order_id=refunds.order_id;

Return customers with no sales

I'm a bit of a beginner with SQL so apologies if this seems trivial/basic. I'm struggling to get my head around it...
I am trying to generate results that show all customers that are in the customer table that have never placed an order and will therefore have no entry on the invoice table.
In other words, I want to select all customers from the customer table where there is no entry for their customer number in the invoice table.
Many thanks,
Mike
SELECT *
FROM customer c
WHERE NOT EXISTS (
SELECT 1
FROM invoice i
WHERE i.customerid = c.customerid
)
I would suggest you also read Oracle's documentation on different types of table joins here.
if customer_id is the collumn that identify the customer you should do something like this...
select * from Customer
where customer_id not in (select customer_id from invoice)
If you want to return all customer rows, then you will want to use a LEFT JOIN
select *
from customer c
left join invoices i
on c.customerid = i.customerid
where i.customerid is null
See SQL Fiddle with Demo
If you need help learning JOIN syntax, then here is a great visual explanation of joins.
A LEFT JOIN will return all rows from the customer table even if there is not a matching row in the invoices table. If you wanted to return only the rows that matched in both tables, then you would use an INNER JOIN. By adding the where i.customerid is null to the query it will return only those rows with no match in invoices.