Best way of excluding customers with a specific order - sql

I'm running a pretty basic query. The issue is I'm trying to get results with all of our customers who never ordered product Y. The problem is, if I use a simple WHERE ProductColumn <> 'Product Y', it doesn't work because almost all of our customers have ordered other products.
Basically, I'm wondering how I could exclude on the customer level (instead of the order level) - if a customer has ordered Product Y, I don't want them showing up in my results at all.
Thanks.

You are probably looking for EXISTS().
If I want to find customers who have placed orders:
SELECT c.*
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customerid = c.customerID
AND productID = 'Y'
)
If I want to find customers who have not placed orders:
SELECT c.*
FROM customers c
WHERE NOT EXISTS (
SELECT 1
FROM orders o
WHERE o.customerid = c.customerID
AND productID = 'Y'
)

Try this:
select * from customers c
where not exists(select 1 from customers
where cutomer_id = c.customer_id
and productcolumn = 'product y')
This assumes you have 'customer_id' column (or at least some id column in your table).

Simple LEFT JOIN should work:
SELECT c.*
FROM customers c
LEFT OUTER JOIN orders o
ON o.customerid = c.customerID
WHERE o.ProductColumn <> 'Product Y'

Related

Sql query to display records that appear more than once in a table

I have two tables, Customer with columns CustomerID, FirstName, Address and Purchases with columns PurchaseID, Qty, CustomersID.
I want to create a query that will display FirstName(s) that have bought more than two products, product quantity is represented by Qty.
I can't seem to figure this out - I've just started with T-SQL
You could sum the purchases and use a having clause to filter those you're interested in. You can then use the in operator to query only the customer names that fit these IDs:
SELECT FirstName
FROM Customer
WHERE CustomerID IN (SELECT CustomerID
FROM Purchases
GROUP BY CustomerID
HAVING SUM(Qty) > 2)
Please try this, it should work for you, according to your question.
Select MIN(C.FirstName) FirstName from Customer C INNER JOIN Purchases P ON C.CustomerID=P.CustomersID Group by P.CustomersID Having SUM(P.Qty) >2
Please try this:
select c.FirstName,p.Qty
from Customer as c
join Purchase as p
on c.CustomerID = p.CustomerID
where CustomerID in (select CustomerID from Purchases group by CustomerID having count(CustomerID)>2);
SELECT
c.FirstName
FROM
Customer c
INNER JOIN Purchases p
ON c.CustomerId = p.CustomerId
GROUP BY
c.FirstName
HAVING
SUM(p.Qty) > 2
While the IN suggestions would work they are kind of overkill and more than likely less performant than a straight up join with aggregation. The trick is the HAVING Clause by using it you can limit your result to the names you want. Here is a link to learn more about IN vs. Exists vs JOIN (NOT IN vs NOT EXISTS)
There are dozens of ways of doing this and to introduce you to Window Functions and common table expressions which are way over kill for this simplified example but are invaluable in your toolset as your queries continue to get more complex:
;WITH cte AS (
SELECT DISTINCT
c.FirstName
,SUM(p.Qty) OVER (PARTITION BY c.CustomerId) as SumOfQty
FROM
Customer c
INNER JOIN Purchases p
ON c.CustomerId = p.CustomerId
)
SELECT *
FROM
cte
WHERE
SumOfQty > 2

SQL EXISTS returns all rows, more than two tables

I know similar questions like this have been asked before, but I have not seen one for more than 2 tables. And there seems to be a difference.
I have three tables from which I need fields, customers where I need customerID and orderID from, orders from which I get customerID and orderID and lineitems from which I get orderID and quantity (= quantity ordered).
I want to find out how many customers bought more than 2 of the same item, so basically quantity > 2 with:
SELECT COUNT(DISTINCT custID)
FROM customers
WHERE EXISTS(
SELECT *
FROM customers C, orders O, lineitems L
WHERE C.custID = O.custID AND O.orderID = L.orderID AND L.quantity > 2
);
I do not understand why it is returning me the count of all rows. I am correlating the subqueries before checking the >2 condition, am I not?
I am a beginner at SQL, so I'd be thankful if you could explain it to me fundamentally, if necessary. Thanks.
You don't have to repeat customers table in the EXISTS subquery. This is the idea of correlation: use the table of the outer query in order to correlate.
SELECT COUNT(DISTINCT custID)
FROM customers c
WHERE EXISTS(
SELECT *
FROM orders O
JOIN lineitems L ON O.orderID = L.orderID
WHERE C.custID = O.custID AND L.quantity > 2
);
I would approach this as two aggregations:
select count(distinct customerid)
from (select o.customerid, l.itemid, count(*) as cnt
from lineitems li join
orders o
on o.orderID = l.orderId
group by o.customerid, l.itemid
) ol
where cnt >= 2;
The inner query counts the number of items that each customer has purchased. The outer counts the number of customers.
EDIT:
I may have misunderstood the question for the above answer. If you just want where quantity >= 2, then that is much easier:
select count(distinct o.customerid)
from lineitems li join
orders o
on o.orderID = l.orderId
where l.quantity >= 2;
This is probably the simplest way to express the query.
I suggest you to use "joins" ,
Try this
select
count(*)
From
orders o
inner join
lineitems l
on
l.orderID = o.orderID
where
l.quantity > 2

Trying to Optimize PostgreSQL Nested WHERE IN

I have a Postgres (9.1) customer database similar to:
customers.id
customers.lastname
customers.firstname
invoices.id
invoices.customerid
invoices.total
invoicelines.id
invoicelines.invoiceid
invoicelines.itemcode
invoicelines.price
I built a search which lists all customers who have purchased a certain item (say 'abc').
Select * from customers WHERE customers.id IN
(Select invoices.customerid FROM invoices WHERE invoices.id IN
(Select invoicelines.invoiceid FROM invoicelines WHERE
invoicelines.itemcode = 'abc')
)
The search works fine and brings up the correct customers but takes about 10 seconds or so on a database of 2 million invoices and 2 million line items.
I was wondering if there was another approach that could trim that down a bit.
An alternative is to use EXISTS:
Select *
from customers
WHERE EXISTS (
Select invoices.customerid
FROM invoices
JOIN invoicelines
ON invoicelines.invoiceid = invoices.id AND
invoicelines.itemcode = 'abc' AND
customers.id = invoices.customerid)
You might switch to using exists instead. I suspect that this might work well:
Select c.*
from customers c
where exists (Select 1
from invoices i join
invoicelines il
on i.id = il.invoiceid and il.itemcode = 'abc'
where c.id = i.customerid
);
For this, you want to be sure you have the right indexes: invoices(customerid, id) and invoicelines(invoiceid, itemcode).
Do you want all of the rows and columns in customer where the itemcode for that customer's item is 'abc'? If you join on the customerid then you can find all of the customer information for those items. If you have duplicates within that list you can use DISTINCT which will only give you one entry per customerID.
SELECT
DISTINCT [List of customer columns]
FROM
customers
INNER JOIN
invoicelines
ON
customers.customerid = invoicelines.customerid
AND
invoicelines.itemcode = 'abc'

Fetch data from more than one tables using Group By

I am using three tables in a PostgreSql database as:
Customer(Id, Name, City),
Product(Id, Name, Price),
Orders(Customer_Id, Product_Id, Date)
and I want to execute a query to get from them "the customers that have have ordered at least two different products alnong with the products". The query I write is:
select c.*, p.*
from customer c
join orders o on o.customer_id = c.id
join product p on p.id = o.product_id
group by (c.id)
having count(distinct o.product_id)>=2
It throws the error:
"column "p.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select c.*, p.*".
However if I remove the the p.* from select statement (assuming that I one does not want the products, only the customers), it runs fine. How can I get the products as well?
Update: Having ordered two or more products, a customer must appear on the output as many times as its product he has ordered. I want as output a table with 5 columns:
Cust ID | Cust Name | Cust City | Prod ID | Prod Name | Prod Price
Is it possible in SQL given that group by should be used? Shoul it be used on more than one columns on different tables?
Try this out :
SELECT distinct c.* ,p.*
FROM Customer c
JOIN
(SELECT o.customer_id cid
FROM Product P
JOIN Orders o
ON p.id= o.product_id
GROUP BY o.customer_id
HAVING COUNT(distinct o.product_id)>=2) cp
ON c.id =cp.cid
JOIN Orders o
on c.id=o.customer_id
JOIN Product p
ON o.product_id =p.id
I hope it solves your problem.
I think you can use following query for this question -
SELECT C1.*, p1.*
FROM Customer C1
JOIN Orders O1 ON O1.Customer_Id = C1.Id
JOIN Product P1 ON P1.Id = O1.Product_Id
WHERE C1.Id IN (SELECT c.Id
FROM Customer c
JOIN Orders o ON o.Customer_Id = c.Id
GROUP BY (c.Id)
HAVING COUNT(DISTINCT o.Product_Id) >= 2)

How to do a join with multiple conditions in the second joined table?

I have 2 tables. The first table is a list of customers.
The second table is a list of equipment that those customers own with another field with some data on that customer (customer issue). The problem is that for each customer, there may be multiple issues.
I need to do a join on these tables but only return results of customers having two of these issues.
The trouble is, if I do a join with OR, I get results including customers with only one of these issues.
If I do AND, I don't get any results because each row only includes one condition.
How can I do this in T-SQL 2008?
Unless I've misunderstood, I think you want something like this (if you're only interested in customers that have 2 specific issues):
SELECT c.*
FROM Customer c
INNER JOIN CustomerEquipment e1 ON c.CustomerId = e1.CustomerId AND e1.Issue = 'Issue 1'
INNER JOIN CustomerEquipment e2 ON c.CustomerId = e2.CustomerId AND e2.Issue = 'Issue 2'
Or, to find any customers that have multiple issues regardless of type:
;WITH Issues AS
(
SELECT CustomerId, COUNT(*)
FROM CustomerEquipment
GROUP BY CustomerId
HAVING COUNT(*) > 1
)
SELECT c.*
FROM Customer c
JOIN Issues i ON c.CustomerId = i.CustomerId
SELECT *
FROM customers as c
LEFT JOIN equipment as e
ON c.customer_id = e.customer_id --> what you are joining on
WHERE (
SELECT COUNT(*)
FROM equipment as e2
WHERE e2.customer.id = c.customer_id
) > 1
You can do it with a sub query instead of Joins:
select * from Customer C where (select Count(*) from Issue I where I.CustomerID = C.CustomerID) < 2
or whatever value you want