Trying to Optimize PostgreSQL Nested WHERE IN - sql

I have a Postgres (9.1) customer database similar to:
customers.id
customers.lastname
customers.firstname
invoices.id
invoices.customerid
invoices.total
invoicelines.id
invoicelines.invoiceid
invoicelines.itemcode
invoicelines.price
I built a search which lists all customers who have purchased a certain item (say 'abc').
Select * from customers WHERE customers.id IN
(Select invoices.customerid FROM invoices WHERE invoices.id IN
(Select invoicelines.invoiceid FROM invoicelines WHERE
invoicelines.itemcode = 'abc')
)
The search works fine and brings up the correct customers but takes about 10 seconds or so on a database of 2 million invoices and 2 million line items.
I was wondering if there was another approach that could trim that down a bit.

An alternative is to use EXISTS:
Select *
from customers
WHERE EXISTS (
Select invoices.customerid
FROM invoices
JOIN invoicelines
ON invoicelines.invoiceid = invoices.id AND
invoicelines.itemcode = 'abc' AND
customers.id = invoices.customerid)

You might switch to using exists instead. I suspect that this might work well:
Select c.*
from customers c
where exists (Select 1
from invoices i join
invoicelines il
on i.id = il.invoiceid and il.itemcode = 'abc'
where c.id = i.customerid
);
For this, you want to be sure you have the right indexes: invoices(customerid, id) and invoicelines(invoiceid, itemcode).

Do you want all of the rows and columns in customer where the itemcode for that customer's item is 'abc'? If you join on the customerid then you can find all of the customer information for those items. If you have duplicates within that list you can use DISTINCT which will only give you one entry per customerID.
SELECT
DISTINCT [List of customer columns]
FROM
customers
INNER JOIN
invoicelines
ON
customers.customerid = invoicelines.customerid
AND
invoicelines.itemcode = 'abc'

Related

SQL SELECT FIELDS IN A DIFFERENT TABLE

I have two tables, one called orders and the other one is invoices. I want to know if I have a closed order without an invoice. they're joined by id_order, so I have this.
select I.ID_Order,O.ID_order from Invoices I
inner join Orders O on o.ID_Order = I.ID_Order
where o.Status='x'
if o.iD_order is found in invoices that means the order is invoiced.
if o.id_order is not found in invoices that means its not invoiced.
I want to get in the select statement all of the orders that are not invoiced.
Method1:
select *
From Order
Where ID_Order not in (Select ID_Order from Invoices)
Method 2:
select O.*
from Orders O
Left join
Invoices I
on o.ID_Order = I.ID_Order
where I.ID_Order IS NULL
Method 3:
select *
From Order as O
Where Not Exists (Select ID_Order from Invoices as I Where I.ID_Order = O.ID_Order)
Stick to the first one, usually that has a better performance.

Join SQL Statements optimization

I have 2 Tables:
Customer
ID
Customer_ID
Name
Sir_Name
Phone
Email
and
Table Invoice
Manager_Name
Manaer_First_Name
Customer_ID1
Customer_ID2
Customer_ID3
There is only one Customer.Customer_ID for each Customer or a Customer has no Customer_ID
In Invoice.Customer_ID1 i have the same Customer_ID.Customer_ID several times.
I Like to get all Records in Customer Table Join Invoice Table - check if the Customer_ID = Customer_ID1 if not check in Customer_ID = Customer_ID2 Or Customer_ID = Customer_ID2
If customer_ID is found in one of rows stop the search.
Probably the best way to write the query is:
select . . .
from customer c join
invoice i
on c.customer_id = coalesce(i.customer_id1, i.customer_id2, i.customer_id3);
This should be able to take advantage of an index on customer(customer_id). If this is not efficient, then another alternative is left join:
select . . ., coalesce(c1.col1, c2.col1, c3.col1) as col1, . . .
from invoice i left join
customer c1
on c1.customer_id = i.customer_id1 left join
customer c2
on c2.customer_id = i.customer_id2 left join
customer c3
on c3.customer_id = i.customer_id3;
The left join can take advantage of an index on customer(customer_id). You need to use coalesce() in the select to choose the field from the right table.
select
*
from [Table Invoice] A
JOIN [Customer] B
ON B.Customer_ID = A.Customer_ID1 OR (B.Customer_ID <> A.Customer_ID1 AND B.Customer_ID = A.Customer_ID2) OR (B.Customer_ID = A.Customer_ID3 AND B.Customer_ID <> A.Customer_ID2 AND B.Customer_ID <> A.Customer_ID1)
this would return you all the Invoices for all of the Customers. In case you need Invoices just for one customer - add
WHERE B.Customer_ID = #YourCustomerID
statement. If you need only one, first invoice, add 'TOP 1' to select statement:
SELECT TOP 1
Could a inner join on or clause
select Customer.*, Invocie.*
from Customer
inner join Invoice on ( Customer.Customer_ID = Invoce.Customer_ID1
OR Customer.Customer_ID = Invoce.Customer_ID2
OR Customer.Customer_ID = Invoce.Customer_ID3)
This is how I understand your request: You want all customers that have at least one entry in the invoice table. But per customer you want the "best" invoice record only; with ID1 match better than ID2 match and ID2 match better than ID3 match.
So join the tables to get all matches and then rank your matches with row_number giving the best matching record #1. Then only keep those rows ranked #1.
select *
from
(
select
c.*,
i.*,
row_number() over
(
partition by c.customer_id order by
case c.customer_id
when i.customer_id1 then 1
when i.customer_id2 then 2
when i.customer_id3 then 3
end
) as rn
from customer c
join invoice i on c.customer_id in (i.customer_id1, i.customer_id2, i.customer_id3)
)
where rn = 1;

SQL Oracle (using AND clause)

when I use below code, I get the data of customers who ordered "Planned" or 'obsolete' products, but I want to get data of the customers who ordered both type, changing 'or' to 'and' does not work... please help
SELECT DISTINCT customers.CUST_EMAIL
,ORDERS.ORDER_ID
,PRODUCT_INFORMATION.PRODUCT_NAME
,PRODUCT_INFORMATION.PRODUCT_STATUS
FROM PRODUCT_INFORMATION
INNER JOIN ORDER_ITEMS ON PRODUCT_INFORMATION.PRODUCT_ID = ORDER_ITEMS.PRODUCT_ID
INNER JOIN ORDERS ON ORDER_ITEMS.ORDER_ID = ORDERS.ORDER_ID
INNER JOIN CUSTOMERS ON CUSTOMERS.CUSTOMER_ID = ORDERS.CUSTOMER_ID
WHERE PRODUCT_INFORMATION.PRODUCT_STATUS = 'planned'
OR PRODUCT_INFORMATION.PRODUCT_STATUS = 'obsolete'
ORDER BY CUSTOMERS.CUST_EMAIL;
I'm guessing that you want the following. If you want to get correct answer, rather than guesses, please provide a good representative set of sample data and your expected result based on that sample data.
First part of the query returns Customers that ordered planned products, second part of the query returns Customers that ordered obsolete products. INTERSECT operator returns only those that have ordered both planned and obsolete products.
You don't need explicit DISTINCT any more, because INTERSECT would do it anyway.
I've removed PRODUCT_INFORMATION.PRODUCT_STATUS from the list of returned columns, because with it the result set would be always empty.
I removed ORDERS.ORDER_ID and PRODUCT_INFORMATION.PRODUCT_NAME from result as well. I don't know what should be the correct query, but it is likely that INTERSECT should be done just on CUSTOMER_ID and then, once you get the list of IDs, you can join other tables to it fetching other related details if needed.
The performance of this method is beyond the scope of the question.
SELECT
CUSTOMERS.CUSTOMER_ID
,customers.CUST_EMAIL
FROM
PRODUCT_INFORMATION
INNER JOIN ORDER_ITEMS ON PRODUCT_INFORMATION.PRODUCT_ID = ORDER_ITEMS.PRODUCT_ID
INNER JOIN ORDERS ON ORDER_ITEMS.ORDER_ID = ORDERS.ORDER_ID
INNER JOIN CUSTOMERS ON CUSTOMERS.CUSTOMER_ID = ORDERS.CUSTOMER_ID
WHERE PRODUCT_INFORMATION.PRODUCT_STATUS = 'planned'
INTERSECT
SELECT
CUSTOMERS.CUSTOMER_ID
,customers.CUST_EMAIL
FROM
PRODUCT_INFORMATION
INNER JOIN ORDER_ITEMS ON PRODUCT_INFORMATION.PRODUCT_ID = ORDER_ITEMS.PRODUCT_ID
INNER JOIN ORDERS ON ORDER_ITEMS.ORDER_ID = ORDERS.ORDER_ID
INNER JOIN CUSTOMERS ON CUSTOMERS.CUSTOMER_ID = ORDERS.CUSTOMER_ID
WHERE PRODUCT_INFORMATION.PRODUCT_STATUS = 'obsolete'
ORDER BY CUST_EMAIL
without the script for you tables it's difficult to build a test case and a working query; i'll try with this step:
select order_id from (
SELECT customers.CUSTOMER_ID
,sum(decode(PRODUCT_INFORMATION.PRODUCT_STATUS, 'obsolete', 1, 0)) obsolete
,sum(decode(PRODUCT_INFORMATION.PRODUCT_STATUS, 'planned', 1, 0)) planned
FROM PRODUCT_INFORMATION
INNER JOIN ORDER_ITEMS ON PRODUCT_INFORMATION.PRODUCT_ID = ORDER_ITEMS.PRODUCT_ID
INNER JOIN ORDERS ON ORDER_ITEMS.ORDER_ID = ORDERS.ORDER_ID
INNER JOIN CUSTOMERS ON CUSTOMERS.CUSTOMER_ID = ORDERS.CUSTOMER_ID
WHERE PRODUCT_INFORMATION.PRODUCT_STATUS = 'planned'
OR PRODUCT_INFORMATION.PRODUCT_STATUS = 'obsolete'
group by customers.CUSTOMER_ID)
where obsolete>1 and planned>1
This query should return all the customer id that have items in orders with both the product status (the different product status may be in different orders), if you want to retrieve orders that have products with both status you must change the query removing customer.customer_id and adding orders.order_id. If you provide some script with sample data we can provide a better answer

SQL inner join not returning records with blank values

I have 2 different types of records in my 'items' table (postgreSQL database). Some items have invoiceid associated, which has customer information associated. The other items in my items table do not have an invoice number associated.
I am trying to return a list of items with invoice date and customer names. The items that don't have invoice or customerer associated will also show, but those fields will just be blank. The problem is with my current sql statment. It only shows the items with invoice info associated.
select items.ItemID, items.qty, items.description, customers.firstname,
customers.lastname, invoices.InvoiceDate, items.status
from items
inner join Invoices on items.InvoiceID = Invoices.InvoiceID
inner join customers on Invoices.CustomerID = Customers.CustomerID
where Items.Status = 'ONTIME'
ORDER BY InvoiceDate asc
Any ideas how I can get all records to show, or is it even possible? The fields that don't have data are NULL, i'm not sure if that is part of the problem or not.
You want to use left outer join instead of inner join:
select i.ItemID, i.qty, i.description, c.firstname,
c.lastname, inv.InvoiceDate, i.status
from items i left outer join
Invoices inv
on i.InvoiceID = inv.InvoiceID left outer join
customers c
on inv.CustomerID = c.CustomerID
where i.Status = 'ONTIME'
order by InvoiceDate asc;
I also introduced table aliases to make the query a bit easier to read.

How to do a join with multiple conditions in the second joined table?

I have 2 tables. The first table is a list of customers.
The second table is a list of equipment that those customers own with another field with some data on that customer (customer issue). The problem is that for each customer, there may be multiple issues.
I need to do a join on these tables but only return results of customers having two of these issues.
The trouble is, if I do a join with OR, I get results including customers with only one of these issues.
If I do AND, I don't get any results because each row only includes one condition.
How can I do this in T-SQL 2008?
Unless I've misunderstood, I think you want something like this (if you're only interested in customers that have 2 specific issues):
SELECT c.*
FROM Customer c
INNER JOIN CustomerEquipment e1 ON c.CustomerId = e1.CustomerId AND e1.Issue = 'Issue 1'
INNER JOIN CustomerEquipment e2 ON c.CustomerId = e2.CustomerId AND e2.Issue = 'Issue 2'
Or, to find any customers that have multiple issues regardless of type:
;WITH Issues AS
(
SELECT CustomerId, COUNT(*)
FROM CustomerEquipment
GROUP BY CustomerId
HAVING COUNT(*) > 1
)
SELECT c.*
FROM Customer c
JOIN Issues i ON c.CustomerId = i.CustomerId
SELECT *
FROM customers as c
LEFT JOIN equipment as e
ON c.customer_id = e.customer_id --> what you are joining on
WHERE (
SELECT COUNT(*)
FROM equipment as e2
WHERE e2.customer.id = c.customer_id
) > 1
You can do it with a sub query instead of Joins:
select * from Customer C where (select Count(*) from Issue I where I.CustomerID = C.CustomerID) < 2
or whatever value you want