Join SQL Statements optimization

Join SQL Statements optimization - sql

I have 2 Tables:
Customer
ID
Customer_ID
Name
Sir_Name
Phone
Email
and
Table Invoice
Manager_Name
Manaer_First_Name
Customer_ID1
Customer_ID2
Customer_ID3
There is only one Customer.Customer_ID for each Customer or a Customer has no Customer_ID
In Invoice.Customer_ID1 i have the same Customer_ID.Customer_ID several times.
I Like to get all Records in Customer Table Join Invoice Table - check if the Customer_ID = Customer_ID1 if not check in Customer_ID = Customer_ID2 Or Customer_ID = Customer_ID2
If customer_ID is found in one of rows stop the search.

Probably the best way to write the query is:
select . . .
from customer c join
invoice i
on c.customer_id = coalesce(i.customer_id1, i.customer_id2, i.customer_id3);
This should be able to take advantage of an index on customer(customer_id). If this is not efficient, then another alternative is left join:
select . . ., coalesce(c1.col1, c2.col1, c3.col1) as col1, . . .
from invoice i left join
customer c1
on c1.customer_id = i.customer_id1 left join
customer c2
on c2.customer_id = i.customer_id2 left join
customer c3
on c3.customer_id = i.customer_id3;
The left join can take advantage of an index on customer(customer_id). You need to use coalesce() in the select to choose the field from the right table.

select
*
from [Table Invoice] A
JOIN [Customer] B
ON B.Customer_ID = A.Customer_ID1 OR (B.Customer_ID <> A.Customer_ID1 AND B.Customer_ID = A.Customer_ID2) OR (B.Customer_ID = A.Customer_ID3 AND B.Customer_ID <> A.Customer_ID2 AND B.Customer_ID <> A.Customer_ID1)
this would return you all the Invoices for all of the Customers. In case you need Invoices just for one customer - add
WHERE B.Customer_ID = #YourCustomerID
statement. If you need only one, first invoice, add 'TOP 1' to select statement:
SELECT TOP 1

Could a inner join on or clause
select Customer.*, Invocie.*
from Customer
inner join Invoice on ( Customer.Customer_ID = Invoce.Customer_ID1
OR Customer.Customer_ID = Invoce.Customer_ID2
OR Customer.Customer_ID = Invoce.Customer_ID3)

This is how I understand your request: You want all customers that have at least one entry in the invoice table. But per customer you want the "best" invoice record only; with ID1 match better than ID2 match and ID2 match better than ID3 match.
So join the tables to get all matches and then rank your matches with row_number giving the best matching record #1. Then only keep those rows ranked #1.
select *
from
(
select
c.*,
i.*,
row_number() over
(
partition by c.customer_id order by
case c.customer_id
when i.customer_id1 then 1
when i.customer_id2 then 2
when i.customer_id3 then 3
end
) as rn
from customer c
join invoice i on c.customer_id in (i.customer_id1, i.customer_id2, i.customer_id3)
)
where rn = 1;

Related

SQL JOIN get all records that do not match certain criteria

I have two tables
Order and Invoice.
Order can have multiple invoices. Each invoice record has a state - paid or unpaid.
Order Invoice
O-123. i1 (paid)
O-123. i2 (unpaid)
O-123. i3(unpaid)
O-456 i4(paid)
O-456 i4(paid)
O-678. i5 (paid)
O-678 i6 (paid)
I need to get a list of all order which have no unpaid invoice. In this case it should return o456 and o678.
Sample query
select * from core.order as o
inner join
invoices as inv
on o.id = inv.order_id
where inv.status is paid

You can use not exists for that. (Assumed datatype of status column as varchar)
select * from core.order as o
where not exists
(
select 1 from invoices as inv where status='unpaid' and o.id=inv.order_id
)

One canonical approach uses aggregation:
SELECT o.id
FROM core.order o
LEFT JOIN invoices inv
ON inv.order_id = o.id
GROUP BY o.id
HAVING COUNT(CASE WHEN inv.status = 'unpaid' THEN 1 END) = 0;

One method is using not exists
select *
from core.order o
where not exists (
select 1
from invoices as inv
where o.id = inv.order_id and inv.status is 'unpaid'
)

Fetch most recent records as part of Joins

I am joining 2 tables customer & profile. Both the tables are joined by a specific column cust_id. In profile table, I have more than 1 entry. I want to select the most recent entry by start_ts (column) when joining both the tables. As a result I would like 1 row - row from customer and most recent row from profile in the resultset. Is there a way to do this ORACLE SQL?

I would use window functions:
select . . .
from customer c join
(select p.*,
row_number() over (partition by cust_id order by start_ts desc) as seqnum
from profile
) p
on c.cust_id = p.cust_id and p.seqnum = 1;
You can use a left join if you like to get customers that don't have profiles as well.

One way (which works for all DB engines) is to join the tables you want to select data from and then join against the specific max-record of profile to filter out the data
select c.*, p.*
from customer c
join profile p on c.cust_id = p.cust_id
join
(
select cust_id, max(start_ts) as maxts
from profile
group by cust_id
) p2 on p.cust_id = p2.cust_id and p.start_ts = p2.maxts

Here is another way (if there exists no newer entry then it's the newest):
select
c.*,
p.*
from
customer c inner join
profile p on p.cust_id = c.cust_id and not exists(
select *
from profile
where cust_id = c.cust_id and start_ts > p.start_ts
)

left join two tables on a non-unique column in right table

I have two tables in sql server and i wanna select and join some data from these table.the first tables have some customer like:
---------------
customer id
Dave 1
Tom 2
---------------
and second table i table of purchases that includes list of last purchases with cost and which customer bought that Product:
------------------
product date customer id
PC 1-1-2000 1
phone 2-3-2000 2
laptop 3-1-2000 1
------------------
i wanna select first table (customers info) with last date of their purchases!
i tried left join but that doesn't give me last purchases becuase customer id is not unique in second table! how can i do this function with SQL server query? Regards

If you just want the max date, use aggregation. I would recommend a left join for customers who have made no purchases:
select c.customer, c.id, max(p.date)
from customers c left join
purchases p
on c.id = p.customer_id
group by c.customer, c.id;

Use the not exists clause for the win!
select c.customer, p.*
from Customer as c
inner join Purchase as p
on p.customer_id = c.id
where not exists (
select 1
from Purchase as p2
where p2.customer_id = p.customer_id
and p2.date > p.date
)

I think you can use inner join and group by
select table1.customer, table1.id, table.max(date)
from table1
inner join table2 on table1.id = table2.id
group by table1.customer, table1.id

Trying to Optimize PostgreSQL Nested WHERE IN

I have a Postgres (9.1) customer database similar to:
customers.id
customers.lastname
customers.firstname
invoices.id
invoices.customerid
invoices.total
invoicelines.id
invoicelines.invoiceid
invoicelines.itemcode
invoicelines.price
I built a search which lists all customers who have purchased a certain item (say 'abc').
Select * from customers WHERE customers.id IN
(Select invoices.customerid FROM invoices WHERE invoices.id IN
(Select invoicelines.invoiceid FROM invoicelines WHERE
invoicelines.itemcode = 'abc')
)
The search works fine and brings up the correct customers but takes about 10 seconds or so on a database of 2 million invoices and 2 million line items.
I was wondering if there was another approach that could trim that down a bit.

An alternative is to use EXISTS:
Select *
from customers
WHERE EXISTS (
Select invoices.customerid
FROM invoices
JOIN invoicelines
ON invoicelines.invoiceid = invoices.id AND
invoicelines.itemcode = 'abc' AND
customers.id = invoices.customerid)

You might switch to using exists instead. I suspect that this might work well:
Select c.*
from customers c
where exists (Select 1
from invoices i join
invoicelines il
on i.id = il.invoiceid and il.itemcode = 'abc'
where c.id = i.customerid
);
For this, you want to be sure you have the right indexes: invoices(customerid, id) and invoicelines(invoiceid, itemcode).

Do you want all of the rows and columns in customer where the itemcode for that customer's item is 'abc'? If you join on the customerid then you can find all of the customer information for those items. If you have duplicates within that list you can use DISTINCT which will only give you one entry per customerID.
SELECT
DISTINCT [List of customer columns]
FROM
customers
INNER JOIN
invoicelines
ON
customers.customerid = invoicelines.customerid
AND
invoicelines.itemcode = 'abc'

How to do a join with multiple conditions in the second joined table?

I have 2 tables. The first table is a list of customers.
The second table is a list of equipment that those customers own with another field with some data on that customer (customer issue). The problem is that for each customer, there may be multiple issues.
I need to do a join on these tables but only return results of customers having two of these issues.
The trouble is, if I do a join with OR, I get results including customers with only one of these issues.
If I do AND, I don't get any results because each row only includes one condition.
How can I do this in T-SQL 2008?

Unless I've misunderstood, I think you want something like this (if you're only interested in customers that have 2 specific issues):
SELECT c.*
FROM Customer c
INNER JOIN CustomerEquipment e1 ON c.CustomerId = e1.CustomerId AND e1.Issue = 'Issue 1'
INNER JOIN CustomerEquipment e2 ON c.CustomerId = e2.CustomerId AND e2.Issue = 'Issue 2'
Or, to find any customers that have multiple issues regardless of type:
;WITH Issues AS
(
SELECT CustomerId, COUNT(*)
FROM CustomerEquipment
GROUP BY CustomerId
HAVING COUNT(*) > 1
)
SELECT c.*
FROM Customer c
JOIN Issues i ON c.CustomerId = i.CustomerId

SELECT *
FROM customers as c
LEFT JOIN equipment as e
ON c.customer_id = e.customer_id --> what you are joining on
WHERE (
SELECT COUNT(*)
FROM equipment as e2
WHERE e2.customer.id = c.customer_id
) > 1

You can do it with a sub query instead of Joins:
select * from Customer C where (select Count(*) from Issue I where I.CustomerID = C.CustomerID) < 2
or whatever value you want

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Join SQL Statements optimization - sql

Could a inner join on or clause select Customer., Invocie. from Customer inner join Invoice on ( Customer.Customer_ID = Invoce.Customer_ID1 OR Customer.Customer_ID = Invoce.Customer_ID2 OR Customer.Customer_ID = Invoce.Customer_ID3)

Related

SQL JOIN get all records that do not match certain criteria

Fetch most recent records as part of Joins

left join two tables on a non-unique column in right table

Trying to Optimize PostgreSQL Nested WHERE IN

How to do a join with multiple conditions in the second joined table?

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Join SQL Statements optimization - sql

Could a inner join on or clause select Customer.*, Invocie.* from Customer inner join Invoice on ( Customer.Customer_ID = Invoce.Customer_ID1 OR Customer.Customer_ID = Invoce.Customer_ID2 OR Customer.Customer_ID = Invoce.Customer_ID3)

Related

SQL JOIN get all records that do not match certain criteria

Fetch most recent records as part of Joins

left join two tables on a non-unique column in right table

Trying to Optimize PostgreSQL Nested WHERE IN

How to do a join with multiple conditions in the second joined table?

Categories

Resources

Could a inner join on or clause select Customer., Invocie. from Customer inner join Invoice on ( Customer.Customer_ID = Invoce.Customer_ID1 OR Customer.Customer_ID = Invoce.Customer_ID2 OR Customer.Customer_ID = Invoce.Customer_ID3)