Sql NOT IN optimization - sql

I'm having trouble optimizing a query. Here are two example tables I am working with:
Table 1:
UID
A
B
Table 2:
UID Parent
A 2
B 2
C 3
D 2
E 3
F 2
Here is what I am doing now:
Select Table1.UID
FROM Table1 R
INNER JOIN Table2 T ON
R.UID = T.UID
INNER JOIN Table2 E ON
T.PARENT = E.PARENT
AND E.UID NOT IN (SELECT UID FROM Table1)
I'm trying to avoid using the NOT IN clause because of obvious hindrances in performance for large numbers of records.
I know the typical ways to avoid NOT IN clauses like the LEFT JOIN where the other table is null, but can't seem to get what I want with all of the other Joins going on.
I will continue working and post if I find a solution.
EDIT: Here is what I am trying to end up with
After the first Inner Join I would have
A
B
AFter the second Inner join I would have:
A D
A F
B D
B F
The second column above is just to represent that it is matching to the other UIDs with the same parent, but I still need the As and Bs as the UID.
EDIT: RDBMS is SQL server 2005, 2008r2, 2012
Table1 is declared in the query with no index
DECLARE #Table1 TABLE ( [UNIQUE_ID] INT PRIMARY KEY )
Table2 has a clustered index on Unique ID

The general approach to this is to use a LEFT JOIN with a where clause that only selects the non-matching rows:
Select Table1.UID
FROM Table1 R
JOIN Table2 T ON R.UID = T.UID
JOIN Table2 E ON T.PARENT = E.PARENT
LEFT JOIN Table3 E2 ON E.UID = R.UID
WHERE E2.UID IS NULL

SELECT Table2.*
FROM Table2
INNER JOIN (
SELECT id FROM Table2
EXCEPT
SELECT id FROM Table1
) AS Filter ON (Table2.id = Filter.id)

Related

Most efficient way to join two tables on multiple fields?

I'm working with an Oracle SQL DB and attempting to join 2 tables together. My issue is that there are 3 different dimensions (4 total fields) upon which the two tables may be joined and I'm looking to identify all records where any one of those methods delivers a match and then pull in a certain field from that 2nd table in those instances.
My current plan is as follows:
SELECT a.*,
CASE
WHEN b.field_1 IS NOT NULL THEN b.field_5
WHEN c.field_2 IS NOT NULL THEN c.field_5
WHEN d.field_3 IS NOT NULL THEN c.field_5
END AS match
FROM table_1 a
LEFT JOIN table_2 b ON a.field_1 = b.field_1
LEFT JOIN table_3 c ON a.field_2 = c.field_2
LEFT JOIN table_4 d ON a.field_3 = d.field3 AND a.field_4 = d.field4
I believe this will give me the results I'm looking for, but I imagine this isn't the most efficient way to accomplish that. Any thoughts on a better approach?
[TL;DR] Your query is fine.
You need to use JOINs to correlate the relationships between the four tables.
If you want to be able to include rows from the driving table when there are no rows in the related tables then the join wants to be an OUTER JOIN.
If you put the driving table first then it will be a LEFT OUTER JOIN (or just LEFT JOIN)
You do not have much option on this.
If you want to get the field_5 values then you either want:
SELECT a.*,
b.field_5 AS b_match,
c.field_5 AS c_match,
d.field_5 AS d_match
FROM table_1 a
LEFT JOIN table_2 b ON a.field_1 = b.field_1
LEFT JOIN table_3 c ON a.field_2 = c.field_2
LEFT JOIN table_4 d ON a.field_3 = d.field3 AND a.field_4 = d.field4
If you want all the matches.
Or, you want to use your query:
SELECT a.*,
CASE
WHEN b.field_1 IS NOT NULL THEN b.field_5
WHEN c.field_2 IS NOT NULL THEN c.field_5
WHEN d.field_3 IS NOT NULL THEN c.field_5 -- Should this be d.field_5?
END AS match
FROM table_1 a
LEFT JOIN table_2 b ON a.field_1 = b.field_1
LEFT JOIN table_3 c ON a.field_2 = c.field_2
LEFT JOIN table_4 d ON a.field_3 = d.field3 AND a.field_4 = d.field4
If you want to get a single match in preference order of tables b, c and then d.
If you are using Oracle 12 or later, a third alternative could be to use UNION ALL in a LATERAL join:
SELECT a.*, l.field_5
FROM table_1 a
LEFT OUTER JOIN LATERAL (
SELECT 1 AS priority, b.field_5
FROM table_2 b
WHERE a.field_1 = b.field_1
UNION ALL
SELECT 2 AS priority, c.field_5
FROM table_3 c
WHERE a.field_2 = c.field_2
UNION ALL
SELECT 3 AS priority, d.field_5
FROM table_3 d
WHERE a.field_3 = d.field_3
AND a.field_4 = d.field_4
ORDER BY priority ASC
FETCH FIRST ROW WITH TIES
) l
ON (1 = 1)
Which may reduce the number of duplicate rows from not having multiple JOINs (that you are potentially ignoring with your CASE expression) but you should test whether it does return your desired results and if it would be more or less performant.

SQL Server: how to write a DELETE statement with a GROUP BY

I am using SQL Server 2008.
I have a SELECT query as follows:
SELECT
Apples.ID, COUNT(Pips.Apples_ID)
FROM
Apples
LEFT JOIN
Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN
Table_C tc ON tb.xID = tc.xID
LEFT JOIN
Pips p ON tb.Apples_ID = p.Apples_ID
WHERE
tc.X IS NULL
GROUP BY
Apples.ID
The tables are:
Apples which has a unique entry (ID) for each Apple.
Pips which can have dozens of pips belonging to 1 Apple
Table_B and Table_C are mapping tables to refine the search
I need to group the results because I do not want an Apples result for each and every Pip that apples can have. The SELECT statement works and returns a list of unique Apple IDs
I now want to DELETE these Apples. I changed my statement to:
DELETE Apples
FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.Apples_ID
WHERE tc.X IS NULL
GROUP BY Apples.ID
but got a syntax error on the GROUP BY.
I tried:
DELETE x
FROM
(SELECT Apples.ID
FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.Apples_ID
WHERE tc.X IS NULL
GROUP BY Apples.ID) x;
But I got an error:
View or function not updatable because the modification affects multiple base tables
How can I delete these rows I have identified in the SELECT, without using a temporary table or script?
As others have pointed out, the sub-query approach can be adapted to work by using an IN ( ... ) clause on a normal single-table delete. This is the simplest way of adapting any select statement to a delete:
DELETE FROM Apples
WHERE ID IN (
-- Sub-query selecting a single column of ID values
)
The sub-query can then be as complex as you like, using GROUP BY, HAVING, etc, as long as it only has one column in the SELECT list.
In your specific case, however, there is no need:
You have no HAVING clause, so the COUNT() doesn't change the rows to delete
The LEFT JOIN to the Pips table has no effect on the result other than the COUNT()
Mentioning the same row twice in a DELETE has no effect, so eliminating duplicates is unnecessary
You can therefore simplify this particular case without using the sub-query:
DELETE Apples
FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
WHERE tc.X IS NULL
DELETE FROM Apples WHERE ID in
(
SELECT a.ID FROM Apples a
LEFT JOIN Table_B tb ON a.ID = tb.a
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.a
WHERE tc.X IS NULL
GROUP BY a.ID
) as q
Are you trying to achieve this:
DELETE FROM APPLES WHERE ID IN
(
SELECT Apples.ID FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.Apples_ID
WHERE tc.X IS NULL
GROUP BY Apples.ID
) x;
The only thing that has a role in the query is tc.X is null. It can be null if there is no match or there is a match but the field X is null:
delete from Apples
where AppleId in
(
SELECT Apples.ID FROM Apples
LEFT JOIN Table_b tb ON tApples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
WHERE tc.X IS NULL
);

Joining two tables where id does not equal

I'm struggling getting this query to produce the results I want.
I have:
table1, columns=empid, alt_id
table2, columns=empid, alt_id
I want to get the empid, and alt_id from table 1 where the alt_id does not match the alt_id in table2. They will both have alt_id numbers I just want to get the ones that do not match.
Any ideas?
SELECT * FROM table1
INNER JOIN table2 ON table2.empid = table1.empid AND table2.alt_id <> table1.alt_id
What does that really mean though? Normally when this is asked, it is of the form "I want all rows from A that have no row matching in B and all in B that have no match in A"
Which looks like this:
SELECT * FROM
A
FULL OUTER JOIN
B
ON
a.id = b.id
You'll see a null for any row data where there isn't a matching row on the other side:
A.id
1
2
B.id
1
3
Result of full outer join:
A.id B.id
1 1
2 null
null 3
You, however have asked for A-B join where the IDs aren't equal, which would be the more useless query of:
SELECT * FROM
A
INNER JOIN
B
ON
a.id != b.id
And it would look like:
A.id B.id
1 3
2 1
2 3
You seem to want not exists:
select t1.*
from table1 t1
where not exists (select 1 from table2 t2 where t2.alt_id = t1.alt_id);
It is unclear whether or not you also want to join on empid, so you might really want:
select t1.*
from table1 t1
where not exists (select 1 from table2 t2 where t2.alt_id = t1.alt_id and t2.empid = t1.empid);
A left join will find all records in Table A that do not match those in Table B. Then use a Where filter to find the Nulls from Table B. That will give you all those in Table A that do not have a matching ID in Table B.
Select A.*
from Table A
Left Join
Table B
on a.altid = b.altid
where b.altid is null;
select *
from [Login] L inner join Employee E
on l.EmployeeID = e.EmployeeID
where l.EmployeeID not in (select EmployeeID from Employee)

How to use SQL Joins and Selects to get results from a third table that matches the first two?

I have a Join statement on two tables(Table 1 and 2), which returns the City and State. I have another table(Table 3) which contains columns like Name, City, State, Country. I want to fetch all the rows from Table 3 whose City and State Columns matches with the rows of the Join result.
Select * from 3rdTable where City='' AND State='';
Result from Join is like
- City | State
- A | B
- C | D
- E | F
Example Result if only 1 row of the 3rd table matches
- C | D
How can this be done?
You can use the joined table as a sub table in 3rdTable to create a where clause as follows;
select *
from 3rdTable
where City+'|'+State= (select a.City+'|'+b.State
from a
inner join b
on a.x=b.y)
Buy concatenating the fields, you can create a single equality to the joined subquery
Be sure about the joins, we have inner join, Left join, right join and outer join; maybe knowing the difference can help you to answer your question.
and also the code is not clear :)
Just join in the 3rd table...
IF we assume table1 has both city and state in it...
SELECT A.City, A.State
FROM Table1 A
INNER JOIN table2 B
on A.PK = B.A_FK
INNER JOIN table3 C
on A.City = C.City
and A.State = C.State
This is the nature of an inner join: Include all rows from all tables where the joined data matches.
If you use an OUTER join (left, right, full outer) then you get all records from one table and only those that match in the others, or full outer all records from all tables aligned where they match.
SELECT *
FROM table3 t3
INNER JOIN (SELECT city,
state
FROM table1 T1
JOIN table2 t2
ON t1.id = t2.id) a1
ON t3.city = a1.city
AND t3.state = a1.state
I think this could help you:
SELECT T3.*
FROM
table_1_2_join T12 /* replace this placeholder table with the select statement that joins your 2 tables */
JOIN table_3 T3 ON T3.City = T12.city AND T3.state = T12.state
Let me know if you need more details.

MySQL: Multi-column join on several tables

I have several tables that I am joining that I need to add another table to and I can't seem to get the right query. Here is what I have now -
Table 1
carid, catid, makeid, modelid, caryear
Table 2
makeid, makename
Table 3
modelid, modelname
Table 4
catid, catname
The query I am using to join these is:
SELECT * FROM table1 a
JOIN table2 b on a.makeid=b.makeid
JOIN table3 c on a.modelid=c.modelid
JOIN table4 d on a.catid=d.catid
WHERE a.carid = $carid;
Now I need to add a 5th table that I am getting from a 3rd party that I am having a hard time adding to my existing query. The new table has these fields -
Table 5
id, year, make, model, citympg, hwympg
I need the citympg and hwympg based on caryear from table 1, makename from table 2, and modelname from table 3. I know I can do a second query with those values, but I would prefer to do a single query and have all of the data in a single row. Can this be done in a single query? If so, how?
it's possible to have more than condition in a join.
does this work?
SELECT a.*, e.citympg, e.hwympg
FROM table1 a
JOIN table2 b on a.makeid=b.makeid
JOIN table3 c on a.modelid=c.modelid
JOIN table4 d on a.catid=d.catid
Join table5 e on b.makename = e.make
and c.modelname = e.model
and a.caryear = e.year
WHERE a.carid = $carid;
...though your question is not clear. Did you only want to join table5 to the others, or was there something else you wanted to do with table5?
Without indexes, It won't be efficient, but you can do
LEFT JOIN table5 ON (table2.make = table5.make AND table3.model = table5.model AND table1.caryear = table5.caryear)
This also assumes the make and models and years strings match exactly.