How can I replace `SELECT ... WHERE SELECT ...` with joins? - sql

I have read that SELECT ... WHERE SELECT ... is slow, and that I should use joins instead.
But I don't know how to replace this code
SELECT Id
FROM Table1
Where
(
Data1 IS NULL
OR
(
Data2=1
AND
(SELECT 1 FROM Table2 WHERE Table2.Id=Table1.Id) IS NULL
)
)
AND
(SELECT 1 FROM Table3 WHERE Table3.Id=Table1.Id) IS NULL
with joins.
The tables have the following structure:
Table1:
Id: INTEGER PRIMARY KEY
Data1: XML
Data2: INTEGER
Table2:
Id: INTEGER
Table3:
Id: INTEGER PRIMARY KEY

select Id from Table1 where
Id not in (select Id from Table3) and
(Data1 is null or
(Data2 = 1 and Id not in (select Id from Table2)));
or, if you really want joins:
select Id from Table1 left join Table2 on (Table1.Id = Table2.Id)
left join Table3 on (Table1.Id = Table3.Id)
where Table3.Id is null and
(Data1 is null or
(Data2 = 1 and Table2.Id is null));
I don't expect much difference in performance between these two. The query would likely benefit from an index on Table2.Id (you have one on Table3.Id by virtue of it being a primary key).

There are two key parts to moving subqueries from in to the from clause. The first is to use left outer join, so no rows from the first table inadvertently drop out. The second is to use select distinct for each subquery, to avoid unwanted duplicates.
Applied to your query, the result is:
SELECT t1.Id
FROM Table1 t1 left outer join
(select distinct id
from Table2
) t2
on t1.id = t2.id left outer join
(select distinct id
from Table3
) t3
on t1.id = t3.id
Where(t1.Data1 IS NULL OR
(t1.Data2=1 and t2.id is null)
) and
t3.id is null;

Related

Update row in a table based on multiple rows in another table

I have two tables: table1 and table2:
table1 has columns id and integer
table2 has columns id and boolean
table2 can have multiple rows with the same id
I want to update the integer column of table1 by looking at all rows with the same id in table2 and seeing if any of the boolean values are true. If so I want table1.integer to be 1, else I want it to be 0.
I have tried something like this:
UPDATE table1,
(
SELECT table2.id, Sum(table2.boolean) > 0
) AS 'condition'
from table2
WHERE 1
GROUP BY table2.id) table3
SET table1.integer =IF(table3.condition, 1, 0) where table1.id = table3.id
And it seems to work, but I wanted to ask if there is a nicer/cleaner/more succinct way of updating the rows of table1 according to multiple rows of table2.
I would recommend EXISTS:
UPDATE table1 t1
SET t1.integer = (EXISTS (SELECT 1
FROM table2 t2
WHERE t2.id = t.id AND
t2.boolean
)
);
This can take advantage of an index on table2(id, boolean). With such an index, it should be faster than an approach that uses JOIN and AGGREGATION.
The syntax of your query is MySql like, so you can do a join like this:
UPDATE table1 t1 INNER JOIN (
SELECT id, MAX(boolean) maxboolean
FROM table2
GROUP BY id
) t2 ON t2.id = t1.id
SET t1.integer = t2.maxboolean
If there are ids in table1 without a corresponding id in table2 and you want the integer column for them to be updated to 0 then use a LEFT join:
UPDATE table1 t1 LEFT JOIN (
SELECT id, MAX(boolean) maxboolean
FROM table2
GROUP BY id
) t2 ON t2.id = t1.id
SET t1.integer = COALESCE(t2.maxboolean, 0)

specifying count in WHERE clause

select *
from table1 t1,
table2 t2,
table3 t3
where t2.parent_id = t1.row_id
and t2.xyz is not null
and (
select count(*)
from table3
where xyz = t2.row_id
) = 0;
Will it work?
I am using the alias t2 within my subquery.
My requirement is to check is to specify condition in where clause such that there is no record present in table3 where column xyz of table3 is stored as row_id of table2.
You can use NOT EXISTS to assert that there is no row returned from the subquery. Use modern explicit join syntax instead of comma based legacy syntax. No need to join table3 outside (you were making a cross join effectively).
select *
from table1 t1
join table2 t2 on t2.parent_id = t1.row_id
where t2.xyz is not null
and not exists (
select 1
from table3
where xyz = t2.row_id
);

Many-To-Many - get all records from related record

This is my many-to-many table:
Table3:
ID_TABLE3
ID_TABLE1_FK
ID_TABLE2_FK
Some_Field
Now what I want is to do a select of all records from TABLE2 where ID_TABLE1_FK in TABLE3 = 3. This is my query, and It returns all records, but It adds all fields of TABLE3 at end - WHICH IS NOT DESIRED !! :
SELECT * from TABLE2
JOIN TABLE3 ON TABLE3.ID_TABLE2_FK = TABLE2.ID_TABLE2
WHERE TABLE3.ID_TABLE1_FK= 3
So where am I wrong ?
Just use a regular JOIN and select the columns you really want;
SELECT t2.*
FROM TABLE2 t2 JOIN
TABLE3 t3
ON t3.ID_TABLE2_FK = t2.ID_TABLE2
WHERE t3.ID_TABLE1_FK = 3;
This could conceivably produce duplicates (if they are in TABLE3). So, you might be better off with:
SELECT t2.*
FROM TABLE2 t2
WHERE EXISTS (SELECT 1
FROM TABLE3 t3
WHERE t3.ID_TABLE2_FK = t2.ID_TABLE2 AND t3.ID_TABLE1_FK = 3
);

Applying joins conditionally in SQL Server

I have some set of records, but now i have to select only those records from this set which have theeir Id in either of the two tables.
Suppose I have table1 which contains
Id Name
----------
1 Name1
2 Name2
Now I need to select only those records from table one
which have either their id in table2 or in table3
I was trying to apply or operator witin inner join like:
select *
from table1
inner join table2 on table2.id = table1.id or
inner join table3 on table3.id = table1.id.
Is it possible? What is the best method to approach this? Actually I am also not able to use
if exist(select 1 from table2 where id=table1.id) then select from table1
Could someone help me to get over this?
Use left join and then check if at least one of the joins has found a relation
select t1.*
from table1 t1
left join table2 t2 on t2.id = t1.id
left join table3 t3 on t3.id = t1.id
where t2.id is not null
or t3.is is not null
I would be inclined to use exists:
select t1.*
from table1 t1
where exists (select 1 from table2 t2 where t2.id = t1.id) or
exists (select 1 from table3 t3 where t3.id = t1.id) ;
The advantage to using exists (or in) over a join involves duplicate rows. If table2 or table3 have multiple rows for a given id, then a version using join will produce multiple rows in the result set.
I think the most efficient way is to use UNION on table2 and table3 and join to it :
SELECT t1.*
FROM table1 t1
INNER JOIN(SELECT id FROM Table2
UNION
SELECT id FROM Table3) s
ON(t.id = s.id)
Alternatively, you can use below SQL as well:
SELECT *
FROM dbo.Table1
WHERE id Table1.IN ( SELECT table2.id
FROM dbo.table2 )
OR Table1.id IN ( SELECT table3.id
FROM Table3 )

Quicker way to insert non-matching ids to column

Is there a quicker way to get the ids that exist in table1 but not exist in table2 and insert them in table2?
insert into table2 (id)
select id
from table1
where table1.id not in (select id from table2)
In addition to your solution using the in operator try the exists one
select id
from table1 t1
where not exists (
select 1
from table2
where id = t1.id
)
If the subquery returns an empty set not exists evaluates to true
The outer join
select id
from
table1 t1
left join
table2 t2 on t1.id = t2.id
where t2.id is null
Use explain analyze to compare