Converting a nested sql where-in pattern to joins

Converting a nested sql where-in pattern to joins - sql

I have a query that is returning the correct data to me, but being a developer rather than a DBA I'm wondering if there is any reason to convert it to joins rather than nested selects and if so, what it would look like.
My code currently is
select * from adjustments where store_id in (
select id from stores where original_id = (
select original_id from stores where name ='abcd'))
Any references to the better use of joins would be appreciated too.

Besides any likely performance improvements, I find following much easier to read.
SELECT *
FROM adjustments a
INNER JOIN stores s ON s.id = a.store_id
INNER JOIN stores s2 ON s2.original_id = s.original_id
WHERE s.name = 'abcd'
Test script showing my original fault in ommitting original_id
DECLARE #Adjustments TABLE (store_id INTEGER)
DECLARE #Stores TABLE (id INTEGER, name VARCHAR(32), original_id INTEGER)
INSERT INTO #Adjustments VALUES (1), (2), (3)
INSERT INTO #Stores VALUES (1, 'abcd', 1), (2, '2', 1), (3, '3', 1)
/*
OP's Original statement returns store_id's 1, 2 & 3
due to original_id being all the same
*/
SELECT * FROM #Adjustments WHERE store_id IN (
SELECT id FROM #Stores WHERE original_id = (
SELECT original_id FROM #Stores WHERE name ='abcd'))
/*
Faulty first attempt with removing original_id from the equation
only returns store_id 1
*/
SELECT a.store_id
FROM #Adjustments a
INNER JOIN #Stores s ON s.id = a.store_id
WHERE s.name = 'abcd'

If you would use joins, it would look like this:
select *
from adjustments
inner join stores on stores.id = adjustments.store_id
inner join stores as stores2 on stores2.original_id = stores.original_id
where stores2.name = 'abcd'
(Apparently you can omit the second SELECT on the stores table (I left it out of my query) because if I'm interpreting your table structure correctly,
select id from stores where original_id = (select original_id from stores where name ='abcd')
is the same as
select * from stores where name ='abcd'.)
--> edited my query back to the original form, thanks to Lieven for pointing out my mistake in his answer!
I prefer using joins, but for simple queries like that, there is normally no performance difference. SQL Server treats both queries the same internally.
If you want to be sure, you can look at the execution plan.
If you run both queries together, SQL Server will also tell you which query took more resources than the other (in percent).

A slightly different approach:
select * from adjustments a where exists
(select null from stores s1, stores s2
where a.store_id = s1.id and s1.original_id = s2.original_id and s2.name ='abcd')

As say Microsoft here:
Many Transact-SQL statements that include subqueries can be
alternatively formulated as joins. Other questions can be posed only
with subqueries. In Transact-SQL, there is usually no performance
difference between a statement that includes a subquery and a
semantically equivalent version that does not. However, in some cases
where existence must be checked, a join yields better performance.
Otherwise, the nested query must be processed for each result of the
outer query to ensure elimination of duplicates. In such cases, a join
approach would yield better results.
Your case is exactly when Join and subquery gives the same performance.
Example when subquery can not be converted to "simple" JOIN:
select Country,TR_Country.Name as Country_Translated_Name,TR_Country.Language_Code
from Country
JOIN TR_Country ON Country.Country=Tr_Country.Country
where country =
(select top 1 country
from Northwind.dbo.Customers C
join
Northwind.dbo.Orders O
on C.CustomerId = O.CustomerID
group by country
order by count(*))
As you can see, every country can have different name translations so we can not just join and count records (in that case, countries with larger quantities of translations will have more record counts)
Of cource, you can can transform this example to:
JOIN with derived table
CTE
but it is an other tale-)

Related

Last two joins cause duplicate rows

Ok, so I have a query that is returning more rows than expected with repeating data. Here is my query:
SELECT AP.RECEIPTNUMBER
,AP.FOLDERRSN
,ABS(AP.PAYMENTAMOUNT)
,ABS(AP.PAYMENTAMOUNT - AP.AMOUNTAPPLIED)
,TO_CHAR(AP.PAYMENTDATE,'MM/DD/YYYY')
,F.REFERENCEFILE
,F.FOLDERTYPE
,VS.SUBDESC
,P.NAMEFIRST||' '||P.NAMELAST
,P.ORGANIZATIONNAME
,VAF.FEEDESC
,VAF.GLACCOUNTNUMBER
FROM ACCOUNTPAYMENT AP
INNER JOIN FOLDER F ON AP.FOLDERRSN = F.FOLDERRSN
INNER JOIN VALIDSUB VS ON F.SUBCODE = VS.SUBCODE
INNER JOIN FOLDERPEOPLE FP ON FP.FOLDERRSN = F.FOLDERRSN
INNER JOIN PEOPLE P ON FP.PEOPLERSN = P.PEOPLERSN
INNER JOIN ACCOUNTBILLFEE ABF ON F.FOLDERRSN = ABF.FOLDERRSN
INNER JOIN VALIDACCOUNTFEE VAF ON ABF.FEECODE = VAF.FEECODE
WHERE AP.NSFFLAG = 'Y'
AND F.FOLDERTYPE IN ('405B','405O')
Everything works fine until I add the bottom two Inner Joins. I'm basically trying to get all payments that had NSF. When I run the simple query:
SELECT *
FROM ACCOUNTPAYMENT
WHERE NSFFLAG = 'Y'
I get only 3 rows pertaining to 405B and 405O folders. So I'm only expecting 3 rows to be returned in the above query but I get 9 with information repeating in some columns. I need the exact feedesc and gl account number based on the fee code that can be found in both the Valid Account Fee and Account Bill Fee tables.
I can't post a picture of my output.
Note: when I run the query without the two bottom joins I get the expected output.
Can someone help me make my query more efficient? Thanks!
As requested, below are the results that my query is returning for vaf.feedesc and vaf.glaccountnumber columns:
Boiler Operator License Fee 2423809
Boiler Certificate of Operation without Manway - Revolving 2423813
Installers (Boiler License)/API Exam 2423807
Boiler Public Inspection/Certification (State or Insurance) 2423816
Boiler Certificate of Operation with Manway 2423801
Boiler Certificate of Operation without Manway 2423801
Boiler Certificate of Operation with Manway - Revolving 2423813
BPV Owner/User Program Fee 2423801
Installers (Boiler License)/API Exam Renewal 2423807

The cause is that at least one of the connections ACCOUNTBILLFEE-FOLDER or VALIDACCOUNTFEE-ACCOUNTBILLFEE is not one-to-one. It allows for one Folder to have many AccountBillFees or for one ValidAccountFee to have many AccountBillFees.
To find the cause of such a problem this is what I usually do:
Change the SELECT A, B, C part of your query to SELECT *.
Reduce the results to one of the rows that is causing you trouble (by adding a WHERE ...). That is a single row without your last two joins and a few rows after you add those two joins.
Look at the result table from left to right. The first columns will probably show the same values for all rows. Once you see a difference between the values in a column, you know that the table of the column you are currently looking at is causing your "multiple row problem".
Now create a SELECT * statement that includes only the two tables joined together that cause multiple rows with the same WHERE ... you used above.
The result should give you a clear picture of the cause.
Once you know the reason for your problem you can think of a solution ;)

Try this if it helps then those tables have additional rows which are not relevant. If it doesn't then look at the results of the subqueries I have below to see what additional filters are needed
SELECT AP.RECEIPTNUMBER
,AP.FOLDERRSN
,ABS(AP.PAYMENTAMOUNT)
,ABS(AP.PAYMENTAMOUNT - AP.AMOUNTAPPLIED)
,TO_CHAR(AP.PAYMENTDATE,'MM/DD/YYYY')
,F.REFERENCEFILE
,F.FOLDERTYPE
,VS.SUBDESC
,P.NAMEFIRST||' '||P.NAMELAST
,P.ORGANIZATIONNAME
,VAF.FEEDESC
,VAF.GLACCOUNTNUMBER
FROM ACCOUNTPAYMENT AP
INNER JOIN FOLDER F ON AP.FOLDERRSN = F.FOLDERRSN
INNER JOIN VALIDSUB VS ON F.SUBCODE = VS.SUBCODE
INNER JOIN FOLDERPEOPLE FP ON FP.FOLDERRSN = F.FOLDERRSN
INNER JOIN PEOPLE P ON FP.PEOPLERSN = P.PEOPLERSN
INNER JOIN
(
SELECT DISTINCT ABF.FEECODE, ABF.FOLDERRSN
FROM ACCOUNTBILLFEE ABF
) ABF ON F.FOLDERRSN = ABF.FOLDERRSN
INNER JOIN
(
SELECT DISTINCT VAF.FEEDESC, VAF.GLACCOUNTNUMBER, VAF.FEECODE
FROM VALIDACCOUNTFEE VAF
) VAF ON ABF.FEECODE = VAF.FEECODE
WHERE AP.NSFFLAG = 'Y'
AND F.FOLDERTYPE IN ('405B','405O')

The data for those last two tables is different in different records in the one to many relationship. Since distinct did not fix the problem, then you have to accept that 9 records is the correct return because you are returning the fields that are different or you have to determine which of the multiple records you don't want returned based on business rules that must come from someone in your company not us.
I don't think you fully understand how SQl works as 9 records is exactly what I would have expected given the information you gave in the question. The following are some queries that show how joining in a one to many relationship can affect output and ways that you can adjust the query to get rid of the duplicated output.
Note that in some of the cases, the query cannot be adjusted to get rid of the output because of the columns you want returned. So even if some of the columns are repeated, if even one of the columns you want return has differnt records and you have no approriate business rules for which of them you want to see, you can't reduce the records set. Which rules you need are based on the type of data you are querying and what the rqeuirements are. This is not a question we can answer here, only your company knows whether a min or max value would be acceptable or if you need to add a where clause and if so what field to put it on and what values to use it to exclude. Those are business rules not SQL.
create table #temp (myid int , mydescription varchar(30))
insert into #temp(myid, mydescription)
values (1, 'test') , (2, 'test2')
create table #temp2 (myid int, myotherdescription varchar(30))
insert into #temp2(myid, myotherdescription)
values (1, 'othertest') , (1, 'othertest2'), (2, 'myothertest') , (1, 'othertest3')
select *
from #temp t
join #temp2 t2 on t.myid = t2.myid
select t2.myid, t.mydescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select distinct t2.myid, t.mydescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select t.myid, t.mydescription, t2.myotherdescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select distinct t.myid, t.mydescription, t2.myotherdescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select t.myid, min(t2.myotherdescription)
from #temp t
join #temp2 t2 on t.myid = t2.myid
group by t.myid
select t.myid, t2.myotherdescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
where t2.myid = 2

SQL Server Table-Value Function and Except Combination performance

I have a table (Resources with about 18000 records) and a Table-Value Function with this body :
ALTER FUNCTION [dbo].[tfn_GetPackageResources]
(
#packageId int=null,
#resourceTypeId int=null,
#resourceCategoryId int=null,
#resourceGroupId int=null,
#resourceSubGroupId int=null
)
RETURNS TABLE
AS
RETURN
(
SELECT Resources.*
FROM Resources
INNER JOIN ResourceSubGroups ON Resources.ResourceSubGroupId=ResourceSubGroups.Id
INNER JOIN ResourceGroups ON ResourceSubGroups.ResourceGroupId=ResourceGroups.Id
INNER JOIN ResourceCategories ON ResourceGroups.ResourceCategoryId=ResourceCategories.Id
INNER JOIN ResourceTypes ON ResourceCategories.ResourceTypeId=ResourceTypes.Id
WHERE
(#resourceSubGroupId IS NULL OR ResourceSubGroupId=#resourceSubGroupId) AND
(#resourceGroupId IS NULL OR ResourceGroupId=#resourceGroupId) AND
(#resourceCategoryId IS NULL OR ResourceCategoryId=#resourceCategoryId) AND
(#resourceTypeId IS NULL OR ResourceTypeId=#resourceTypeId) AND
(#packageId IS NULL OR PackageId=#packageId)
)
now I make a query like this :
SELECT id
FROM dbo.tfn_GetPackageResources(#sourcePackageId,null,null,null,null)
WHERE id not in(
SELECT a.Id
FROM dbo.tfn_GetPackageResources(#sourcePackageId,null,null,null,null) a INNER JOIN
dbo.tfn_GetPackageResources(#comparePackageId,null,null,null,null) b
ON a.No = b.No AND
a.UnitCode=b.UnitCode AND
a.IsCompound=b.IsCompound AND
a.Title=b.Title
)
This query takes about 10 seconds!(Although each part query runs extremely fast but the whole one take time) I check it with LEFT JOIN and NOT EXISTS but the result was same.
but if I run the query on the Resources table directly it only takes one second or less! the fast query is :
select * from resources where id not in (select id from resources)
how can I solve it?

Your UDF is expanded like a macro.
So your complete query has
9 INNER JOINs in the IN clause
4 INNER JOINs in the main SELECT.
You apply (... IS NULL OR ...) 15 times in total for each of your WHERE clauses.
Your idea of clever code reuse fails because of this expansionSQL does not usually lend itself to this reuse.
Keep it simple:
SELECT
R.id
FROM
Resources R
WHERE
R.PackageId = #sourcePackageId
AND
R.id not in (
SELECT a.Id
FROM Resources a
INNER JOIN
Resources b
ON a.No = b.No AND
a.UnitCode=b.UnitCode AND
a.IsCompound=b.IsCompound AND
a.Title=b.Title
WHERE
a.PackageId = #sourcePackageId
AND
b.PackageId = #comparePackageId
)
For more, see my other answers here:
Why is a UDF so much slower than a subquery?
Profiling statements inside a User-Defined Function
Does query plan optimizer works well with joined/filtered table-valued functions?
Table Valued Function where did my query plan go?

In your function, declare the type of the table it returns, and include a primary key. This way, the ID filter will be able to look up the IDs more efficiently.
See http://msdn.microsoft.com/en-us/library/ms191165(v=sql.105).aspx for the syntax.

Thing you should try is to break one complicated query into multiple simple ones that store their results in temporary tables, this way one complicated execution plan will be replaced by several simple plans whose total execution time might be shorter then the execution time of a complicated execution plan:
SELECT *
INTO #temp1
FROM dbo.tfn_GetPackageResources(#sourcePackageId,null,null,null,null)
SELECT *
INTO #temp2
FROM dbo.tfn_GetPackageResources(#comparePackageId,null,null,null,null)
SELECT a.Id
INTO #ids
FROM #temp1 a
INNER JOIN
#temp2 b ON
a.No = b.No
AND a.UnitCode=b.UnitCode
AND a.IsCompound=b.IsCompound
AND a.Title=b.Title
SELECT id
FROM #temp1
WHERE id not in(
SELECT Id
FROM #ids
)
-- you can also try replacing the above query with this one if it performs faster
SELECT id
FROM #temp1 t
WHERE NOT EXISTS
(
SELECT Id FROM #ids i WHERE i.Id = t.id
)

Equality of "select ... where in" and joins

Suppose I have a table1 like this:
id | itemcode
-------------
1 | c1
2 | c2
...
And a table2 like this:
item | name
-----------
c1 | acme
c2 | foo
...
Would the following two queries return the same result set under every condition?
SELECT id, itemcode
FROM table1
WHERE itemcode IN (SELECT DISTINCT item
FROM table2
WHERE name [some arbitrary test])
SELECT id, itemcode
FROM table1
JOIN (SELECT DISTINCT item
FROM table2
WHERE name [some arbitrary test]) items
ON table1.itemcode = items.item
Unless I'm really missing something stupid, I'd say yes. But I've done two queries which boil down to this form and I am getting different results. There are some nested queries using WHERE IN, but for the last step I've noticed a JOIN is much faster. The nested queries are all entirely isolated so I don't believe they are the problem, so I just want to eliminate the possibility that I've got a misconception regarding the above.
Thanks for any insights.
EDIT
The two original queries:
SELECT imitm, imlitm, imglpt
FROM jdedata.F4101
WHERE imitm IN
(SELECT DISTINCT ivitm AS itemno
FROM jdedata.F4104
WHERE ivcitm IN
(SELECT DISTINCT ivcitm AS legacycode
FROM jdedata.F4104
WHERE ivitm IN
(SELECT DISTINCT tritm
FROM trigdata.F4101_TRIG)
)
)
SELECT orig.imitm, orig.imlitm, orig.imglpt
FROM jdedata.F4101 orig
JOIN
(SELECT DISTINCT ivitm AS itemno
FROM jdedata.F4104
WHERE ivcitm IN
(SELECT DISTINCT ivcitm AS legacycode
FROM jdedata.F4104
WHERE ivitm IN
(SELECT DISTINCT tritm
FROM trigdata.F4101_TRIG))) itemns
ON orig.imitm = itemns.itemno
EDIT 2
Although I still don't understand why the queries returned different results, it would seem our logic was flawed from the beginning since we were using the wrong columns in some parts. Mind that I'm not saying I made a mistake interpreting the queries as written above or had some typo, we just needed to select on some different stuff.
Normally I don't rest until I get to the bottom of things like these, but I'm very tired and am entering my first vacation since January that spans more than one day, so I can't really be bothered searching further right now. I'm sure the tips given here will come in handy later. Upvotes have been distributed for all the help and I've accepted Ypercube's answer, mostly because his comments have led me the furthest. But thanks all round! If I do find out more later, I'll try to remember pinging back in.

Since table2.item is not nullable, the 2 versions are equivalent. You can remove the distinct from the IN version, it's not needed. You can check these 3 versions and their execution plans:
SELECT id, itemcode FROM table1 WHERE itemcode IN
( SELECT item FROM table2 WHERE name [some arbitrary test] )
SELECT id, itemcode FROM table1 JOIN
( SELECT DISTINCT item FROM table2 WHERE name [some arbitrary test] )
items ON table1.itemcode = items.item
SELECT id, itemcode FROM table1 WHERE EXISTS
( SELECT * FROM table2 WHERE table1.itemcode = table2.item
AND (name [some arbitrary test]) )

Ideally I would want to see the differences between the result sets.
- Are you getting duplication of records
- Is one set always a sub-set of the other
- Does one set have both 'additional' and 'missing' records in comparison to the other?
That said, the logic should be equivilent. My best guess would be that you have some empty string entries in there; because Oracle's version of a NULL CHAR/VARCHAR is just an empty string. This can give very funky results if you're not prepared for it.

Both queries perform a semijoin i.e. no attributes from table2 appear in the topmost SELECT (the resultset).
To my eye, your first query is easiest to identify as a semijoin, EXISTS even more so. On the other hand, an optimizer would no doubt see it differently ;)

You can also try to do a direct join to the second table
SELECT DISTINCT id, itemcode
FROM table1
INNER JOIN table2 ON table1.itemcode = table2.item
WHERE name [some arbitrary test] )
You don't need distinct if item is primary key or unique
Exists and Inner Join should have the same execution speed, while IN is more expensive.

I'd look for some data type conversion in there.
create table t_vc (val varchar2(6));
create table t_c (val char(6));
insert into t_vc values ('12345');
insert into t_vc values ('12345 ');
insert into t_c values ('12345');
insert into t_c values ('12345');
select t_c.val||':'
from t_c
where val in (select distinct val from t_vc);
select c.val||':'
from t_vc v join (select distinct val from t_c) c on v.val=c.val;

INNER JOIN vs IN

SELECT C.* FROM StockToCategory STC
INNER JOIN Category C ON STC.CategoryID = C.CategoryID
WHERE STC.StockID = #StockID
VS
SELECT * FROM Category
WHERE CategoryID IN
(SELECT CategoryID FROM StockToCategory WHERE StockID = #StockID)
Which is considered the correct (syntactically) and most performant approach and why?
The syntax in the latter example seems more logical to me but my assumption is the JOIN will be faster.
I have looked at the query plans and havent been able to decipher anything from them.
Query Plan 1
Query Plan 2

The two syntaxes serve different purposes. Using the Join syntax presumes you want something from both the StockToCategory and Category table. If there are multiple entries in the StockToCategory table for each category, the Category table values will be repeated.
Using the IN function presumes that you want only items from the Category whose ID meets some criteria. If a given CategoryId (assuming it is the PK of the Category table) exists multiple times in the StockToCategory table, it will only be returned once.
In your exact example, they will produce the same output however IMO, the later syntax makes your intent (only wanting categories), clearer.
Btw, yet a third syntax which is similar to using the IN function:
Select ...
From Category
Where Exists (
Select 1
From StockToCategory
Where StockToCategory.CategoryId = Category.CategoryId
And StockToCategory.Stock = #StockId
)

Syntactically (semantically too) these are both correct. In terms of performance they are effectively equivalent, in fact I would expect SQL Server to generate the exact same physical plans for these two queries.

T think There are just two ways to specify the same desired result.

for sqlite
table device_group_folders contains 10 records
table device_groups contains ~100000 records
INNER JOIN: 31 ms
WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT device_groups.uuid FROM select_childs INNER JOIN device_groups ON device_groups.parent = select_childs.uuid;
WHERE 31 ms
WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT device_groups.uuid FROM select_childs, device_groups WHERE device_groups.parent = select_childs.uuid;
IN <1 ms
SELECT device_groups.uuid FROM device_groups WHERE device_groups.parent IN (WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT * FROM select_childs);

How to avoid large in clause?

I have 3 tables :
table_product (30 000 row)
---------
ID
label
_
table_period (225 000 row)
---------
ID
date_start
date_end
default_price
FK_ID_product
and
table_special_offer (10 000 row)
-----
ID
label
date_start,
date_end,
special_offer_price
FK_ID_period
So I need to load data from all these table, so here it's what I do :
1/ load data from "table_product" like this
select *
from table_product
where label like 'gun%'
2/ load data from "table_period" like this
select *
from table_period
where FK_ID_product IN(list of all the ids selected in the 1)
3/ load data from "table_special_offer" like this
select *
from table_special_offer
where FK_ID_period IN(list of all the ids selected in the 2)
As you may think the IN clause in the point 3 can be very very big (like 75 000 big), so I got a lot of chance of getting either a timeout or something like " An expression services limit has been reached".
Have you ever had something like this, and how did you manage to avoid it ?
PS :
the context : SQL server 2005, .net 2.0
(please don't tell me my design is bad, or I shouldn't do "select *", I just simplified my problem so it is a little bit simpler than 500 pages describing my business).
Thanks.

Switch to using joins:
SELECT <FieldList>
FROM Table_Product prod
JOIN Table_Period per ON prod.Id = per.FK_ID_Product
JOIN Table_Special_Offer spec ON per.ID = spec.FK_ID_Period
WHERE prod.label LIKE 'gun%'
Something you should be aware of is the difference of IN vs JOIN vs EXISTS - great article here.

In finally have my answer : table variable (a bit like #smirkingman's solution but not with cte) so:
declare #product(id int primary key,label nvarchar(max))
declare #period(id int primary key,date_start datetime,date_end datetime,defaultprice real)
declare #special_offer(id int,date_start datetime,date_end datetime,special_offer_price real)
insert into #product
select *
from table_product
where label like 'gun%'
insert into #period
select *
from table_period
where exists(
select * from #product p where p.id = table_period.FK_id_product
)
insert into #special_offer
select *
from table_special_offer
where exists(
select * from #period p where p.id = table_special_offer.fk_id_period
)
select * from #product
select * from #period
select * from #special_offer
this is for the sql, and with c# I use ExecuteReader, Read, and NextResult of the class sqldatareader
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqldatareader.aspx
I got all I want :
- my datas
- i don't have too much data (unlike the solutions with join)
- i don't execute twice the same query (like solution with subquery)
- i don't have to change my mapping code (1row = 1 business object)

Don't use explicit list of values in IN clause. Instead, write your query like
... FK_ID_product IN (select ID
from table_product
where label like 'gun%')

SELECT *
FROM
table_product tp
INNER JOIN table_period tper
ON tp.ID = tper.FK_ID_product
INNER JOIN table_special_offer so
ON tper.ID = so.FK_ID_period
WHERE
tp.label like 'gun%'"

First some code...
Using JOIN:
SELECT
table_product.* --'Explicit table calls just for organisation sake'
, table_period.*
, table_special_offer.*
FROM
table_product
INNER JOIN table_period
ON table_product.ID = table_period.FK_ID_product
INNER JOIN table_special_offer
ON table_period.ID = table_special_offer.FK_ID_period
WHERE
tp.label like 'gun%'"
Using IN :
SELECT
*
FROM
table_special_offer
WHERE FK_ID_period IN
(
SELECT
FK_ID_period
FROM
table_period
WHERE FK_ID_product IN
(
SELECT
FK_ID_product
FROM
table_product
WHERE label like '%gun'
) AS ProductSub
) AS PeriodSub
Depending on how well your tables get indexed both can be used. Inner Joins as the others have suggested are definitely efficient at doing your query and returning all data for the 3 tables. If you are only needing To use the ID's from table_product and table_period Then using the nested "IN" statements can be good for adapting search criteria on indexed tables (Using IN can be ok if the criteria used are integers like I assume your FK_ID_product is).
An important thing to remember is every database and relational table setup is going to act differently, you wont have the same optimised results in one db to another. Try ALL the possibilities at hand and use the one that is best for you. The query analyser can be incredibly useful in times like these when you need to check performance.
I had this situation when we were trying to join up customer accounts to their appropriate addresses via an ID join and a linked table based condition (we had another table which showed customers with certain equipment which we had to do a string search on.) Strangely enough it was quicker for us to use both methods in the one query:
--The query with the WHERE Desc LIKE '%Equipment%' was "joined" to the client table using the IN clause and then this was joined onto the addresses table:
SELECT
Address.*
, Customers_Filtered.*
FROM
Address AS Address
INNER JOIN
(SELECT Customers.* FROM Customers WHERE ID IN (SELECT CustomerID FROM Equipment WHERE Desc LIKE '%Equipment search here%') AS Equipment ) AS Customers_Filtered
ON Address.CustomerID = Customers_Filtered.ID
This style of query (I apologise if my syntax isn't exactly correct) ended up being more efficient and easier to organise after the overall query got more complicated.
Hope this has helped - Follow #AdaTheDev 's article link, definitely a good resource.

A JOIN gives you the same results.
SELECT so.Col1
, so.Col2
FROM table_product pt
INNER JOIN table_period pd ON pd.FK_ID_product = pt.ID_product
INNER JOIN table_special_offer so ON so.FK_ID_Period = pd.ID_Period
WHERE pt.lable LIKE 'gun%'

I'd be interested to know if this might make an improvement:
WITH products(prdid) AS (
SELECT
ID
FROM
table_product
WHERE
label like 'gun%'
),
periods(perid) AS (
SELECT
ID
FROM
table_period
INNER JOIN products
ON id = prdid
),
offers(offid) AS (
SELECT
ID
FROM
table_special_offer
INNER JOIN periods
ON id = perid
)
... just a suggestion...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Converting a nested sql where-in pattern to joins - sql

A slightly different approach: select * from adjustments a where exists (select null from stores s1, stores s2 where a.store_id = s1.id and s1.original_id = s2.original_id and s2.name ='abcd')

Related

Last two joins cause duplicate rows

SQL Server Table-Value Function and Except Combination performance

Equality of "select ... where in" and joins

INNER JOIN vs IN

How to avoid large in clause?

Categories

Resources