Combining INNER JOIN and LEFT JOIN - sql

Edited
I'm not asking How to write good query but those 3 queries return same results.
Query 1
SELECT v1.id
FROM (
SELECT DISTINCT t1.id
FROM t1 LEFT JOIN t2 ON t1.id = t2.id
WHERE t2.id IS NULL
) v1 INNER JOIN (
SELECT DISTINCT t3.id
FROM t3 LEFT JOIN t4 ON t3.id = t4.id
WHERE t4.id IS NULL
) v2 ON v1.id = v2.id;
Query 2
SELECT DISTINCT t1.id
FROM (t1 LEFT JOIN t2 ON t1.id = t2.id)
INNER JOIN (t3 LEFT JOIN t4 ON t3.id = t4.id) ON t1.id = t3.id
WHERE t2.id IS NULL AND t4.id IS NULL;
Query 3
SELECT DISTINCT t1.id
FROM t1 LEFT JOIN t2 ON t1.id = t2.id
INNER JOIN t3 ON t1.id = t3.id LEFT JOIN t4 ON t3.id = t4.id
WHERE t2.id IS NULL AND t4.id IS NULL;
Queries are not hard coded by programmer but generate dynamically by user input.
For example when user inserts find id in t1 (but not in t2) and in t3 (but not in t4), his indention is Query 1. But currently my program generates Query 3 and it looks like OK. I'm wondering this query has a bug in some cases, so that should be changed like Query 2 or 1.
User input (shown above) is just example and converting user input to JOIN statement is difficult at last to me.
Thanks in advanced.

This is the original query:
SELECT v1.id
FROM (SELECT DISTINCT t1.id
FROM t1 LEFT JOIN t2 ON t1.id = t2.id
WHERE t2.id IS NULL
) v1 INNER JOIN
(SELECT DISTINCT t3.id
FROM t3 LEFT JOIN t4 ON t3.id = t4.id
WHERE t4.id IS NULL
) v2
ON v1.id = v2.id;
If I understand correctly, you want ids that are in t1 and t3, but not in t2 and t4.
I would express the second query as:
SELECT distinct t1.id
FROM t1 INNER JOIN
t3
on t1.id = t3.id LEFT JOIN
t2
on t1.id = t2.id LEFT JOIN
t4
on t1.id = t4.id
WHERE t2.id IS NULL AND t4.id IS NULL;

I read your original query as asking for "all ids in T1 which are in both T1 and T3 but not in T2 or T4". Is that correct? If so, my query would be:
SELECT DISTINCT t1.id
FROM t1
WHERE EXISTS (SELECT 1 FROM t3 WHERE t1.id = t3.id)
AND NOT EXISTS (SELECT 1 FROM t2 WHERE t1.id = t2.id)
AND NOT EXISTS (SELECT 1 FROM t4 WHERE t1.id = t4.id)

Not sure what the real objective is, nor which dbms is being targeted
-- oracle
SELECT id FROM ( SELECT id FROM t1 MINUS SELECT id FROM t2 ) a
INTERSECT
SELECT id FROM ( SELECT id FROM t3 MINUS SELECT id FROM t4 ) b
;
-- sql server
SELECT id FROM ( SELECT id FROM t1 EXCEPT SELECT id FROM t2 ) a
INTERSECT
SELECT id FROM ( SELECT id FROM t3 EXCEPT SELECT id FROM t4 ) b
;
SELECT c.id from (
SELECT t1.id
FROM t1 LEFT JOIN t2 ON t1.id = t2.id
WHERE t2.id IS NULL
UNION ALL
SELECT t3.id
FROM t3 LEFT JOIN t4 ON t3.id = t4.id
WHERE t4.id IS NULL
) c GROUP BY c.id HAVING count(*) >= 2
;
The next one is potentially quite efficient BUT has a caveat
-- conditions apply, assumes t2 and t4 cannot have ids not in t1 or t3 respectively
SELECT c.id from (
SELECT a.id from (
(SELECT ID FROM T1)
UNION ALL
(SELECT ID FROM T2)
) a GROUP BY a.id HAVING count(*) = 1
UNION ALL
SELECT b.id from (
(SELECT ID FROM T3)
UNION ALL
(SELECT ID FROM T4)
) b GROUP BY b.id HAVING count(*) = 1
) c GROUP BY c.id HAVING count(*) >= 2
;
Fiddles:
| Oracle
| SQL Server |
MySQL |

Related

Merging different conditional select queries into one - with structure

I have a query in SQL Server 2008 as below:
;with PassiveCases
AS
(
Select t1.id, COUNT(*) as PassiveCounts
from #Table1 t1
inner join Table2 t2 on t2.Id = t1.SurgicalRequest_Id
inner join Table3 t3 on t3.Id = t2.ReportingIndicator_Id
inner join Table3 t4 on t4.Id = t3.Id
where t4.Passive = 1
),
ActiveCases
AS
(
Select t1.id, COUNT(*) as ActiveCounts
from #Table1 t1
inner join Table2 t2 on t2.Id = t1.SurgicalRequest_Id
inner join Table3 t3 on t3.Id = t2.ReportingIndicator_Id
inner join Table3 t4 on t4.Id = t3.Id
where t4.Passive = 0
)
What I want it to combine these structures into one to increase the efficiency and provide more readability. I am looking for something like below (it doesn't work)
:With Cases
AS
(
Select t1.id, case when t4.Passive = 1 then COUNT(*) as PassiveCounts else COUNT(*) as ActiveCounts
from #Table1 t1
inner join Table2 t2 on t2.Id = t1.SurgicalRequest_Id
inner join Table3 t3 on t3.Id = t2.ReportingIndicator_Id
inner join Table3 t4 on t4.Id = t3.Id
)
Any help would be appreciated.
Use conditional aggregation.
Select t1.id
,sum(case when t4.Passive = 1 then 1 else 0 end) as PassiveCounts
,sum(case when t4.Passive <> 1 then 1 else 0 end) as ActiveCounts
from #Table1 t1
inner join Table2 t2 on t2.Id = t1.SurgicalRequest_Id
inner join Table3 t3 on t3.Id = t2.ReportingIndicator_Id
inner join Table3 t4 on t4.Id = t3.Id
group by t1.id
You just want conditional aggregation. In your case, though, you can do this with arithmetic:
With Cases AS (
Select t1.id, sum(t4.Passive) as PassiveCount, sum(1 - t4.Passive) as ActiveCount
from #Table1 t1 inner join
Table2 t2
on t2.Id = t1.SurgicalRequest_Id inner join
Table3 t3
on t3.Id = t2.ReportingIndicator_Id inner join
Table3 t4
on t4.Id = t3.Id
group by t1.id
)
According to your question, t4.Passive only takes on two values, so the arithmetic should do what you want.

Multiple Full Outer Joins

I want to use the result of a FULL OUTER JOIN as a table to FULL OUTER JOIN on another table. What is the syntax that I should be using?
For eg: T1, T2, T3 are my tables with columns id, name. I need something like:
T1 FULL OUTER JOIN T2 on T1.id = T2.id ==> Let this be named X
X FULL OUTER JOIN T3 on X.id = t3.id
I want this to be achieved so that in the final ON clause, I want the T3.id to match either T1.id or T2.id. Any alternative way to do this is also OK.
SELECT COALESCE(X.id,t3.id) AS id, *-- specific columns here instead of the *
FROM
(
SELECT COALESCE(t1.id,t2.id) AS id, * -- specific columns here instead of the *
FROM T1 FULL OUTER JOIN T2 on T1.id = T2.id
) AS X
FULL OUTER JOIN T3 on X.id = t3.id
Often, chains of full outer joins don't behave quite as expected. One replacements uses left join. This works best when a table has all the ids you need. But you can also construct that:
from (select id from t1 union
select id from t2 union
select id from t3
) ids left join
t1
on ids.id = t1.id left join
t2
on ids.id = t2.id left join
t3
on ids.id = t3.id
Note that the first subquery can often be replaced by a table. If you have such a table, you can select the matching rows in the where clause:
from ids left join
t1
on ids.id = t1.id left join
t2
on ids.id = t2.id left join
t3
on ids.id = t3.id
where t1.id is not null or t2.id is not null or t3.id is not null
You can do it like you suggested, using IN()
FROM T1
FULL OUTER JOIN T2
ON(T1.id = T2.id)
FULL OUTER JOIN T3
ON(T3.ID IN(T2.id,T1.id))
or I think you can do it with a UNION (depends on what you need) :
SELECT * FROM
(SELECT name,id from T1
UNION
SELECT name,id from T2) x
FULL OUTER JOIN T3
ON(t3.id = x.id)

Left Join; Only if there is at least one matching record

I need to join two pairs of tables. If there is an ID in Table1 that can also be found in Table3 I need to join the tables. If there is no matching ID from Table1 in Table3, I need to not join the tables.
Ex.
If there is at least one id in Table1 in Table3; do something that is effectively this:
SELECT *
FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.ID = t2.ID
LEFT JOIN Table3 AS t3 ON t1.ID = t3.ID
LEFT JOIN Table4 AS t4 ON t3.ID = t4.ID
If there are no IDs that match between Table1 and Table3; do something that is effectively this:
SELECT *
FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.ID = t2.ID
Just translating your question into SQL, you can do this:
IF EXISTS(SELECT * FROM Table1 T1 INNER JOIN Table3 T3 ON T1.ID=T3.ID)
SELECT *
FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.ID = t2.ID
LEFT JOIN Table3 AS t3 ON t1.ID = t3.ID
LEFT JOIN Table4 AS t4 ON t3.ID = t4.ID
ELSE
SELECT *
FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.ID = t2.ID

Oracle SQL: JOIN by dblink

I use next SQL-query in Oracle DB:
SELECT T1.*,
T3.*
FROM MyTable1 T1
INNER JOIN MyTable2 T2 ON T2.Id1 = T1.Id
LEFT JOIN MyTable3#dblink1 T3 ON T3.Id2 = T2.Id
This query is very simple and fast (about 1 min, T1 contain about 1 million rows, T3 more then 10 million rows). Now I want to use MyTable4 from dblink1 for filtering selected rows data. For it, I use subquery:
SELECT T1.*,
T3.*
FROM MyTable1 T1
INNER JOIN MyTable2 T2 ON T2.Id1 = T1.Id
LEFT JOIN (SELECT Sub_T1.*
FROM MyTable3#dblink1 Sub_T1
INNER JOIN MyTable4#dblink1 Sub_T2 ON Sub_T2.Id3 = Sub_T1.Id
WHERE
Sub_T2.MyColumn1 = 'required value') T3 ON T3.Id2 = T2.Id
But this query is too slow (more then 20min). If I rewrite this query to:
SELECT T1.*,
T3.*
FROM MyTable1 T1
INNER JOIN MyTable2 T2 ON T2.Id1 = T1.Id
LEFT JOIN MyTable3#dblink1 T3 ON T3.Id2 = T2.Id
LEFT JOIN MyTable4#dblink1 T4 ON T4.Id3 = T3.Id
WHERE
T4.MyColumn1 = 'required value'
Then my query work fast again, but I donn't like result (I want to see columns of T3 as null, if WHERE return false).
How to improve my second query, for speed up it?
Does phrasing the query with parentheses solve the problem?
SELECT T1.*,
T3.*
FROM MyTable1 T1 INNER JOIN
MyTable2 T2
ON T2.Id1 = T1.Id LEFT JOIN
(MyTable3#dblink1 T3 JOIN
MyTable4#dblink1 T4
ON T4.Id3 = T3.Id AND
T4.MyColumn1 = 'required value'
)
ON T3.Id2 = T2.Id;
Or, also:
SELECT T1.*,
T3.*
FROM MyTable1 T1 INNER JOIN
MyTable2 T2
ON T2.Id1 = T1.Id LEFT JOIN
MyTable3#dblink1 T3
ON T3.Id2 = T2.Id
EXISTS (SELECT 1 FROM MyTable4#dblink1 T4 WHERE T4.Id3 = T3.Id AND T4.MyColumn1 = 'required value'
)

How do I force join inside a left join branch?

I need to LEFT join to entire t2+t3 branch, but if I can find a matching join between t1 and t2, I want to enforce the t2 and t3 join.
SELECT T1.name,T2.bob,T3.a
FROM T1
LEFT JOIN T2 ON t1.id = t2.t1_id
JOIN T3 ON t2.id = T3.t2_id
What is the syntax?
Sample data:
T1 [id,name]
1 aaa
2 bbb
3 ccc
T2 [id,t1_id,bob]
1,1,777
2,1,888
2,2,999
T3[id,t2_id,a]
1,2,'yeh'
EXPECTED RESULT:
[name] , [a] , [bob]
aaa , 'yeh' , 888
bbb , NULL , NULL
ccc , NULL , NULL
EDIT: This query returns you the result as expected from your sample data -
(Also note, this query is taken with help from Treefrog's answer below) -
SELECT t1.[name], t3.a, t2.bob
FROM T2 as t2
JOIN T3 as t3 ON t3.t2_id = t2.id
RIGHT JOIN T1 as t1 ON t1.id = t2.t1_id
My Older answer -
SELECT a
FROM T1 as t1
INNER JOIN T2 as t2 ON t1.id = t2.t1_id
LEFT JOIN T3 as t3 ON t2.id = t3.t2_id
SELECT T1.a
FROM T2
JOIN T3 ON T3.t2_id = T2.id
RIGHT JOIN T1 ON T1.id = T2.t1_id
Both of these will give you your result. Not sure which would perform better.
NESTED STATEMENT:
SELECT [name], NULL AS [a], NULL AS [bob]
FROM t1
LEFT OUTER JOIN t2 ON t1.id = t2.t1_id
LEFT OUTER JOIN t3 ON t2.id = t3.t2_id
WHERE t3.t2_id IS NULL
AND (SELECT COUNT(*) FROM t1 AS t1b LEFT OUTER JOIN t2 AS t2b ON t1b.id = t2b.t1_id LEFT OUTER JOIN t3 AS t3b ON t2b.id = t3b.t2_id WHERE t1.id = t1b.id AND t3b.a IS NOT NULL) = 0
UNION
SELECT [name], [a], [bob]
FROM t1
LEFT OUTER JOIN t2 ON t1.id = t2.t1_id
LEFT OUTER JOIN t3 ON t2.id = t3.t2_id
WHERE t3.t2_id IS NOT NULL
ORDER BY t1.name
TEMPORARY TABLE:
CREATE TABLE #tmp_Rslt([name] varchar(50), [a] varchar(50), [bob] varchar(50))
--select matches
INSERT INTO #tmp_Rslt
SELECT [name], [a], [bob]
FROM t1
LEFT OUTER JOIN t2 ON t1.id = t2.t1_id
LEFT OUTER JOIN t3 ON t2.id = t3.t2_id
WHERE t3.t2_id IS NOT NULL
ORDER BY t1.name
--select t1's that didn't have matches
INSERT INTO #tmp_Rslt
SELECT [name], NULL AS [a], NULL AS [bob]
FROM t1
LEFT OUTER JOIN t2 ON t1.id = t2.t1_id
LEFT OUTER JOIN t3 ON t2.id = t3.t2_id
WHERE t3.t2_id IS NULL
AND t1.[name] NOT IN (SELECT DISTINCT [name] FROM #tmp_Rslt)
SELECT *
FROM #tmp_Rslt
--cleanup.
DROP TABLE #tmp_Rslt
This should work in MySQL
SELECT * FROM T1
LEFT JOIN (T2
INNER JOIN T3 ON T2.id=T3.t2_id
) ON T1.id= T2.t1_id