Join query where table references itself - sql

I'm using Oracle 10, but the best way to ask this question is with an example.
select *
from t1, t2
where t1.id = t2.id
and t1.otherID = (select max(otherID)
from t2
where id = THE ID FROM THE OUTER QUERY T1
)
I think you see where I'm trying to go with this. I need to reference t1 in the subquery to join it to the max of t2.
I need to know how to create a query like this.
"THE ID FROM THE OUTER QUERY T1" is where my confusion is.
I tried using t1.id, but did not get results.

Try the following
select t1.*, t2.*
from t1
join t2 on t1.id = t2.id
join (select id, max(otherID) as max_otherID
from t2
group by id
) a ON a.id = t1.id and a.max_otherID = t1.otherID
Using a sub-query on the join often gives better performance than using it in the where clause.

Related

How to fix "Expressions referencing the outer query..." error in Spark-SQL?

I have an SQL query with a subquery running on Spark. I get this error: "Expressions referencing the outer query are not supported outside of WHERE/HAVING clauses". Can you help me to find out the reason?
select distinct NAME from table1, table2 t
where t.ID = (select min(t.ID) from table1 a where a.WID = table1.WID) and
t.WID = table1.WID and
t.VID = table1.VID
the error message is as follows:
"org.apache.spark.sql.AnalysisException: Expressions referencing the outer query are not supported outside of WHERE/HAVING clauses:
Aggregate [min(outer(FAILURE_ID#3104)) AS min(outer())#3404]"
Learn to use proper, explicit, standard JOIN syntax!
You can write your query with all table references in the FROM clause:
select distinct NAME
from table1 t1 join
table2 t2
on t2.WID = t1.WID and
t2.VID = t1.VID join
(select tt1.WID, min(tt1.id) as min_id
from table1 tt1
group by tt1.WID
) tt1
on tt1.WID = t1.WID and tt1.min_id = t1.id;
Or use window functions:
select distinct NAME
from table2 t2 join
(select t1.*,
min(t1.id) over (partition by t1.WID) as min_id
from table1 t1
) t1
on t2.WID = t1.WID and
t2.VID = t1.VID and
t1.min_id = t1.id;
EDIT:
The above assumes a reasonable interpretation of your query. To mimic the logic as written, you can do:
select distinct NAME
from table1 t1 join
table2 t2
on t2.WID = t1.WID and
t2.VID = t1.VID
where t1.ID is not null;
That is all the subquery is doing.

Convert to join query

select t.* from table1 t where t.id NOT IN(
select Id from t2 where usrId in
(select usrId from t3 where sId=value));
I the result i need is like if there are matching id's in t1 and t2 then those id's should be omitted and only the remaining rows should be given to me. I tried converting into join but it is giving me the result i wanted. Below is my join query.
SELECT t.* FROM table1 t JOIN table2 t2 ON t.Id <> t2.Id
JOIN table3 t3 ON t3.Id=t2.Id WHERE t3.sId= :value
This doesn't feth me the correct result. it was returning all the rows, but i want to restrict the result based on the matching id's in table t1 and table t2. Matching id's should be ommited from the result.I will be passing the value for sId.
I believe this to be an accurate refactor of your query using joins. I don't know if we can do away with the subquery, but in any case the logic appears to be the same.
select t1.*
from table1 t1
left join
(
select t2.Id
from table2 t2
inner join table3 t3
on t2.usrId = t3.usrId
where t3.sId = <value>
) t2
on t1.Id = t2.Id
where t2.Id is null
Let's break down and solve problem step by step.
So your query
select t.* from table1 t where t.id NOT IN(
select Id from t2 where usrId in
(select usrId from t3 where sId=value));
on converting the inner query to JOIN will yield
select t.* from table1 t where t.id NOT IN
(SELECT T2.ID FROM T2 JOIN T3 on T2.UsrID =T3.UsrID and T3.sID=value)
which on further converting to JOIN with outer table will be
select t.* from table1 t LEFT JOIN
(SELECT T2.ID FROM T2 JOIN T3 on T2.UsrID =T3.UsrID and T3.sID=value)t4
ON t.id =T4.ID
WHERE t4.ID is NULL
In case you completely want to remove sub-query you can try like this
SELECT t.*
FROM table1 t
LEFT JOIN T2
ON T.ID=T2.ID
LEFT JOIN T3
ON T3.UsrId=T2.UsrID AND T3.sId=value
WHERE T3.UsrID IS NULL

Left join subquery gives invalid object name error

I have a SQL query that looks like this:
SELECT TOP 1000 FROM [Mydb].[dbo].[Table1] AS t1
LEFT JOIN (
SELECT fk_id, Email FROM dbo.Table2
) AS t2 ON t1.id = t2.fk_id
But this gives me the error:
Invalid object name 'dbo.Table2'.
Any idea why SQL Server does not recognize Table2 in my subquery?
PS.
I tried to rename dbo.Table2 to [Mydb].[dbo].[Table2]. But that gives me the same error.
First of all, Your query formation isn't correct. no need of that subquery at all.
Your posted query
SELECT TOP 1000 FROM [Mydb].[dbo].[Table1] AS t1
LEFT JOIN (
SELECT fk_id, Email FROM dbo.Table2
) AS t2 ON t1.id = t2.fk_id
Can be simplified as below, give it a try
SELECT TOP 1000 * FROM [Table1] t1
LEFT JOIN Table2 t2 ON t1.id = t2.fk_id
Not very sure if this is the problem but you have missed the asterisk (*) after top 1000. So maybe it should be something like the following
SELECT TOP 1000 * FROM [Mydb].[dbo].[Table1] AS t1
LEFT JOIN (
SELECT fk_id, Email FROM dbo.Table2
) AS t2 ON t1.id = t2.fk_id
If this doesnt work, remove dbo from dbo.Table2 and then try.
Hope this helps.

SQL - alternative to left outer join

There is a standard way in SQL to conut a number of rows joined to one table acepting also the 0?
That is one example :
SELECT t1.id, COUNT(t2.*)
FROM t1 LEFT OUTER JOIN t2 ON ( t1.id = t2.id )
GROUP BY t1.id
I need a alternative because i use odbc with different databases, and on some databases the left join aren't supported.
SELECT
t1.id,
(SELECT COUNT(*) FROM t2 WHERE t2.id = t1.id) as t2_count
FROM t1
Two Options:
Option 1: use the (+) operator:
SELECT t1.id, COUNT(t2.*)
FROM t1, t2
WHERE t2.id(+) = t1.id
GROUP BY t1.id
I don;t know if it works on all drivers. Option 2 that will work with all drivers is to create a view and create the view instead.

SQL Joining three tables and using LEFT OUTER JOIN

I have three tables and two seperate SQL queries which are working correctly and I am having correct results.
If I try to join these three tables I am having null as result.
First query:
select T1.ID,T3.COMPANY
from T1,T3
where (T1.status!='CLOSED') and (T1.PRIORITY)>5 and T1.CLASSID=T3.CLASSID
Second query:
SELECT T1.ID, T2.DESCRIPTION
FROM T1
LEFT OUTER JOIN T2
ON T1.ID=T2.KEY
WHERE T1.status!='CLOSED'
AND (T2.CREATEDATE= (SELECT MAX(CREATEDATE)
FROM T2
WHERE T2.KEY=T1.ID))
I tried to join them but as result I am having null:
select T1.ID,T3.COMPANY,T2.DESCRIPTION
from T1
INNER JOIN T3 ON T1.CLASSID=T3.CLASSID
LEFT OUTER JOIN T2
ON T1.ID=T2.KEY
where (T1.status!='CLOSED') AND (T1.PRIORITY)>5
AND (T2.CREATEDATE= (SELECT MAX(CREATEDATE)
FROM T2
WHERE T2.KEY=T1.ID))
like it does not recognized last part for taking MAX value from T2 table.
What am I doing wrong? Thanks for help
Firstly, use an alias for the subquery on table T2.
T2.CREATEDATE =
(SELECT MAX(T2Alias.CREATEDATE)
FROM T2 AS T2Alias
WHERE T2Alias.KEY = T1.ID)
Secondly, consider moving this condition into the ON clause of the LEFT JOIN to table T2.
The first thing that jumps out at me is the new dependency on both T1.Priority > 5 and T2.CreateDate value being equal to the result of the inline query:
( AND (T1.PRIORITY) > 5
AND (T2.CREATEDATE =
(SELECT MAX(CREATEDATE) FROM T2 WHERE T2.KEY = T1.ID) )
Without the data it's difficult to check however this may be the issue