order of join in hive - hive

I have a query which involves a left join followed by a join. I want to make sure the left join is done first. The left join comes before the join nin my query, is this enough? This is how the join looks like
select * from
(select *....) A
left join
(select *...) B
on A.a = B.a
left join
C
on A.f = C.f

I cannot see the JOIN in your code, only two LEFT JOIN statements.
However, if you have something like this:
select * from
(select *....) A
left join
(select *...) B
on A.a = B.a
join
C
on A.f = C.f
and you want to make sure the LEFT JOIN is executed first, you can move this LEFT JOIN to a sub-query:
select *
from (
select * from (
(select *....) A
left join
(select *...) B
on A.a = B.a
)
) D
join
C
on D.f = C.f

Related

SQL Left Join Efficiency in Presto/Spark SQL

I will like to ask which will be the better option and why if I intend to LEFT JOIN a few tables? Provide an example below.
Option 1:
SELECT * FROM TABLEA a
LEFT JOIN TABLEB b
ON a.id=b.id
LEFT JOIN TABLEC c
ON a.id=c.id
LEFT JOIN TABLED d
ON a.id=d.id
Option 2(CTE):
WITH tablea_b as (
SELECT * FROM TABLEA a
LEFT JOIN TABLEB b
ON a.id=b.id)
, tablea_b_c as (
SELECT * FROM tablea_b a
LEFT JOIN TABLEC c
ON a.id=c.id)
, tablea_b_c_d as (
SELECT * FROM tablea_b_c a
LEFT JOIN TABLED d
ON a.id = d.id) SELECT * FROM tablea_b_c_d
Basically the differences is i left join part by part at option 2 whereas at option 1 i do it in one go. Are there any differences in terms of efficiency?

SQL: how to self-reference the output of an INNER JOIN?

I would like to be able to run a self-inner join on the output of a query.
Performing an self INNER JOIN, in the simplest case, is easy:
SELECT *
FROM A a1 INNER JOIN
A a2 ON
a1.key = a2.key
The problem is that I need to do this self-inner join on the output of another inner join. Something like
SELECT *
FROM DATA.A A INNER JOIN
DATA.B B
ON A.key = B.key output /* output is the dataset I am interested in */
INNER JOIN
(FROM DATA.A A INNER JOIN
DATA.B B
ON A.key = B.key output2) /* same code to get output, so that I can self reference */
ON
OUTPUT.key_alt = OUTPUT2.key_alt
Is it possible to do so? I cannot store output in my database.
In SQL Server:
I prefer to use a common table expression for this sort of thing. It keeps things a more readable in my opinion.
with cte as (
select *
from data.A as A
inner join data.B as B
on A.key = B.key
)
select ...
from cte as o
inner join cte as i
on o.key = i.key
You can do achieve this with standard subqueries though.
select o.*
from (
select *
from data.A as A
inner join data.B as B
on A.key = B.key
) as o
inner join (
select *
from data.A as A
inner join data.B as B
on A.key = B.key
) as i
on o.key = i.key

ACCESS SQL query: adding a new field?

I would like to add a field to a existing query that doesn't get affected from 'Where function'
For example,
This is the original code....
SELECT SHELL_Payables.PoolNum,
A.[Code], B.[Program] AS Program, A.PayableAmt, C.ReceivableAmt INTO [New Data]
FROM A INNER JOIN B ON A.ID=B.ID
INNER JOIN C ON A.Num=B.Num
WHERE (((A.AccountingPeriod)<=[AccountingYearMonth]));
I would like to add A.PayableAmt again but this time where clause (accountingperiod <= accountingyearMonth) should not be applied to this field...
Any ideas? It would be much appreciated.
To use union and select into, you would need to write your query something like this:
SELECT *
INTO [New Data]
FROM (
SELECT PoolNum
,A.[Code]
,B.[Program] AS Program
,A.PayableAmt
,C.ReceivableAmt
FROM A
INNER JOIN B ON A.ID = B.ID
INNER JOIN C ON A.Num = B.Num
WHERE A.AccountingPeriod <= AccountingYearMonth
UNION
SELECT PoolNum
,A.[Code]
,B.[Program] AS Program
,A.PayableAmt
,C.ReceivableAmt
FROM A
INNER JOIN B ON A.ID = B.ID
INNER JOIN C ON A.Num = B.Num
)
UPDATE
If you want to add another PayableAmt column to the same row, maybe you can join back to the table A something like this:
SELECT t.PoolNum
,a.[Code]
,a.[Program] AS Program
,t.PayableAmt
,a.PayableAmt AS NewPayableAmt
,C.ReceivableAmt
INTO [New Data]
FROM A
LEFT JOIN
(
SELECT
PoolNum
,A.[Code]
,B.[Program] AS Program
,A.PayableAmt
,C.ReceivableAmt
FROM A
INNER JOIN B ON A.ID = B.ID
INNER JOIN C ON A.Num = B.Num
WHERE A.AccountingPeriod <= AccountingYearMonth
) t
ON t.Code = A.Code --assuming this is unique
INNER JOIN B ON A.ID = B.ID
INNER JOIN C ON A.Num = B.Num

Syntax for multiple joins in sql

Working on Oracle: I am attempting to do an inner self join, with a where clause, then take that result and do a left outer join on it:
(select * from table1 A
inner join
select * from table1 B
on A.id = B.id
where
A.id is not null and B.id is not null) C
left outer join
select * from table2 D
on C.id = D.id
Somehow I am syntactically challenged and can't make this work. Can't seem to find the right syntax anywhere.
Just the put the where clause at the end. The database will get it right:
select *
from table1 A
inner join table1 B on A.id = B.id
left join table2 D on D.id = A.id
where A.id is not null
In this case, we can take advantage of the logical transitive property for your id column joins and where clause.
Your second join needs to be joined to a query add a select * from at the beginning
select * from (select * from table1 A
inner join
select * from table1 B
on A.id = B.id
where
A.id is not null and B.id is not null) C
left outer join
select * from table2 D
on C.id = D.id

SQL Server Multiple LEFT JOIN, one-to-many

I am looking for a way to perform multiple joins from one source table to more than one table. Similar to the following:
SELECT a.NAME, b.address, c.phone
FROM tblname a
LEFT JOIN tbladdress b ON a.nid = b.nid
I also want to perform a left join on the Telephone table tblPhone at the same time:
tblname a left join tblPhone c on a.PID = c.PID
Try as I might I can't see how to put this into one query.
You can simply repeat your JOIN clauses as many times as is needed, e.g.:
SELECT a.NAME
,b.address
,c.phone
FROM tblname a
LEFT JOIN tbladdress b ON a.nid = b.nid
LEFT JOIN tblPhone c ON a.PID = c.PID
SELECT a.name, b.address, c.phone
FROM tblname a
left join tbladdress b on a.nid = b.nid
left join tblPhone c on a.PID = c.PID;
SELECT a.name, b.address, c.phone
FROM (tblname a
left join tbladdress b on a.nid = b.nid) c
left join tblPhone d on c.PID=d.PID