How to do spatial left joins with a multiple join query - sql

I have some data I'm trying to query thats a mix of spatial data and some non spatial data. Some of these tables contain a MDSYS.SDO_GEOMETRY spatial data, some don't.
My query is so far as shown:
select table1.col1, table1.col2, table2.col5, table2.col3, table3.col1, table3.col2, table4.col9, **table5.col1** -- this is the data from spatial join
from table1
left join table2 on
table2.col1 = table1.col1
left join table3 on
table3.col5 = table1.col6
left join table4 on
table4.col2 = table1.col1
left join table5 on
--**spatial geometry where spatial geometry in column 3 of table1 is within spatial geometry of table5 column 2**
group by table1.col1, table1.col2, table2.col5, table2.col3, table3.col1, table3.col2, table4.col9
My question is, how do I do this? Everything I've found out on spatial left joins don't involve joining other non spatial tables.
This is for SQL Developer Oracle

Related

Join evaluation order in HIVE

I was trying to run a query that would make use of multiple joins inside HIVE.
example:
SELECT *
FROM table1
LEFT JOIN table2 -- the table resulted from the inner join should be left joined to table1
INNER JOIN table3 -- this inner join should happen first between table2 and table3
ON table3.id = table2.id
ON table2.id = table1.id
I think this is perfectly valid on other SQL DBMS's, but HIVE gives me an error. Are this kind of joins ( I really don't know what to call them so I can't google them) illegal in HIVE?
Workarounds would be some subquery unions, but I am more interested in getting more information on this kind of syntax.
Thanks!
This is valid SQL syntax and should be parsed as:
FROM table1 LEFT JOIN
(table2 INNER JOIN
table3
ON table3.id = table2.id
)
ON table2.id = table1.id
By convention, ON clauses are interleaved with JOINs, sot the conditions are where the JOIN is specified. However, the syntax allows for this construct as well.
I don't use such syntax -- and I strongly discourage using it without parentheses -- but I thought pretty much all databases supported it.
If parentheses don't work, you have two options. One is a subquery:
This is valid SQL syntax and should be parsed as:
FROM table1 LEFT JOIN
(SELECT table2.id, . . . -- other columns you want
FROM table2 INNER JOIN
table3
ON table3.id = table2.id
) t23
ON t23.id = table1.id
Or using a RIGHT JOIN:
SELECT table2 INNER JOIN
table3
ON table3.id = table2.id RIGHT JOIN
table1
ON table2.id = table1.id
In this case, the RIGHT JOIN should be equivalent. But it can be complicated getting exactly the same semantics when multiple joins are involved (and without using parentheses).

When to use right join or full outer join

I work with DB / SQL almost on a daily basis and the more I work with sql, the more I'm the opinion that there is no reason to use a right join or a full outer join.
Let's assume we have two tables: table1 and table2. Either I want to receive additional information for the rows in table1 so I can use an inner join on table2 and if I want to keep the original rows if there is no match, I use the left join then:
In case I have to add additional information to table 2, I can do the same and left join table 2 to table on. So I do not see a reason why I should ever use a right join. Is there any use case where you can not use a left join for a right join?
I also wondered if I would ever need a full outer join. Why would you join two tables and keep the rows that do not match of BOTH tables? We you could also achieve this by using two left joins.
Why would you join two tables and keep the rows that do not match of BOTH tables?
The full join has cases where it is useful.One of them is comparing two tables for differences like XOR between tables:
SELECT *
FROM t1
FULL JOIN t2
ON t1.id = t2.id
WHERE t1.id IS NULL
OR t2.id IS NULL;
Example:
t1.id ... t2.id
1 NULL
NULL 2
you could also achieve this by using two left joins.
Yes you could:
SELECT t1.*, t2.*
FROM t1
LEFT JOIN t2
ON t1.id = t2.id
WHERE t2.id IS NULL
UNION ALL
SELECT t1.*, t2.*
FROM t2
LEFT JOIN t1
ON t1.id = t2.id
WHERE t1.id IS NULL;
Some SQL dialects does not support FULL OUTER JOIN and we emulate it that way.
Related: How to do a FULL OUTER JOIN in MySQL?
On the other hand RIGHT JOIN is useful when you have to join more than 2 tables:
SELECT *
FROM t1
JOIN t2
...
RIGHT JOIN t3
...
Of course you could argue that you could rewrite it to correspodning form either by changing join order or using subqueries(inline views). From developer perspective it is always good to have tools(even if you don't have to use them)

SQL join between 2 tables with OR condition

I am just trying to understand the concept behind joining of 2 tables with an OR condition.
My requirement is: I need to join 2 tables Table1 [colA, colB] and Table2 [colX, colY] on columns Table1.colA = Table2.colB but if colA is NULL the condition should be Table1.colB = Table2.colY.
Do I need to do join them separately and then do union? Or is there a way I can do it in one join? Note that I have millions of records in both tables and its a left join and the tables reside in HIVE. I don't have a reproducible example, just trying to understand the concept.
While I'm not familiar with HiveQL, in SQL server this could be accomplished as follows:
SELECT *
FROM table1 t1
JOIN table2 t2
ON COALESCE(t1.cola, t1.colb) = CASE
WHEN t1.cola IS NULL THEN t2.coly
ELSE t2.colx
END
The logic should be fairly readable.
Translate your conditions directly:
SELECT *
FROM table1 t1 JOIN
table2 t2
ON (t1.cola = t2.colb) or
(t1.cola is null and t1.colb = t2.coly)
Usually, or is a performance killer in joins. This wold often be expressed using two separate left joins:
SELECT . . . , COALESCE(t2a.col, t2b.col) as col
FROM table1 t1 LEFT JOIN
table2 t2a
ON (t1.cola = t2.colb) LEFT JOIN
table2 t2b
ON t1.cola is null and t1.colb = t2.coly;

Does INNER JOIN performance depends on order of tables?

A question suddenly came to my mind while I was tuning one stored procedure. Let me ask it -
I have two tables, table1 and table2. table1 contains huge data and table2 contains less data. Is there performance-wise any difference between these two queries(I am changing order of the tables)?
Query1:
SELECT t1.col1, t2.col2
FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col2
Query2:
SELECT t1.col1, t2.col2
FROM table2 t2
INNER JOIN table1 t1
ON t1.col1=t2.col2
We are using Microsoft SQL server 2005.
Aliases, and the order of the tables in the join (assuming it's INNER JOIN) doesn't affect the final outcome and thus doesn't affect performance since the order is replace (if needed) when the query is executed.
You can read some more basic concepts about relational algebra here:
http://en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_operators

Join Unlike Tables

I have 2 unlike tables and a large set of subqueries that have a key for each of those two tables. I need to join the two tables to each subquery.
Table 1
Table1ID
Table 2
Table2ID
Subqueries
Table1ID
Table2ID
Is there any way to join everything together?
I have tried something similar to
SELECT Table1.Table1ID, Table2.Table2ID
FROM Table1, Table2
LEFT JOIN (SELECT Table1ID, Table2ID FROM ....) q1 ON Table1.Table1ID = q1.Table1ID AND Table2.Table2ID = q1.Table2ID
...
This following query will select all fields from a join of all three tables on the respective table IDs:
SELECT *
FROM Table1 t1
INNER JOIN Subqueries s
ON t1.Tabl1Id = s.Table1Id
INNER JOIN Table2 t2
ON s.Tabl2Id = ts.Table2Id
If you need absolutely all records from both Table1 and Table2, whether they are joined via the Subqueries table, then you can change the join to FULL OUTER:
SELECT *
FROM Table1 t1
FULL OUTER JOIN Subqueries s
ON t1.Tabl1Id = s.Table1Id
FULL OUTER JOIN Table2 t2
ON s.Tabl2Id = ts.Table2Id