Are the following select statements SQL92 compliant?
SELECT table1.id, table2.id,*
FROM table1, table2
WHERE table1.id = table2.id
SELECT table1.Num, table2.id,*
FROM table1, table2
WHERE table1.Num = table2.id
Following on from StingyJack...
SELECT
table1.id,
table2.id,
*
FROM
table1
INNER JOIN
table2 ON table1.id = table2.id
WHERE
table1.column = 'bob'
SELECT table1.id, table2.id,* FROM table1, table2 WHERE table1.id = table2.id and table1.column = 'bob'
Where's the JOIN? Where's the filter?
JOIN also forces some discipline and basic checking: easier to avoid cross join or partial cross joins
Yes, the queries you show use SQL92 compliant syntax. My copy of "Understanding the New SQL: A Complete Guide" by Jim Melton & Alan R. Simon confirms it.
SQL92 still supports joins using the comma syntax, for backward compatibility with SQL89.
As far as I know, all SQL implementations support both comma syntax and JOIN syntax joins.
In most cases, the SQL implementation knows how to optimize them so that they are identical in semantics (that is, they produce the same result) and performance.
I might be wrong, but my understanding is that the SQL92 convention is to join tables using the JOIN statement (e.g. FROM table1 INNER JOIN table2).
Unfortunately, I believe they are, but that join syntax is more difficult to read and maintain.
I know that with MSSQL, there is no perfomance difference between either of these two join methods, but which one is easier to understand?
SELECT table1.id, table2.id,*
FROM table1, table2
WHERE table1.id = table2.id
SELECT
table1.id,
table2.id,
*
FROM table1
INNER JOIN table2
ON table1.id = table2.id
Related
I was trying to run a query that would make use of multiple joins inside HIVE.
example:
SELECT *
FROM table1
LEFT JOIN table2 -- the table resulted from the inner join should be left joined to table1
INNER JOIN table3 -- this inner join should happen first between table2 and table3
ON table3.id = table2.id
ON table2.id = table1.id
I think this is perfectly valid on other SQL DBMS's, but HIVE gives me an error. Are this kind of joins ( I really don't know what to call them so I can't google them) illegal in HIVE?
Workarounds would be some subquery unions, but I am more interested in getting more information on this kind of syntax.
Thanks!
This is valid SQL syntax and should be parsed as:
FROM table1 LEFT JOIN
(table2 INNER JOIN
table3
ON table3.id = table2.id
)
ON table2.id = table1.id
By convention, ON clauses are interleaved with JOINs, sot the conditions are where the JOIN is specified. However, the syntax allows for this construct as well.
I don't use such syntax -- and I strongly discourage using it without parentheses -- but I thought pretty much all databases supported it.
If parentheses don't work, you have two options. One is a subquery:
This is valid SQL syntax and should be parsed as:
FROM table1 LEFT JOIN
(SELECT table2.id, . . . -- other columns you want
FROM table2 INNER JOIN
table3
ON table3.id = table2.id
) t23
ON t23.id = table1.id
Or using a RIGHT JOIN:
SELECT table2 INNER JOIN
table3
ON table3.id = table2.id RIGHT JOIN
table1
ON table2.id = table1.id
In this case, the RIGHT JOIN should be equivalent. But it can be complicated getting exactly the same semantics when multiple joins are involved (and without using parentheses).
I noticed today that this query
select * from table1 table2 where column_from_table1 = ?;
works. It works the same as (same columns return)
select * from table1 where column_from_table1 = ?;
Shouldn't the former be a syntax error? What is it interpreting table2 as?
Appears it's interpreting it as renaming the table, even though table2 exists it happily allows the rename, this also works:
select * from table1 asdf where asdf.column_from_table1 = ?;
select * from table1 table2 where column_from_table1 = ?;
table2 is working as a table alias for table1. It's not being used as the name of an object in the database at all. The fact that a table named table2 exists is wholly irrelevant to this query. Usually you'd see something like this:
select t.id, t.name from table1 t where t.column_from_table1 = ?;
Some RDBMSs require the as keyword, so you'll also see this:
SELECT t.id, t.name FROM table1 AS t WHERE t.column_from_table1 = ?;
Table aliases are useful for making queries with multiple tables easier to write, especially if they have shared column names which need to be qualified. They're also essential for self-joins where a table is joined to itself.
Example of a join using aliases:
SELECT t1.Id,
t1.Name as t1_Name
t2.Name as t2_Name
FROM table1 t1
JOIN table2 t2
ON t1.id = t2.id
WHERE t1.column_from_table1 = ?;
Or, for a self-join to look for duplicate Name values, for example:
SELECT t1.Name,
t1.Id
t2.Id as Dupe_Id
FROM table1 t1
JOIN table1 t2
ON t1.Name = t2.Name
WHERE t1.Id < t2.Id;
Notice that this query is referring to table1 twice and uses the aliases of t1 and t2 to differentiate which it's referring to.
Note that a comma join, such as FROM table1, table2 WHERE table1.id = table2.id is very old syntax that should be explicitly avoided when writing queries. The older syntax is difficult to read and maintain and doesn't support outer joins except by vender-specific extensions. The newer syntax with the JOIN keyword was introduced in standard SQL in 1992. There's no reason to still be using comma joins.
SELECT .... FROM TABLE1 T1, TABLE2 T2, TABLE3 T3
WHERE T1.NAME = 'ABC' AND T1.ID = T2.COL_ID AND T2.COL1 = T3.COL2
vs
SELECT .... FROM TABLE1 T1
WHERE T1.NAME = 'ABC'
INNER JOIN TABLE2 T2 ON T1.ID = T2.COL_ID
INNER JOIN TABLE3 T3 ON T2.COL1 = T3.COL2
Two questions
In terms of performance, which will perform better and why?
If Option 2 has the better performance, when should be using Option 1? (vice versa question if Option 1 has better performance)
The second query is not correct. It should be:
SELECT .... FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.ID = T2.COL_ID
INNER JOIN TABLE3 T3 ON T2.COL1 = T3.COL2
WHERE T1.NAME = 'ABC'
This is the right way to write your join condition. The 1st one is accepted, but technically creates a cartesian product. All modern database deals perfectly with both 1st and 2nd queries and interprets them the same way, therefore, performance should be the same. But still, you should use the second one because it is more readable and allows you to have only one way to write join weither it is a inner, left or full outer.
The answer is easy: Don't use comma-separated joins (first query). We used these in the 1980s for the lack of something better, but then in 1992 the new syntax (second query) was introduced1, because the old syntax was error-prone (it was easier to forget to apply join criteria) and harder to maintain (was missing join criteria intended or not in a query?) and there was no standard syntax for outer joins.
1 Oracle was a little late though featuring the new syntax. They introduced the new ANSI joins in Oracle 9i in 2001.
In terms of performance: There should be no difference in speed, because DBMS optimizers see that this is essentially the same query.
Your second query is syntactically incorrect by the way. The query's WHERE clause belongs after the complete FROM clause, i.e. after all the joins:
SELECT ....
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.col_id
INNER JOIN table3 t3 ON t2.col1 = t3.col2
WHERE t1.name = 'ABC';
I appreciate this might be very simple for you guys but sometimes the logic behind JOIN can be difficult for beginners. I want to select "ID" from table1 but only those "ID"s which do NOT appear in table2."ID". I tested LEFT and RIGHT but cannot get it to work the way I need to. I am using dashDB.
You can use NOT IN and subquery
Select * from table1 where id NOT IN (select id from table2);
try this...
SELECT *
FROM table1
LEFT JOIN table2 ON table1.ID = table2.ID
WHERE table2.ID IS NULL
I always prefer NOT EXISTS to do this
Select * from table1 a
where NOT EXISTS (select 1 from table2 b where a.id = b.id);
Here is a excellent article by Aaron Bertrand that compares the performance of all the methods
Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?
Use the below script.
SELECT t1.ID
FROM table1 t1
LEFT JOIN table2 t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL
What kind of join is actually for the following sql statement?
select *
from table1 tbl1, table2 tbl2
where tbl1.id = tbl2.id
Does it only return result if both id matches?
This is an inner join.
Yes, only records that have matching IDs will be returned.
This is the same as:
select *
from table1 tbl1
inner join table2 tbl2
on tbl1.id = tbl2.id
Personally, I prefer the explicit notation of INNER JOIN.
Yes, that is ANSI-89 syntax for an inner join. ANSI-92 defines the [INNER,LEFT, etc...] JOIN keywords.