I was wondering, how does an inner join work when no WHERE clause is specified? For example,
SELECT table1.letter, table2.letter, table1.number, table2.number
FROM tbl AS table1, tbl AS table2;
tbl:
text, integer
a , 1
b , 2
c , 3
Tried finding some examples online but I couldn't seem to find any :-/
Thanks!
The current implicit join syntax you are using:
FROM tbl AS table1, tbl AS table2;
will result in a cross join if no restrictions are present in the WHERE clause. But really you should use modern ANSI-92 syntax when writing your queries, e.g.
SELECT
table1.letter,
table2.letter,
table1.number,
table2.number
FROM tbl AS table1
INNER JOIN tbl AS table2
-- ON <some conditions>
One obvious reason to use this syntax is that it makes it much easier to see the logic of your query. In this case, if your updated query were missing an ON clause, then we would know right away that it is doing a cross join, which most of the time is usually not what you want to be doing.
The comma operator generates a Cartesian product -- every row in the first table combined with every row of the second.
This is more properly written using the explicit cross join:
SELECT table1.letter, table2.letter, table1.number, table2.number
FROM tbl table1 CROSS JOIN
tbl table2;
If you have conditions for combining the two tables, then you would normally use JOIN with an ON clause.
You can use cross join
select * from table1 cross join table2
Here is a link to understand more about the use of cross join.
https://www.w3resource.com/sql/joins/cross-join.php
Related
This question already has answers here:
Select from Table1, Table2
(3 answers)
Closed 4 years ago.
I know different joins, but I wanted to know which of them is being used when we run queries like this:
select * from table1 t1, table2 t2
is it full outer join or natural join for example?
Also does it have a unique meaning among different databases or all do the same?
UPDATE: what if we add where clause ? will it be always inner join?
The comma in the from clause -- by itself -- is equivalent to cross join in almost all databases. So:
from table1 t1, table2 t2
is functionally equivalent to:
from table1 t1 cross join table2 t2
They are not exactly equivalent, because the scoping rules within the from clause are slightly different. So:
from table1 t1, table2 t2 join
table3 t3
on t1.x = t3.x
generates an error, whereas the equivalent query with cross join works.
In general, conditions in the WHERE clause will always result in the INNER JOIN. However, some databases have extended the syntax to support outer joins in the WHERE clause.
I can think of one exception where the comma does not mean CROSS JOIN. Google's BigQuery originally used the comma for UNION ALL. However, that is only in Legacy SQL and they have removed that in Standard SQL.
Commas in the FROM clause have been out of fashion since the 1900s. They are the "original" form of joining tables in SQL, but explicit JOIN syntax is much better.
To me, they also mean someone who learned SQL decades ago and refused to learn about outer joins, or someone who has learned SQL from ancient materials -- and doesn't know a lot of other things that SQL does.
demo: db<>fiddle
This is a CROSS JOIN (cartesian product). So both of the following queries are equal
SELECT * FROM table1, table2 -- implicit CROSS JOIN
SELECT * FROM table1 CROSS JOIN table1 -- explicit CROSS JOIN
concerning UPDATE
A WHERE clause makes the general CROSS JOIN to an INNER JOIN. An INNER JOIN can be got by three ways:
SELECT * FROM table1, table2 WHERE table1.id = table2.id -- implicit CROSS JOIN notation
SELECT * FROM table1 CROSS JOIN table2 WHERE table1.id = table2.id -- really unusual!: explicit CROSS JOIN notation
SELECT * FROM table1 INNER JOIN table2 ON (table1.id = table2.id) -- explicit INNER JOIN NOTATION
Further reading (wikipedia)
In a DB2-400 SQL join, can the USING() clause be used with one or more AND ON clauses for a single join..? This is for a situation where some field names are the same, but not all, so USING() would only apply to part of the join.
I could have sworn I've done this before and it worked, but now it eludes me.
I've tried various combinations as shown below, but none of them work. Perhaps I'm simply mistaken and it's not possible:
SELECT * FROM T1 INNER JOIN T2 USING (COL1,COL2) AND ON (T1.COL3=T2.COL4)
SELECT * FROM T1 INNER JOIN T2 ON (T1.COL3=T2.COL4) AND USING (COL1,COL2)
SELECT * FROM T1 INNER JOIN T2 ON (T1.COL3=T2.COL4), USING (COL1,COL2)
SELECT * FROM T1 INNER JOIN T2 USING (COL1,COL2,(1.COL3=T2.COL4))
Checking the syntax diagram here https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_72/db2/rbafzjoinedt.htm
I suggest that the only options for JOIN USING are a comma separated list of columns
JOIN table-reference USING ( column-name [, column-name] ... )
and you can't mix USING with ON
You can use where:
SELECT *
FROM T1 INNER JOIN
T2 USING (COL1, COL2)
WHERE T1.COL3 = T2.COL4;
Another alternative would be to use a subquery to rename the column in one of the tables.
Back in the old days, I used to write select statements like this:
SELECT
table1.columnA, table2.columnA
FROM
table1, table2
WHERE
table1.columnA = 'Some value'
However I was told that having comma separated table names in the "FROM" clause is not ANSI92 compatible. There should always be a JOIN statement.
This leads to my problem.... I want to do a comparison of data between two tables but there is no common field in both tables with which to create a join. If I use the 'legacy' method of comma separated table names in the FROM clause (see code example), then it works perfectly fine. I feel uncomfortable using this method if it is considered wrong or bad practice.
Anyone know what to do in this situation?
Extra Info:
Table1 contains a list of locations in 'geography' data type
Table2 contains a different list of 'geography' locations
I am writing select statement to compare the distances between the locations. As far I know you cant do a JOIN on a geography column??
You can (should) use CROSS JOIN. Following query will be equivalent to yours:
SELECT
table1.columnA
, table2.columnA
FROM table1
CROSS JOIN table2
WHERE table1.columnA = 'Some value'
or you can even use INNER JOIN with some always true conditon:
FROM table1
INNER JOIN table2 ON 1=1
Cross join will help to join multiple tables with no common fields.But be careful while joining as this join will give cartesian resultset of two tables.
QUERY:
SELECT
table1.columnA
, table2,columnA
FROM table1
CROSS JOIN table2
Alternative way to join on some condition that is always true like
SELECT
table1.columnA
, table2,columnA
FROM table1
INNER JOIN table2 ON 1=1
But this type of query should be avoided for performance as well as coding standards.
A suggestion - when using cross join please take care of the duplicate scenarios. For example in your case:
Table 1 may have >1 columns as part of primary keys(say table1_id,
id2, id3, table2_id)
Table 2 may have >1 columns as part of primary keys(say table2_id,
id3, id4)
since there are common keys between these two tables (i.e. foreign keys in one/other) - we will end up with duplicate results. hence using the following form is good:
WITH data_mined_table (col1, col2, col3, etc....) AS
SELECT DISTINCT col1, col2, col3, blabla
FROM table_1 (NOLOCK), table_2(NOLOCK))
SELECT * from data_mined WHERE data_mined_table.col1 = :my_param_value
I have a select statement
select someFields from table1,table2
which is a cartesian join, right? What's the explicit syntax for such a join. Searching the web for , isn't yielding much luck.
Cartesian join aka cross-join:
SELECT somefields FROM table1 CROSS JOIN table2
is a part of ANSI standard SQL:1992.
I am building an application which dynamically generates sql to search for rows of a particular Table (this is the main domain class, like an Employee).
There are three tables Table1, Table2 and Table1Table2Map.
Table1 has a many to many relationship with Table2, and is mapped through Table1Table2Map table. But since Table1 is my main table the relationship is virtually like a one to many.
My app generates a sql which basically gives a result set containing rows from all these tables. The select clause and joins dont change whereas the where clause is generated based on user interaction. In any case I dont want duplicate rows of Table1 in my result set as it is the main table for result display. Right now the query that is getting generated is like this:
select distinct Table1.Id as Id, Table1.Name, Table2.Description from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id)
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
For simplicity I have excluded the where clause. The problem is when there are multiple rows in Table2 for Table1 even though I have said distinct of Table1.Id the result set has duplicate rows of Table1 as it has to select all the matching rows in Table2.
To elaborate more, consider that for a row in Table1 with Id = 1 there are two rows in Table1Table2Map (1, 1) and (1, 2) mapping Table1 to two rows in Table2 with ids 1, 2. The above mentioned query returns duplicate rows for this case. Now I want the query to return Table1 row with Id 1 only once. This is because there is only one row in Table2 that is like an active value for the corresponding entry in Table1 (this information is in Mapping table).
Is there a way I can avoid getting duplicate rows of Table1.
I think there is some basic problem in the way I am trying to solve the problem, but I am not able to find out what it is. Thanks in advance.
Try:
left outer join (select distinct YOUR_COLUMNS_HERE ...) SUBQUERY_ALIAS on ...
In other words, don't join directly against the table, join against a sub-query that limits the rows you join against.
You can use GROUP BY on Table1.Id ,and that will get rid off the extra rows. You wouldn't need to worry about any mechanics on join side.
I came up with this solution in a huge query and it this solution didnt effect the query time much.
NOTE : I'm answering this question 3 years after its been asked but this may help someone i believe.
You can re-write your left joins to be outer applies, so that you can use a top 1 and an order by as follows:
select Table1.Id as Id, Table1.Name, Table2.Description
from Table1
outer apply (
select top 1 *
from Table1Table2Map
where (Table1Table2Map.Table1Id = Table1.Id) and Table1Table2Map.IsActive = 1
order by somethingCol
) t1t2
outer apply (
select top 1 *
from Table2
where (Table2.Id = Table1Table2Map.Table2Id)
) t2;
Note that an outer apply without a "top" or an "order by" is exactly equivalent to a left outer join, it just gives you a little more control. (cross apply is equivalent to an inner join).
You can also do something similar using the row_number() function:
select * from (
select distinct Table1.Id as Id, Table1.Name, Table2.Description,
rowNum = row_number() over ( partition by table1.id order by something )
from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id)
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
) x
where rowNum = 1;
Most of this doesn't apply if the IsActive flag can narrow down your other tables to one row, but they might come in useful for you.
To elaborate on one point: you said that there is only one "active" row in Table2 per row in Table1. Is that row not marked as active such that you could put it in the where clause? Or is there some magic in the dynamic conditions supplied by the user that determines what's active and what isn't.
If you don't need to select anything from Table2 the solution is relatively simply in that you can use the EXISTS function but since you've put TAble2.Description in the clause I'll assume that's not the case.
Basically what separates the relevant rows in Table2 from the irrelevant ones? Is it an active flag or a dynamic condition? The first row? That's really how you should be removing duplicates.
DISTINCT clauses tend to be overused. That may not be the case here but it sounds like it's possible that you're trying to hack out the results you want with DISTINCT rather than solving the real problem, which is a fairly common problem.
You have to include activity clause into your join (and no need for distinct):
select Table1.Id as Id, Table1.Name, Table2.Description from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id) and Table1Table2Map.IsActive = 1
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
If you want to display multiple rows from table2 you will have duplicate data from table1 displayed. If you wanted to you could use an aggregate function (IE Max, Min) on table2, this would eliminate the duplicate rows from table1, but would also hide some of the data from table2.
See also my answer on question #70161 for additional explanation