SQL: JOIN vs LEFT OUTER JOIN? - sql

I have multiple SQL queries that look similar where one uses JOIN and another LEFT OUTER JOIN. I played around with SQL and found that it the same results are returned. The codebase uses JOIN and LEFT OUTER JOIN interchangeably. While LEFT JOIN seems to be interchangeable with LEFT OUTER JOIN, I cannot I cannot seem to find any information about only JOIN. Is this good practice?
Ex Query1 using JOIN
SQL
SELECT
id,
name
FROM
u_users customers
JOIN
t_orders orders
ON orders.status=='PAYMENT PENDING'
Ex. Query2 using LEFT OUTER JOIN
SQL
SELECT
id,
name
FROM
u_users customers
LEFT OUTER JOIN
t_orders orders
ON orders.status=='PAYMENT PENDING'

As previously noted above:
JOIN is synonym of INNER JOIN. It's definitively different from all
types of OUTER JOIN
So the question is "When should I use an outer join?"
Here's a good article, with several great diagrams:
https://www.sqlshack.com/sql-outer-join-overview-and-examples/
The short answer your your question is:
Prefer JOIN (aka "INNER JOIN") to link two related tables. In practice, you'll use INNER JOIN most of the time.
INNER JOIN is the intersection of the two tables. It's represented by the "green" section in the middle of the Venn diagram above.
Use an "Outer Join" when you want the left, right or both outer regions.
In your example, the result set happens to be the same: the two expressions happen to be equivalent.
ALSO: be sure to familiarize yourself with "Show Plan" (or equivalent) for your RDBMS: https://www.sqlshack.com/execution-plans-in-sql-server/
'Hope that helps...

First the theory:
A join is a subset of the left join (all other things equal). Under some circumstances they are identical
The difference is that the left join will include all the tuples in the left hand side relation (even if they don't match the join predicate), while the join will only include the tuples of the left hand side that match the predicate.
For instance assume we have to relations R and S.
Say we have to do R JOIN S (and R LEFT JOIN S) on some predicate p
J = R JOIN S on (p)
Now, identify the tuples of R that are not in J.
Finally, add those tuples to J (padding any attribute in J not in R with null)
This result is the left join:
R LEFT JOIN S (p)
So when all the tuples of the left hand side of the relation are in the JOIN, this result will be identical to the Left Join.
back to you problem:
Your JOIN is very likely to include all the tuples from Users. So the query is the same if you use JOIN or LEFT JOIN.

The two are exactly equivalent, because the WHERE clause turns the LEFT JOIN into an INNER JOIN.
When filtering on all but the first table in a LEFT JOIN, the condition should usually be in the ON clause. Presumably, you also have a valid join condition, connecting the two tables:
SELEC id, name
FROM u_users u LEFT JOIN
t_orders o
ON o.user_id = u.user_id AND o.status = 'PAYMENT PENDING';
This version differs from the INNER JOIN version, because this version returns all users even those with no pending payments.

Both are the same, there is no difference here.
You need to use the ON clause when using Join. It can match any data between two tables when you don't use the ON clause.
This can cause performance issue as well as map unwanted data.
If you want to see the differences you can use "execution plans".
for example, I used the Microsoft AdventureWorks database for the example.
LEFT OUTER JOIN :
LEFT JOIN :
If you use the ON clause as you wrote, there is a possibility of looping.
Example "execution plans" is below.
You can access the correct mapping and data using the ON clause complement.
select
id,
name
from
u_users customers
left outer join
t_orders orders on customers.id = orders.userid
where orders.status=='payment pending'

Related

When talking about ANSI SQL, is there a difference between JOIN and LEFT JOIN?

I had an interview today where they asked the question, "Tell me the difference between a JOIN and INNER JOIN." I proceeded to explain what INNER JOIN is, and started to talk about LEFT JOIN. The interviewer interrupted me and said, "No I didn't say LEFT JOIN, just JOIN." I was honestly stuck because I never used "JOIN" I always specified LEFT JOIN.
He told me that LEFT JOIN and JOIN act the same. When looking up "JOIN" I can't find any information saying that it works just like left join.
Does JOIN work the same as LEFT JOIN?
Is it normal, for tech/IT jobs, to have trick questions like this?
Plain old JOIN is a synonym for INNER JOIN. That is quite different from a LEFT JOIN
SQL implements through explicit syntax five JOIN operations:
CROSS JOIN
INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL JOIN
In addition, "join" can colloquially mean "combining rows from two tables" and there are other join types -- such as semi-joins.

Duplicates in Oracle Full Join [duplicate]

This question already has answers here:
What is the difference between "INNER JOIN" and "OUTER JOIN"?
(28 answers)
Closed 2 years ago.
I read this article about the working of Full Joins.
The article says that a Full Join, which combines the results of Left Join and Right Join, "retains duplicate rows". Therefore, in order to simulate a FULL JOIN, we use UNION ALL instead of UNION.
But, when I perform a FULL JOIN on two tables in Oracle, I do not find duplicates at all. (I believe, Oracle internally uses the 'UNION ALL' operation on left and right joins to perform a FULL JOIN.) The left join and right join contain some common rows, but when I run a full join, those common rows don't appear twice.
Results of Left Join :
SELECT * FROM ORDERS LEFT JOIN CUSTOMER ON ORDERS.CUSTOMERID = CUSTOMER.CUSTOMER_ID;
Results of Right Join :
SELECT * FROM ORDERS RIGHT JOIN CUSTOMER ON ORDERS.CUSTOMERID = CUSTOMER.CUSTOMER_ID;
Results of Full Join :
SELECT * FROM CUSTOMER FULL OUTER JOIN ORDERS ON CUSTOMER.CUSTOMER_ID = ORDERS.CUSTOMERID;
As you can see from the results of FULL JOIN, it "Does not contain duplicate rows" , even though left join and right join have some common rows.
So, why is it believed that a full join contains duplicate rows ? Am I missing something ?
I'm really a bit confused by what you mean by "duplicated rows". Joins don't generated "duplicates". What they do is include all combinations of rows with matching join keys.
In your case, customerid = 3 is repeated, so it would seem to be repeated in one or both tables. This is true in all your join queries.
If your data is properly structured, then orders.customerid should always match customers.customerid, unless the former is NULL. In a properly structured database, full join is very, very rarely needed. For instance, assuming that the customerid has a properly declared foreign key relationship, then you would want left join, with customers as the first table:
SELECT *
FROM CUSTOMERS c LEFT JOIN
ORDERS o
USING (CUSTOMERID);
With USING, CUSTOMERID only appears once in the result set.

SQL Query Join queries

I have 2 queries in big query where i want to join 2 tables in some condition.
First query
Second query is same but im using JOIN instead of LEFT JOIN.
Can anyone explain me why LEFT JOIN with WHERE condition returns diffrent results count then INNER JOIN?
why LEFT JOIN with WHERE condition returns diffrent results count then INNER JOIN?
They are considering different starting sets to work with. Here are some nice illustrations on difference between joins:
https://www.diffen.com/difference/Inner_Join_vs_Outer_Join
https://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
Relevant images from referenced urls are here:
note that OUTER is optional so left outer join is equal to left join

Condition on main table in inner join vs in where clause

Is there any reason on an INNER JOIN to have a condition on the main table vs in the WHERE clause?
Example in INNER JOIN:
SELECT
(various columns here from each table)
FROM dbo.MainTable AS m
INNER JOIN dbo.JohnDataRecord AS jdr
ON m.ibID = jdr.ibID
AND m.MainID = #MainId -- question here
AND jdr.SentDate IS NULL
LEFT JOIN dbo.PTable AS p1
ON jdr.RecordID = p1.RecordID
LEFT JOIN dbo.DataRecipient AS dr
ON jdr.RecipientID = dr.RecipientID
(more left joins here)
WHERE
dr.lastRecordID IS NOT NULL;
Query with condition in WHERE clause:
SELECT
(various columns here from each table)
FROM dbo.MainTable AS m
INNER JOIN dbo.JohnDataRecord AS jdr
ON m.ibID = jdr.ibID
AND jdr.SentDate IS NULL
LEFT JOIN dbo.PTable AS p1
ON jdr.RecordID = p1.RecordID
LEFT JOIN dbo.DataRecipient AS dr
ON jdr.RecipientID = dr.RecipientID
(more left joins here)
WHERE
m.MainID = #MainId -- question here
AND dr.lastRecordID IS NOT NULL;
Difference in other similar questions that are more general whereas this is specific to SQL Server.
In the scope of the question,
Is there any reason on an INNER JOIN to have a condition on the main
table vs in the WHERE clause?
This is a STYLE choice for the INNER JOIN.
From a pure style reflection point of view:
While there is no hard and fast rule for STYLE, it is generally observed that this is a less often used style choice. For example that might generally lead to more challenging maintenance such as if someone where to remove the INNER JOIN and all the subsequent ON clause conditions, it would effect the primary table result set, OR make the query more difficult to debug/understand when it is a very complex set of joins.
It might also be noted that this line might be placed on many INNER JOINS further adding to the confusion.

SQL DB Question

Question about SQL View. Trying to develop a view from two tables. The two tables have same Primary Keys, execpt the 1st table has all of them, the 2nd has some, but not all. When I INNER Join them, I get a recordset but its not complete, because the 2nd table doesnt have all the records in it. Is there a way in my view to write logic stating that if the key isnt in there int he table #2 to insert a zero so the entire record set is shown in the view? I wan tto show ALL the records in the view even if theres nothing to inner join.
My example below:
SELECT dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst, dbo.Baan_view1b.[User],
dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage, MAX(dbo.notes.percent_developed) AS Expr1
FROM dbo.Baan_view1b INNER JOIN
dbo.notes ON dbo.Baan_view1b.Number = dbo.notes.note_number
GROUP BY dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst, dbo.Baan_view1b.[User],
dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage
HAVING (NOT (dbo.Baan_view1b.stage LIKE 'Closed'))
what you are looking for is the Left Join (left outer join) and not the inner join
SELECT dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst,
dbo.Baan_view1b.[User], dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage,
MAX(dbo.notes.percent_developed) AS Expr1
FROM dbo.Baan_view1b
LEFT OUTER JOIN dbo.notes
ON dbo.Baan_view1b.Number = dbo.notes.note_number
WHERE NOT dbo.Baan_view1b.stage LIKE 'Closed'
GROUP BY dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst,
dbo.Baan_view1b.[User], dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage
Also, changing the HAVING Clause to a WHERE clause makes the query more efficient.
Yes, you can do this. Assuming that baan_view1b has all the records and notes has only some, change
FROM dbo.Baan_view1b INNER JOIN dbo.notes
to say
FROM dbo.Baan_view1b LEFT OUTER JOIN dbo.notes
INNER JOIN (or just plain JOIN) tells the database engine to take records from Baan_view1b, match them up with records in notes, and include a row in the output for every pair of records that match. As you have seen, it excludes records from Baan_view1b that don't have matches in the notes table.
LEFT OUTER JOIN instead tells the engine to take ALL the records from Bann_view1b (because it's on the left side of the JOIN keywords). Then, it will match up records from notes wherever it can. However, you are guaranteed a row in the output for every row in the left-hand table regardless of whether it can be matched.
If, as is usual, you asked for column values from both tables, the columns from the table on the right-hand side of the JOIN will have NULL values in the missing rows.
Change the inner join to a left outer join.
(Or a right outer join or a full outer join if you feel fancy.)
You need a outer join. This shows all records that have a matching key as well as the ones that don't. The inner join only shows records that have matching join keys.
Enjoy!
You need to do a Left Outer Join as other posters have already mentioned. More information can be found here.