Duplicates in Oracle Full Join [duplicate] - sql

This question already has answers here:
What is the difference between "INNER JOIN" and "OUTER JOIN"?
(28 answers)
Closed 2 years ago.
I read this article about the working of Full Joins.
The article says that a Full Join, which combines the results of Left Join and Right Join, "retains duplicate rows". Therefore, in order to simulate a FULL JOIN, we use UNION ALL instead of UNION.
But, when I perform a FULL JOIN on two tables in Oracle, I do not find duplicates at all. (I believe, Oracle internally uses the 'UNION ALL' operation on left and right joins to perform a FULL JOIN.) The left join and right join contain some common rows, but when I run a full join, those common rows don't appear twice.
Results of Left Join :
SELECT * FROM ORDERS LEFT JOIN CUSTOMER ON ORDERS.CUSTOMERID = CUSTOMER.CUSTOMER_ID;
Results of Right Join :
SELECT * FROM ORDERS RIGHT JOIN CUSTOMER ON ORDERS.CUSTOMERID = CUSTOMER.CUSTOMER_ID;
Results of Full Join :
SELECT * FROM CUSTOMER FULL OUTER JOIN ORDERS ON CUSTOMER.CUSTOMER_ID = ORDERS.CUSTOMERID;
As you can see from the results of FULL JOIN, it "Does not contain duplicate rows" , even though left join and right join have some common rows.
So, why is it believed that a full join contains duplicate rows ? Am I missing something ?

I'm really a bit confused by what you mean by "duplicated rows". Joins don't generated "duplicates". What they do is include all combinations of rows with matching join keys.
In your case, customerid = 3 is repeated, so it would seem to be repeated in one or both tables. This is true in all your join queries.
If your data is properly structured, then orders.customerid should always match customers.customerid, unless the former is NULL. In a properly structured database, full join is very, very rarely needed. For instance, assuming that the customerid has a properly declared foreign key relationship, then you would want left join, with customers as the first table:
SELECT *
FROM CUSTOMERS c LEFT JOIN
ORDERS o
USING (CUSTOMERID);
With USING, CUSTOMERID only appears once in the result set.

Related

SQL: JOIN vs LEFT OUTER JOIN?

I have multiple SQL queries that look similar where one uses JOIN and another LEFT OUTER JOIN. I played around with SQL and found that it the same results are returned. The codebase uses JOIN and LEFT OUTER JOIN interchangeably. While LEFT JOIN seems to be interchangeable with LEFT OUTER JOIN, I cannot I cannot seem to find any information about only JOIN. Is this good practice?
Ex Query1 using JOIN
SQL
SELECT
id,
name
FROM
u_users customers
JOIN
t_orders orders
ON orders.status=='PAYMENT PENDING'
Ex. Query2 using LEFT OUTER JOIN
SQL
SELECT
id,
name
FROM
u_users customers
LEFT OUTER JOIN
t_orders orders
ON orders.status=='PAYMENT PENDING'
As previously noted above:
JOIN is synonym of INNER JOIN. It's definitively different from all
types of OUTER JOIN
So the question is "When should I use an outer join?"
Here's a good article, with several great diagrams:
https://www.sqlshack.com/sql-outer-join-overview-and-examples/
The short answer your your question is:
Prefer JOIN (aka "INNER JOIN") to link two related tables. In practice, you'll use INNER JOIN most of the time.
INNER JOIN is the intersection of the two tables. It's represented by the "green" section in the middle of the Venn diagram above.
Use an "Outer Join" when you want the left, right or both outer regions.
In your example, the result set happens to be the same: the two expressions happen to be equivalent.
ALSO: be sure to familiarize yourself with "Show Plan" (or equivalent) for your RDBMS: https://www.sqlshack.com/execution-plans-in-sql-server/
'Hope that helps...
First the theory:
A join is a subset of the left join (all other things equal). Under some circumstances they are identical
The difference is that the left join will include all the tuples in the left hand side relation (even if they don't match the join predicate), while the join will only include the tuples of the left hand side that match the predicate.
For instance assume we have to relations R and S.
Say we have to do R JOIN S (and R LEFT JOIN S) on some predicate p
J = R JOIN S on (p)
Now, identify the tuples of R that are not in J.
Finally, add those tuples to J (padding any attribute in J not in R with null)
This result is the left join:
R LEFT JOIN S (p)
So when all the tuples of the left hand side of the relation are in the JOIN, this result will be identical to the Left Join.
back to you problem:
Your JOIN is very likely to include all the tuples from Users. So the query is the same if you use JOIN or LEFT JOIN.
The two are exactly equivalent, because the WHERE clause turns the LEFT JOIN into an INNER JOIN.
When filtering on all but the first table in a LEFT JOIN, the condition should usually be in the ON clause. Presumably, you also have a valid join condition, connecting the two tables:
SELEC id, name
FROM u_users u LEFT JOIN
t_orders o
ON o.user_id = u.user_id AND o.status = 'PAYMENT PENDING';
This version differs from the INNER JOIN version, because this version returns all users even those with no pending payments.
Both are the same, there is no difference here.
You need to use the ON clause when using Join. It can match any data between two tables when you don't use the ON clause.
This can cause performance issue as well as map unwanted data.
If you want to see the differences you can use "execution plans".
for example, I used the Microsoft AdventureWorks database for the example.
LEFT OUTER JOIN :
LEFT JOIN :
If you use the ON clause as you wrote, there is a possibility of looping.
Example "execution plans" is below.
You can access the correct mapping and data using the ON clause complement.
select
id,
name
from
u_users customers
left outer join
t_orders orders on customers.id = orders.userid
where orders.status=='payment pending'

SQL Server JOINS: Are 'JOIN' Statements 'LEFT OUTER' Associated by Default in SQL Server? [duplicate]

This question already has answers here:
What is the difference between "INNER JOIN" and "OUTER JOIN"?
(28 answers)
Closed 8 years ago.
I have about 6 months novice experience in SQL, TSQL, SSIS, ETL. As I find myself using JOIN statements more and more in my intern project I have been experimenting with the different JOIN statements. I wanted to confirm my findings. Are the following statements accurate pertaining to the conclusion of JOIN statements in SQL Server?:
1)I did a LEFT OUTER JOIN query and did the same query using JOIN which yielded the same results; are all JOIN statements LEFT OUTER associated in SQL Server?
2)I did a LEFT OUTER JOIN WHERE 2nd table PK (joined to) IS NOT NULL and did the same query using an INNER JOIN which yielded the same results; is it safe to say the the INNER JOIN statement will yield only matched records? and is the same as LEFT OUTER JOIN where joined records IS NOT NULL ?
The reason I'm asking is because I have been only using LEFT OUTER JOINS because that is what I was comfortable with. However, I want to eliminate as much code as possible when writing queries to be more efficient. I just wanted to make sure my observations are correct.
Also, are there any tips that you could provide on easily figuring out which JOIN statement is appropriate for specific queries? For instance, what JOIN would you use if you wanted to yield non-matching records?
Thanks.
A join or inner join (same thing) between table A and table B on, for instance, field1, would narrow in on all rows of table A and B sharing the same field1 value.
A left outer join between A and B, on field1, would show all rows of table A, and only those rows of table B that have a field1 existing in table A.
Where the rows of field1 on table A have a field1 value that doesn't exist in table B, the table B value would show null for field1, but the row of table A would be retained because it is an outer join. These are rows that wouldn't show up in a join which is an implied inner join.
If you get the same results doing a join between table A and table B as you do a left outer join between table A and B, then whatever fields you're joining on have values that exist in both tables. No value for any of the joined fields in A or B exist exclusively in A or B, they all exist in both A and B.
It is also possible you're putting criteria into the where clause that belongs in the on clause of the outer join, which may be causing your confusion. In my example above of tables A and B, where A is being left outer joined with B, you would put any criteria related to table B in the on clause, not the where clause, otherwise you would essentially be turning the outer join into an inner join. For example if you had b.field4 = 12 in the WHERE clause, and table B didn't have a match with A, it would be null and that criteria would fail, and it'd no longer come back even though you used a left outer join. That may be what you are referring to.
JOIN's are mapped to 'INNER JOIN' by default

Is it better to use where instead of adding conditions into join clause in SQL queries? [duplicate]

This question already has answers here:
INNER JOIN ON vs WHERE clause
(12 answers)
Closed 8 years ago.
Hello :) I've got a question on MySQL queries.
Which one's faster and why?
Is there any difference at all?
select tab1.something, tab2.smthelse
from tab1 inner join tab2 on tab1.pk=tab2.fk
WHERE tab2.somevalue = 'value'
Or this one:
select tab1.something, tab2.smthelse
from tab1 inner join tab2 on tab1.pk=tab2.fk
AND tab2.somevalue = 'value'
As Simon noted, the difference in performance should be negligible. The main concern would be ensuring your query correctly expresses your intent, and (especially) you get the expected results.
Generally, you want to add filters to the JOIN clause only if the filter is a condition of the join. In most (not all) cases, a filter should be applied to the WHERE clause, as it is a filter of the overall query, not of the join itself.
AFAIK, the only instance where this really affects the outcome of the query is when using an OUTER JOIN.
Consider the following queries:
SELECT *
FROM Customer c
LEFT JOIN Orders o ON c.CustomerId = o.CustomerId
WHERE o.OrderType = "InternetOrder"
vs.
SELECT *
FROM Customer c
LEFT JOIN Orders o ON c.CustomerId = o.CustomerId AND o.OrderType = "InternetOrder"
The first will return one row for each customer order that has an order type of "Internet Order". In effect, your left join has become an inner join because of the filter that was applied to the whole query (i.e. customers who do not have an "InternetOrder" will not be returned at all).
The second will return at least one row for each customer. If the customer has no orders of order type "Internet Order", it will return null values for all order table fields. Otherwise it will return one row for each customer order of type "Internet Order".
If the constraint is based off the joined table (as yours is) then it makes sense to specify the constraint when you join.
This way MySQL is able to reduce the rows from the joined table at the time it joins, as otherwise it will need to be able to select all data that fulfills the basic JOIN criteria prior to applying the WHERE logic.
In reality you'll see little difference in performance until you get to more complex queries or larger datasets, however limiting the data at each JOIN will be more efficient overall if done well especially if there are good indexes on the joined table.

SQL DB Question

Question about SQL View. Trying to develop a view from two tables. The two tables have same Primary Keys, execpt the 1st table has all of them, the 2nd has some, but not all. When I INNER Join them, I get a recordset but its not complete, because the 2nd table doesnt have all the records in it. Is there a way in my view to write logic stating that if the key isnt in there int he table #2 to insert a zero so the entire record set is shown in the view? I wan tto show ALL the records in the view even if theres nothing to inner join.
My example below:
SELECT dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst, dbo.Baan_view1b.[User],
dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage, MAX(dbo.notes.percent_developed) AS Expr1
FROM dbo.Baan_view1b INNER JOIN
dbo.notes ON dbo.Baan_view1b.Number = dbo.notes.note_number
GROUP BY dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst, dbo.Baan_view1b.[User],
dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage
HAVING (NOT (dbo.Baan_view1b.stage LIKE 'Closed'))
what you are looking for is the Left Join (left outer join) and not the inner join
SELECT dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst,
dbo.Baan_view1b.[User], dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage,
MAX(dbo.notes.percent_developed) AS Expr1
FROM dbo.Baan_view1b
LEFT OUTER JOIN dbo.notes
ON dbo.Baan_view1b.Number = dbo.notes.note_number
WHERE NOT dbo.Baan_view1b.stage LIKE 'Closed'
GROUP BY dbo.Baan_view1b.Number, dbo.Baan_view1b.description, dbo.Baan_view1b.system, dbo.Baan_view1b.Analyst,
dbo.Baan_view1b.[User], dbo.Baan_view1b.[Date Submitted], dbo.Baan_view1b.category, dbo.Baan_view1b.stage
Also, changing the HAVING Clause to a WHERE clause makes the query more efficient.
Yes, you can do this. Assuming that baan_view1b has all the records and notes has only some, change
FROM dbo.Baan_view1b INNER JOIN dbo.notes
to say
FROM dbo.Baan_view1b LEFT OUTER JOIN dbo.notes
INNER JOIN (or just plain JOIN) tells the database engine to take records from Baan_view1b, match them up with records in notes, and include a row in the output for every pair of records that match. As you have seen, it excludes records from Baan_view1b that don't have matches in the notes table.
LEFT OUTER JOIN instead tells the engine to take ALL the records from Bann_view1b (because it's on the left side of the JOIN keywords). Then, it will match up records from notes wherever it can. However, you are guaranteed a row in the output for every row in the left-hand table regardless of whether it can be matched.
If, as is usual, you asked for column values from both tables, the columns from the table on the right-hand side of the JOIN will have NULL values in the missing rows.
Change the inner join to a left outer join.
(Or a right outer join or a full outer join if you feel fancy.)
You need a outer join. This shows all records that have a matching key as well as the ones that don't. The inner join only shows records that have matching join keys.
Enjoy!
You need to do a Left Outer Join as other posters have already mentioned. More information can be found here.

Compare inner join and outer join SQL statements

What is the difference between an inner join and outer join? What's the precise meaning of these two kinds of joins?
Check out Jeff Atwood's excellent:
A Visual Explanation of SQL Joins
Marc
Wikipedia has a nice long article on the topic [here](http://en.wikipedia.org/wiki/Join_(SQL))
But basically :
Inner joins return results where there are rows that satisfy the where clause in ALL tables
Outer joins return results where there are rows that satisfy the where clause in at least one of the tables
You use INNER JOIN to return all rows from both tables where there is a match. ie. in the resulting table all the rows and columns will have values.
In OUTER JOIN the resulting table may have empty columns. Outer join may be either LEFT or RIGHT
LEFT OUTER JOIN returns all the rows from the first table, even if there are no matches in the second table.
RIGHT OUTER JOIN returns all the rows from the second table, even if there are no matches in the first table.
INNER JOIN returns rows that exist in both tables
OUTER JOIN returns all rows that exist in either table
Inner join only returns a joined row if the record appears in both table.
Outer join depending on direction will show all records from one table, joined to the data from them joined table where a corresponding row exists
Using mathematical Set,
Inner Join is A ^ B;
Outer Join is A - B.
So it is (+) is your A side in the query.
Assume an example schema with customers and order:
INNER JOIN: Retrieves customers with orders only.
LEFT OUTER JOIN: Retrieves all customers with or without orders.
RIGHT OUTER JOIN: Retrieves all orders with or without matching customer records.
For a slightly more detailed infos, see Inner and Outer Join SQL Statements