LEFT JOIN THREE tables - sql

How to create sql query to select the distinct table A data
as in the image
Thanks

One method is minus:
select . . .
from a
minus
select . . .
from b
minus
select . . .
from c;
Or, not exists:
select a.*
from a
where not exists (select 1 from b where . . . ) and
not exists (select 1 from c where . . . );
You don't clarify what the matching conditions are, so I've used . . . for generality.
These two versions are not the same. The first returns unique combinations of columns from a where those same columns are not in b or c. The second returns all columns from a, where another set is not in b or c.

If you must use LEFT JOIN to implement what is really an anti join, then do this:
SELECT *
FROM a
LEFT JOIN b ON b.a_id = a.a_id
LEFT JOIN c ON c.a_id = a.a_id
WHERE b.a_id IS NULL
AND c.a_id IS NULL
This reads:
FROM: Get all rows from a
LEFT JOIN: Optionally get the matching rows from b and c as well
WHERE: In fact, no. Keep only those rows from a, for which there was no match in b and c
Using NOT EXISTS() is a more elegant way to run an anti-join, though. I tend to not recommend NOT IN() because of the delicate implications around three valued logic - which can lead to not getting any results at all.
Side note on using Venn diagrams for joins
A lot of people like using Venn diagrams to illustrate joins. I think this is a bad habit, Venn diagrams model set operations (like UNION, INTERSECT, or in your case EXCEPT / MINUS) very well. Joins are filtered cross products, which is an entirely different kind of operation. I've blogged about it here.

Select what isn't in B nor C nor in A inner join B inner join C
Select * from A
where A.id not in ( select coalesce(b.id,c.id) AS ID
from b full outer join c on (b.id=c.id) )
or also: --- you don't need a join so jou can avoid doing it
select * from A
where a.id not in (select coalesce (B.ID,C.ID) AS ID from B,C)

I would do like this:
SELECT t1.name
FROM table1 t1
LEFT JOIN table2 t2 ON t2.name = t1.name
WHERE t2.name IS NULL
Someone already ask something related to your question, you should see it
here

Related

get all rows from A plus missing rows from B

This seems so obvious but I am failing.
In Teradata SQL, how to get all rows from table A, plus those from table B, that do not occur in table A, based on key field key?
This must have been asked a thousand times. But honestly I do not find the answer.
Full outer join seems to give me duplicate "inner join" results.
--Edit , based on first comment (thanks) --
so if I would do
select * from A
union all
select * from B
left join A
on A.key = B.key
where A.key IS NULL
I guess that would work (untested) but is that the most performant way?
Sometimes EXISTS or NOT EXISTS performs better than joins:
select * from A
union all
select * from B
where not exists (
select 1 from A
where A.key = B.key
)
I assume the key columns are already indexed.
Your version is fine . . . if you select the right columns:
select A.* from A
union all
select B.*
from B left join
A
on A.key = B.key
where A.key IS NULL;
I think Teradata does a good job optimizing joins. That said, EXISTS is also a very reasonable option.

FULL OUTER JOIN with an OR condition

I have 2 tables with 2 different id's. I want to join based on the 2 different id's and a few other parameters, but the problem is that the 1 id's don't always match. Sometimes id number 1 will have matches, some times id number 2 will not match any and some times both will match.
Using full outer join with an OR condition in the JOIN clause really slows down my query. Is there a more efficient way of doing this?
I know you can use unions instead in case of inner joins but am not sure how to optimize using outer joins.
SELEC A.*, B.*
FROM A
FULL OUTER JOIN B
ON (A.id_1 = B.id_1 or A.id_2 = B.id_2)
AND A.pay_month = B.pay_month
AND A.plan = B.plan
Hmmm . . . This might be sufficient:
select A.*, B.*
from A full outer join
B
on A.id_1 = B.id_1 and A.pay_month = B.pay_month and A.plan = B.plan
union -- intentionally to remove duplicates
select A.*, B.*
from A full outer join
B
on A.id_2 = B.id_2 and A.id_1 <> B.id_1 and A.pay_month = B.pay_month and A.plan = B.plan;
This is not 100% equivalent -- for instance, this removes duplicates even within a table. Also, the union adds the overhead of removing duplicates. But the results may be good for your purposes.
Also, is a full outer join really necessary? I rarely use it in my code.

Select non duplicate records from a hive join query

I have the following Hive query:
select *
from A
left outer join B
on A.ID = B.ID
where B.ID IS NULL
The result produces duplicate data but I need only non-duplicate records.
After some research, I tried the below query:
select *
from (
select *
from A
left outer join on B
where A.ID = B.ID AND B.ID IS NULL ) join_result
group by jojn_result.ID
It's showing an ambiguous column reference ID error.
I do not have the columns name of table A.
Please help me to identify the solution to this .
Thank you .
Hmmm . . . How about select:
Select A.*
from A left outer join
B
on A.ID = B.ID
where B.ID IS NULL;
I removed the B columns because they are not needed.
One of your join columns may have NULL values. Whenever there is NULL in any of the join key values, it will skip that column. Try replacing the NULL with some default value while joining using NVL or COALESCE. I was looking for same answer and saw your post here. But there was no solution. But since i found the solution I just wanted to post here so that someone can benefit.
select *
from A
left outer join B
on coalesce(A.ID,000) = coalesce(B.ID,000)
where B.ID IS NULL

How do I find records that are not joined?

I have two tables that are joined together.
A has many B
Normally you would do:
select * from a,b where b.a_id = a.id
To get all of the records from a that has a record in b.
How do I get just the records in a that does not have anything in b?
select * from a where id not in (select a_id from b)
Or like some other people on this thread says:
select a.* from a
left outer join b on a.id = b.a_id
where b.a_id is null
select * from a
left outer join b on a.id = b.a_id
where b.a_id is null
The following image will help to understand SQL LET JOIN :
Another approach:
select * from a where not exists (select * from b where b.a_id = a.id)
The "exists" approach is useful if there is some other "where" clause you need to attach to the inner query.
SELECT id FROM a
EXCEPT
SELECT a_id FROM b;
You will probably get a lot better performance (than using 'not in') if you use an outer join:
select * from a left outer join b on a.id = b.a_id where b.a_id is null;
SELECT <columnns>
FROM a WHERE id NOT IN (SELECT a_id FROM b)
In case of one join it is pretty fast, but when we are removing records from database which has about 50 milions records and 4 and more joins due to foreign keys, it takes a few minutes to do it.
Much faster to use WHERE NOT IN condition like this:
select a.* from a
where a.id NOT IN(SELECT DISTINCT a_id FROM b where a_id IS NOT NULL)
//And for more joins
AND a.id NOT IN(SELECT DISTINCT a_id FROM c where a_id IS NOT NULL)
I can also recommended this approach for deleting in case we don't have configured cascade delete.
This query takes only a few seconds.
The first approach is
select a.* from a where a.id not in (select b.ida from b)
the second approach is
select a.*
from a left outer join b on a.id = b.ida
where b.ida is null
The first approach is very expensive. The second approach is better.
With PostgreSql 9.4, I did the "explain query" function and the first query as a cost of cost=0.00..1982043603.32.
Instead the join query as a cost of cost=45946.77..45946.78
For example, I search for all products that are not compatible with no vehicles. I've 100k products and more than 1m compatibilities.
select count(*) from product a left outer join compatible c on a.id=c.idprod where c.idprod is null
The join query spent about 5 seconds, instead the subquery version has never ended after 3 minutes.
Another way of writing it
select a.*
from a
left outer join b
on a.id = b.id
where b.id is null
Ouch, beaten by Nathan :)
This will protect you from nulls in the IN clause, which can cause unexpected behavior.
select * from a where id not in (select [a id] from b where [a id] is not null)

How can I implement SQL INTERSECT and MINUS operations in MS Access

I have researched and haven't found a way to run INTERSECT and MINUS operations in MS Access. Does any way exist
INTERSECT is an inner join. MINUS is an outer join, where you choose only the records that don't exist in the other table.
INTERSECT
select distinct
a.*
from
a
inner join b on a.id = b.id
MINUS
select distinct
a.*
from
a
left outer join b on a.id = b.id
where
b.id is null
If you edit your original question and post some sample data then an example can be given.
EDIT: Forgot to add in the distinct to the queries.
INTERSECT is NOT an INNER JOIN. They're different. An INNER JOIN will give you duplicate rows in cases where INTERSECT WILL not. You can get equivalent results by:
SELECT DISTINCT a.*
FROM a
INNER JOIN b
on a.PK = b.PK
Note that PK must be the primary key column or columns. If there is no PK on the table (BAD!), you must write it like so:
SELECT DISTINCT a.*
FROM a
INNER JOIN b
ON a.Col1 = b.Col1
AND a.Col2 = b.Col2
AND a.Col3 = b.Col3 ...
With MINUS, you can do the same thing, but with a LEFT JOIN, and a WHERE condition checking for null on one of table b's non-nullable columns (preferably the primary key).
SELECT DISTINCT a.*
FROM a
LEFT JOIN b
on a.PK = b.PK
WHERE b.PK IS NULL
That should do it.
They're done through JOINs. The old fashioned way :)
For INTERSECT, you can use an INNER JOIN. Pretty straightforward. Just need to use a GROUP BY or DISTINCT if you have don't have a pure one-to-one relationship going on. Otherwise, as others had mentioned, you can get more results than you'd expect.
For MINUS, you can use a LEFT JOIN and use the WHERE to limit it so you're only getting back rows from your main table that don't have a match with the LEFT JOINed table.
Easy peasy.
Unfortunately MINUS is not supported in MS Access - one workaround would be to create three queries, one with the full dataset, one that pulls the rows you want to filter out, and a third that left joins the two tables and only pulls records that only exist in your full dataset.
Same thing goes for INTERSECT, except you would be doing it via an inner join and only returning records that exist in both.
No MINUS in Access, but you can use a subquery.
SELECT DISTINCT a.*
FROM a
WHERE a.PK NOT IN (SELECT DISTINCT b.pk FROM b)
I believe this one does the MINUS
SELECT DISTINCT
a.CustomerID,
b.CustomerID
FROM
tblCustomers a
LEFT JOIN
[Copy Of tblCustomers] b
ON
a.CustomerID = b.CustomerID
WHERE
b.CustomerID IS NULL