Proper Left join of three tables - SQL - sql

Update - Ok the three answers all make sense, i'm going to try them each as I am curious if there is a performance +/- but i'm not sure I have enough test data in my tables to determine that.
I am trying to look at Table A and search to see if a user exists in Table B or Table C so as to find anyone form table A who does not exists in at least one of the other two tables (they do not need to exists in both, just B or C)
Something like this but without having to have to almost identical statements below
SELECT emp_id
FROM
tableA
LEFT JOIN
TableB
ON
tableA.emp_id = tableB.emp_id
WHERE
TableA.emp_id IS NULL
SELECT emp_id
FROM
tableA
LEFT JOIN
TableC
ON
tableA.emp_id = tableC.emp_id
WHERE
TableA.emp_id IS NULL
Table A
+---------+--------+-----------+
| Emp_ID | Status | hire_date |
+---------+--------+-----------+
| 12345 | happy | 10/10/2005|
| 54321 | sad | 12/01/2009|
+---------+--------+-----------+
Table B
+---------+--------+
| Emp_ID | Weight |
+---------+--------+
| 12345 | 185 |
| 54321 | 150 |
+---------+--------+
Table C
+---------+--------+
| Emp_ID | City |
+---------+--------+
| 12345 | Chicago|
| 54321 | Atlanta|
+---------+--------+
Thanks for any suggestions!

You can join all tables in a single query.
SELECT a.Emp_ID -- a.* <<== if you want to include all columns
FROM tbA a
LEFT JOIN tbB b
ON a.Emp_ID = b.Emp_ID
LEFT JOIN tbC c
ON a.Emp_ID = c.Emp_ID
WHERE b.Emp_ID IS NULL
AND c.Emp_ID IS NULL -- <<== AND should be use here

Why not just express the query using not in?
SELECT emp_id
FROM tableA
WHERE emp_id not in (select emp_id from TableB) and
emp_id not in (select emp_id from TableC);

you can join 3 table simply as below
Select emp_id from table1 a left join table2 b on a.emp_id=b.emp_id
left join table3 c on c.emp_id=a.emp_id

Your query can't work because you have a WHERE tableA.emp_id IS NULL, and TableA is the emp_id you want to test for, you should have tested with TableB.eemp_id IS NULL and TableC.emp_id IS NULL for the second query.
Since you want rows that do not exists in at least tableB or tableC, you can do a LEFT JOIN with both tableB and tableC and test if at least one of the emp_id in those tables IS NULL with a OR
SELECT emp_id
FROM tableA
LEFT JOIN TableB ON tableA.emp_id = tableB.emp_id
LEFT JOIN TableC ON tableA.emp_id = tableC.emp_id
WHERE
TableB.emp_id IS NULL
OR TableC.emp_id IS NULL

Related

Trying to write an inner join to filter out some conditions

I'm currently struggling with carrying out some joins and hoping someone can shed some light on this.
I have three tables: A,B,C
Table C lists names of individuals
Table A lists the food they like to eat
Table B is the link to show what food in A a person likes from C (Our
system was built without foreign keys! I know, it's a pain!)
What I'm trying to write is a query that will return a list of values from Table C which shows the individuals that don't like a specific food...say PFC
I have the following:
select * from table_c c
inner join table_b b
on c.name = b.bValue
inner join table_a a
on b.aValue = a.number
where a.value not in('PFC')
I'm assuming the joins are working but as table A has multiple values, the two extra rows are being returned. Is it possible to not show this client if one of the joins shows a food I don't want to see?
Table A
|---------------------|------------------|
| Number | Value |
|---------------------|------------------|
| 1 | McDs |
|---------------------|------------------|
| 1 | KFC |
|---------------------|------------------|
| 1 | PFC |
|---------------------|------------------|
Table B
|---------------------|------------------|
| bValue | aValue |
|---------------------|------------------|
| John | 1 |
|---------------------|------------------|
Table C
|---------------------|
| Name |
|---------------------|
| John |
|---------------------|
I'm also using SQL Server 2013 if that makes a difference!
With NOT EXISTS:
select * from table_c c
where not exists (
select 1 from table_b b inner join table_a a
on b.aValue = a.number
where b.bValue = c.name and a.value = 'PFC'
)
One option is to aggregate by name:
SELECT
c.Name
FROM table_c c
INNER JOIN table_b b
ON c.Name = b.bValue
INNER JOIN table_a a
ON b.aValue = a.Number
GROUP BY
c.Name
HAVING
COUNT(CASE WHEN a.Value = 'PFC' THEN 1 END) = 0;
We could also try expressing this using an exists query:
SELECT
c.Name
FROM table_c c
WHERE NOT EXISTS (SELECT 1 FROM table_b b
INNER JOIN table_a a
ON b.aValue = a.Number
WHERE c.Name = b.bValue AND
a.Value = 'PFC');

Oracle - Conditional Join eliminating additional joins where unnecessary

Let’s say I have a simplified table structure as follows:
Table A - ID (PK)
Table B - ID (PK), AID = FK to Table A
Table C - ID (PK), BID = FK to Table B
Table D - ID (PK), CID = FK to Table C
Query something like so:
SELECT * FROM TABLE_A TBLA
LEFT JOIN TABLE_B TBLB ON TBLA.ID = TBLB.AID
LEFT JOIN TABLE_C TBLC ON TBLB.ID = TBLC.BID
LEFT JOIN TABLE_D TBLD ON TBLC.ID = TBLD.CID
It’s relatively straight-forward but what I want to do is somewhat a conditional join in that I want all records from TABLE A but want to join TABLE B -> TABLE C -> TABLE D if the first join between TABLE A and TABLE B is satisfied, bearing in mind that I could change TABLE B -> TABLE C -> TABLE D joins to be INNER as there’ll exist in that initial join between TABLE A and TABLE B is satisfied.-
But also I’d need a WHERE condition on TABLE D also.
So essentially want to eliminate the LEFT JOIN’S between TABLE_B, TABLE_C, TABLE_D where join isn’t satisfied between TABLE_A and TABLE_B.
Very simplified data so apologies!
| Table A |
| ID |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| Table B |
| ID | AID |
| 1 | 5 |
| Table C |
| ID | BID |
| 1 | 1 |
| Table D |
| ID | CID | Value |
| 1 | 1 | ABC |
The reason I want to eliminate the join is that for 4 of the 5 rows in Table A, I’m doing unnecessary joins across three tables to get the value in Table D.
You can use brackets with join, but not sure if that would help you!
SELECT * FROM
TABLE_A TBLA
LEFT JOIN (TABLE_B TBLB
INNER JOIN TABLE_C TBLC ON TBLB.ID = TBLC.BID
INNER JOIN TABLE_D TBLD ON TBLC.ID = TBLD.CID) ON TBLA.ID = TBLB.AID
This way, when you have A matches entries in B, but B is not matched in the chain to C and D, B data is not retrieved, and so for C
You mentioned that you need a where condition on table D? Does that mean that you always have to link to D?! Note that if you have a condition on D, the condition has to be satisfied in all cases. Hence, when no records are retrieved from D, no records will be retreived from the query (even with your initial outer join unless you used OR .. is null )

How do I join these tables when the first join needs to account for a NULL?

TableA TableB TableC
matl matl | job | suffix job | suffix
------- ---------------------------- --------------
itemA itemA| jobA | suffixA jobA | suffixA
itemB NULL | NULL | NULL
--
After joining, I need to return:
Query
matl | job suffix
---------------------------
itemA | jobA suffixA
itemB | NULL NULL
How would I write my query to use the information in TableA to return information from TableC, and return NULL if there is no information in TableC after joining?
SELECT A.matl,
B.jobA,
B.suffix
FROM TableA A
LEFT JOIN
( SELECT *
FROM TableB
WHERE ( job,suffix ) IN ( SELECT job,suffix FROM TableC )
) B
ON A.matl = B.matl;
Use LEFT JOIN
SELECT *
FROM TableA A
LEFT JOIN TableB B
ON A.matl = B.matl
LEFT JOIN TableC C
ON B.job = C.job
AND B.suffix = C.suffix

left join not giving correct output

I am having 2 table
tableA
Accountid
-----------
10
11
12
tableB
Accountid | Date |
--------------------------------------------
10 | 2016-02-02 |
11 | 2016-02-02 |
11 | 2016-02-02 |
15 | 2016-02-03 |
I am expecting the output like
Accountid | ID |
------------------------------------
10 | 10 |
11 | 11 |
12 | NULL |
I am running this query
select A.accountid,b.accountid as ID
from tableA as A
left join tableB as B on a.accountid=b.accountid
where b.date between '2016-02-02' and '2016-02-02'
but it is giving me the output as, I am not sure where am I going wrong
Accountid | ID |
-----------------------------------
10 | 10 |
11 | 11 |
I am using MSSQL database.
When any field of the right table of a left join is occurred in WHERE clause then this join will behave like INNER JOIN
To get expected result your query should be like this
select A.accountid,b.accountid as ID
from tableA as A
left join tableB as B on a.accountid=b.accountid and
b.date between '2016-02-02' and '2016-02-02'
Try this:
select A.accountid,b.accountid as ID
from tableA as A left join tableB as B on a.accountid=b.accountid
where b.date is null or b.date between '2016-02-02' and '2016-02-02'
The reason is, for AccountID 12 b.date is effectively null (because there's no row in tableB). Therefore you'll only get a result for that row if you allow date to be null in the query.
If b does not exist (id = 12) your where clause return false .
if you want to see the row with id=12 you must include your test (b.date between '2016-02-02' and '2016-02-02') with your "ON" clause :
select A.accountid,b.accountid as ID from tableA as A left join tableB as B
on a.accountid=b.accountid and b.date between '2016-02-02' and '2016-02-02'
the reason is because the WHERE clause of your select is executed after the LEFT JOIN
so, first sql server will extract the data as you expect, with the row 12-NULL,
and then it will be filtered out and removed from output by your WHERE clause
you can move the date filter on JOIN condition as suggested by #JaydipJ and #RémyBaron
or filter the tableB before the JOIN this way:
select A.accountid,b.accountid as ID
from tableA as A
left join (
select *
from tableB
where b.date between '2016-02-02' and '2016-02-02'
) as B on a.accountid=b.accountid

Select first record in a One-to-Many relation using left join

I'm trying to join two tables using a left-join. And the result set has to include only the first record from the "right" joined table.
Lets say I have two tables A and B as below;
Table "A"
code | emp_no
101 | 12222
102 | 23333
103 | 34444
104 | 45555
105 | 56666
Table "B"
code | city | county
101 | Glen Oaks | Queens
101 | Astoria | Queens
101 | Flushing | Queens
102 | Ridgewood | Brooklyn
103 | Bayside | New York
Expected Output:
code | emp_no | city | county
101 | 12222 | Glen Oaks | Queens
102 | 23333 | Ridgewood | Brooklyn
103 | 34444 | Bayside | New York
104 | 45555 | NULL | NULL
105 | 56666 | NULL | NULL
If you notice my result has only the one matched record from table "B"(doesn't matter what record is matched) after left join (and it is a one to many mapping)
I need to pick the first matched record from table B and ignore all other rows.
Please help!
Thanks
After playing around a bit, this turns out to be trickier than I'd expected! Assuming that table_b has some single column that is unique (say, a single-field primary key), it looks like you can do this:
SELECT table_a.code,
table_a.emp_no,
table_b.city,
table_b.county
FROM table_a
LEFT
JOIN table_b
ON table_b.code = table_a.code
AND table_b.field_that_is_unique =
( SELECT TOP 1
field_that_is_unique
FROM table_b
WHERE table_b.code = table_a.code
)
;
Another option: OUTER APPLY
If supported by the database, OUTER APPLY is an efficient and terse option.
SELECT *
FROM
Table_A a
OUTER APPLY
(SELECT TOP 1 *
FROM Table_B b_1
WHERE b_1.code = a.code
) b
;
This results in a left join to the indeterminate first matched record. My tests show it to be quicker than any other posted solution (on MS SQL Server 2012).
The highest voted answer does not seem correct to me, and seems overcomplicated.
Just group by the code field on table B in your subquery and select the maximum Id per grouping.
SELECT
table_a.code,
table_a.emp_no,
table_b.city,
table_b.county
FROM
table_a
LEFT JOIN
table_b
ON table_b.code = table_a.code
AND table_b.field_that_is_unique IN
(SELECT MAX(field_that_is_unique)
FROM table_b
GROUP BY table_b.code)
If you are on SQL Server 2005 or later version, you could use ranking to achieve what you want. In particular, ROW_NUMBER() seems to suit your needs nicely:
WITH B_ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY code ORDER BY city)
FROM B
)
SELECT
A.code,
A.emp_no,
B.city,
B.county
FROM A
LEFT JOIN B_ranked AS B ON A.code = B.code AND b.rnk = 1
OR
WITH B_unique_code AS (
select * from(
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY code ORDER BY city)
FROM B
) AS s
where rnk = 1
)
SELECT
A.code,
A.emp_no,
B.city,
B.county
FROM A
LEFT JOIN B_unique_code AS B ON A.code = B.code
I modified the answer from ruakh and this seem to work perfectly with mysql.
SELECT
table_a.code,
table_a.emp_no,
table_b.city,
table_b.county
FROM table_a a
LEFT JOIN table_b b
ON b.code = a.code
AND b.id = ( SELECT id FROM table_b
WHERE table_b.code = table_a.code
LIMIT 1
)
;
this is how:
Select * From TableA a
Left Join TableB b
On b.Code = a.Code
And [Here put criteria predicate that 'defines' what the first record is]
Hey, if the city and county are unique, then use them
Select * From TableA a
Left Join TableB b
On b.Code = a.Code
And b.City + b.county =
(Select Min(city + county)
From TableB
Where Code = b.Code)
But the point is you have to put some expression in there to tell the query processor what it means to be first.
In Oracle you can do:
WITH first_b AS (SELECT code, min(rowid) AS rid FROM b GROUP BY code))
SELECT a.code, a.emp_no, b.city, b.county
FROM a
INNER JOIN first_b
ON first_b.code = a.code
INNER JOIN b
ON b.rowid = first_b.rid