SQL trying to do a JOIN to include results from multiple Tables - sql

I'm a complete novice teaching myself SQL by writing and modifying a few queries and reports at work.
I've got something of a handle on the various types of JOINs and I've used INNER JOIN a few times with decent success.
What I'm stuck on should be a simple task, but my Google-Fu must be weak. Here's what I'm trying to do.
Say I have 3 tables, Table_A, Table_B, and Table_C, and each table has a column called [Serial_Number].
What I'm wanting to select is 3 of the other columns if A.Serial_Number = B.Serial_Number OR C.Serial_Number.
I've tried doing:
SELECT
*
FROM
Table_A AS A
INNER JOIN Table_B AS B ON A.Serial_Number = B.Serial_Number
INNER JOIN Table_C AS C ON A.Serial_Number = C.Serial_Number
But this always yields 0 results as the nature of the data dictates that if A matches B, it will never match C and vice versa. I also tried a LEFT OUTER JOIN as the second clause, but this just includes NULLs from Table_C that have already matched on Table_B.
All the searches I have done relating to JOINs on multiple tables seem to be about using JOINS to further exclude records, where I'm actually wanting to INCLUDE more records.
Like I said, I'm sure this is really simple, just needing a nudge in right direction.
Thanks!

The use of two inner joins here is akin to saying
If A.Serial_Number = B.Serial_Number AND
A.Serial_Number = C.Serial_Number
Using left outer join on the second clause - by which i presume you mean second join - would perform a left join on a result set already filtered by A.Serial_Number = B.Serial_Number by the first inner join. Given that B.Serial_Number doesn't relate to C.Serial_Number you wouldn't expect the an equijoin to return any result from tablec.
What you want is a left outer join like you tried but for both tableb and tablec.
Select *
From tablea
Left join tableb on tableb.Serial_Number = tablea.Serial_Number
Left join tablec on tablec.Serial_Number = tablea.Serial_Number
This way regardless of whether tablea.Serial_Number is in tableb it will still be returned and thus available to be joined to tablec

Agreed. Your output for your inner joins is producing NULLs which is why it is resulting in 0. I would suggest modifying your INNER JOIN.

Related

Filter on the column on which two tables are joined

Are next two queries going to return same result set?
SELECT * FROM tableA a
JOIN tableB b
ON a.id = b.id
WHERE a.id = '5'
--------------------------------
SELECT * FROM tableA a
JOIN tableb b
ON a.id = b.id
WHERE b.id = '5'
Also, will answer be different if LEFT JOIN is used instead of JOIN?
As written, they will return the same result.
The two will not necessarily return the same result with a left join.
Yes the result will be the same.
With a left join you will get every dataset of both table who got a ID.
With a join (Inner Join) you will get only the dataset's who a.id = b.id.
This site will explain you how to join https://www.w3schools.com/sql/sql_join.asp
Yes they will. A simple join works like an inner join by default. It checks for instances where the item you're joining on exist on both tables. Since you're joining on where a.id=b.id the results will be the same.
If you change the type of join to a left, the results will include all a.id's regardless of whether they are equal to 5.

Want to understand a query for a view i'm trying to dissect

I'm confused by a bit of a query i'm working with.
select *
from Table1
inner join Table2
on Table1.id1 = Table2.id1
right outer join Table3
right outer join Table4
inner join Table5
on Table4.id1 = Table5.id1
on Table3.id1 = Table5.id2
on Table1.id2 = Table5.id3
I tried to keep the query as close to what i'm working with as I could.
I don't understand the joins without the ON and then the join with multiple ONs.
Are tables 3 and 4 not actually being joined until after table 5 is joined?
The following doesn't work as Table5.id1 and Table5.id2 receive 'multi-part identifier "Table5.id_" could not be bound
select *
from Table1
inner join Table2
on Table1.id1 = Table2.id1
right outer join Table3
on Table3.id1 = Table5.id2
right outer join Table4
on Table4.id1 = Table5.id1
inner join Table5
on Table1.id2 = Table5.id3
Additionally, this bit does process because table 5 is joined first and solves the bounding error, but I receive about 27k more records than is wanted
select *
from Table1
inner join Table2
on Table1.id1 = Table2.id1
inner join Table5
on Table1.id2 = Table5.id3
right outer join Table3
on Table3.id1 = Table5.id2
right outer join Table4
on Table4.id1 = Table5.id1
So at this point it's obvious the original query is built the way it is for a reason, but I still don't understand the logic behind it or what is actually happening.
Any help would be much appreciated.
What you have here are multiple nested joins. Before I explain, I'm going to reformat the query a bit to make it easier to see what's going on.
select *
from Table1
inner join Table2 on Table1.id1 = Table2.id1
right outer join Table3 -- Join B
right outer join Table4 -- Join A
inner join Table5 on Table4.id1 = Table5.id1
on Table3.id1 = Table5.id2 -- ON clause for Join A
on Table1.id2 = Table5.id3 -- ON clause for Join B
Nested joins let you join two tables together then join that result to another set of records. Initially that doesn't sound terribly useful. That's just a regular join, right? Kinda. The difference is that only if the inner-most join succeeds does it attempt to join that row to the outer table. This isn't really useful at all if all you are doing is using inner joins. It becomes a lot more interesting if you are mixing inner and outer joins (more on that shortly).
I'll attempt to explain what's going on with this query, both in prose and in comments so hopefully between the two it will make sense.
First, the inner-most join here is an inner join between tables 4 and 5. Those tables are joined together first. That will give you a result set where each row in Table4 has at least one matching row in Table5 (according to whatever criteria exists in the on clause, in this case that Table4.id1 = Table5.id1). This implicitly filters out any rows from both Table4 and Table5 that don't have a match in the other table.
Then that result is then right joined to Table3 (on Table3.id1 = Table5.id2). Meaning you will get all records from Table3 joined with their corresponding match in the Table4/5 join set (if present).
Then we do a right join on that whole result set with Table1 (on Table3.id1 = Table5.id2). Meaning we will end up with everything in Table3 joined to the Table4/5 combo and then to a Table 1/2 combo.
The ultimate result set is everything from Table3 joined with 0 or more rows that match with Table1 and Table2 (if Table1 doesn't have a matching Table2 record, neither will be joined to Table3). Same for Table4/5. I believe this is correct (too much staring at this without the ability to run the query means I may have confused myself, but the basic idea is correct).
So why this crazy syntax? Alternatives are kind of a pain too. You could use CTEs or apply statements, both of which are their own kind of fun (not necessarily hard, just not your vanilla SQL. I tried converting your query using those and I think I got reasonably close, then I confused myself into a corner because of poor naming of things then I gave up). So why do this? Well it means you can ensure that you can outer join two tables to a third table only if there are matches in the first two tables. Maybe a more concrete example would help?
Say you have 4 tables Person, Order, OrderItem and OrderItemDiscount. You are tasked with getting back a result set that shows every order and to highlight orders that contain a Figlewubbit and where a discount code was used on it. So you write this:
select *
from Person p
left join Order o on o.PersonId = p.PersonId
left join OrderItem oi on oi.OrderId = o.OrderId
and oi.ItemName = 'Figlewubbit'
left join OrderItemDiscount oid on oid.OrderItemId = oi.OrderItemId
Another way to write it would be this:
select *
from Person p
left join Order o on o.PersonId = p.PersonId
left join OrderItem oi
inner join OrderItemDiscount oid on oid.OrderItemId = oi.OrderItemId
on oi.OrderId = o.OrderId
and oi.ItemName = 'Figlewubbit'
The execution plan here will change. OrderItem and OrderItemdDiscount will get joined together then that set will get fed into the left join to Order. Each OrderItem and OrderItemdDiscount joined row is effectively treated as a combined entity for the other joins. You won't get one without the other.
(I apologize if this example seems contrived. Nested joins are a weird beast. They have their uses (I've needed them once or twice). But coming up with a simple example that requires their use is quite hard. They are a very specialized tool that usually requires an equally specialized (and complicated) requirement to warrant their use. I highly recommend researching this some more and using simple versions of them first. Combining right joins and multiple nested joins even gives me a headache trying to parse it.)
Actually, tables 4 and 5 are joined first, then table 3 is joined. Here's execution plan:

SQL Different between Left join on... and Left Join on..where

I have two sql to join two table together:
select top 100 a.XXX
,a.YYY
,a.ZZZ
,b.GGG
,b.JJJ
from table_01 a
left join table_02 b
on a.XXX = b.GGG
and b.JJJ = "abc"
and a.YYY between '01/08/2009 13:18:00' and '12/08/2009 13:18:00'
select top 100 a.XXX
,a.YYY
,a.ZZZ
,b.GGG
,b.JJJ
from table_01 a
left join table_02 b
on a.XXX = b.GGG
where b.JJJ = "abc"
and a.YYY between '01/08/2009 13:18:00' and '12/08/2009 13:18:00'
The outcome of them is different but I don't understand the reason why.
I would be grateful if I can get some help here.
Whenever you are using LEFT JOIN, all the conditions about the content of the right table should be in the ON clause, otherwise you are effectively converting your LEFT JOIN to an INNER JOIN.
The reason for that is that when a LEFT JOIN is used, all the rows from the left table will be returned. If they are matched by the right table, the values of the matching row(s) will be returned as well, but if they are not matched with any row on the right table, then the right table will return a row of null values.
Since you can't compare anything to NULL (not even another NULL) (Read this answer to find out why), you are basically telling your database to return all rows that are matched in both tables.
However, when the condition is in the ON clause, Your database knows to treat it as a part of the join condition.

Having problems with SQL Joins

Table A
Table B
I tried to use LEFT OUTER JOIN but it seems not working..
I want the query to extract all data from Table A with 0 as average score if there is no data yet for the specified parameter. Meaning, in Figure 3, it should have shown ID 2 with 0 on s. Can anyone help me figure out the solution?
You have the table names switched in the join. To keep all of Table A then it needs to be the table listed on the left side of the left join. Also anything that you want to only affect the output of table B, and not filter the entire results, should be moved to the left join on clause. Should be:
SELECT a.id,
Avg(Isnull(b.score, 0)) AS s
FROM a
LEFT OUTER JOIN b
ON a.id = b.id
AND b.kind = 'X'
GROUP BY a.id

Need help with a sql query that has an inner and outer join

I really need help getting this query right. I can't share actual table and column names, but will try my best to layout the problem simply.
Assume the following tables. The tables and keys CANNOT be changed. Period. I don't care if you think it's a bad design, this question isn't a design question, it's on SQL syntax.
Table A - Primary key named id1
Table B - Contains two foreign keys, TableA.id1 and Foo.id2(ignore Foo, it doesn't matter for this)
Table C - Contains two foreign keys, TableA.id1 and Foo.id2, additional interesting
columns.
Constraints:
The SQL gets a set of id1s passed in as an argument.
It must return a list of Table C rows.
It must only return Table C rows where a Table B row exists with a matching TableA.id1 and Foo.id2 - There ARE rows in Table C that don't match Table B
A row MUST be returned for every id1 passed in, even if no Table C row exists.
At first I tried a Left Outer Join from Table A to Table B then an Inner Join to Table C. That violates the 4th rule above, as the Inner Join drops out those rows.
Next I tried two Left Outer joins. This is closer, but has the side effect of including rows that match the Table A join to Table B, but don't have a corresponding Table C entry, which isn't what I want.
So, here's what I came up with.
SELECT
a.id1,
c.*
FROM
TableB b
INNER JOIN
TableC c USING (id1,id2)
RIGHT OUTER JOIN
TableA a USING (id1)
WHERE
a.id1 in (x,y,z)
I'm a bit wary of a Right Outer Join, as the documentation I've read says it can be replaced with a Left Outer, but it doesn't appear so for this case. It also seems a bit rare, which is making other devs nervous, so I'm being cautious.
So, three questions in one.
Is this correct?
Did I use the Right Outer Join correctly?
Is there a cleaner way to achieve the same thing?
EDIT: DB is MySQL
You can rewrite it as a LEFT OUTER JOIN by using parentheses. In pseudo-SQL change this:
SELECT ...
FROM b
INNER JOIN c ON ...
RIGHT OUTER JOIN a ON ...
to this:
SELECT ...
FROM a
LEFT OUTER JOIN (
b INNER JOIN c ON ...
) ON ...
You can use an EXISTS clause, which sometimes works better
SELECT
a.id1,
c.*
FROM TableA a
LEFT JOIN TableC c
ON c.id1 = a.id1 AND EXISTS (
select *
from TableB b
where b.id1=c.id1 and b.id2=c.id2)
WHERE
a.id1 in (x,y,z)
As you have written it, it works because ANSI JOINs are always processed top to bottom. Since you need to test B against C before joining to A, it is about the only way to write it without introducing a subquery [(B x C) RIGHT JOIN A]. However, a bad query plan could perform all records in B and C (B x C) before right joining to A.
The EXISTS method efficiently uses the filter on A, then LEFT JOINs to C and for each C found, validates that it also exists in B (or discards).
Q's
Yes your query is correct
Yes
EXISTS should work better
Yeah, you need to start with TableA and then add tables B and C using joins. The only reason you even need TableA is to make sure you have a row for each parameter.
Select a.id1,c.*
From
TableA a
Left Join TableB b on a.id1=b.id1
Left Join TableC c on b.id1=c.id1 and b.id2=c.id2
Where a.id1 in (x,y,z)
You need to do OUTER joins all the way across, or rows that are missing in B will also cause data from A to be filtered out of the result set. By joining C to B (instead of directly to A) you are using B to filter. You could do it with a complicated EXISTS clause, but this is cleaner.