Tables join using mapping table - sql

I have 3 tables:
table1: col1(id), col2(segment), col3(sector), col4(year)
mapping table2:
col1(segment1) => values are the same as from table1.col2,
col2(segmnet2) =>values are the same as from table3.col2
table3: col1(id), col2(segment), col3(sector), col4(year)
Now, Im doing FULL OUTER JOIN:
select t1.id, t3.id
from table1 t1
full outer join table3 t3 on
t1.year = t3.year and....
But I also need to join by COL2 - SEGMENT, with using mapping table.
How to do correctly do it?

If I understood you correctly, you just need to add another full outer join:
select t1.id, t3.id
from table1 t1
full outer join mapping t2 on( t1.col2= t2.col1)
full outer join table3 t3 on(t1.year = t3.year and t2.col2 = t3.col2
Just to make sure - a full outer join keeps all the records from both tables being joined, no matter if there is a match or not! I've added another full outer join but change it to the kind of join you need if it isn't full.

Related

sql query joins multiple tables - too slow

I am currently trying to join a few tables together (maybe join 2 additional more if possible) but with how my query is written right now, I cant even see the results with 3 tables
select t1.x,
t1.y,
t1.z,
t4.a,
t4.b,
t4.c,
t4.d
from t1
left join t2 on t1.id=t2.id
left join t3 on t2.id=t3.id
left join t4 on t1.id2=t4.id
where t1.date between 'x' and'x'
and t1.city not in ('x')
and t3.column = x;
Is there a way to optimize this code to run faster and perhaps make it able to add more tables to it?
Thank you in advance!
Your query has some logic issues that might help with the speed.
t2 is joined to t1 when they have the same id value.
t3 is then pulled in, if and only if, there was a row in t2 and it has the same value as t1 and t2.
Finally, in your where clause, the t3.column has to be x else it's filtered.
This means a row in t3 has to exist. Every t1 record that doesn't have a t2 record and a t3 record will be filtered out with that where. Thus you don't need a left join, you need an INNER join.
select t1.x,
t1.y,
t1.z,
t4.a,
t4.b,
t4.c,
t4.d
from t1
inner join t2 on t1.id=t2.id
inner join t3 on t2.id=t3.id
left join t4 on t1.id2=t4.id
where t1.date between 'x' and'x'
and t1.city not in ('x')
and t3.column = x;
In some DBMS you can move the t3.column clause to the join command which can help filter out the rows earlier in the plan.
select t1.x,
t1.y,
t1.z,
t4.a,
t4.b,
t4.c,
t4.d
from t1
inner join t2 on t1.id=t2.id
inner join t3 on t2.id=t3.id and t3.column = x
left join t4 on t1.id2=t4.id
where t1.date between 'x' and'x'
and t1.city not in ('x');
My final advise is to take a close look at t2 to see if you really need it. Ask yourself, is there a reason a row has to exist in t2 in order for me to get the right results? ... because if t1.id = t2.id then t1.id = t3.id and you can eliminate the t2 table completely.

How do I get the results of one JOIN and THEN feed those into a separate join in T-SQL?

I'm trying to JOIN 2 tables ON a key like
SELECT column1,column2
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.t2id = t2.id
Now, I have a 3rd table that has a Foreign Key with t2's id that I want to join... When I do
LEFT JOIN
Table3 t3 ON t3.t2id = --<-------------- This is where I'm lost
I don't know if I should do ON t3.t2id = t1.t2id OR ON t3.t2id = t2.id
What I need is the list of t2ids which are still in the picture after the first join. However, it seems as though if I specify either of the above, it will just pull ids from the original table before the first join?
To clarify one more time: I'm trying to essentially do a INNER JOIN of Table1 and Table2, get the resulting table, then get the t2ids of those results and feed them into a final join such that the final result contains all of Table3's rows as well as the data from the first join
You said: "final result contains all of Table3's rows as well as the data from the first join".
It means that you need
Table3 LEFT JOIN <previous results>
instead of
<previous results> LEFT JOIN Table3
The easiest way to write it is to use Common-Table Expressions:
WITH
CTE_InnerJoin
AS
(
SELECT column1, column2, t1.t2id
FROM
Table1 t1
INNER JOIN Table2 t2 ON t1.t2id = t2.id
)
SELECT
CTE_InnerJoin.column1
,CTE_InnerJoin.column2
,Table3....
FROM
Table3
LEFT JOIN CTE_InnerJoin ON CTE_InnerJoin.t2id = Table3.t2id
;
It doesn't matter what column you include in the CTE: t1.t2id or t2.id, the values in them are the same, because they are inner-joined together.
JOINs already do exactly what you want. A JOIN isn't always between two tables. Frequently, it's between the results of previous joins.
SELECT column1,column2
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.t2id = t2.id
LEFT JOIN
Table3 t3 ON t3.t2id = t2.id
At the point at which you're writing the final ON clause here, what you're joining is precisely the results of the previous INNER JOIN on the left and the table Table3 on the right. All of t1, t2 and t3 are in scope within the ON clause, but note that t1 and t2 are now both used as aliases into the same source of rows - the previous INNER JOIN.
As a further example, consider the "diamond join":
SELECT
*
FROM
t1
left join
t2
on
t1.a = t2.b
left join
t3
on
t1.c = t3.d
inner join
t4
on
t2.e = t4.f OR
t3.g = t4.h
This is a way of joining two tables (t1 and t4) based on two alternative joins. Note that in the final inner join, what is on the "left" is the result of already joining tables t1, t2 and t3.
Join the table having foreign key with
Try this....
SELECT column1,column2
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.t2id = t2.id
LEFT join Table3 t3 ON t2.id=t3.t2id
Or Like this.
SELECT t12.column1 ,
t12.column2 ,
t3.*
FROM (
--- INNER JOIN of Table1 and Table2, get the resulting table,
SELECT t1.column1 ,
t2.column2 ,
t1.t2id --- or t2.id doesn't matter because its inner join
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.t2id = t2.id
) T12
LEFT JOIN Table3 T3 ON t3.t2id = t1.t2id --- then get the t2ids of those results
--- and feed them into a final join
--- if you want to get all rows from Table3, Change LEFT JOIN Table3 T3 ON t3.t2id = T1.t2id
--- into RIGHT JOIN Table3 T3 ON t3.t2id = T1.t2id
try this
select t3.*, column1, column2
from
table1 t1 inner join table2 t2 on t1.t2id = t2.id
right outer join table3 t3 on t3.t2id = t2.id
equvalent to
select t3.*, column1, column2
from
table1 t1 inner join table2 t2 on t1.t2id = t2.id
right outer join table3 t3 on t3.t2id = t1.t2id
if you want all rows from table 3 and those matching rows from table1 inner joined to table2 then you can use this syntax:
select t3.*,
column1, column2
from table3 t3
left join table2 t2
inner join table1 t1
on t1.t2id = t2.id
on t3.t2id = t2.id

Multiple joins in a sql query - which is best option

I want to use a sql query with multiple joins similar to the example below.
SELECT t1.column1, t1.column2, t1.column3
FROM
table1 t1
LEFT JOIN table2 t2 ON (t1.id1 = t2.id)
LEFT JOIN table3 t3 ON (t1.id1 = t3.id)
JOIN table4 t4 ON t1.id2 = t4.id
WHERE
...
Would this give different results than the following query:
SELECT t1.column1, t1.column2, t1.column3
FROM
table1 t1
LEFT JOIN table2 t2 ON (t1.id1 = t2.id)
LEFT JOIN table3 t3 ON (t2.id = t3.id)
JOIN table4 t4 ON t1.id2 = t4.id
WHERE
...
If they are both 'correct' is the second more efficient than the first?
Thanks
The queries are different, so this isn't a performance issue. The difference are these lines:
LEFT JOIN table3 t3 ON (t1.id1 = t3.id)
and
LEFT JOIN table3 t3 ON (t2.id1 = t3.id)
For the first, t3.id needs to only match t1.id. For the second, it needs to match t2.id1, which in turn must also match t1.id. In other words, the second version requires that the id be in both t1 and t2.
This is because of the LEFT JOIN. The queries would be equivalent if they used INNER JOIN.
The second one is more efficient because it will always return the same or less amount of data.
In the first query you are asking for all the records that are both in table1 and table4 + records from table2 if they exists in table1 + records from the table3 if they exists in table1.
In the second query you are asking for all the records that are both in table1 and table4 + records from table2 if they exists in table1 + records from the table3 if they exists BOTH in in table1 and in table2

SQL select column when there's more than one of the same name

I have this query
select *
from alldistros t1
LEFT join origin t2 on t1.name=t2.name
LEFT join desktop t3 on t2.name=t3.name
LEFT join beginnerdistributions t4 on t3.name=t4.name
it add on all my tables. But now when I want to select the name field (which is in all of them) I can't show it. It's just blank when I call it. And I would think so since there's more than 1 columns of the same name.
What can I do to fix this?
Just a plain join won't work, since it removes some of the fields that does not have the properties in the other tables.
You can use the 'AS' keyword to name a column. For instance:
select t1.name AS DistroName, t2.name AS OriginName, t3.name AS DesktopName
from alldistros t1
LEFT join origin t2 on t1.name=t2.name
LEFT join desktop t3 on t2.name=t3.name
LEFT join beginnerdistributions t4 on t3.name=t4.name
select
t1.name as t1_name,
t2.name as t2_name,
t3.name as t3_name
from alldistros t1
LEFT join origin t2 on t1.name=t2.name
LEFT join desktop t3 on t2.name=t3.name
LEFT join beginnerdistributions t4 on t3.name=t4.name
Not sure if it's Oracle only, but USING can do this for you for ad-hoc queries:
SELECT *
FROM TABLEA
JOIN TABLEB USING (NAME)
This will only return one NAME column from the SELECT *.

SQL - Difference between these Joins?

I should probably know this by now, but what, if any is the difference between the two statements below?
The nested join:
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
ON t2.table2_ID = t1.table1_ID
The more traditional join:
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
Well, it's the order of operations..
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
ON t2.table2_ID = t1.table1_ID
could be rewritten as:
SELECT
t1.*
FROM
table1 t1 -- inner join t1
INNER JOIN
(table2 t2 LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID) -- with this
ON t2.table2_ID = t1.table1_ID -- on this condition
So basically, first you LEFT JOIN t2 with t3, based on the join condition: table3_ID = table2_ID, then you INNER JOIN t1 with t2 on table2_ID = table1_ID.
In your second example you first INNER JOIN t1 with t2, and then LEFT JOIN the resulting inner join with table t3 on the condition table2_ID = table1_ID.
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
could be rewritten as:
SELECT
t1.*
FROM
(table1 t1 INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID) -- first inner join
LEFT JOIN -- then left join
table3 t3 ON t3.table3_ID = t2.table2_ID -- the result with this
EDIT
I apologize. My first remark was wrong. The two queries will produce the same results but there may be a difference in performance as the first query may perform slower than the second query in some instances ( when table 1 contains only a subset of the elements in table 2) as the LEFT JOIN will be executed first - and only then intersected with table1. As opposed to the second query which allows the query optimizer to do it's job.
For your specific example, I don't think there should be any difference in the query plans generated, but there's definitely a difference in readability. Your 2nd example is MUCH easier to follow.
If you were to reverse the types of joins in the example, you could end up with much different results.
SELECT t1.*
FROM table1 t1
LEFT JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
-- may not produce the same results as...
SELECT t1.*
FROM table1 t1
LEFT JOIN table2 t2
INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
ON t2.table2_ID = t1.table1_ID
Based on the fact that order of the joins DOES matter in many cases - careful thought should go into how you're writing your join syntax. If you find that the 2nd example is what you're really trying to accomplish, i'd consider rewriting the query so that you can put more emphasis on the order of your joins...
SELECT t1.*
FROM table2 t2
INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
RIGHT JOIN table1 t1 ON t2.table2_ID = t1.table1_ID
The best way to see what is different in these two queries is to compare the Query Plan for both these queries.
There is no difference in the result sets for these IF there are always rows in table3 for a given row in table2.
I tried it on my database and the difference in the query plans was that
1. For the first query, the optimizer chose to do the join on table2 and table 3 first.
2. For the second query, the optimizer chose to join table1 and table2 first.
You should see no difference at all between the two queries, provided your DBMS' optimizer is up to scratch. That, however, even for big-iron, high-cost platforms, is not an assumption I'd be confident in making, so I'd be fairly unsurprised to discover that query plans (and consequently execution times) varied.