Im tryin to join 6 tables in bigquery named
T0, T1, T2, T3, T4, T5
The tables result im interested are T0 and T1
after query this tables I got 43 matches
SELECT
T1.F1,
T0.F2,
T0.F3,
T0.F4,
T1.F5,
T1.F6,
T1.F7,
T1.F8
T0.F9
FROM `TABLE0` T0
INNER JOIN `TABLE1` T1 on T1.F1= T0.F1
WHERE T0.F1 = "010001476713"
AND T0.F2 = T1.F2
ORDER BY T0.F4
But when I run this with multiple INNER JOIN I got 800 results not the 43, results are duplicated
SELECT
T2.F11,
T3.F15,
T2.F12,
T3.F16,
T3.F17,
T1.F1,
T2.F13,
T3.F17,
T5.F18,
T5.F19,
T5.F20,
T2.F14,
T0.F9,
T1.F10,
T4.F3,
T4.F21,
T4.F22,
T0.F2,
T3.F23,
T0.F3,
T0.F4,
T1.F5,
T1.F6,
T1.F7,
T1.F8
FROM `TABLE0` T0
INNER JOIN `TABLE1` T1 ON T1.F1= T0.F1
INNER JOIN `TABLE3` T3 ON T3.F1=T1.F1
INNER JOIN `TABLE2` T2 ON T2.F24 = T3.F24
INNER JOIN `TABLE4` T4 ON T4.F3 = T0.F3
INNER JOIN `TABLE5` as T5 ON T5.F1=T0.F1
WHERE T0.F1 = "010001476713"
AND T0.F2 = T1.F2
ORDER BY T0.F4
When I get duplicate rows, I solve it like this:
You get 43 results on your inner join of table T0 & T1. So far so good.
Now comment out everything related to table T2, T4, & T5 (I've placed the commas at the beginning of the row for easier commenting out) like this
SELECT
--T2.F11,
T3.F15
--,T2.F12
,T3.F16
,T3.F17
,T1.F1
--,T2.F13
,T3.F17
--,T5.F18
--,T5.F19
--,T5.F20
--,T2.F14
,T0.F9
,T1.F10
--,T4.F3
--,T4.F21
--,T4.F22
,T0.F2
,T3.F23
,T0.F3
,T0.F4
,T1.F5
,T1.F6
,T1.F7
,T1.F8
FROM `TABLE0` T0
INNER JOIN `TABLE1` T1 ON T1.F1= T0.F1 and T0.F2 = T1.F2
INNER JOIN `TABLE3` T3 ON T3.F1=T1.F1
--INNER JOIN `TABLE2` T2 ON T2.F24 = T3.F24
--INNER JOIN `TABLE4` T4 ON T4.F3 = T0.F3
--INNER JOIN `TABLE5` as T5 ON T5.F1=T0.F1
WHERE T0.F1 = "010001476713"
ORDER BY T0.F4
I've moved the and T0.F2 = T1.F2 from the where to on in the inner join. When you run this query, do you still get 43 rows, or more? If more, you need to figure out what it is double matching on, and add that to your on statement of it really is a 1-1 relationship, or perhaps group the results if you don't want multiple matches. You may need to comment out your select statement and select all to really figure it out, like this:
SELECT *
/*
--T2.F11,
T3.F15
--,T2.F12
,T3.F16
,T3.F17
,T1.F1
--,T2.F13
,T3.F17
--,T5.F18
--,T5.F19
--,T5.F20
--,T2.F14
,T0.F9
,T1.F10
--,T4.F3
--,T4.F21
--,T4.F22
,T0.F2
,T3.F23
,T0.F3
,T0.F4
,T1.F5
,T1.F6
,T1.F7
,T1.F8
*/
FROM `TABLE0` T0
INNER JOIN `TABLE1` T1 ON T1.F1= T0.F1 and T0.F2 = T1.F2
INNER JOIN `TABLE3` T3 ON T3.F1=T1.F1
--INNER JOIN `TABLE2` T2 ON T2.F24 = T3.F24
--INNER JOIN `TABLE4` T4 ON T4.F3 = T0.F3
--INNER JOIN `TABLE5` as T5 ON T5.F1=T0.F1
WHERE T0.F1 = "010001476713"
ORDER BY T0.F4
Once you figure out what rows are causing the duplication, you either group the results or add an 'and' statement to the on clause to make it a 1-1, and then move on. You then uncomment the parts of the query related to T2 and do the same thing, then T4 and then T5. If you send me the results of the query above, I can help you figure out what your on clause needs to be to keep it from duplicating.
thank you #jenstretman, I find table 4 to be duplicating matches by using a foreign Key with non-primary Key creating duplicates, the solution was to use a DISTINCT to only select specifically matched rows.
SELECT DISTINCT
T2.F11,
T3.F15,
T2.F12,
T3.F16,
T3.F17,
T1.F1,
T2.F13,
T3.F17,
T5.F18,
T5.F19,
T5.F20,
T2.F14,
T0.F9,
T1.F10,
T4.F3,
T4.F21,
T4.F22,
T0.F2,
T3.F23,
T0.F3,
T0.F4,
T1.F5,
T1.F6,
T1.F7,
T1.F8
FROM `TABLE0` T0
INNER JOIN `TABLE1` T1 ON T1.F1= T0.F1
INNER JOIN `TABLE3` T3 ON T3.F1=T1.F1
INNER JOIN `TABLE2` T2 ON T2.F24 = T3.F24
INNER JOIN (SELECT DISTINCT T4.F3, T4.F21, T4.F22, FROM `TABLE4` T4)T4 ON T4.F3 = T0.F3
INNER JOIN `TABLE5` as T5 ON T5.F1=T0.F1
WHERE T0.F1 = "010001476713"
AND T0.F2 = T1.F2
ORDER BY T0.F4
bit of a novice question, I am running a query and left joining and wanted to know whether there was a difference when you specify a filter in terms of performance, in e.g below, top I filter straight after first join and below I do all joins and then filter:
Select t1.*,t2.* from t1 t1
left join t2 t2
on t1.key = t2.key
and t1.date < today
left join t3 t3
on t2.key2 = t3.key
vs
Select t1.*,t2.* from t1 t1
left join t2 t2
on t1.key = t2.key
left join t3 t3
on t2.key2 = t3.key
and t1.date < today
Learn what LEFT JOIN ON returns: INNER JOIN ON rows UNION ALL unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of an OUTER JOIN.
In general your queries have different inner join & null-extended rows for the 1st left joins & then further differences due to more joining. Unless certain constraints hold, the 2 queries return different functions of their inputs. So comparing their performance seems moot.
I'm using SQL server manger.
I have 3 tables
I need a query that pulls t1 ands add an Origin Basin and a Destination Basin.
So far I have the following:
select T1.[Country (destination)], T3.AreaName
From T1
left outer join T2 on
T1.[Country (destination)] = T2.CountryName
inner join T3 on
T2.AreaID = T3.AreaID
inner join T3 on
T2.AreaID = T3.AreaID
Which returns:
Country | Area
However, I'm having trouble doing this for the second country column. I believe you use aliases. I've tried:
select (select AreaName
FROM T3
where T3.AreaID = T2.AreaID) as 'Area Imp',
(select AreaID
From T2
where T2.CountryName = T1.[Country (origin)]) as 'x',
(select AreaID
From T2
where T2.CountryName = T1.[Country (destination)]) as 'y'
FROM T1
But I can't get it to work.
This is what you need to do:
select t1.date, t1.country_destination, t1.country_origin, destination_area.AreaName as area_destination, origin_area.AreaName as area_origin
from t1 as t1 join t2 as destination on t1.country_destination = destination.countryname
join t2 as origin on t1.country_origin = origin.countryname
join t3 as destination_area on t2.areaid = destination_area.areaid
join t3 as origin_area on t2.areaid = origin_area.areaid
You will need to join with the same table twice, both for t2 and t3 so that you get the matching records for your needs.
It helps usually to put aliases that match the purpose of the join (in this case, destination and origin) when writing the query.
I think what you're trying to do is something like this:
select T1.*, T3dest.AreaName, T3orig.AreaName
From
T1
inner join
T2 T2dest on
T1.[Country (destination)] = T2dest.CountryName
inner join
T3 T3dest on
T2dest.AreaID = T3dest.AreaID
inner join
T2 T2orig on
T1.[Country (origin)] = T2orig.CountryName
inner join
T3 T3orig on
T2orig.AreaID = T3orig.AreaID
Note that I've switched to inner joins throughout, at the moment. If you do want left join semantics, you either need to use those for all of the joins to the T2 and T3 tables or you need to change the join order (so that the relevant T3 joins to the T2 tables occur before the attempted join with T1). It's not clear from the sample data if that's required, however.
Try this, You would still want to join on area id's
select T1.Date,T1.[Country (destination)], null [Country (origin)], T3.AreaName [AreaName(Destination)], null [AreaName(Origin)]
From T1
left outer join T2 on
T1.[Country (destination)] = T2.CountryName
inner join T3 on
T2.AreaID = T3.AreaID
union all
select T1.Date,null [Country (destination)], t1.[Country (origin)], Null [AreaName(Destination)], t3. [AreaName(Origin)]
From T1
left outer join T2 on
T1.[Country (Origin)] = T2.CountryName
inner join T3 on
T2.AreaID = T3.AreaID
Query1: t1 left join t2 on some condition cross join t3 right join t4 on some condition
Query2: t1 left join t2 on some condition cross join t4 left join t3 on some condition
Will the result of these queries be the same or not?
They are not the same.
Ex. if we consider all the tables have the same number of rows, the first query returns less rows than the second one.
I was given a query that uses some very weird syntax for a join and I need to understand how this join is being used:
SELECT
T1.Acct#
, T2.New_Acct#
, T3.Pool#
FROM DB.dbo.ACCT_TABLE T1
LEFT JOIN DB.dbo.CROSSREF_TABLE T2
INNER JOIN DB.dbo.POOL_TABLE T3
ON T2.Pool# = T3.Pool#
ON T1.Acct# = T2.Prev_Acct#
T1 is a distinct account list
T2 is a distinct account list for each Pool #
T3 is a distinct pool list (group of accounts)
I need to return the previous account number held in T2 for each record in T1. I also need the T3 Pool# returned for each pool.
What I'm trying to understand is why someone would write the code this way. It doesn't make sense to me.
A little indenting will show you better what was intended
SELECT
T1.Acct#
, T2.New_Acct#
, T3.Pool#
FROM DB.dbo.ACCT_TABLE T1
LEFT JOIN DB.dbo.CROSSREF_TABLE T2
INNER JOIN DB.dbo.POOL_TABLE T3
ON T2.Pool# = T3.Pool#
ON T1.Acct# = T2.Prev_Acct#
This is a valid syntax that forces the join order a bit. Basically it is asksing for only the records in table T2 that are also in table T3 and then left joining them to T1. I don't like it personally as it is confusing for maintenance. I would prefer a derived table as I find those much clearer and much easier to change when I need to do maintenance six months later:
SELECT
T1.Acct#
, T2.New_Acct#
, T3.Pool#
FROM DB.dbo.ACCT_TABLE T1
LEFT JOIN (select T2.New_Acct#, T3.Pool#
FROM DB.dbo.CROSSREF_TABLE T2
INNER JOIN DB.dbo.POOL_TABLE T3
ON T2.Pool# = T3.Pool#) T4
ON T1.Acct# = T4.Prev_Acct#
An OUTER APPLY would be clearer here:
SELECT
T1.Acct#,
T4.New_Acct#,
T4.Pool#
FROM DB.dbo.ACCT_TABLE T1
OUTER APPLY
(
SELECT
T2.New_Acct#,
T3.Pool#
FROM DB.dbo.CROSSREF_TABLE T2
INNER JOIN DB.dbo.POOL_TABLE T3 ON T2.Pool# = T3.Pool#
WHERE T1.Acct# = T4.Prev_Acct#
) T4