How to join two tables on multiple columns using OR condition in bigquery SQL - sql

Lets say I have two tables.
First table shown below:
tableA
Second table
tableB
Now I want to write a query That will join the two tables above on either name OR email OR phone.
Something like:
SELECT * FROM tableA
LEFT JOIN tableB
ON
(tableA.name_A = tableB.name_B OR tableA.email_A = tableB.email_B OR tableA.phone_A = tableB.phone_B)
And it should produce a table something like this
If you notice,
John matches rows between tableA and tableB on name.
Ally/allie matches rows between tableA and tableB on email.
Sam/Samual matches rows between tableA and tableB on phone
When I try to do this same query though I receive an
error that says LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.
I am using BigQuery.
Please help, cheers

Try INNER JOIN
SELECT * FROM tableA
INNER JOIN tableB
ON
(tableA.name_A = tableB.name_B OR tableA.email_A = tableB.email_B OR tableA.phone_A = tableB.phone_B)
or CROSS JOIN:
SELECT * FROM tableA
CROSS JOIN tableB
WHERE
tableA.name_A = tableB.name_B
OR tableA.email_A = tableB.email_B
OR tableA.phone_A = tableB.phone_B
or UNION DISTINCT:
SELECT * FROM tableA
LEFT JOIN tableB
ON tableA.name_A = tableB.name_B
UNION DISTINCT
SELECT * FROM tableA
LEFT JOIN tableB
ON tableA.email_A = tableB.email_B
UNION DISTINCT
SELECT * FROM tableA
LEFT JOIN tableB
ON tableA.phone_A = tableB.phone_B

could you try by using parenthesis
SELECT * FROM tableA
LEFT JOIN tableB
ON
(tableA.name_A = tableB.name_B) OR
(tableA.email_A = tableB.email_B) OR
(tableA.phone_A = tableB.phone_B)

Related

Full Outer Join failing to return all records from both tables

I have a pair of tables I need to join, I want to return any record that's in tableA, tableB or both. I think I need a FULL OUTER JOIN
This query return 1164 records
SELECT name FROM tableA
WHERE reportDay = '2022-Apr-05'
And this one return 3339 records
SELECT name FROM tableB
WHERE reportDay = '2022-Apr-05'
And this one returns 3369 records (so there must be 30 records in tableA that aren't in tableB)
select distinct name FROM tableA where reportDay = '2022-Apr-05'
union distinct
select distinct name FROM tableB where reportDay = '2022-Apr-05'
I want to obtain a list of all matching records in either table. The query above returns 3369 records, so a FULL OUTER JOIN should also return 3369 rows (I think). My best effort so far is shown below. It returns 1164 rows and returns what looks to me to be a left join between tableA and tableB.
SELECT tableA.name.*, tableB.name.*
FROM tableA
FULL OUTER JOIN tableB
ON (tableA.name = tableB.name and tableB.reportDay = '2022-Apr-05')
WHERE tableA.reportDay = '2022-Apr-05'
Help appreciated. (if this looks question looks familiar, it's a follow-on question to this one )
UPDATE - Sorry (#forpas) to keep moving the goalposts - I'm trying to match test data to real-data scenario's.
DROP TABLE tableA;
DROP TABLE tableB;
CREATE TABLE tableA (name VARCHAR(10),
reportDay DATE,
val1 INTEGER,
val2 INTEGER);
CREATE TABLE tableB (name VARCHAR(10),
reportDay DATE,
test1 INTEGER,
test2 INTEGER);
INSERT INTO tableA values ('A','2022-Apr-05',1,2),
('B','2022-Apr-05',3,4), ('C','2022-Apr-05',5,6),
('A','2022-Apr-06',1,2), ('B','2022-Apr-06',3,4),
('C','2022-Apr-06',5,6), ('Z','2022-Apr-04',5,6),
('Z','2022-Apr-06',5,6) ;
INSERT INTO tableB values ('A','2022-Apr-03',5,6),
('B','2022-Apr-04',11,22), ('B','2022-Apr-05',11,22),
('C','2022-Apr-05',33,44), ('D','2022-Apr-05',55,66),
('B','2022-Apr-06',11,22), ('C','2022-Apr-06',33,44),
('D','2022-Apr-06',55,66), ('Q','2022-Apr-06',5,6);
SELECT tableA.*, tableB.*
FROM tableA
FULL OUTER JOIN tableB
ON (tableA.name = tableB.name and tableB.reportDay = '2022-Apr-05'
AND tableA.reportDay = '2022-Apr-05' )
For this data, I'd hope to see 4 rows of data 'A' from tableA only, 'B' and 'C' from both tables, and 'D' from table B only. I'm after the 5th April records only! The query (shown above) suggested by #forpas works except that the 'A' record in tableA doesn't get returned.
UPDATE - FINAL EDIT AND ANSWER!
Ok, the solution seem to be to concetenate the two fields together before joining....
SELECT a.*, b.*
FROM tableA a FULL OUTER JOIN tableB b
ON (b.name || b.reportDay) = (a.name || a.reportDay)
WHERE (a.reportDay = '2022-Apr-05' OR a.reportDay IS NULL)
AND (b.reportDay = '2022-Apr-05' OR b.reportDay IS NULL);
The condition for the date should be placed in a WHERE clause:
SELECT a.*, b.*
FROM tableA a FULL OUTER JOIN tableB b
ON b.name = a.name AND a.reportDay = b.reportDay
WHERE '2022-Apr-05' IN (a.reportDay, b.reportDay);
or:
SELECT a.*, b.*
FROM tableA a FULL OUTER JOIN tableB b
ON b.name = a.name
WHERE (a.reportDay = '2022-Apr-05' OR a.reportDay IS NULL)
AND (b.reportDay = '2022-Apr-05' OR b.reportDay IS NULL);
See the demo.

How to find differences between 2 tables with different key names

So say I have 2 tables. TableA has keys FirstName, LastName, Age, Location. TableB has keys 1FirstName, 1LastName, 1Age, 1Location. Table A is a working table, and TableB is a reference table. How could I go about finding what records exist in TableB that DO NOT exist in TableA
Using an outer join will join all rows of left table to either right table or NULL
Then including only NULL from right hand side will give you what you need.
TableB as left side
SELECT
*
FROM (
SELECT
b.*,
a.*
FROM TableB b
LEFT OUTER JOIN TableA a
ON
b.1FirstName = a.Firstname
AND
b.1LastName = a.Lastname
AND
b.1Age = a.Age
AND
b.1Location = a.Location
) q
WHERE q.FirstName IS NULL

SQL antijoin with multiple keys

I'd like to implement an antijoin on two table but using two keys so that the result is all rows in Table A that do not contain the combinations of [key_1, key_2] found in Table B. How can I write this query in SQL?
If you want an anti-left join, the logic is:
select a.*
from tablea a
left join tableb b on b.key_1 = a.key_1 and b.key_2 = a.key_2
where b.key_1 is null
As for me, I like to implement such logic with not exists, because I find that it is more expressive about the intent:
select a.*
from tablea a
where not exists (
select 1 from tableb b where b.key_1 = a.key_1 and b.key_2 = a.key_2
)
The not exists query would take advantage of an index on tableb(key_1, key_2).
select a.*
from table_a a
left anti join table_b b on a.key_1 = b.key_1 and a.key_2 = b.key_2;

MS Acces Jet SQL Error: Join Expression not supported with multiple Join conditions

I'm trying to run this SQL Expression in Access:
Select *
From ((TableA
Left Join TableB
On TableB.FK = TableA.PK)
Left Join TableC
On TableC.FK = TableB.PK)
Left Join (SELECT a,b,c FROM TableD WHERE b > 1) AS TableD
On (TableD.FK = TableC.PK AND TableA.a = TableD.a)
but it keeps getting error: Join-Expression not supported.
Whats the problem?
Sorry, im just starting with Jet-SQL and in T-SQL its all fine.
Thanks
The issue is that the final outer join condition TableA.a = TableD.a will cause the query to contain ambiguous outer joins, since the records to which TableA is joined to TableD will depend upon the results of the joins between TableA->TableB, TableB->TableC and TableC->TableD.
To avoid this, you'll likely need to structure your query with the joins between tables TableA, TableB & TableC existing within a subquery, the result of which is then outer joined to TableD. This unambiguously defines the order in which the joins are evaluated.
For example:
select * from
(
select TableA.a, TableC.PK from
(
TableA left join TableB on TableA.PK = TableB.FK
)
left join TableC on TableB.PK = TableC.FK
) q1
left join
(
select TableD.a, TableD.b, TableD.c, TableD.FK from TableD
where TableD.b > 1
) q2
on q1.a = q2.a and q1.PK = q2.FK
Consider relating every join to the FROM table to avoid having to nest relations.
SELECT *
FROM ((TableA
LEFT JOIN TableB
ON TableB.FK = TableA.PK)
LEFT JOIN TableC
ON TableC.FK = TableA.PK)
LEFT JOIN
(SELECT FK,a,b,c
FROM TableD WHERE b > 1
) AS TableD
ON (TableD.FK = TableA.PK)
AND (TableD.a = TableA.a)

Optimizing SQL Query - Joining 4 tables

I am trying to join 4 tables. Currently I've achieved it by doing this.
SELECT columns
FROM tableA
LEFT OUTER JOIN tableB ON tableB.address_id = tableA.address_id
INNER JOIN tableC ON tableC.company_id = tableA.company_id AND tableC.client_id = ?
UNION
SELECT columns
FROM tableA
LEFT OUTER JOIN tableB ON tableB.address_id = tableA.gaddress_id
INNER JOIN tableD ON tableD.company_id = tableA.company_id AND tableD.branch_id = ?
The structure of tableC and tableD is very similar. Let's say that tableC contains data for clients. And tableD contains data for client's branch. tableA are companies and tableB are addresses My goal is to get data from tableA that are joined to table B (All companies that has addresses) and all the data from tableD and also from tableC.
This wroks nice, but I am afraid that is would be very slow.
I think you can trick it like this:
First UNION between C,D and only the join to the rest of the query, it should improve the query significantly :
SELECT columns
FROM TableA
LEFT OUTER JOIN tableB ON tableB.address_id = tableA.address_id
INNER JOIN(SELECT Columns,'1' as ind_where FROM tableC
UNION ALL
SELECT Columns,'2' FROM TableD) joined_Table
ON (joined_Table.company_id = tableA.company_id AND joined_Table.New_Col_ID= ?)
The New_Col_ID -> just select both branch_id and client_id in the same column and alias it as New_Col_ID or what ever
In addition you can index the tables(if not exists yet) :
TableA(address_id,company_id)
TableB(address_id)
TableC(company_id,client_id)
TableD(company_id,branch_id)
Why should that be slow? You select client adresses and branch addresses and show the complete result. That seems straight-forward.
You join on IDs and this should be fast (as there should be indexes available accordingly). You may want to introduce composite indexes on
create index idx_c on tableC(client_id, company_id)
and
create index idx_d on tableD(branch_id, company_id)
However: UNION is a lot of work for the DBMS, because it has to look for and eliminate duplicates. Can there even be any? Otherwise use UNION ALL.
Try CTE so that you don't have to go through TableA and TableB twice for the union.
; WITH TempTable (Column1, Column2, ...)
AS ( SELECT columns
FROM tableA
LEFT OUTER JOIN tableB
ON tableB.address_id = tableA.gaddress_id
)
SELECT Columns
FROM TempTable
INNER JOIN tableC
ON tableC.company_id = tableA.company_id AND tableC.client_id = ?
UNION
SELECT Columns
FROM TempTable
INNER JOIN tableD ON tableD.company_id = tableA.company_id AND tableD.branch_id = ?