Join on any of several columns - sql

I have a few tables, each containing different columns, which contain different subsets of all the products.
I want to get a list of all the information about all the products, so I want to do a full outer join on the product_id.
I tried
select * from table1
full outer join table2 b on b.product_id = table1.product_id
...
full outer join tableN c on b.product_id = table1.product_id
but this results in multiple rows where a product_id does not exist in table1, but might exist in table2 and tableN.
Is there a way to "coalesce" the join column?

If you use full outer join with more than two tables, then you will end up with something like this:
select *
from table1 a full outer join
table2 b
on b.product_id = a.product_id
... full outer join
tableN c
on c.product_id = coalesce(b.product_id, a.product_id)
The using clause -- supported by Postgres but not SQL Server -- simplifies this. It does assume that the columns all have the same name.
An alternative is a driving table. If you don't have a table of all products handy, you can create one:
select *
from (select product_id from table1 union
select product_id from table2 union
. . .
select product_id from tableN
) driving left join
table1 a
on a.product_id = driving.product_id left join
table2 b
on b.product_id = driving.product_id
... full outer join
tableN c
on c.product_id = driving.product_id;
This should be easier to interpret and the on clauses are simplified.
Finally, perhaps you just want common columns. If so, just use union all:
select product_id, col1, col2, col3 from table1 union all
select product_id, col1, col2, col3 from table2 union all
. . .
select product_id, col1, col2, col3 from tableN;
This prevents the proliferation of columns with NULL values.

select * from table1
inner join
(
SELECT * FROM table2
UNION ALL
SELECT * FROM table3
UNION ALL ........
SELECT * FROM tableN
) N
on N.product_id = table1.product_id

Related

SQL statement to check if data exists in one table and not in two other tables

I'm creating 1 temp table (temp1) using table1.
and I want to check if data from temp table is present in table1 and table2.
table1 and table2 have same columns.
It's difficult to assess exactly what you need without further detail, but you could try a LEFT JOIN and a COUNT here to indicate whether there are any matching rows (whereby anything over 0 indicates matching rows)
SELECT
COUNT(*) AS matching_rows
FROM
(
SELECT
1 AS 'ColumnA'
) AS T1
LEFT OUTER JOIN
(
SELECT
2 AS 'ColumnA'
) AS T2
ON T1.ColumnA = T2.ColumnA
WHERE
T2.ColumnA IS NOT NULL
You can also use an INNER JOIN for this:
SELECT
COUNT(*) AS matching_rows
FROM
(
SELECT
1 AS 'ColumnA'
) AS T1
INNER JOIN
(
SELECT
2 AS 'ColumnA'
) AS T2
ON T1.ColumnA = T2.ColumnA

BigQuery Issue - LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join

I am executing following query and getting "LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join".
select p.*, q.* from (
select a.* from table1 a
left join
(select distinct * from table2) b
on a.name1=b.name2
where a.name1 is not null) p
left join
(SELECT distinct name3, amount FROM table3) q
on q.name3=p.name1 or q.name3=p.name2
How should I resolve it?
Maybe FULL JOIN with additional IS NOT NULL condition will fit your case:
select p.*, q.*
from (
select a.* from table1 a
left join
(select distinct * from table2) b
on a.name1=b.name2
where a.name1 is not null
) p
full join
(SELECT distinct name3, amount FROM table3) q
on q.name3=p.name1 or q.name3=p.name2
where p.name1 is not null or p.name2 is not null

Best way to do multiple left outer excluding joins

I have one table that I need to bump against multiple tables with left outer joins excluding the right(s). Is there a best practice for this? Union all the other tables first? Something else?
Here's the first thought that comes to my mind to handle this, but I want to know if there is a better more efficient way.
select
master_table.*
from
master_table
left outer join
(
select customer_id from table_1
union
select customer_id from table_2
union
select customer_id from table_3
union
select customer_id from table_4
) bump_table
on
master_table.customer_id = bump_table.customer_id
where
bump_table.customer_id is null
I should think a NOT EXISTS would be better. It certainly better communicates the intent of the query.
select * from master_table m
where not exists( select 1 from table_1 where m.customer_id=table_1.customer_id)
and not exists( select 1 from table_2 where m.customer_id=table_2.customer_id)
and not exists( select 1 from table_3 where m.customer_id=table_3.customer_id)
and not exists( select 1 from table_4 where m.customer_id=table_4.customer_id)
The basic form is surely faster - similar to the NOT EXISTS that #dbenham already supplied.
SELECT m.*
FROM master_table m
LEFT JOIN table_1 t1 ON t1.customer_id = m.customer_id
LEFT JOIN table_2 t2 ON t2.customer_id = m.customer_id
LEFT JOIN table_3 t3 ON t3.customer_id = m.customer_id
LEFT JOIN table_4 t4 ON t4.customer_id = m.customer_id
WHERE t1.customer_id IS NULL
AND t2.customer_id IS NULL
AND t3.customer_id IS NULL
AND t4.customer_id IS NULL;

sql, outer join

I have two tables, linked with an outer join. The relationship between the primary and secondary table is a 1 to [0..n]. The secondary table includes a timestamp column indicating when the record was added. I only want to retrieve the most recent record of the secondary table for each row in the primary. I have to use a group by on the primary table due to other tables also part of the SELECT. There's no way to use a 'having' clause though since this secondary table is not part of the group.
How can I do this without doing multiple queries?
For performance, try to touch the table least times
Option 1, OUTER APPLY
SELECT *
FROM
table1 a
OUTER APPY
(SELECT TOP 1 TimeStamp FROM table2 b
WHERE a.somekey = b.somekey ORDER BY TimeStamp DESC) x
Option 2, Aggregate
SELECT *
FROM
table1 a
LEFT JOIN
(SELECT MAX(TimeStamp) AS maxTs, somekey FROM table2
GROUP BY somekey) x ON a.somekey = x.somekey
Note: each table is mentioned once, no correlated subqueries
Something like:
SELECT a.id, b.*
FROM table1 a
INNER JOIN table2 b ON b.parentid = a.id
WHERE b.timestamp = (SELECT MAX(timestamp) FROM table2 c WHERE c.parentid = a.id)
Use LEFT JOIN instead of INNER JOIN if you want to show rows for IDs in table1 without any matches in table2.
select *
from table1 left outer join table2 a on
table1.id = a.table1_id
where
not exists (select 1 from table2 b where a.table1_id = b.table1_id and b.timestamp > a.timestamp)
The quickest way I know of is this:
SELECT
A.*,
B.SomeField
FROM
Table1 A
INNER JOIN (
SELECT
B1.A_ID,
B1.SomeField
FROM
Table2 B1
LEFT JOIN Table2 B2 ON (B1.A_ID=B2.A_ID) AND (B1.TimeStmp < B2.TimeStmp)
WHERE
B2.A_ID IS NULL
) B ON B.A_ID = A.ID

sql query without outer join key word

Is it possible to write a sql query where you know you have to use the left outer join..but cannot or are not allowed to use the "outer join" Key Word
I have two table sand want to get rows with null vaues from the left table ...this is pretty simple ...but am not supposed to use the key word....outer join....I need to right the logic for outer join myself
SELECT Field1
FROM table1
WHERE id NOT IN (SELECT id FROM table2)
SELECT Field1
FROM table1
WHERE NOT EXISTS (SELECT * FROM table2 where table2.id = table1.id)
This is something people do but it is deprecated and it does not currently work correctly (it sometimes will return a cross join instead of a left join) so it should NOT be used. I'm telling this only so you avoid using this solution.
SELECT Field1
FROM table1, table2 where table1.id *= table2.id
;WITH t1(c,d) AS
(
SELECT 1,'A' UNION ALL
SELECT 2,'B'
),t2(c,e) AS
(
SELECT 1,'C' UNION ALL
SELECT 1,'D' UNION ALL
SELECT 3,'E'
)
SELECT t1.c, t1.d, t2.c, t2.e
FROM t1, t2
WHERE t1.c = t2.c
UNION ALL
SELECT t1.c, t1.d, NULL, NULL
FROM t1
WHERE c NOT IN (SELECT c
FROM t2
WHERE c IS NOT NULL)
Returns
c d c e
----------- ---- ----------- ----
1 A 1 C
1 A 1 D
2 B NULL NULL
(Equivalent to)
SELECT t1.c, t1.d, t2.c, t2.e
FROM t1
LEFT JOIN t2
ON t1.c = t2.c
For SQL Server, you can just use LEFT JOIN - the OUTER is optional, just like INTO in an INSERT statement.
This is the same for all OUTER JOINs.
For an INNER JOIN you can just specify JOIN with no qualifiers and it is interpreted as an INNER JOIN.
This will give you all the rows in table A that don't have a matching row in table B:
SELECT *
FROM A
WHERE NOT EXISTS (
SELECT 1
FROM B
WHERE A.id = B.id
);
Returns all the matching rows from both tables:
SELECT a.*,b.* FROM table_a a, table_b b
WHERE a.key_field = b.key_field
Potential drawback is non-matches will be skipped.