Select full outer join from many-to-many relationships - sql

I am trying to do something in MSSQL which I suppose is a fairly simple and common thing in any database with many-to-many relationships. However I seem to always end up with a quite complicated select query, I seem to be repeating the same conditions several times to get the desired output.
The scenario is like this. I have 2 tables (table A and B) and a cross table with foreign keys to the ID columns of A and B. There can only be one unique pair of As and Bs in the crosstable (I guess the 2 foreign keys make up a primary key in the cross table ?!?). Data in the three tables could look like this:
TABLE A TABLE B TABLE AB
ID Type ID Type AID BID
--------------------------------------------------
R Up 1 IN R 3
S DOWN 2 IN T 3
T UP 3 OUT T 5
X UP 4 OUT Z 6
Y DOWN 5 IN
Z UP 6 OUT
Now let's say I select all rows in A of type UP and all rows in B of type OUT:
SELECT ID FROM A AS A1
WHERE Type = 'UP'
(Result: R, T, X, Z)
SELECT ID FROM B AS B1
WHERE Type = 'OUT'
(Result: 3, 4, 6)
What I want now is to fully outer join these 2 sub queries based on the relations listed in AB. Hence I want all IDs in A1 and B1 to be listed at least once:
A.ID B.ID
R 3
T 3
null 4
X null
Z 6
From this results set I want to be able to see:
- Which rows in A1 does not relate to any rows in B1
- Which rows in B1 does not relate to any rows in A1
- Relations between rows in A1 and B1
I have tried a couple of things such as:
SELECT A1.ID, B1.ID
FROM (
SELECT * FROM A
WHERE Type = 'UP') AS A1
FULL OUTER JOIN AB ON
A1.ID = AB.AID
FULL OUTER JOIN (
SELECT * FROM B
WHERE Type = 'OUT') AS B1
ON AB.BID = B1.ID
This doesn't work, since some of the relations listed in AB are between rows in A1 and rows NOT IN B1 OR between rows in B1 but NOT IN A1.
In other words - I seem to be forced to create a subquery for the AB table also:
SELECT A1.ID, B1.ID
FROM (
SELECT * FROM A
WHERE Type = 'UP') AS A1
FULL OUTER JOIN (
SELECT * FROM AB AS AB1
WHERE
AID IN (SELECT ID FROM A WHERE type = 'UP') AND
BID IN (SELECT ID FROM B WHERE type = 'OUT')
) AS AB1 ON
A1.ID = AB1.AID
FULL OUTER JOIN (
SELECT * FROM B
WHERE Type = 'OUT') AS B1
ON AB1.BID = B1.ID
That just seems like a rather complicated solution for a seemingly simply problem. Especially when you consider that for A1 and B1 subqueries with more (complex) conditions - possible involving joins to other tables (one-to-many) would require the same temporary joins and conditions to be repeated in the AB1 subquery.
I am thinking that there must be an obvious way to rewrite the above select statements in order to avoid having to repeat the same conditions several times. The solution is probably right there in front me, but I just can't see it.
Any help would be appreciated.

I think you could employ a CTE in this case, like this:
;WITH cte AS (
SELECT A.ID AS AID, A.Type AS AType, B.ID AS BID, B.Type AS BType
FROM A FULL OUTER JOIN AB ON A.ID = AB.AID
FULL OUTER JOIN B ON B.ID = AB.BID)
SELECT AID, BID FROM CTE WHERE AType = 'UP' OR BType = 'OUT'
The advantage of using a CTE is that it will be compiled once. Then you can add additional criteria to the WHERE clause outside the CTE
Check this SQL Fiddle

Related

select join result as columns to same row

I have a one-to-many relation in a pg database. I have table A and table B, where rows of B have a foreign key to A.
I want to select certain rows from A and attach certain columns from matching rows of B to same row from A.
E.g.
A
id | created_at |
B
id | created_at | a_id | type |
I tried to do multiple subqueries, e.g.
select A.id,
(select created_at from B where b.a_id = a.id and B.type = 'some_type' limit 1) as some_type_created_at,
(select created_at from B where b.a_id = a.id and B.type = 'another_type' limit 1) as another_type_created_at
from A
But this is obviously ugly and wrong, feels like that. What is the better way of achieving it in Postgres?
Ofcourse I can do join and get the full cartesian product, but I want the result from the db to be directly like this.
There's nothing wrong about using scalar subqueries the way you are doing it. That will work well and will give you the result you want.
Alternatively, you could use lateral table expressions; that will also give you the same result, it's more complex, and in this case I don't see any particular benefit to use them. Lateral queries will take the form:
select
a.id,
b1.created_at as some_type_created_at,
b2.created_at as another_type_created_at
from a
left join lateral (
select created_at from B where b.a_id = a.id and B.type = 'some_type' limit 1
) b1 on true,
left join lateral (
select created_at from B where b.a_id = a.id and B.type = 'another_type' limit 1
) b2 on true
In sum, you are good as you are.

Join two tables, select column based on value in first table

Imagine a table A with two columns "Type" and "Severity", and a table B with columns "Type", "Severity_1", "Severity_2", "Severity_3", "Severity_4".
A.Severity is an integer, and all the B.Severity_* fields contain a description.
I want to query table A for Type and Severity, and also return a third column with the corresponding description from table B.
Currently, I'm using LINQ and have a set of nested IF statements in the select clause. Is there a way to project table B or select out each {Type, Severity, Severity_*} record and union the results?
You should make a view, if possible, like this
select a.Type, a.Severity,
case a.Severity
when 1 then b1.Severity_1
when 2 then b2.Severity_2
when 3 then b3.Severity_3
when 4 then b4.Severity_4
end as Description
from TableA a
left join tableb b1 on a.Severity = 1 and a.Type = b1.Type
left join tableb b2 on a.Severity = 2 and a.Type = b2.Type
left join tableb b3 on a.Severity = 3 and a.Type = b3.Type
left join tableb b4 on a.Severity = 4 and a.Type = b4.Type
Then just query the view in Linq.

SQL Logic: When joining Child table B to Parent Table A on A.FID = B.ID

I have been wondering if the results would change in multi-join tables queries.
If you have parent Table A
A B
ID|FID FID
1|2 1
2|4 2
3|5 3
4|7 4
5|8 5
6|NULL 6
7|NULL 7
8|NULL 8
does it matter which table column you specified in the WHERE clause?
For example, what is the difference between the two:
Select *
From Table A
Left Join B on A.FID = B.FID
WHERE A.FID IN (2,5,8)
Select *
From Table A
Left Join B on A.FID = B.FID
WHERE B.ID IN (2,5,8)
Thank you for the help!
EDIT:
Micheal has solved my question and I have tested it out
'Actually, while your answer is a good one (and probably the one he's looking for), since both of his queries are essentially filtering on the primary key of B (A.FID, B.ID), they actually are logically identical (assuming that A.FID is a true foreign key constraint on B). That is, both queries filter out rows in which B.ID is not 2, 5 or 8.' – Michael L.
It is only different is Table B is the main table and you queried based on B.ID as in:
SELECT *
FROM B
LEFT JOIN A ON A.FID = B.FID
WHERE B.FID IN (2,5,8)
While this will be the same as having A as the main table:
SELECT *
FROM B
LEFT JOIN A ON A.FID = B.FID
WHERE A.FID IN (2,5,8)
Yes, it does.
When you use an OUTER JOIN, values from one of the tables may be NULL. So, the second query is equivalent to:
Select *
From Table A Inner Join
B
on A.FID = B.ID
WHERE B.ID IN (2, 5, 8);
because the NULL values are filtered out.
As a general rules with LEFT JOIN:
Filters on the first table belong in the WHERE clause.
Filters on the second and subsequent tables should to in the ON clause.

Is there alternative way to write this query?

I have tables A, B, C, where A represents items which can have zero or more sub-items stored in C. B table only has 2 foreign keys to connect A and C.
I have this sql query:
select * from A
where not exists (select * from B natural join C where B.id = A.id and C.value > 10);
Which says: "Give me every item from table A where all sub-items have value less than 10.
Is there a way to optimize this? And is there a way to write this not using exists operator?
There are three commonly used ways to test if a value is in one table but not another:
NOT EXISTS
NOT IN
LEFT JOIN ... WHERE ... IS NULL
You have already shown code for the first. Here is the second:
SELECT *
FROM A
WHERE id NOT IN (
SELECT b.id
FROM B
NATURAL JOIN C
WHERE C.value > 10
)
And with a left join:
SELECT *
FROM A
LEFT JOIN (
SELECT b.id
FROM B
NATURAL JOIN C
WHERE C.value > 10
) BC
ON A.id = BC.id
WHERE BC.id IS NULL
Depending on the database type and version, the three different methods can result in different query plans with different performance characteristics.

filter duplicates in SQL join

When using a SQL join, is it possible to keep only rows that have a single row for the left table?
For example:
select * from A, B where A.id = B.a_id;
a1 b1
a2 b1
a2 b2
In this case, I want to remove all except the first row, where a single row from A matched exactly 1 row from B.
I'm using MySQL.
This should work in MySQL:
select * from A, B where A.id = B.a_id GROUP BY A.id HAVING COUNT(*) = 1;
For those of you not using MySQL, you will need to use aggregate functions (like min() or max()) on all the columns (except A.id) so your database engine doesn't complain.
It helps if you specify the keys of your tables when asking a question such as this. It isn't obvious from your example what the key of B might be (assuming it has one).
Here's a possible solution assuming that ID is a candidate key of table B.
SELECT *
FROM A, B
WHERE B.id =
(SELECT MIN(B.id)
FROM B
WHERE A.id = B.a_id);
First, I would recommend using the JOIN syntax instead of the outdated syntax of separating tables by commas. Second, if A.id is the primary key of the table A, then you need only inspect table B for duplicates:
Select ...
From A
Join B
On B.a_id = A.id
Where Exists (
Select 1
From B B2
Where B2.a_id = A.id
Having Count(*) = 1
)
This avoids the cost of counting matching rows, which can be expensive for large tables.
As usual, when comparing various possible solutions, benchmarking / comparing the execution plans is suggested.
select
*
from
A
join B on A.id = B.a_id
where
not exists (
select
1
from
B B2
where
A.id = b2.a_id
and b2.id != b.id
)