SQL summations with multiple outer joins - sql

I have tables a, b, c, and d whereby:
There are 0 or more b rows for each a row
There are 0 or more c rows for each a row
There are 0 or more d rows for each a row
If I try a query like the following:
SELECT a.id, SUM(b.debit), SUM(c.credit), SUM(d.other)
FROM a
LEFT JOIN b on a.id = b.a_id
LEFT JOIN c on a.id = c.a_id
LEFT JOIN d on a.id = d.a_id
GROUP BY a.id
I notice that I have created a cartesian product and therefore my sums are incorrect (much too large).
I see that there are other SO questions and answers, however I'm still not grasping how I can accomplish what I want to do in a single query. Is it possible in SQL to write a query which aggregates all of the following data:
SELECT a.id, SUM(b.debit)
FROM a
LEFT JOIN b on a.id = b.a_id
GROUP BY a.id
SELECT a.id, SUM(c.credit)
FROM a
LEFT JOIN c on a.id = c.a_id
GROUP BY a.id
SELECT a.id, SUM(d.other)
FROM a
LEFT JOIN d on a.id = d.a_id
GROUP BY a.id
in a single query?

Your analysis is correct. Unrelated JOIN create cartesian products.
You have to do the sums separately and then do a final addition. This is doable in one query and you have several options for that:
Sub-requests in your SELECT: SELECT a.id, (SELECT SUM(b.debit) FROM b WHERE b.a_id = a.id) + ...
CROSS APPLY with a similar query as the first bullet then SELECT a.id, b_sum + c_sum + d_sum
UNION ALL as you suggested with an outer SUM and GROUP BY on top of that.
LEFT JOIN to similar subqueries as above.
And probably more... The performance of the various solutions might be slightly different depending on how many rows in A you want to select.

SELECT a.ID, debit, credit, other
FROM a
LEFT JOIN (SELECT a_id, SUM(b.debit) as debit
FROM b
GROUP BY a_id) b ON a.ID = b.a_id
LEFT JOIN (SELECT a_id, SUM(b.credit) as credit
FROM c
GROUP BY a_id) c ON a.ID = c.a_id
LEFT JOIN (SELECT a_id, SUM(b.other) as other
FROM d
GROUP BY a_id) d ON a.ID = d.a_id

Can also be done with correlated subqueries:
SELECT a.id
, (SELECT SUM(debit) FROM b WHERE a.id = b.a_id)
, (SELECT SUM(credit) FROM c WHERE a.id = c.a_id)
, (SELECT SUM(other) FROM d WHERE a.id = d.a_id)
FROM a

Related

Simplifying code with nested JOINs in WHERE clause

I'm writing an SQL statement in PostgreSQL where I'm JOINing data from different tables that are each connected by foreign keys on their ids. Table b has a field a_id which relates to the id of table a and so on.
My problem is that I want to reuse a value from the joined table in a WHERE clause without having to do all the JOINs again, like this:
SELECT *
FROM a
INNER JOIN b ON b.a_id = a.id
INNER JOIN c ON c.b_id = b.id
WHERE a.id = 3
AND a.x =
(SELECT c.y
FROM a
INNER JOIN b ON b.a_id = a.id
INNER JOIN c ON c.b_id = b.id
WHERE a.id = 3
AND c.id = 5)
I bet there's a simpler solution for this snippet that I'm just not realising. I'll be glad if anybody can help me out.
I don't have a silver bullet answer which simplifies your query, but CTEs certainly could make it a bit easier on the eyes:
WITH cte AS (
SELECT *
FROM a
INNER JOIN b ON b.a_id = a.id
INNER JOIN c ON c.b_id = b.id
WHERE a.id = 3
)
SELECT *
FROM cte
WHERE x IN (SELECT y FROM cte WHERE c_id = 5);
My aliases or column names may be off, and you may need to tidy up the CTE a bit before it would actually work for you.
You can use window functions for this:
SELECT . . .
FROM (SELECT a.*, b.*, c.*, -- should list the columns explicitly
MAX(c.y) FILTER (WHERE c.id = 5) OVER () as y_5
FROM a INNER JOIN
b
ON b.a_id = a.id INNER JOIN
c
ON c.b_id = b.id
WHERE a.id = 3
) abc
WHERE abc.x = abc.y_5;
I hope the below-mentioned query will help you.
SELECT *
FROM a
INNER JOIN b ON b.a_id = a.id
INNER JOIN c ON c.b_id = b.id
WHERE a.id = 3 and c.id =5 and a.x = c.y

issue in sql join query formation

I have two tables Say A and B. A is master table and B is child table, from which I need values as below.
select A.Id, A.Name, B.Path from A,B where A.Id=B.Id
Now, I want to add column of 3rd table which is child of table 'B', say C i.e. C.File.
The value of C.File will be null if C.SubId=B.SubId is false else will return value when condition becomes true.
This is the exact definition of a left join:
SELECT a.id, b.name, b.path, c.file
FROM a
JOIN b ON a.id = b.id
LEFT JOIN c ON b.subid = c.subid
You need to LEFT JOIN your third table from what I can gather.
SELECT A.Id, A.Name, B.Path, C.file
FROM tableA a
INNER JOIN tableB b ON a.id = b.id
LEFT JOIN tableC c ON b.subid = c.subid
Simply Join all the three tables using INNER JOIN
select A.Id, A.Name, B.Path ,C.File
FROM A
INNER JOIN B
ON A.Id=B.Id
INNER JOIN C
ON C.SubId=B.SubId

Left Join Multiple Tables and Avoid Duplicates

I have two tables with a 1:n relationship to my base table, both of which I want to LEFT JOIN.
-------------------------------
Table A Table B Table C
-------------------------------
|ID|DATA| |ID|DATA| |ID|DATA|
-------------------------------
1 A1 1 B1 1 C1
- - 1 C2
I'm using:
SELECT * FROM TableA a
LEFT JOIN TableB b
ON a.Id = b.Id
LEFT JOIN TableC c
ON a.Id = c.Id
But this is showing duplicates for TableB:
1 A1 B1 C1
1 A1 B1 C2
How can I write this join to ignore the duplicates? Such as:
1 A1 B1 C1
1 A1 null C2
I think you need to do logic to get what you want. You want for any multiple b.ids to eliminate them. You can identify them using row_number() and then use case logic to make subsequent values NULL:
select a.id, a.val,
(case when row_number() over (partition by b.id, b.seqnum order by b.id) = 1 then val
end) as bval
c.val as cval
from TableA a left join
(select b.*, row_number() over (partition by b.id order by b.id) as seqnum
from tableB b
) b
on a.id = b.id left join
tableC c
on a.id = c.id
I don't think you want a full join between B and C, because you will get multiple rows. If B has 2 rows for an id and C has 3, then you will get 6. I suspect that you just want 3. To achieve this, you want to do something like:
select *
from (select b.*, row_number() over (partition by b.id order by b.id) as seqnum
from TableB b
) b
on a.id = b.id full outer join
(select c.*, row_number() over (partition by c.id order by c.id) as seqnum
from TableC c
) c
on b.id = c.id and
b.seqnum = c.seqnum join
TableA a
on a.id = b.id and a.id = c.id
This is enumerating the "B" and "C" lists, and then joining them by position on the list. It uses a full outer join to get the full length of the longer list.
The last join references both tables so TableA can be used as a filter. Extra ids in B and C won't appear in the results.
Do you want to use distinct
SELECT distinct * FROM TableA a
LEFT JOIN TableB b
ON a.Id = b.Id
LEFT JOIN TableC c
ON a.Id = c.Id
Do it as a UNION, i.e.
SELECT TableA.ID, TableB.ID, TableC.Id
FROM TableA a
INNER JOIN TableB b ON a.Id = b.Id
LEFT JOIN TableC c ON a.Id = c.Id
UNION
SELECT TableA.ID, Null, TableC.Id
FROM TableA a
LEFT JOIN TableC c ON a.Id = c.Id
i.e. one SELECT to being back the first row and another to bring back the second row. It's a bit rough because I don't know anything about the data you are trying to read but the principle is sound. You may need to rework it a bit.

How many B and C has A?

I have this tables:
A:
id
1
2
B:
id a_id
1 1
2 1
3 1
C:
id a_id
1 1
2 1
3 2
I need this result:
A, CountB, CountC
1, 3, 2
2, 0, 1
This try doesnt work fine:
SELECT
A.id, COUNT(B.id), COUNT(C.id)
FROM
A
LEFT JOIN
B ON A.id = B.a_id
LEFT JOIN
C ON A.id = C.a_id
GROUP BY A.id
How must be the sql sentence without using correlative queries?
The following variation on yours should work:
SELECT A.id, COUNT(distinct B.id), COUNT(distinct C.id)
FROM A LEFT JOIN
B
ON A.id = B.a_id LEFT JOIN
C
ON A.id = C.a_id
GROUP BY A.id
However, there are those (such as myself) who feel that using count distinct is a cop-out. The problem is that the rows from B and from C are interfering with each other, multiplying in the join. So, you can also do each join independently, and then put the results together:
select ab.id, cntB, cntC
from (select a.id, count(*) as cntB
from A left outer join
B
on A.id = B.a_id
group by a.id
) ab join
(select a.id, count(*) as cntC
from A left outer join
C
on A.id = C.a_id
group by a.id
) ac
on ab.id = ac.id
For just counting, the first form is fine. If you need to do other summarizations (say, summing a value), then you generally need to split into the component queries.

how to eliminate duplicate results from joined sql statement

i have four tables which are related like this:
TABLE A: 1 to many with TABLE B
TABLE B: 1 to many with TABLE C
TABLE D: many to 1 with TABLE B
I would like to create a result set which contains no duplicates.
SELECT A.f1
A.f2
B.f1
C.f1
D.f1
from A LEFT JOIN (B INNER JOIN D on D.fk_b = B.id) on A.id = B.fk_a
LEFT JOIN C on C.fk_b = B.id
Some but not all records are duplicated. Only the records from TABLE D which have the same FK to TABLE B.
If there are 2 the same D.fk_B fields then i would like to select only the first one. My INNER JOIN would become something like:
B INNER JOIN TOP 1 D on.....
But that's not possible?
Thank you!
Assuming that the structure in question is near the one shown bellow,
The following SELECT statement will show you just the primary keys, briefly giving an idea about what data is being returned.
SELECT
A.id AS `A_id`,
B.id AS `B_id`,
COALESCE(C.id, 0) AS `C_id`,
COALESCE(
(SELECT D.id FROM D WHERE D.fk_b = B.id LIMIT 1), 0
) AS `D_id`
FROM B
INNER JOIN A ON (B.fk_a = A.id)
LEFT JOIN C ON (B.id = C.fk_b);
Now the select you've asked for:
SELECT
A.f1 AS `A_f1`,
A.f2 AS `A_f2`,
B.f1 AS `B_f1`,
COALESCE(C.f1, '-') AS `C_f1`,
COALESCE(
(SELECT D.f1 FROM D WHERE D.fk_b = B.id LIMIT 1), '-'
) AS `D_f1`
FROM B
INNER JOIN A ON (B.fk_a = A.id)
LEFT JOIN C ON (B.id = C.fk_b);
Code above was tested and worked fine on MySQL 5.6
It appears that you have cartesian join.
You could try just putting DISTINCT infront of you column definition or you could so subqueries to link the tables together.
Try
SELECT DISTINCT A.f1
A.f2
B.f1
C.f1
D.f1
from A LEFT JOIN (B INNER JOIN D on D.fk_b = B.id) on A.id = B.fk_a
LEFT JOIN C on C.fk_b = B.id
You can do this:
B INNER JOIN TOP 1 D on.....
Like this
B INNER JOIN (SELECT TOP 1 fields from table) D ON ...
Try this:
SELECT DISTINCT A.f1
A.f2
B.f1
C.f1
D.f1
from A
LEFT JOIN (B INNER JOIN (SELECT DISTINCT fk_b FROM D) D on D.fk_b = B.id)
on A.id = B.fk_a
LEFT JOIN C on C.fk_b = B.id