select join result as columns to same row - sql

I have a one-to-many relation in a pg database. I have table A and table B, where rows of B have a foreign key to A.
I want to select certain rows from A and attach certain columns from matching rows of B to same row from A.
E.g.
A
id | created_at |
B
id | created_at | a_id | type |
I tried to do multiple subqueries, e.g.
select A.id,
(select created_at from B where b.a_id = a.id and B.type = 'some_type' limit 1) as some_type_created_at,
(select created_at from B where b.a_id = a.id and B.type = 'another_type' limit 1) as another_type_created_at
from A
But this is obviously ugly and wrong, feels like that. What is the better way of achieving it in Postgres?
Ofcourse I can do join and get the full cartesian product, but I want the result from the db to be directly like this.

There's nothing wrong about using scalar subqueries the way you are doing it. That will work well and will give you the result you want.
Alternatively, you could use lateral table expressions; that will also give you the same result, it's more complex, and in this case I don't see any particular benefit to use them. Lateral queries will take the form:
select
a.id,
b1.created_at as some_type_created_at,
b2.created_at as another_type_created_at
from a
left join lateral (
select created_at from B where b.a_id = a.id and B.type = 'some_type' limit 1
) b1 on true,
left join lateral (
select created_at from B where b.a_id = a.id and B.type = 'another_type' limit 1
) b2 on true
In sum, you are good as you are.

Related

Subqueries vs Multi Table Join

I've 3 tables A, B, C. I want to list the intersection count.
Way 1:-
select count(id) from A a join B b on a.id = b.id join C c on B.id = C.id;
Result Count - X
Way 2:-
SELECT count(id) FROM A WHERE id IN (SELECT id FROM B WHERE id IN (SELECT id FROM C));
Result Count - Y
The result count in each of the query is different. What exactly is wrong?
A JOIN can multiply the number of rows as well as filtering out rows.
In this case, the second count should be the correct one because nothing is double counted -- assuming id is unique in a. If not, it needs count(distinct a.id).
The equivalent using JOIN would use COUNT(DISTINCT):
select count(distinct a.id)
from A a join
B b
on a.id = b.id join
C c
on B.id = C.id;
I mention this for completeness but do not recommend this approach. Multiplying the number of rows just to remove them using distinct is inefficient.
In many databases, the most efficient method might be:
select count(*)
from a
where exists (select 1 from b where b.id = a.id) and
exists (select 1 from c where c.id = a.id);
Note: This assumes there are indexes on the id columns and that id is unique in a.

Query left join without all the right rows from B table

I have 2 tables, A and B.
I need all columns from A + 1 column from B in my select.
Unfortunately, B has multiples rows(all identicals) for 1 row in A
on the join condition.
I tried but I can't isolate one row in A for one row in B with left join for example while keeping my select.
How can I do this query ? Query in ORACLE SQL
Thanks in advance.
This is a good use for outer apply. The structure of the query looks like this:
select a.*, b.col
from a outer apply
(select top 1 b.col
from b
where b.? = a.?
) b;
Normally, you would only use top 1 with order by. In this case, it doesn't seem to make a difference which row you choose.
You can group by on all columns from A, and then use an aggregate (like max or min) to pick any of the identical B values:
select a.*
, b.min_col1
from TableA a
left join
(
select a_id
, min(col1) as min_col1
from TableB
group by
a_id
) b
on b.a_id = a.id

Select table as json array

I have three tables in PostgreSQL: A, B, C.
I want to get a row from table A with a specific id, plus all records from tables B and C with matching id as aggregated JSON.
For example:
Table A Table B Table C
---------------------------------------------------------------
id / colum1 / colum2 id/ colum 1 id / column1
1 someValue, somValue 1 someVal1 1 someVal1
1 someVal2 1 someVal2
The expected output for id = 1 would be:
a.column1 a.column2 ARRAY_JSON_B ARRAY_JSON_C
------------------------------------------------------------------------------
someValue someValue [{colum1:'someVal1'}, [{colum1:'someVal1'},
{colum1:'someVal2'}] {colum1:'someVal2'}]
This requires Postgres 9.3 or later.
Simple case
I suggest to use the simpler json_agg() that's meant for this purpose, in LATERAL joins:
SELECT *
FROM a
LEFT JOIN LATERAL (SELECT json_agg(b) AS array_json_b FROM b WHERE id = a.id) b ON true
LEFT JOIN LATERAL (SELECT json_agg(c) AS array_json_c FROM c WHERE id = a.id) c ON true
WHERE id = 1;
LEFT JOIN LATERAL ... ON true keeps rows in the result that have no match on the left side of the join. Details:
What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
Subtle difference: This query returns NULL where no match is found in b or c, #stas' query with correlated subqueries returns an empty array instead. May or may not be important.
Actual answer
Your example in the question excludes the redundant id column in b and c from the result - which makes sense. To achieve this, you can't use #stas' simple correlated subquery. While it would still work for a single column instead of the whole row, it would lose the column name and produce a simple array. Also, it would not work for more than one column.
Use json_object_agg() for a single selected column (which also allows to chose the tag name freely):
SELECT *
FROM a
LEFT JOIN LATERAL (
SELECT json_object_agg('colum1', colum1) AS array_json_b
FROM b WHERE id = a.id
) b ON true
LEFT JOIN LATERAL (
SELECT json_object_agg('colum1', colum1) AS array_json_c
FROM c WHERE id = a.id
) c ON true
WHERE id = 1;
Or use a subselect for any selection (col1 and col2 in this example):
SELECT *
FROM a
LEFT JOIN LATERAL (
SELECT json_agg(x) AS array_json_b
FROM (SELECT col1, col2 FROM b WHERE id = a.id) x
) b ON true
LEFT JOIN LATERAL (
SELECT json_agg(x) AS array_json_c
FROM (SELECT col1, col2 FROM c WHERE id = a.id) x
) c ON true
WHERE id = 1;
Related:
Return multiple columns of the same row as JSON array of objects
How do I return a jsonb array and array of objects from my data?
select
a.*,
to_json(array(select b from b where b.id = a.id)) array_json_b,
to_json(array(select c from c where c.id = a.id)) array_json_c
from a
where
a.id = 1;
I hope your Postgresql version is 9.3 or higher. There is a clever function to_json which can convert anything to json. So we take an array of all related rows from b and convert it. Same with c.

Efficient way to check if row exists for multiple records in postgres

I saw answers to a related question, but couldn't really apply what they are doing to my specific case.
I have a large table (300k rows) that I need to join with another even larger (1-2M rows) table efficiently. For my purposes, I only need to know whether a matching row exists in the second table. I came up with a nested query like so:
SELECT
id,
CASE cnt WHEN 0 then 'NO_MATCH' else 'YES_MATCH' end as match_exists
FROM
(
SELECT
A.id as id, count(*) as cnt
FROM
A, B
WHERE
A.id = B.foreing_id
GROUP BY A.id
) AS id_and_matches_count
Is there a better and/or more efficient way to do it?
Thanks!
You just want a left outer join:
SELECT
A.id as id, count(B.foreing_id) as cnt
FROM A
LEFT OUTER JOIN B ON
A.id = B.foreing_id
GROUP BY A.id

filter duplicates in SQL join

When using a SQL join, is it possible to keep only rows that have a single row for the left table?
For example:
select * from A, B where A.id = B.a_id;
a1 b1
a2 b1
a2 b2
In this case, I want to remove all except the first row, where a single row from A matched exactly 1 row from B.
I'm using MySQL.
This should work in MySQL:
select * from A, B where A.id = B.a_id GROUP BY A.id HAVING COUNT(*) = 1;
For those of you not using MySQL, you will need to use aggregate functions (like min() or max()) on all the columns (except A.id) so your database engine doesn't complain.
It helps if you specify the keys of your tables when asking a question such as this. It isn't obvious from your example what the key of B might be (assuming it has one).
Here's a possible solution assuming that ID is a candidate key of table B.
SELECT *
FROM A, B
WHERE B.id =
(SELECT MIN(B.id)
FROM B
WHERE A.id = B.a_id);
First, I would recommend using the JOIN syntax instead of the outdated syntax of separating tables by commas. Second, if A.id is the primary key of the table A, then you need only inspect table B for duplicates:
Select ...
From A
Join B
On B.a_id = A.id
Where Exists (
Select 1
From B B2
Where B2.a_id = A.id
Having Count(*) = 1
)
This avoids the cost of counting matching rows, which can be expensive for large tables.
As usual, when comparing various possible solutions, benchmarking / comparing the execution plans is suggested.
select
*
from
A
join B on A.id = B.a_id
where
not exists (
select
1
from
B B2
where
A.id = b2.a_id
and b2.id != b.id
)