How to write sql select query for following - sql

I have following tables
Table 1: person
columns: id,name,address,code
Table 2: carDetails
columns: id,person_id,car_brand
constraints: FL==>carDetails(person_id) reference person(id)
Note: carDetails is having multiple details for single person
Table 3: mobileDetails
columns: id,person_id,mobile_brand
constraints: FL==>mobileDetails(person_id) reference person(id)
Note: mobileDetails is having multiple details for single person
Similarly i have lot of details like car and mobile for person
What I want to select is:
person(id),
person(name),
Array of carDetails(brand) belonging to that particular person(id)
Array of mobileDetails(brand) belonging to that particular person(id)

You should write this query using subqueries for the aggregation:
select p.*, c.car_brands, m.mobile_brands
from person p left join
(select c.person_id, array_agg(cd.car_brand) as car_brands
from car_details c
group by c.person_id
) c
on c.person_id = p.id left join
(select m.person_id, array_agg(m.mobile_brand) as mobile_brands
from mobile_details m
group by m.person_id
) m
on m.person_id = m.id;
Two notes:
You want to use left join, in case you have no data in one of the tables for some people.
You want to aggregate before joining to avoid duplicates. Although you could add distinct to array_agg() that incurs a performance penalty.
If you are filtering the people, it is often more efficient to do this using a subquery or (equivalently) as lateral join:
select p.*,
(select array_agg(cd.car_brand) as car_brands
from car_details c
where c.person_id = p.id
) as car_brands,
(select array_agg(m.mobile_brand) as mobile_brands
from mobile_details m
where m.person_id = p.id
) as mobile_brands
from person p;

Related

Subtracting values of columns from two different tables

I would like to take values from one table column and subtract those values from another column from another table.
I was able to achieve this by joining those tables and then subtracting both columns from each other.
Data from first table:
SELECT max_participants FROM courses ORDER BY id;
Data from second table:
SELECT COUNT(id) FROM participations GROUP BY course_id ORDER BY course_id;
Here is some code:
SELECT max_participants - participations AS free_places FROM
(
SELECT max_participants, COUNT(participations.id) AS participations
FROM courses
INNER JOIN participations ON participations.course_id = courses.id
GROUP BY courses.max_participants, participations.course_id
ORDER BY participations.course_id
) AS course_places;
In general, it works, but I was wondering, if there is some way to make it simplier or maybe my approach isn't correct and this code will not work in some conditions? Maybe it needs to be optimized.
I've read some information about not to rely on natural order of result set in databases and that information made my doubts to appear.
If you want the values per course, I would recommend:
SELECT c.id, (c.max_participants - COUNT(p.id)) AS free_places
FROM courses c LEFT JOIN
participations p
ON p.course_id = c.id
GROUP BY c.id, c.max_participants
ORDER BY 1;
Note the LEFT JOIN to be sure all courses are included, even those with no participants.
The overall number is a little tricker. One method is to use the above as a subquery. Alternatively, you can pre-aggregate each table:
select c.max_participants - p.num_participants
from (select sum(max_participants) as max_participants from courses) c cross join
(select count(*) as num_participants from participants from participations) p;

SQL - count of number of times a foreign key appears in a table

Schema Info:
3 tables are concerned: SIGHTING, SPOTTER, AND ORG
SIGHTING references SPOTTER through FK SpotterId.
SPOTTER references ORG through FK OrgId.
I would like a query to return two columns; one for a list of ORG.OrgName, and another for the respective total count of Spotter_ID appearances in SIGHTINGS for the corresponding ORG.OrgID.
What I have done below returns the incorrect counts for each row returned:
SELECT ORG.ORG_NAME AS ORG_NAME,
(SELECT COUNT(SIGHTINGS.SPOTTER_ID)
FROM SIGHTINGS
, SPOTTERS
WHERE SIGHTINGS.SPOTTER_ID = SPOTTERS.SPOTTER_ID
AND SPOTTERS.ORG_ID=ORG.ORG_ID) AS ORG_COUNT
FROM ORG;
It seems that you only need aggregation:
SELECT COUNT(1), orgName
FROM SIGHTING
INNER JOIN SPOTTER USING (spotterId)
INNER JOIN ORG USING (orgId)
GROUP BY orgName
This is simple aggregation, but you only need one JOIN:
select o.orgname, count(*) as numSpotters
from org o join
spotters s
on o.orgId = s.orgId
group by o.orgname;
Note: If you want all organizations, even those with no spotters, then use left join instead of join.

LEFT JOIN across three tables (with junction table)

In Postgres, is there a way to perform a left join between tables linked by a junction table, with some filtering on the linked table?
Say, I have two tables, humans and pets, and I want to perform a query where I have the human ID, and the pet name. If the human ID exists, but they don't have a pet with that name, I still want the human's row to be returned.
If I had a FK relationship from pets to humans, this would work:
select h.*, p.*
from humans as h
left join pets as p on p.human_id = h.id and p.name = 'fluffy'
where h.id = 13
and I'd get a row with human 13's details, and fluffy's values. In addition, if human 13 didn't have a pet named 'fluffy', I'd get a row with human 13's values, and empty values for the pet's columns.
BUT, I don't have a direct FK relationship, I have a junction table between humans and pets, so I'm trying a query like:
select h.*, p.*
from humans as h
left join humans_pets_junction as j on j.human_id = h.id
left join pets as p on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13
Which returns rows for all of human 13's pets, with empty columns except for fluffy's row.
If I add p.name = 'fluffy' to the WHERE clause, that filters out all the empty rows, but also means I get 0 rows if human 13 doesn't have a pet named fluffy at all.
Is there a way to replicate the behavior of the FK-style left join, but when used with a junction table?
One method is to do the comparison in the where clause:
select h.*, p.*
from humans as h left join
humans_pets_junction as j
on j.human_id = h.id left join
pets as p
on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13 and (p.name = 'fluffy' or p.id is null);
Alternatively, join the junction table and the pets table as a subquery or CTE:
select h.*, p.*
from humans h left join
(select j.*
from humans_pets_junction j join
pets p
on j.pet_id = p.id and p.name = 'fluffy'
) pj
on pj.human_id = h.id
where h.id = 13;
In Postgres you can use parentheses to prioritize JOIN order. You do not need a subquery:
SELECT h.*, p.id AS p_id, p.name AS pet_name
FROM humans h
LEFT JOIN (pets p
JOIN humans_pets_junction j ON p.name = 'fluffy'
AND j.pet_id = p.id
AND j.human_id = 13) ON TRUE
WHERE h.id = 13;
Per documentation:
Parentheses can be used around JOIN clauses to control the join order.
In the absence of parentheses, JOIN clauses nest left-to-right.
I added the predicate j.human_id = 13 to the join between your junction table and the pets to eliminate irrelevant rows at the earliest opportunity. The outer LEFT JOIN only needs the dummy condition ON TRUE.
SQL Fiddle.
Aside 1: I assume you are aware that you have a textbook implementation of a n:m (many-to-many) relationship?
How to implement a many-to-many relationship in PostgreSQL?
Aside 2: The unfortunate naming convention in the example makes it necessary to deal out column aliases. Don't use "id" and "name" as column names in your actual tables to avoid such conflicts. Use proper names like "pet_id", "human_id" etc.

Left outer join two levels deep in Postgres results in cartesian product

Given the following 4 tables:
CREATE TABLE events ( id, name )
CREATE TABLE profiles ( id, event_id )
CREATE TABLE donations ( amount, profile_id )
CREATE TABLE event_members( id, event_id, user_id )
I'm attempting to get a list of all events, along with a count of any members, and a sum of any donations. The issue is the sum of donations is coming back wrong (appears to be a cartesian result of donations * # of event_members).
Here is the SQL query (Postgres)
SELECT events.name, COUNT(DISTINCT event_members.id), SUM(donations.amount)
FROM events
LEFT OUTER JOIN profiles ON events.id = profiles.event_id
LEFT OUTER JOIN donations ON donations.profile_id = profiles.id
LEFT OUTER JOIN event_members ON event_members.event_id = events.id
GROUP BY events.name
The sum(donations.amount) is coming back = to the actual sum of donations * number of rows in event_members. If I comment out the count(distinct event_members.id) and the event_members left outer join, the sum is correct.
As I explained in an answer to the referenced question you need to aggregate before joining to avoid a proxy CROSS JOIN. Like:
SELECT e.name, e.sum_donations, m.ct_members
FROM (
SELECT e.id AS event_id, e.name, SUM(d.amount) AS sum_donations
FROM events e
LEFT JOIN profiles p ON p.event_id = e.id
LEFT JOIN donations d ON d.profile_id = p.id
GROUP BY 1, 2
) e
LEFT JOIN (
SELECT m.event_id, count(DISTINCT m.id) AS ct_members
FROM event_members m
GROUP BY 1
) m USING (event_id);
IF event_members.id is the primary key, then id is guaranteed to be UNIQUE in the table and you can drop DISTINCT from the count:
count(*) AS ct_members
You seem to have this two independent structures (-[ means 1-N association):
events -[ profiles -[ donations
events -[ event members
I wrapped the second one into a subquery:
SELECT events.name,
member_count.the_member_count
COUNT(DISTINCT event_members.id),
SUM(donations.amount)
FROM events
LEFT OUTER JOIN profiles ON events.id = profiles.event_id
LEFT OUTER JOIN donations ON donations.profile_id = profiles.id
LEFT OUTER JOIN (
SELECT
event_id,
COUNT(*) AS the_member_count
FROM event_members
GROUP BY event_id
) AS member_count
ON member_count.event_id = events.id
GROUP BY events.name
Of course you get a cartesian product between donations and events for every event since both are only bound to the event, there is no join relation between donations and event_members other than the event id, which of course means that every member matches every donation.
When you do your query, you ask for all events - let's say there are two, event Alpha and event Beta - and then JOIN with the members. Let's say that there is a member Alice that participates on both events.
SELECT events.name, COUNT(DISTINCT event_members.id), SUM(donations.amount)
FROM events
LEFT OUTER JOIN profiles ON events.id = profiles.event_id
LEFT OUTER JOIN donations ON donations.profile_id = profiles.id
LEFT OUTER JOIN event_members ON event_members.event_id = events.id
GROUP BY events.name
On each row you asked the total for Alice's donations. If Alice donated 100 USD, then you asked for:
Alpha Alice 100USD
Beta Alice 100USD
So it's not surprising that when asking for the sum total Alice comes out as having donated 200 USD.
If you wanted the sum of all donations, you'd better doing with two distinct queries. Trying to do everything with a single query, while possible, would be a classical SQL Antipattern (actually the one in chapter #18, "Spaghetti Query"):
Unintended Products
One common consequence of producing all your
results in one query is a Cartesian product. This happens when two of
the tables in the query have no condition restricting their
relationship. Without such a restriction, the join of two tables pairs
each row in the first table to every row in the other table. Each such
pairing becomes a row of the result set, and you end up with many more
rows than you expect.

SQL JOIN using a mapping table

I have three tables:
COLLECTION
PERSON
PERSON_COLLECTION
where PERSON_COLLECTION is a mapping table id|person_id|collection_id
I now want to select all entries in collection and order them by person.name.
Do I have to join the separate tables with the mapping table first and then do a join again on the results?
SELECT
c.*,
p.Name
FROM
Collection c
JOIN Person_Collection pc ON pc.collection_id = c.id
JOIN Person p ON p.id = pc.person_id
ORDER BY p.Name
Not sure without the table schema but, my take is:
SELECT
c.*,
p.*
FROM
Person_Collection pc
LEFT JOIN Collection c
ON pc.collection_id = c.id
LEFT JOIN Person p
ON pc.person_id = p.id
ORDER BY p.name
The order you join won't break it but depending on which sql product you're using may effect performance.
You need to decide if you want ALL records from both/either table or only records which have a matching mapping entry, this will change the type of join you need to use.