LEFT JOIN across three tables (with junction table) - sql

In Postgres, is there a way to perform a left join between tables linked by a junction table, with some filtering on the linked table?
Say, I have two tables, humans and pets, and I want to perform a query where I have the human ID, and the pet name. If the human ID exists, but they don't have a pet with that name, I still want the human's row to be returned.
If I had a FK relationship from pets to humans, this would work:
select h.*, p.*
from humans as h
left join pets as p on p.human_id = h.id and p.name = 'fluffy'
where h.id = 13
and I'd get a row with human 13's details, and fluffy's values. In addition, if human 13 didn't have a pet named 'fluffy', I'd get a row with human 13's values, and empty values for the pet's columns.
BUT, I don't have a direct FK relationship, I have a junction table between humans and pets, so I'm trying a query like:
select h.*, p.*
from humans as h
left join humans_pets_junction as j on j.human_id = h.id
left join pets as p on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13
Which returns rows for all of human 13's pets, with empty columns except for fluffy's row.
If I add p.name = 'fluffy' to the WHERE clause, that filters out all the empty rows, but also means I get 0 rows if human 13 doesn't have a pet named fluffy at all.
Is there a way to replicate the behavior of the FK-style left join, but when used with a junction table?

One method is to do the comparison in the where clause:
select h.*, p.*
from humans as h left join
humans_pets_junction as j
on j.human_id = h.id left join
pets as p
on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13 and (p.name = 'fluffy' or p.id is null);
Alternatively, join the junction table and the pets table as a subquery or CTE:
select h.*, p.*
from humans h left join
(select j.*
from humans_pets_junction j join
pets p
on j.pet_id = p.id and p.name = 'fluffy'
) pj
on pj.human_id = h.id
where h.id = 13;

In Postgres you can use parentheses to prioritize JOIN order. You do not need a subquery:
SELECT h.*, p.id AS p_id, p.name AS pet_name
FROM humans h
LEFT JOIN (pets p
JOIN humans_pets_junction j ON p.name = 'fluffy'
AND j.pet_id = p.id
AND j.human_id = 13) ON TRUE
WHERE h.id = 13;
Per documentation:
Parentheses can be used around JOIN clauses to control the join order.
In the absence of parentheses, JOIN clauses nest left-to-right.
I added the predicate j.human_id = 13 to the join between your junction table and the pets to eliminate irrelevant rows at the earliest opportunity. The outer LEFT JOIN only needs the dummy condition ON TRUE.
SQL Fiddle.
Aside 1: I assume you are aware that you have a textbook implementation of a n:m (many-to-many) relationship?
How to implement a many-to-many relationship in PostgreSQL?
Aside 2: The unfortunate naming convention in the example makes it necessary to deal out column aliases. Don't use "id" and "name" as column names in your actual tables to avoid such conflicts. Use proper names like "pet_id", "human_id" etc.

Related

How to fix query with multiple joined tables?

I have a main table M (Movies) and other tables L (Location), G (Genre), and S (Sub Genre). Each of the "other" tables are in a one to many relationship to table M, using.
I want to list all the Blu Ray titles and pull in their Location, Length (Time), Comments, Genre, and Sub Genre.
My query is:
SELECT L.Location, M.Title, M.Length, M.Comments, G.Genre, S.SubGenre
FROM ((L
INNER JOIN M ON M.Location = L.ID)
INNER JOIN G ON M.Genre = G.ID)
INNER JOIN SubGenre ON M.SubGenre = SubGenre.ID
ORDER BY M.ID
WHERE M.Type is "BluRay"
ORDER BY M.ID;
It gives me a subset of what the subset (26) of what the total number of records should be (447.)
1. Do I have the proper table relationships?
2. Do I really need the parentheses? (error without them)
3. How do I change my query to give me all the Location records, with the appropriate movie-related information?
4. What if I want to add additional tables?
The DB schema:
-- Note that Type and Length are in between square brackets, because those are reserved words.
-- Avoid use of reserved words with MovieType and MovieLength
SELECT
L.LocationName
, M.Title
, M.[Length]
, M.Comments
, G.GenreName
, S.SubGenreName
FROM Movies M
INNER JOIN Location L ON L.LocationID = M.LocationID
INNER JOIN Genre G ON G.GenreID = M.GenreID
INNER JOIN SubGenre S ON S.SubGenreID = M.SubGenreID
WHERE M.[Type] = 'BluRay'
ORDER BY M.MovieID
You need to JOIN on shared table columns.
For "How to change your query to give all Location records, with appropriate movie-related information" that depends on what you think is appropriate.
You should not need the parentheses. Unless you are using a SQL database I am not familiar with.
You do not need to put the INNER in because the default JOIN is INNER JOIN in all flavors of SQL databases. You also have 2 ORDER BY M.ID you only want the one after the WHERE.
I am not sure what you mean by more tables do you mean you tables to the JOIN or actually more tables?

How to write sql select query for following

I have following tables
Table 1: person
columns: id,name,address,code
Table 2: carDetails
columns: id,person_id,car_brand
constraints: FL==>carDetails(person_id) reference person(id)
Note: carDetails is having multiple details for single person
Table 3: mobileDetails
columns: id,person_id,mobile_brand
constraints: FL==>mobileDetails(person_id) reference person(id)
Note: mobileDetails is having multiple details for single person
Similarly i have lot of details like car and mobile for person
What I want to select is:
person(id),
person(name),
Array of carDetails(brand) belonging to that particular person(id)
Array of mobileDetails(brand) belonging to that particular person(id)
You should write this query using subqueries for the aggregation:
select p.*, c.car_brands, m.mobile_brands
from person p left join
(select c.person_id, array_agg(cd.car_brand) as car_brands
from car_details c
group by c.person_id
) c
on c.person_id = p.id left join
(select m.person_id, array_agg(m.mobile_brand) as mobile_brands
from mobile_details m
group by m.person_id
) m
on m.person_id = m.id;
Two notes:
You want to use left join, in case you have no data in one of the tables for some people.
You want to aggregate before joining to avoid duplicates. Although you could add distinct to array_agg() that incurs a performance penalty.
If you are filtering the people, it is often more efficient to do this using a subquery or (equivalently) as lateral join:
select p.*,
(select array_agg(cd.car_brand) as car_brands
from car_details c
where c.person_id = p.id
) as car_brands,
(select array_agg(m.mobile_brand) as mobile_brands
from mobile_details m
where m.person_id = p.id
) as mobile_brands
from person p;

Fetching distinct rows from multiple joins SQL

I have one main table called deliveries and it has one to many relationship with deliveries_languages as dl, deliveries_markets dm and deliveries_tags dt having delivery_id as foreign key. These 3 tables have one to one relation with languages , markets and tags respectively. Additionaly, deliveries, table have one to one relation with companies and have company_is as foreign key. Following is a query that I have written:
SELECT deliveries.*, languages.display_name, markets.default_name, tags.default_name, companies.name
FROM deliveries
JOIN deliveries_languages dl ON dl.delivery_id = deliveries.id
JOIN deliveries_markets dm ON dm.delivery_id = deliveries.id
JOIN deliveries_tags dt ON dt.delivery_id = deliveries.id
JOIN languages ON languages.id = dl.language_id
JOIN markets ON markets.id = dm.market_id
JOIN tags ON tags.id = dt.tag_id
JOIN companies ON companies.id = deliveries.company_id
WHERE
deliveries.name ILIKE '%new%' AND
deliveries.created_by = '5f331347-fb58-4f63-bcf0-702f132f97c5' AND
deliveries.deleted_at IS NULL
LIMIT 10
Here I am getting redundant delivery_ids because for each delivery_id there are multiple languages, markets and tags. I want to use limit on distinct delivery_ids. So, limit 10 should not give me 10 records from above join query but 10 records where there is distinct delivery_id (deliveries.id). I probably can use derived table concept here but I am not sure how can I do that. Can someone please help me to resolve this issue.
In Postgres, you can use distinct on:
SELECT DISTINCT ON (d.id) d.*, l.display_name, m.default_name, t.default_name, c.name
FROM deliveries d JOIN
deliveries_languages dl
ON dl.delivery_id = d.id JOIN
deliveries_markets dm
ON dm.delivery_id = d.id JOIN
deliveries_tags dt
ON dt.delivery_id = d.id JOIN
languages l
ON l.id = dl.language_id JOIN
markets m
ON m.id = dm.market_id JOIN
tags
ON t.id = dt.tag_id JOIN
companies c
ON c.id = d.company_id
WHERE d.name ILIKE '%new%' AND
d.created_by = '5f331347-fb58-4f63-bcf0-702f132f97c5' AND
d.deleted_at IS NULL
ORDER BY d.id
LIMIT 10;
For larger amounts of data, this can be very inefficient. Multiple junction tables result in Cartesian products when used like this. However, for a smallish amount of data, this should solve your problem.

SQL JOIN using a mapping table

I have three tables:
COLLECTION
PERSON
PERSON_COLLECTION
where PERSON_COLLECTION is a mapping table id|person_id|collection_id
I now want to select all entries in collection and order them by person.name.
Do I have to join the separate tables with the mapping table first and then do a join again on the results?
SELECT
c.*,
p.Name
FROM
Collection c
JOIN Person_Collection pc ON pc.collection_id = c.id
JOIN Person p ON p.id = pc.person_id
ORDER BY p.Name
Not sure without the table schema but, my take is:
SELECT
c.*,
p.*
FROM
Person_Collection pc
LEFT JOIN Collection c
ON pc.collection_id = c.id
LEFT JOIN Person p
ON pc.person_id = p.id
ORDER BY p.name
The order you join won't break it but depending on which sql product you're using may effect performance.
You need to decide if you want ALL records from both/either table or only records which have a matching mapping entry, this will change the type of join you need to use.

Getting individual counts of a tables column after joining other tables

I'm having problems getting an accurate count of a column after joining others. When a column is joined I would still like to have a DISTINCT count of the table that it is being joined on.
A restaurant has multiple meals, meals have multiple food groups, food groups have multiple ingredients.
Through the restaurants id I want to be able to calculate how many of meals, food groups, and ingrediants the restaurant has.
When I join the food_groups the count for meals increases as well (I understand this is natural behavior I just don't understand how to get what I need due to it.) I have tried DISTINCT and other things I have found, but nothing seems to do the trick. I would like to keep this to one query rather than splitting it up into multiple ones.
SELECT
COUNT(meals.id) AS countMeals,
COUNT(food_groups.id) AS countGroups,
COUNT(ingrediants.id) AS countIngrediants
FROM
restaurants
INNER JOIN
meals ON restaurants.id = meals.restaurant_id
INNER JOIN
food_groups ON meals.id = food_groups.meal_id
INNER JOIN
ingrediants ON food_groups.id = ingrediants.food_group_id
WHERE
restaurants.id='43'
GROUP BY
restaurants.id
Thanks!
The DISTINCT goes inside the count
SELECT
COUNT(DISTINCT meals.id) AS countMeals,
COUNT(DISTINCT food_groups.id) AS countGroups,
COUNT(DISTINCT ingrediants.id) AS countIngrediants
FROM
restaurants
INNER JOIN
meals ON restaurants.id = meals.restaurant_id
INNER JOIN
food_groups ON meals.id = food_groups.meal_id
INNER JOIN
ingrediants ON food_groups.id = ingrediants.food_group_id
WHERE
restaurants.id='43'
GROUP BY
restaurants.id
You're going to have to do subqueries, I think. Something like:
SELECT
(SELECT COUNT(1) FROM meals m WHERE m.restaurant_id = r.id) AS countMeals,
(SELECT COUNT(1) FROM food_groups fg WHERE fg.meal_id = m.id) AS countGroups,
(SELECT COUNT(1) FROM ingrediants i WHERE i.food_group_id = fg.id) AS countGroups
FROM restaurants r
Where were you putting your DISTINCT and on which columns? When using COUNT() you need to do the distinct inside the parentheses and you need to do it over a single column that is distinct for what you're trying to count. For example:
SELECT
COUNT(DISTINCT M.id) AS count_meals,
COUNT(DISTINCT FG.id) AS count_food_groups,
COUNT(DISTINCT I.id) AS count_ingredients
FROM
Restaurants R
INNER JOIN Meals M ON M.restaurant_id = R.id
INNER JOIN Food_Groups FG ON FG.meal_id = M.id
INNER JOIN Ingredients I ON I.food_group_id = FG.id
WHERE
R.id='43'
Since you're selecting for a single restaurant, you shouldn't need the GROUP BY. Also, unless this is in a non-English language, I think you misspelled ingredients.