Fetching distinct rows from multiple joins SQL - sql

I have one main table called deliveries and it has one to many relationship with deliveries_languages as dl, deliveries_markets dm and deliveries_tags dt having delivery_id as foreign key. These 3 tables have one to one relation with languages , markets and tags respectively. Additionaly, deliveries, table have one to one relation with companies and have company_is as foreign key. Following is a query that I have written:
SELECT deliveries.*, languages.display_name, markets.default_name, tags.default_name, companies.name
FROM deliveries
JOIN deliveries_languages dl ON dl.delivery_id = deliveries.id
JOIN deliveries_markets dm ON dm.delivery_id = deliveries.id
JOIN deliveries_tags dt ON dt.delivery_id = deliveries.id
JOIN languages ON languages.id = dl.language_id
JOIN markets ON markets.id = dm.market_id
JOIN tags ON tags.id = dt.tag_id
JOIN companies ON companies.id = deliveries.company_id
WHERE
deliveries.name ILIKE '%new%' AND
deliveries.created_by = '5f331347-fb58-4f63-bcf0-702f132f97c5' AND
deliveries.deleted_at IS NULL
LIMIT 10
Here I am getting redundant delivery_ids because for each delivery_id there are multiple languages, markets and tags. I want to use limit on distinct delivery_ids. So, limit 10 should not give me 10 records from above join query but 10 records where there is distinct delivery_id (deliveries.id). I probably can use derived table concept here but I am not sure how can I do that. Can someone please help me to resolve this issue.

In Postgres, you can use distinct on:
SELECT DISTINCT ON (d.id) d.*, l.display_name, m.default_name, t.default_name, c.name
FROM deliveries d JOIN
deliveries_languages dl
ON dl.delivery_id = d.id JOIN
deliveries_markets dm
ON dm.delivery_id = d.id JOIN
deliveries_tags dt
ON dt.delivery_id = d.id JOIN
languages l
ON l.id = dl.language_id JOIN
markets m
ON m.id = dm.market_id JOIN
tags
ON t.id = dt.tag_id JOIN
companies c
ON c.id = d.company_id
WHERE d.name ILIKE '%new%' AND
d.created_by = '5f331347-fb58-4f63-bcf0-702f132f97c5' AND
d.deleted_at IS NULL
ORDER BY d.id
LIMIT 10;
For larger amounts of data, this can be very inefficient. Multiple junction tables result in Cartesian products when used like this. However, for a smallish amount of data, this should solve your problem.

Related

How to Query across multiple tables

I have the following DB. Is this a proper query to get the first name of the authors loaned by Members ? or is there a more efficient way ?
select M.MemberNo, A.FirstName
from Person.Members as M
INNER JOIN Booking.Loans as L
on M.MemberNo = L.MemberNo
INNER JOIN Article.Copies as C
on L.ISBN = C.ISBN and L.CopyNo = C.CopyNo
INNER JOIN Article.Items as I
on C.ISBN = I.ISBN
INNER JOIN Article.Titles as T
on I.TitleID = T.TitleID
INNER JOIN Article.TitleAuthors as TA
on T.TitleID = TA.TitleID
INNER JOIN Article.Authors as A
on TA.AuthorID = A.AuthorID
;
go
The most direct way to get the names of authors of books that are lent out to members would be like this:
Get member loans (Booking.Loans)
Get books for member loans (Article.Items)
Get authors for the books for member loans (Article.TitleAuthors)
Get first names for authors for the books for member loans (Article.Authors)
select L.MemberNo, A.FirstName
FROM Booking.Loans as L
INNER JOIN Article.Items as I
on L.ISBN = I.ISBN
INNER JOIN Article.TitleAuthors as TA
on I.TitleID = TA.TitleID
INNER JOIN Article.Authors as A
on TA.AuthorID = A.AuthorID
The advantage to joining all the tables like you have your original query is to ensure that each intermediary table was INSERTed into or DELETEd from correctly.
For example, your original query wouldn't return any rows for a member if they had a loan (i.e. a row in the Loans table) but for some reason didn't have an entry in the Members table. If you can assume that (or don't care if) there will always be a row in the Members table if there is a row in the Loans table, then you can exclude the join to Person.Members.
You can apply this same logic to the other intermediary tables (Member, Copies, Title) too, if needed. If you need a specific piece of data from one of the these tables, though, then you would need to include them in the joins.

LEFT JOIN across three tables (with junction table)

In Postgres, is there a way to perform a left join between tables linked by a junction table, with some filtering on the linked table?
Say, I have two tables, humans and pets, and I want to perform a query where I have the human ID, and the pet name. If the human ID exists, but they don't have a pet with that name, I still want the human's row to be returned.
If I had a FK relationship from pets to humans, this would work:
select h.*, p.*
from humans as h
left join pets as p on p.human_id = h.id and p.name = 'fluffy'
where h.id = 13
and I'd get a row with human 13's details, and fluffy's values. In addition, if human 13 didn't have a pet named 'fluffy', I'd get a row with human 13's values, and empty values for the pet's columns.
BUT, I don't have a direct FK relationship, I have a junction table between humans and pets, so I'm trying a query like:
select h.*, p.*
from humans as h
left join humans_pets_junction as j on j.human_id = h.id
left join pets as p on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13
Which returns rows for all of human 13's pets, with empty columns except for fluffy's row.
If I add p.name = 'fluffy' to the WHERE clause, that filters out all the empty rows, but also means I get 0 rows if human 13 doesn't have a pet named fluffy at all.
Is there a way to replicate the behavior of the FK-style left join, but when used with a junction table?
One method is to do the comparison in the where clause:
select h.*, p.*
from humans as h left join
humans_pets_junction as j
on j.human_id = h.id left join
pets as p
on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13 and (p.name = 'fluffy' or p.id is null);
Alternatively, join the junction table and the pets table as a subquery or CTE:
select h.*, p.*
from humans h left join
(select j.*
from humans_pets_junction j join
pets p
on j.pet_id = p.id and p.name = 'fluffy'
) pj
on pj.human_id = h.id
where h.id = 13;
In Postgres you can use parentheses to prioritize JOIN order. You do not need a subquery:
SELECT h.*, p.id AS p_id, p.name AS pet_name
FROM humans h
LEFT JOIN (pets p
JOIN humans_pets_junction j ON p.name = 'fluffy'
AND j.pet_id = p.id
AND j.human_id = 13) ON TRUE
WHERE h.id = 13;
Per documentation:
Parentheses can be used around JOIN clauses to control the join order.
In the absence of parentheses, JOIN clauses nest left-to-right.
I added the predicate j.human_id = 13 to the join between your junction table and the pets to eliminate irrelevant rows at the earliest opportunity. The outer LEFT JOIN only needs the dummy condition ON TRUE.
SQL Fiddle.
Aside 1: I assume you are aware that you have a textbook implementation of a n:m (many-to-many) relationship?
How to implement a many-to-many relationship in PostgreSQL?
Aside 2: The unfortunate naming convention in the example makes it necessary to deal out column aliases. Don't use "id" and "name" as column names in your actual tables to avoid such conflicts. Use proper names like "pet_id", "human_id" etc.

Get correct result from mysql query

I have the following tables:
**products** which has these fields: id,product,price,added_date
**products_to_categories** which has these fields: id,product_id,category_id
**adverts_to_categories** -> id,advert_id,category_id
**adverts** which has these fields: id,advert_name,added_date,location
I can not execute sql that will return to me all products that are from category 14 and that are owned by advert located in London. So I have 4 tables and 2 conditions - to be from category 14 and the owner of the product to be from London. I tried many variants to execute sql but none of the results were correct.. Do I need to use Join and which Join - left, right, full? How the correct sql will look like? thank you in advance for your help and sorry for boring you :)
This is what I have tried so far:
SELECT p.id, product, price, category_id,
p.added_date, adverts.location, adverts.id
FROM products p,
products_to_categories ptc,
adverts,
adverts_to_categories ac
WHERE ptc.category_id = "14"
AND ptc.product_id=p.id
AND ac.advert_id=adverts.id
AND adverts.location= "London"
pretty basic logic
Select * from Products P
INNER JOIN Products_To_Categories PTC ON P.ID = PTC.Product_ID
INNER JOIN Adverts_to_Categories ATC ON ATC.Category_Id = PTC.Category_ID
INNER JOIN Adverts AD on AD.ID = ATC.Advert_ID
WHERE PTC.Category_ID = 14 and AD.Location = 'LONDON'
you would only need a LEFT or right join IF you wanted records from a table which didn't exist in other tables.
so for example, if you wanted all products even if a records even those without a category, then you would use a LEFT Join instead of inner.
The following statement should return all columns from the product table in category with id 14 and all adverts located in London:
select p.* from products p
inner join products_to_categories pc on p.id = pc.product_id
inner join adverts_to_categories ac on pc.category_id = ac.category_id
inner join adverts a on a.id = ac.advert_id
where pc.category_id = 14
and ac.location = 'London';
You should remember to add an index to the column location if you are doing these string-based queries very often.

Getting individual counts of a tables column after joining other tables

I'm having problems getting an accurate count of a column after joining others. When a column is joined I would still like to have a DISTINCT count of the table that it is being joined on.
A restaurant has multiple meals, meals have multiple food groups, food groups have multiple ingredients.
Through the restaurants id I want to be able to calculate how many of meals, food groups, and ingrediants the restaurant has.
When I join the food_groups the count for meals increases as well (I understand this is natural behavior I just don't understand how to get what I need due to it.) I have tried DISTINCT and other things I have found, but nothing seems to do the trick. I would like to keep this to one query rather than splitting it up into multiple ones.
SELECT
COUNT(meals.id) AS countMeals,
COUNT(food_groups.id) AS countGroups,
COUNT(ingrediants.id) AS countIngrediants
FROM
restaurants
INNER JOIN
meals ON restaurants.id = meals.restaurant_id
INNER JOIN
food_groups ON meals.id = food_groups.meal_id
INNER JOIN
ingrediants ON food_groups.id = ingrediants.food_group_id
WHERE
restaurants.id='43'
GROUP BY
restaurants.id
Thanks!
The DISTINCT goes inside the count
SELECT
COUNT(DISTINCT meals.id) AS countMeals,
COUNT(DISTINCT food_groups.id) AS countGroups,
COUNT(DISTINCT ingrediants.id) AS countIngrediants
FROM
restaurants
INNER JOIN
meals ON restaurants.id = meals.restaurant_id
INNER JOIN
food_groups ON meals.id = food_groups.meal_id
INNER JOIN
ingrediants ON food_groups.id = ingrediants.food_group_id
WHERE
restaurants.id='43'
GROUP BY
restaurants.id
You're going to have to do subqueries, I think. Something like:
SELECT
(SELECT COUNT(1) FROM meals m WHERE m.restaurant_id = r.id) AS countMeals,
(SELECT COUNT(1) FROM food_groups fg WHERE fg.meal_id = m.id) AS countGroups,
(SELECT COUNT(1) FROM ingrediants i WHERE i.food_group_id = fg.id) AS countGroups
FROM restaurants r
Where were you putting your DISTINCT and on which columns? When using COUNT() you need to do the distinct inside the parentheses and you need to do it over a single column that is distinct for what you're trying to count. For example:
SELECT
COUNT(DISTINCT M.id) AS count_meals,
COUNT(DISTINCT FG.id) AS count_food_groups,
COUNT(DISTINCT I.id) AS count_ingredients
FROM
Restaurants R
INNER JOIN Meals M ON M.restaurant_id = R.id
INNER JOIN Food_Groups FG ON FG.meal_id = M.id
INNER JOIN Ingredients I ON I.food_group_id = FG.id
WHERE
R.id='43'
Since you're selecting for a single restaurant, you shouldn't need the GROUP BY. Also, unless this is in a non-English language, I think you misspelled ingredients.

SQL join over five tables

I have five tables:
models: id, name, specification
models_networks: id, model_id, network_id
networks: id, name, description
countries_networks: id, country_id, network_id
countries: id, countryName, etc, etc
the models table is connected to the networks table via models_networks with a many to many relation.
the networks table is connected to the countries table via countries_networks with a many to many relation
I need to do the following query, but I'm stuck:
Select all the models that will work in a specific country.
e.g.: say France has two networks. PigNetwork and CowNetwork. I want to get all the models that work on PigNetwork or CowNetwork, basically any that work in that country one way or the other.
If I've made myself clear, can someone help with the JOIN query please? I've only ever gone as far as joining two tables before. Thanks.
SELECT
m.name AS model_name,
c.countryName,
COUNT(*) AS network_count
FROM
models AS m
INNER JOIN models_networks AS mn ON mn.model_id = m.id
INNER JOIN networks AS n ON n.id = mn.network_id
INNER JOIN countries_networks AS cn ON cn.network_id = n.id
INNER JOIN countries AS c ON c.id = cn.country_id
WHERE
c.countryName = 'France'
GROUP BY
m.name,
c.countryName
Something along the lines of this should work...
SELECT M.Name As ModelName FROM Countries C
INNER JOIN Countries_Networks CN
ON C.CountryId = CN.CountryId
INNER JOIN Networks N
ON CN.NetworkId = N.NetworkId
INNER JOIN ModelNetworks MN
ON MN.NetworkId = N.NetworkId
INNER JOIN Model M
ON M.ModelId = MN.ModelId
WHERE C.CountryName = 'FRANCE'
SELECT m.id
FROM model m
WHERE EXISTS
(
SELECT NULL
FROM model_networks mn
JOIN countries_networks cn
ON cn.network_id = mn.network_id
AND cn.country_id = #code_of_france
WHERE mn.model_id = m.id
)
This is efficient since it returns a model right that moment it finds the first suitable network.
Make sure you have the following UNIQUE indexes:
model_networks (model_id, network_id)
country_network (country_id, network_id)
SELECT models.id, models.name, models.specifications JOIN models_networks ON models.id=models_networks.model_id WHERE models_networks.networks_id IN (1,2)
(replace 1,2 with your network ids )
Doesn't look like you need two joins if you just want the models. You only need more joins if you don't know the network ids or if you need columns out of the other tables. The syntax for that is the same though. You just start with JOIN <table> ON <field>=<field> before the WHERE statement
select models.id, models.name, models.specification from models inner join models_networks on models.id = models_network.network_id inner join countries_networks on models_network.network_id = countries_networks.network_id where countries_networks.countryName = 'France'