SQL NOT EXISTS vs count distinct

SQL NOT EXISTS vs count distinct - sql

I have three tables; Apartment (apartmentnr, floor, apartmenttype), Floor (floornr, house), House (housenr, adress)
Now, I want to show housenr and adresses of houses that have apartments of all apartmenttypes (1-4), using NOT EXISTS.

As #onedaywhen hints, not exists is pretty ponderous for this task, and count distinct offers leaner syntax (performance issues are mentioned in the article he points to):
SELECT House.adress
FROM House
JOIN Floor ON (House.housenr=Floor.house)
JOIN Apartment ON (Floor.floornr=Apartment.floor)
GROUP BY House.housenr
HAVING COUNT(DISTINCT Apartment.apartmenttype)=4
which essentially says "show adress [[sic]] of houses with 4 different types of apartments". Only good reason to force the use of not exists would be, as others hinted, homework...

Ah then you want to use Chris Dates approach, rather than the one made popular Joe Celko? If you are not sure, both approaches are discussed here:
Relational Division by Joe Celko

If I understand your relationships properly you could something like this. Remember you could use alias's for the tables to make it slightly easier to read, I've just kept them in to make it clear what I'm doing.
Select House.hoursenr, address
inner join Floor on House.housenr = Floor.house
inner join Apartment on Apartment.floor = Floor.floornr
where Apartment.apartmenttype in (1,2,3,4)
I would argue that some of your entity relationships are the wrong way round in your table definitions.

Given a structure like this:
Houses
ID
Address
Floors
ID
HouseID
Apartments
Id
FloorId
ApartmentType
I think this will do the trick:
SELECT H.ID FROM Houses H JOIN Floors F ON H.Id = F.HouseId JOIN Apartments A on F.Id = A.FloorId AND A.ApartmentType = 1 AND H.Id IN
(SELECT H.ID FROM Houses H JOIN Floors F ON H.Id = F.HouseId JOIN Apartments A on F.Id = A.FloorId AND A.ApartmentType = 2) AND H.Id IN
(SELECT H.ID FROM Houses H JOIN Floors F ON H.Id = F.HouseId JOIN Apartments A on F.Id = A.FloorId AND A.ApartmentType = 3) AND H.Id IN
(SELECT H.ID FROM Houses H JOIN Floors F ON H.Id = F.HouseId JOIN Apartments A on F.Id = A.FloorId AND A.ApartmentType = 4)
In English, it means, give me the set of all houses with an apartment of type 1 that are also in the set of houses with an apartment of type 2...and so on.

Related

How to fix query with multiple joined tables?

I have a main table M (Movies) and other tables L (Location), G (Genre), and S (Sub Genre). Each of the "other" tables are in a one to many relationship to table M, using.
I want to list all the Blu Ray titles and pull in their Location, Length (Time), Comments, Genre, and Sub Genre.
My query is:
SELECT L.Location, M.Title, M.Length, M.Comments, G.Genre, S.SubGenre
FROM ((L
INNER JOIN M ON M.Location = L.ID)
INNER JOIN G ON M.Genre = G.ID)
INNER JOIN SubGenre ON M.SubGenre = SubGenre.ID
ORDER BY M.ID
WHERE M.Type is "BluRay"
ORDER BY M.ID;
It gives me a subset of what the subset (26) of what the total number of records should be (447.)
1. Do I have the proper table relationships?
2. Do I really need the parentheses? (error without them)
3. How do I change my query to give me all the Location records, with the appropriate movie-related information?
4. What if I want to add additional tables?
The DB schema:

-- Note that Type and Length are in between square brackets, because those are reserved words.
-- Avoid use of reserved words with MovieType and MovieLength
SELECT
L.LocationName
, M.Title
, M.[Length]
, M.Comments
, G.GenreName
, S.SubGenreName
FROM Movies M
INNER JOIN Location L ON L.LocationID = M.LocationID
INNER JOIN Genre G ON G.GenreID = M.GenreID
INNER JOIN SubGenre S ON S.SubGenreID = M.SubGenreID
WHERE M.[Type] = 'BluRay'
ORDER BY M.MovieID

You need to JOIN on shared table columns.
For "How to change your query to give all Location records, with appropriate movie-related information" that depends on what you think is appropriate.
You should not need the parentheses. Unless you are using a SQL database I am not familiar with.
You do not need to put the INNER in because the default JOIN is INNER JOIN in all flavors of SQL databases. You also have 2 ORDER BY M.ID you only want the one after the WHERE.
I am not sure what you mean by more tables do you mean you tables to the JOIN or actually more tables?

Postgres do array agg for each row

I have a query which will take jobs_locum_hospital_ids from my doctor table, it will then join this to the hospital table on id and fetch the name, then placing all of these into an array.
so [187,123] --> ("George Eliot Hospital - Acute Services"),("Good Hope Hospital")
select array_agg(t)
from (
select h.name from (select jsonb_array_elements_text(d.jobs_locum_hospital_ids)::int as id from doctor d
where d.id = 11720) as q1
left join hospital h on h.id = q1.id
)t
But this is only performing this for where d.id = 11720
What I'd like to do is do this for each row. So in a way joining to
select * from doctor
left join that thing above

It is a bit hard to figure out your data structure or why you are using json functions for this. From what I can tell, doctors have an array of hospital ids and you want the names:
select d.*,
(select array_agg(h.name)
from unnest(d.jobs_locum_hospital_ids) dh join
hospital h
on dh = h.id
) as hospital_names
from doctors;
Just the fact that you want to do this suggests that you really want a junction table, doctorHospitals with one row per doctor and per hospital.

LEFT JOIN across three tables (with junction table)

In Postgres, is there a way to perform a left join between tables linked by a junction table, with some filtering on the linked table?
Say, I have two tables, humans and pets, and I want to perform a query where I have the human ID, and the pet name. If the human ID exists, but they don't have a pet with that name, I still want the human's row to be returned.
If I had a FK relationship from pets to humans, this would work:
select h.*, p.*
from humans as h
left join pets as p on p.human_id = h.id and p.name = 'fluffy'
where h.id = 13
and I'd get a row with human 13's details, and fluffy's values. In addition, if human 13 didn't have a pet named 'fluffy', I'd get a row with human 13's values, and empty values for the pet's columns.
BUT, I don't have a direct FK relationship, I have a junction table between humans and pets, so I'm trying a query like:
select h.*, p.*
from humans as h
left join humans_pets_junction as j on j.human_id = h.id
left join pets as p on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13
Which returns rows for all of human 13's pets, with empty columns except for fluffy's row.
If I add p.name = 'fluffy' to the WHERE clause, that filters out all the empty rows, but also means I get 0 rows if human 13 doesn't have a pet named fluffy at all.
Is there a way to replicate the behavior of the FK-style left join, but when used with a junction table?

One method is to do the comparison in the where clause:
select h.*, p.*
from humans as h left join
humans_pets_junction as j
on j.human_id = h.id left join
pets as p
on j.pet_id = p.id and p.name = 'fluffy'
where h.id = 13 and (p.name = 'fluffy' or p.id is null);
Alternatively, join the junction table and the pets table as a subquery or CTE:
select h.*, p.*
from humans h left join
(select j.*
from humans_pets_junction j join
pets p
on j.pet_id = p.id and p.name = 'fluffy'
) pj
on pj.human_id = h.id
where h.id = 13;

In Postgres you can use parentheses to prioritize JOIN order. You do not need a subquery:
SELECT h.*, p.id AS p_id, p.name AS pet_name
FROM humans h
LEFT JOIN (pets p
JOIN humans_pets_junction j ON p.name = 'fluffy'
AND j.pet_id = p.id
AND j.human_id = 13) ON TRUE
WHERE h.id = 13;
Per documentation:
Parentheses can be used around JOIN clauses to control the join order.
In the absence of parentheses, JOIN clauses nest left-to-right.
I added the predicate j.human_id = 13 to the join between your junction table and the pets to eliminate irrelevant rows at the earliest opportunity. The outer LEFT JOIN only needs the dummy condition ON TRUE.
SQL Fiddle.
Aside 1: I assume you are aware that you have a textbook implementation of a n:m (many-to-many) relationship?
How to implement a many-to-many relationship in PostgreSQL?
Aside 2: The unfortunate naming convention in the example makes it necessary to deal out column aliases. Don't use "id" and "name" as column names in your actual tables to avoid such conflicts. Use proper names like "pet_id", "human_id" etc.

SQL Query - max(count()) - possible without CTE?

I'm studying for my Database Systems exam at the moment, having some trouble with an exercise creating a query.
I have four tables:
A referent-table with personal data of referents,
A course-table with course data (with the responsible referent as foreign key),
A workshop-table with workshop data (with the corresponding course as foreign key),
A booking-table which manages bookings (with the corresponding workshop which has been booked as a foreign key)
My exercise is to find out how much money a referent earns (there's a price-column in workshop)
It's not very difficult to list how much money he earns per workshop; I created this query to show me:
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.price AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid
This returns this:
2;"Anna";0.60
4;"Ahmed";3.5
1;"Hans";
2;"Anna";13.20
3;"Wolfgang";
As you can see, it works fine.
I now have two rows for Anna (because she is responsible for two courses..) and want to have one row with the sum of both tables.
Unfortunately, the only way to do this (as I found out) is with a Common Table Expression (CTE) - with a CTE it works:
WITH incomepercourse AS (
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.price AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid
)
SELECT referentid, name, SUM(income) FROM incomepercourse GROUP BY referentid, name
this returns:
3;"Wolfgang";
4;"Ahmed";3.50
2;"Anna";13.80
1;"Hans";
Is there any way to avoid a CTE?
My professor didn't talk about CTE, and it also isn't in his lecture notes - so there has to be some other, simpler way.
Is there anyone out there who knows a better way to achieve this?
Thanks in Advance!

You can wrap the CTE part and query as shown below.
SELECT tbl1.referentid, tbl1.name, SUM(tbl1.income) FROM (
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.Preis AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid)tbl1
GROUP BY tbl1.referentid, tbl1.name

You do not need to join the workshop and booking table if you do not use it.
If the price is in the course table as shown in your first request I think what you are asking for is answered by this request:
SELECT referent.referentid, referent.name, sum(price)
FROM referent LEFT JOIN g22_courses ON g22_courses.responsiblerefid=referent.id
GROUP BY referent.referentid, referent.name
If the price is in the workshop table, just add a join to this table.

Okay, I solved it.
It was a problem with my understanding of the problem after all... ._.
I know managed to make a query like a_horse_with_no_name said in his comment:
SELECT r.referentid,
r.name,
SUM(c.price) AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid
ORDER BY r.referentid
This solves my problem perfectly, returning the right values.
Thank you!

How would I join this query in one single SQL statement?

Let's say I have these tables, with these fields:
companies: company_id | name | num_employees
companies_countries: company_id | country_id
countries: country_id | country_iso_code
Assuming this is a 1:1 relationship:
How can I join the country_iso_code directly into the company recordset, when I fetch all the companies? I think I would need two joins here?

A simple example :
select c.name, n.country_iso_code
from companies c,
companies_countries x,
countries n
where x.company_id = c.company_id
and n.country_id = x.country_id
Edit
For a good intro to JOIN, have a look at SQL JOIN.

SELECT
companies.company_id,
companies.name,
companies.num_employees,
countries.country_iso_code
FROM
companies
LEFT JOIN
companies_countries ON (companies_countries.company_id = companies.company_id)
LEFT JOIN
countries ON (countries.country_id = companies_countries.country_id);
A query that uses a LEFT JOIN instead of an INNER JOIN or an implicit join, will return companies even when they have no country assigned. On the other hand, a query with an INNER JOIN would skip companies that do not have an assigned country in companies_countries.
Note that your design is implying that each company can be assigned more than one country. If you want to enforce only one country for each company, simply put a country_id column in your companies table. You would not need the companies_countries table.

select name, country_iso_code
from companies left join companies_countries
on (companies.company_id = companies_countries.company_id)
left join countries
on (companies_countries.country_id = countries.country_id)

SELECT co.*, cu.country_iso_code
FROM companies co
LEFT JOIN companies_countries cc ON cc.company_id = co.company_id
LEFT JOIN countries c ON c.country_id = cc.country_id
Why LEFT JOIN and not the WHERE condition like in other examples? A join table (companies_countries) is not typically used in 1:1 relationships - it is overnormalization.
When your relationship ceases to be 1:1:
SELECT co.*, GROUP_CONCAT(cu.country_iso_code)
FROM companies co
LEFT JOIN companies_countries cc ON cc.company_id = co.company_id
LEFT JOIN countries c ON c.country_id = cc.country_id
This will return results like
CompanyA | Canada,USA,Mexico
CompanyB | Ireland,UK,Japan

This is not a 1:1 relationship. A country can have more the one company.
This is either a 1:N relationship (for some reason implemented using two relational tables), or the M:N relationship which describes multinational companies.
If this is a 1:N relationship, you could just put the country_code field into the companies table, in which case one join would be enough:
SELECT *
FROM companies co
LEFT JOIN
countries cn
ON cn.country_code = co.country_code
Your design is viable for both 1:N and M:N relationships, in which case two joins are required:
SELECT co.*, cn.*
FROM companies co
LEFT JOIN
company_countries cc
ON cc.company_id = co.company_id
LEFT JOIN
countries cn
ON cn.country_code = cc.country_code
If this is a 1:N relationship, you should make company_id a PRIMARY KEY in the company_country table.
If this is a M:N relationship, you should make a composite PRIMARY KEY on company_country (company_id, country_code)
You may want to read this article in my blog about the difference between entity-relationship model and its relational implementation:
What is entity-relationship model?

Yes, you would need two joins.
You can create a view and fake a single join though...

select companies.* from countries, companies_countries, companies where countries.country_id=companies_countries and companies_countries.company_id=company_id and countries.country_iso_code='xxxx'
where xxxx is the iso code you want to match

select c.company_id, c.name, c.num_employees, co.county_iso_code from
companies c, companies_countries cc, countries co
where c.company_id = cc.company_id
and cc.country_id = co.country_id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL NOT EXISTS vs count distinct - sql

I have three tables; Apartment (apartmentnr, floor, apartmenttype), Floor (floornr, house), House (housenr, adress) Now, I want to show housenr and adresses of houses that have apartments of all apartmenttypes (1-4), using NOT EXISTS.

Ah then you want to use Chris Dates approach, rather than the one made popular Joe Celko? If you are not sure, both approaches are discussed here: Relational Division by Joe Celko

Related

How to fix query with multiple joined tables?

Postgres do array agg for each row

LEFT JOIN across three tables (with junction table)

SQL Query - max(count()) - possible without CTE?

How would I join this query in one single SQL statement?

Categories

Resources