Postgres Question: Aren't both a and b correct? - sql

For questions below, use the following schema definition.
restaurant(rid, name, phone, street, city, state, zip)
customer(cid, fname, lname, phone, street, city, state, zip)
carrier(crid, fname, lname, lp)
delivery(did, rid, cid, tim, size, weight)
pickup(did, tim, crid)
dropoff(did, tim, crid)
It's a schema for a food delivery business that employs food carriers (carrier table).
Customers (customer table) order food from restaurants (restaurant table).
The restaurants order a delivery (delivery table); to deliver food from restaurant to customer.
The pickup table records when carrier picks up food at restaurant.
The dropoff table records when carrier drops off food at customer.
1.Find customers who have less than 5 deliveries.
a. select cid,count()
from delivery
group by cid
having count() < 5;
b. select a.cid,count()
from customer a
inner join delivery b
using(cid)
group by a.cid
having count() < 5;
c. select a.cid,count()
from customer a
left outer join delivery b
on a.cid=b.cid
group by a.cid
having count() < 5;
d. select cid,sum(case when b.cid is not null then 1 else 0 end)
from customer a
left outer join delivery b
using (cid)
group by cid
having sum(case when b.cid is not null then 1 else 0 end) < 5;
e. (write your own answer)

No, they are not correct. They miss customers who have had no deliveries.
The last is the best of a bunch of not so good queries. A better version would be:
select c.cid, count(d.cid)
from customer c left outer join
delivery d
on c.cid = d.cid
group by c.cid
having count(d.cid) < 5;
The sum(case) is over kill. And Postgres even offers a better solution than that!
count(*) filter (where d.cid is not null)
But count(d.cid) is still more concise.
Also note the use of meaningful table aliases. Don't get into the habit of using arbitrary letters for tables. That just makes queries hard to understand.

Related

improving the sql query for a dbms problem

I want to write a SQL query for the problem as defined below, my answer for the first part is as below, but I am not sure about the answer, can anyone help me? Is the answer correct, or if not, how can I improve it?
For the second part can anyone help me?
Let us consider the following relational schema about physicians and departments:
PHYSICIAN (PhysicianId, Name, Surname, Specialization, Gender, BirthDate, Department);
Let every physician be univocally identified by a code and characterized by a name, a surname, a specialization (we assume to record exactly one specializa- tion for each physician), a gender, a birth date, a department (each physician is assigned to one and only one department), and a home city.
Let every city be univocally identified by its name and characterized by the region it belongs to.
DEPARTMENT (Name, Building, Floor, Chief)
Let every department be univocally identified by a name and characterized by its location (building and floor) and chief. Let us assume that a physician can be the chief of at most one department (the department he/she belongs to). We do not exclude the possibility for two distinct departments to be located at the same floor of the same building.
BELONGS TO(City,Region)
Let us assume that a physician can be the chief of at most one department (the department he/she belongs to). We do not exclude the possibility for two distinct departments to be located at the same floor of the same building.
I want to formulate an SQL query to compute the following data (exploiting aggregate functions only if they are strictly necessary):
• the departments such that (i) all their physicians reside in the region Piemonte and (ii) at least one of them resides in the city Torino.
My answer for the fist part is as below:
for the second part, I don't know how to solve it .
create view X{
select b.city, b.region, b.department
from physician p inner join belong-to b on b.city= p.homecity
where b.region="Piemonte"
select name
from department d
where exists( select*
from X
where p.department =d.name )
I think this should work:
Select Distinct b.department
From physician p
Join belong-to b on b.city = p.homecity
Where Exists (Select 1 From belong-to b2 Where b2.city = p.homecity AND b2.region = "Piemonte")
Exists (Select 1 From belong-to b2 Where b2.city = p.homecity AND b2.city = "Torino") AND
NOT Exists (Select 1 From belong-to b2 Where b2.city = p.homecity AND b2.region <> "Piemonte") AND
I would use aggregation:
select p.department
from physician p join
belongsto b
on p.homecity = b.city
group by p.department
having sum(case when b.region <> 'Piemonte' then 1 else 0 end) = 0 and -- no Piemonte
sum(case when p.homecity = 'Torino' then 1 else 0 end) >= 1 -- at least one Torino

SELECT * FROM table in addition of aggregation function

Short context:
I would like to show a list of all companies except if they are in the sector 'defense' or 'government' and their individual total spent on training classes. Only the companies that have this total amount above 1000 must be shown.
So I wrote the following query:
SELECT NAME, ADDRESS, ZIP_CODE, CITY, SUM(FEE-PROMOTION) AS "Total spent on training at REX"
FROM COMPANY INNER JOIN PERSON ON (COMPANY_NUMBER = EMPLOYER) INNER JOIN ENROLLMENT ON (PERSON_ID = STUDENT)
WHERE SECTOR_CODE NOT IN (SELECT CODE
FROM SECTOR
WHERE DESCRIPTION = 'Government' OR DESCRIPTION = 'Defense')
GROUP BY NAME, ADDRESS, ZIP_CODE, CITY
HAVING SUM(FEE-PROMOTION) > 1000
ORDER BY SUM(FEE-PROMOTION) DESC
Now what I actually need is, instead of defining every single column in the COMPANY table, I would like to show ALL columns of the COMPANY table using *.
SELECT * (all tables from COMPANY here), SUM(FEE-PROMOTION) AS "Total spent on training at REX"
FROM COMPANY INNER JOIN PERSON ON (COMPANY_NUMBER = EMPLOYER) INNER JOIN ENROLLMENT ON (PERSON_ID = STUDENT)
WHERE SECTOR_CODE NOT IN (SELECT CODE
FROM SECTOR
WHERE DESCRIPTION = 'Government' OR DESCRIPTION = 'Defense')
GROUP BY * (How to fix it here?)
HAVING SUM(FEE-PROMOTION) > 1000
ORDER BY SUM(FEE-PROMOTION) DESC
I could define every single column from COMPANY in the SELECT and that solution will do the job (as in the first example), but how can I make the query shorter using "SELECT * from the table COMPANY"?
The key idea is to summarize in the subquery to get the total spend for the company. This allows you to remove the aggregation from the outer query:
select c.*, pe.total_spend
from company c join
sector s
on c.sector_code = s.code left join
(select p.employer, sum(e.fee - e.promotion) as training_spend
from person p join
enrollment e
on p.person_id = e.student
group by p.employer
) pe
on pe.employer = c.company_number
where s.sector not in ('Government', 'Defense') and
pe.total_spend > 1000

Selecting cities that have 10 or more students and instructors combined in SQL

I need to show the city, state, number of student residents, number of instructor residents, and total student/instructor residents in that city. The information is contained in 3 tables: ZIPCODE, STUDENT, and INSTRUCTOR.
The ZIPCODE table has the columns ZIP, CITY, and STATE.
The STUDENT table has STUDENT_ID and ZIP.
The INSTRUCTOR table has INSTRUCTOR_ID and ZIP.
I've tried a couple of inner joins, and intersects, but I keep getting a wide variety of errors. I'm still very new with SQL, and am not sure how to actually make this work, any help or advice would be greatly appreciated.
You probably want a mix of union and join for this. I doubt you want intersect. Plenty of ways to do this, here's one
SELECT
Z.city,
Z.state,
SUM(case when d.typ = 's' then 1 ELSE 0 END) as count_students,
SUM(case when d.typ = 'i' then 1 ELSE 0 END) as count_instructors,
Count(*) as count_all
FROM
(SELECT * FROM
(SELECT 's' as typ, zip FROM student)
UNION ALL
(SELECT 'I ' as typ, zip FROM Instructor)
) d
INNER JOIN
zipcode z
ON d.zip on z.zip
GROUP BY
z.city, z.state
I pull all the records out of each student and instructor table and union them to make one big list, make a column to keep track of the type, the sum does the counting, when the type is s, the case when returns a 1. The sum will sum the 1s up as a count. You thus end up with a city/state/typ combination for each row and when grouped on city and state and summed on the typ, it gives a count
Here's another way to do this:
SELECT
Z.city,
Z.state,
SUM(s.ct) as count_students,
SUM(i.ct) as count_instructors,
SUM(s.ct) + SUM(I.ct) as count_all
FROM
zipcode z
LEFT OUTER JOIN
(SELECT zip, count(*) ct FROM student GROUP BY zip) s
ON s.zip = z.zip
LEFT OUTER JOIN
(SELECT zip, count(*) as ct FROM Instructor GROUP BY zip) i
ON i.zip = z.zip
GROUP BY z.city, z.state
We group and count the students and the instructors in their own subqueries producing just a single count per zip and join these (left join) to all the zip codes. We group in a sub query to ensure that there is only ever a 1:1 relationship between zipcode and s/i. If it were 1:many the sums would beome distorted. Because multiple zips can refer to one city there is another round of grouping and summing to aggregate all the zips from one city

Select all customers loyal to one company?

I've got tables:
TABLE | COLUMNS
----------+----------------------------------
CUSTOMER | C_ID, C_NAME, C_ADDRESS
SHOP | S_ID, S_NAME, S_ADDRESS, S_COMPANY
ORDER | S_ID, C_ID, O_DATE
I want to select id of all customers who made order only from shops of one company - 'Samsung' ('LG', 'HP', ... doesn't really matter, it's dynamic).
I've come only with one solution, but I consider it ugly:
( SELECT DISTINCT c_id FROM order JOIN shop USING(s_id) WHERE s_company = "Samsung" )
EXCEPT
( SELECT DISTINCT c_id FROM order JOIN shop USING(s_id) WHERE s_company != "Samsung" );
Same SQL queries, but reversed operator. Isn't there any aggregate method which solves such query better?
I mean, there could be millions of orders(I don't really have orders, I've got something that occurs more often).
Is it efficient to select thousands of orders and then compare them to hundreds of thousands orders which have different company? I know, that it compares sorted things, so it's O( m + n + sort(n) + sort(m) ). But that's still large for millions of records, or isn't?
And one more question. How could I select all customer values (name, address). How can I join them, can I do just
SELECT CUSTOMER.* FROM CUSTOMER JOIN ( (SELECT...) EXCEPT (SELECT...) ) USING (C_ID);
Disclaimer: This question ain't homework. It's preparation for the exam and desire to things more effective. My solution would be accepted at exam, but I like effective programming.
I like to approach this type of question using group by and a having clause. You can get the list of customers using:
select o.c_id
from orders o join
shops s
on o.s_id = o.s_id
group by c_id
having min(s.s_company) = max(s.s_company);
If you care about the particular company, then:
having min(s.s_company) = max(s.s_company) and
max(s.s_company) = 'Samsung'
If you want full customer information, you can join the customers table back in.
Whether this works better than the except version is something that would have to be tested on your system.
How about a query that uses no aggregate functions like Min and Max?
select C_ID, S_ID
from shop
group by C_ID, S_ID;
Now we have a distinct list of customers and all the companies they shopped at. The loyal customers will be the ones who only appear once in the list.
select C_ID
from Q1
group by C_ID
having count(*) = 1;
Join back to the first query to get the company id:
with
Q1 as(
select C_ID, S_ID
from shop
group by C_ID, S_ID
),
Q2 as(
select C_ID
from Q1
group by C_ID
having count(*) = 1
)
select Q1.C_ID, Q1.S_ID
from Q1
join Q2
on Q2.C_ID = Q1.C_ID;
Now you have a list of loyal customers and the one company each is loyal to.

Better way to demand, in SQL, that a column contains every specified value

Imagine you have two tables, with a one to many relationship.
For this example, I will suggest that there are two tables: Person, and Homes.
The person table holds a persons name, and gives them an ID. The homes table, holds the association of homes to a person. PID joins to "Person.ID"
And, in this tiny DB, a person can have no homes, or many homes.
I hope I drew that right.
How do I write a select, that returns everyone with every specified house type?
Let's say these are valid "Types" in the homes table:
Cottage, Main, Mansion, Spaceport.
I want to return everyone, in the Person table, who has a spaceport and a Cottage.
The best I could come up with was this:
SELECT DISTINCT( p.name ) AS name
FROM person p
INNER JOIN homes h ON h.pid = p.id
WHERE 'spaceport' in (
SELECT DISTINCT( type ) AS type
FROM homes
WHERE pid = p.id
)
AND 'cottage' in (
SELECT DISTINCT( type ) AS type
FROM homes
WHERE pid = p.id
)
When I wrote that, it works, but I'm pretty sure there has to be a better way.
The HAVING clause here will guarantee that the persons returned have both types, not just one or the other.
SELECT p.name
FROM person p
INNER JOIN homes h
ON p.id = h.pid
AND h.type IN ('spaceport', 'cottage')
GROUP BY p.name
HAVING COUNT(DISTINCT h.type) = 2
select * from homes;
home_id person_id type
--
1 1 cottage
2 1 mansion
3 2 cottage
4 3 mansion
5 4 cottage
6 4 cottage
To find the id numbers of every person who has both a cottage and a mansion, group by the id number, restrict the output to cottages and mansions, and count the distinct types.
select person_id
from homes
where type in ('cottage','mansion')
group by person_id
having count(distinct type) = 2;
person_id
--
1
You can use this query in a join to get all the columns from the person table.
select person.*
from person
inner join (select person_id
from homes
where type in ('cottage','mansion')
group by person_id
having count(distinct type) = 2) T
on person.person_id = T.person_id;
Thanks to Joe for pointing out an error in my count().
Not sure about the performance on this one, but here goes:
SELECT PID FROM (
SELECT PID, COUNT(PID) cnt FROM (
SELECT DISTINCT PID, Type FROM Homes
WHERE Type IN ('Type1', 'Type2', 'Type3')
) a
GROUP BY PID
) b
WHERE b.cnt = 3
You'd have to dynamically generate your IN clause as well as the WHERE b.CNT clause.