Writing sql without aggregation function - sql

I want to write a SQL query for the problem as defined below, I am not sure about the answer, can anyone help me? Is the answer correct, or if not, how can I improve it?
I have used aggregation function, how can I improve it, to write an sql without aggregation function?
Let us consider the following relational schema about physicians and departments:
PHYSICIAN (PhysicianId, Name, Surname, Specialization, Gender, BirthDate, Department);
Let every physician be univocally identified by a code and characterized by a name, a surname, a specialization (we assume to record exactly one specialization for each physician), a gender, a birth date, and the relative department (each physician is assigned to one and only one department).
DEPARTMENT (Name, Building, Floor, Chief)
Let every department be univocally identified by a name and characterized by its location (building and floor) and a chief.
Let us assume that a physician can be the chief of at most one department (the department he/she belongs to). We do not exclude the possibility for two distinct departments to be located at the same floor of the same building.
I want to formulate an SQL query to compute the following data (exploiting aggregate functions only if they are strictly necessary):
the departments which have no male physicians and with at least two physicians whose home city is Venice.
My answer is as below:
select d.name
from department d
where d.name in (select p.department from physician p where p.gender =! 'Male')
and d.name in (select p.department from physician p
where HomeCity = 'Venice'
group by p.PhysicianId
having count > 2)
Or:
select d.*
from department d
inner join physician p on d.name=p.department and
p.gender=!"Male"
left join physician o where d.name=o.department and o.birthdate='venice'
groupby birthdate
having sum(o. physicianID) >2

Assuming that both the conditions should apply at the same time, I'd say your answer is correct. I would modify it just a bit:
select d.name
from department d
where not exists (select 1 from physician p where p.gender = 'Male' and p.Department = d.name)
and d.name in (select p.department from physician p
where HomeCity = 'Venice'
group by p.department, p.HomeCity
having count >= 2)
If you go with the inner join solution, you will have to apply a distinct to your select, which will make your query slower.
The not exists clause I propose, will create an execution plan, similar to inner join. That is you will not need to compare each name with all the names of the departments without Male employees.
You can't avoid the aggregate function if you want to count something. There is a workaround with Over in TSQL but I would not recommend it.

Despite conditions are different the main idea is the same as in improving the written sql query.
You need to use the same approach but substitute your own join and where conditions.
Update
Here is a version for your case:
select distinct d.*
from departments d
left join physicians m on d.name = m.department and m.gender = 'Male'
inner join physicians v1 on d.name = v1.department and v1.homecity = 'Venice'
inner join physicians v2 on d.name = v2.department and v2.homecity = 'Venice' and v2.id <> v1.id
where m.id is null

Related

Count the number of employees for every country

I have this task:
Count the number of employees for every country. Show only those countries, when works more than 20 employees
employee_id is dedicated for Employees table
country belongs to different table - Countries table and we need country_name from this table
I have no idea how to solve this task. Below what I was able to create. I think we should use Inner Join.
SELECT a.employee_id
, b.country_name
, COUNT(a.employee_id) AS count
FROM employees a
INNER JOIN countries b ON a.employee_id = b.country_name
GROUP BY b.country_name
WHERE employee_id >20;
I think I need help from the beginning.
Thanks
Your join doesn't seem correct but as I don't know the table structure, I can't say what the right column is (I'm going to assume that it should be country_name. Even so, try this:
SELECT b.country_name
, COUNT(a.employee_id) AS count
FROM employees a
INNER JOIN countries b ON a.country_name = b.country_name
GROUP BY b.country_name
HAVING COUNT(employee_id) >20;
When grouping you need to use the HAVING statement to filter.

Struggling with SQL subquery selection

I'm trying to answer a SQL question for revision purposes but can't seem to work out how to get it to work. The tables in question are:
The question is asking me to write an SQL command to display for each employee who has a total distance from all journeys of more than 100, the employee's name and the total number of litres used by the employee on all journeys (the number of litres for a journey is distanceInKm / kmPerLitre).
So far I've tried several variations of code beginning with:
SELECT
name, TravelCost.distanceInKm / Car.kmPerLitre AS "Cost in Litres"
FROM
Employee, Car, TravelCost
WHERE
Employee.id = TravelCost.employeeID
AND Car.regNo = TravelCost.carRegNo
It's at this point I get a bit stuck, any help would be greatly appreciated, thanks!
Never use commas in the FROM clause. Always use proper, standard, explicit JOIN syntax.
You are missing a GROUP BY and a HAVING:
SELECT e.name, SUM(tc.distanceInKm / c.kmPerLitre) AS "Cost in Litres"
FROM Employee e JOIN
TravelCost tc
ON e.id = tc.employeeID JOIN
Car c
ON c.regNo = tc.carRegNo
GROUP BY e.name
HAVING SUM(tc.distanceInKm) > 100;
Use Group By and Having Clause
SELECT NAME,
Sum(TravelCost.distanceInKm/ Car.kmPerLitre) AS "Cost in Litres"
FROM Employee
INNER JOIN TravelCost
ON Employee.id = TravelCost.employeeID
INNER JOIN Car
ON Car.regNo = TravelCost.carRegNo
GROUP BY NAME
HAVING Sum(distanceInKm) > 100
You need to JOIN all the tables and find sum of litres like this:
select
e.*,
sum(distanceInKm/c.kmPerLitre) litres
from employee e
inner join travelcost t
on e.id = t.employeeId
inner join car c
on t.carRegNo = c.regNo
group by e.id, e.name
having sum(t.distanceInKm) > 100;
Also, you need to group by id instead of just names as the other answers suggest. There can be multiple employees with same name.
Also, use explicit JOIN syntax instead of older comma based syntax. It's modern and clearer.
-- **How fool am I! How arrogant am I! I just thought `sum(tc.distanceInKm/c.kmPerLitre)`
-- may have a problem, since a employee may have multiple cars,and car's kmPerLitre is differenct.
-- However there is no problem, it's simple and right!
-- The following is what I wrote, what a bloated statement it is! **
-- calcute the total number of litres used by the employee on all journeys
select e.name, sum(Cost_in_Litres) as "Cost in Litres"
from (
select t.employeeID
-- calcute the litres used by the employee on all journeys group by carRegNo
, sum(t.distanceInKm)/avg(c.kmPerLitre) as Cost_in_Litres
from TravelCost t
inner join Car c
on c.regNo = t.carRegNo
where t.employeeID in
( -- find the employees who has a total distance from all journeys of more than 100
select employeeID
from TravelCost
group by employeeID
having sum(distanceInKm)> 100
)
group by t.carRegNo, t.employeeID
) a
inner join Employee e
on e.id = a.employeeID
group by e.id,e.name;

How to retrieve data from multiple tables using Subquery?

Suppose we have two tables
student(studentID, name, department_ID)
department(departmentID, name).
Our aim is to retrieve the data from both tables using the subquery. I'm trying this
select * from department, student
where department.departmentID
IN (select student.departmentID from student, department
where student.departmentID = department.departmentID)
but it returns the cross product of the number of rows of two tables.
This is possible to get the correct result using JOIN like this
select * from department
Inner join student
on student.departmentID = department.departmentID
and using WHERE clause like this
select * from department, student
where department.departmentID = student.departmentID
I'm wondering if someone can tell me how it can be possible using subquery in SQL.
Hope this helps:
select *, (select name from department d where s.departmentID = d.departmentID) as dname
from student s
where (select name from department d where s.departmentID = d.departmentID) is not null
This problem is however meant to be solved using joins. To learn subQ, use proper examples.
SQL Fiddle: Test

Using group by and having clause

Using the following schema:
Supplier (sid, name, status, city)
Part (pid, name, color, weight, city)
Project (jid, name, city)
Supplies (sid, pid, jid**, quantity)
Get supplier numbers and names for suppliers of parts supplied to at least two different projects.
Get supplier numbers and names for suppliers of the same part to at least two different projects.
These were my answers:
1.
SELECT s.sid, s.name
FROM Supplier s, Supplies su, Project pr
WHERE s.sid = su.sid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid) >= 2
2.
SELECT s.sid, s.name
FROM Suppliers s, Supplies su, Project pr, Part p
WHERE s.sid = su.sid AND su.pid = p.pid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid)>=2
Can anyone confirm if I wrote this correctly? I'm a little confused as to how the Group By and Having clause works
The semantics of Having
To better understand having, you need to see it from a theoretical point of view.
A group by is a query that takes a table and summarizes it into another table. You summarize the original table by grouping the original table into subsets (based upon the attributes that you specify in the group by). Each of these groups will yield one tuple.
The Having is simply equivalent to a WHERE clause after the group by has executed and before the select part of the query is computed.
Lets say your query is:
select a, b, count(*)
from Table
where c > 100
group by a, b
having count(*) > 10;
The evaluation of this query can be seen as the following steps:
Perform the WHERE, eliminating rows that do not satisfy it.
Group the table into subsets based upon the values of a and b (each tuple in each subset has the same values of a and b).
Eliminate subsets that do not satisfy the HAVING condition
Process each subset outputting the values as indicated in the SELECT part of the query. This creates one output tuple per subset left after step 3.
You can extend this to any complex query there Table can be any complex query that return a table (a cross product, a join, a UNION, etc).
In fact, having is syntactic sugar and does not extend the power of SQL. Any given query:
SELECT list
FROM table
GROUP BY attrList
HAVING condition;
can be rewritten as:
SELECT list from (
SELECT listatt
FROM table
GROUP BY attrList) as Name
WHERE condition;
The listatt is a list that includes the GROUP BY attributes and the expressions used in list and condition. It might be necessary to name some expressions in this list (with AS). For instance, the example query above can be rewritten as:
select a, b, count
from (select a, b, count(*) as count
from Table
where c > 100
group by a, b) as someName
where count > 10;
The solution you need
Your solution seems to be correct:
SELECT s.sid, s.name
FROM Supplier s, Supplies su, Project pr
WHERE s.sid = su.sid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid) >= 2
You join the three tables, then using sid as a grouping attribute (sname is functionally dependent on it, so it does not have an impact on the number of groups, but you must include it, otherwise it cannot be part of the select part of the statement). Then you are removing those that do not satisfy your condition: the satisfy pr.jid is >= 2, which is that you wanted originally.
Best solution to your problem
I personally prefer a simpler cleaner solution:
You need to only group by Supplies (sid, pid, jid**, quantity) to
find the sid of those that supply at least to two projects.
Then join it to the Suppliers table to get the supplier same.
SELECT sid, sname from
(SELECT sid from supplies
GROUP BY sid
HAVING count(DISTINCT jid) >= 2
) AS T1
NATURAL JOIN
Supliers;
It will also be faster to execute, because the join is only done when needed, not all the times.
--dmg
Because we can not use Where clause with aggregate functions like count(),min(), sum() etc. so having clause came into existence to overcome this problem in sql. see example for having clause go through this link
http://www.sqlfundamental.com/having-clause.php
First of all, you should use the JOIN syntax rather than FROM table1, table2, and you should always limit the grouping to as little fields as you need.
Altought I haven't tested, your first query seems fine to me, but could be re-written as:
SELECT s.sid, s.name
FROM
Supplier s
INNER JOIN (
SELECT su.sid
FROM Supplies su
GROUP BY su.sid
HAVING COUNT(DISTINCT su.jid) > 1
) g
ON g.sid = s.sid
Or simplified as:
SELECT sid, name
FROM Supplier s
WHERE (
SELECT COUNT(DISTINCT su.jid)
FROM Supplies su
WHERE su.sid = s.sid
) > 1
However, your second query seems wrong to me, because you should also GROUP BY pid.
SELECT s.sid, s.name
FROM
Supplier s
INNER JOIN (
SELECT su.sid
FROM Supplies su
GROUP BY su.sid, su.pid
HAVING COUNT(DISTINCT su.jid) > 1
) g
ON g.sid = s.sid
As you may have noticed in the query above, I used the INNER JOIN syntax to perform the filtering, however it can be also written as:
SELECT s.sid, s.name
FROM Supplier s
WHERE (
SELECT COUNT(DISTINCT su.jid)
FROM Supplies su
WHERE su.sid = s.sid
GROUP BY su.sid, su.pid
) > 1
What type of sql database are using (MSSQL, Oracle etc)?
I believe what you have written is correct.
You could also write the first query like this:
SELECT s.sid, s.name
FROM Supplier s
WHERE (SELECT COUNT(DISTINCT pr.jid)
FROM Supplies su, Projects pr
WHERE su.sid = s.sid
AND pr.jid = su.jid) >= 2
It's a little more readable, and less mind-bending than trying to do it with GROUP BY. Performance may differ though.
1.Get supplier numbers and names for suppliers of parts supplied to at least two different projects.
SELECT S.SID, S.NAME
FROM SUPPLIES SP
JOIN SUPPLIER S
ON SP.SID = S.SID
WHERE PID IN
(SELECT PID FROM SUPPPLIES GROUP BY PID, JID HAVING COUNT(*) >= 2)
I am not slear about your second question

JOIN mutiple tables in SQL SERVER

I have like above Logical ERD. I'm alright with it BUT I can't understand how to display the correct information.
For example:
Need to lists groups and the members belonging to each group. For each group show the ID and its name. For each member, show the unique identifier, the name, gender, date of birth and identifier of their group leader.
Ok, we have group table and group member table.
SELECT group ID, group name
FROM group;
SELECT member ID, name, gender, D.O.B, Leader ID
From group member;
I understand that this is wrong, I just not understand how to display right information, I can imagine it but can't write it down O_o....stuck a bit
One more question, how about the supervisor, I understand that it goes through (Activity Participant) BUT how do i have to create the activity table with this supervisor as foreign key?
This is what you can do:
SELECT
P.Name,
P.DOB,
P.Gender,
G.GroupName,
GL.PersonId
FROM Person P
INNER JOIN GroupMember GM ON GM.PersonId = P.PersonId
INNER JOIN Group G ON G.GroupId = P.GroupId
INNER JOIN GroupLeader GL ON GL.GroupId = G.GroupId
You can JOIN more tables and build your query as shown above.