How to I find the person who has taught the most classes - sql

I want to try and find the employee who has taught the most classes as the position Teacher. So in this I want to print out Nick, as he has taught the most classes as a Teacher.
However, I am getting the error:
ERROR: column "e.name" must appear in the GROUP BY clause or be used in an aggregate function Position: 24
CREATE TABLE employees (
id integer primary key,
name text
);
CREATE TABLE positions (
id integer primary key,
name text
);
CREATE TABLE teaches (
id integer primary key,
class text,
employee integer,
position integer,
foreign key (employee) references employees(id),
foreign key (position) references positions(id)
);
INSERT INTO employees (id, name) VALUES
(1, 'Clive'), (2, 'Johnny'), (3, 'Sam'), (4, 'Nick');
INSERT INTO positions (id, name) VALUES
(1, 'Assistant'), (2, 'Teacher'), (3, 'CEO'), (4, 'Manager');
INSERT INTO teaches (id, class, employee, position) VALUES
(1, 'Dancing', 1, 1), (2, 'Gardening', 1, 2),
(3, 'Dancing', 1, 2), (4, 'Baking', 4, 2),
(5, 'Gardening', 4, 2), (6, 'Gardening', 4, 2),
(7, 'Baseball', 4, 1), (8, 'Baseball', 2, 1),
(9, 'Baseball', 4, 2);
The SQL statement I am trying to use:
SELECT count(t.class), e.name
FROM positions p
JOIN teaches t
ON p.id = t.position
JOIN employees e
ON e.id = t.employee
WHERE p.name = 'Teacher'
GROUP BY t.employee;
I've been working on this on a sql fiddle:
http://www.sqlfiddle.com/#!17/a8e19c/3

Your query looks pretty good. You just need to fix the GROUP BY clause, so it is consistent with the columns in the SELECT clause. Then ORDER BY and LIMIT:
SELECT count(*) cnt_classes, e.name
FROM positions p
INNER JOIN teaches t ON p.id = t.position
INNER JOIN employees e ON e.id = t.employee
WHERE p.name = 'Teacher'
GROUP BY e.id --> primary key of "employees"
ORDER BY cnt_classes DESC --> order by descending count of classes
LIMIT 1 --> keep the first row only

In your select you are using aggregate COUNT that counts all lines in each group (GROUP BY t.employee) but you don't aggregate e.name.
So for Nick you basically select 4 rows each for one class that have two columns - class name and teacher name. Then you ask server to count class names in Nicks group (by his employee id), that aggregates 4 rows into one with value 4 but you don't do anything about teacher name so you are left with invalid structure where you have 1 row for classes count column and 4 rows for teacher name. Same for other teachers. And that's what server is complaining about. Easiest way to fix that is to add e.name to GROUP BY, that will squeeze those 4 rows of same value into one.
To get teacher that teaches most classes you then only need to sort results by class count descending order and limit result count to 1. That will give you result row with highest class count.
Updated fiddle: http://www.sqlfiddle.com/#!17/a8e19c/7

You're getting the error because you need to need to have every column you're selecting (e.name in this example in the GROUP BY clause, otherwise SQL doesn't know how to group and return a count for that column. You'll also want to use TOP(1) and order by if you want to return the person with the most.
SELECT TOP(1) count(*), e.name
FROM teaches t
INNER JOIN positions p ON t.position = p.id
INNER JOIN employees e ON e.id = t.employee
WHERE p.name = 'Teacher'
GROUP BY e.name
ORDER BY count(*) DESC;

Related

How many elements in one column are linked to an element other column?

Consider I have two tables
Courses Program
---------------------------
course_ID program_id
course_title program_name
program_ID
Now, I want to check no of courses(by course_id) offered by each program (program_id).
select c.program_id ,p.program_name, count(course_id)
from courses c
join Program p on c.Program_id =p.Program_id
group by program_id,program_name
If I understood you correctly, you're searching for a GROUP BY and a corresponding aggregate.
--Creating sample tables and data
SELECT course_ID, course_title, program_ID
INTO #courses
FROM (
VALUES (0, 'course_0', 0),
(1, 'course_1', 0),
(2, 'course_2', 0),
(3, 'course_3', 0),
(4, 'course_4', 1),
(5, 'course_5', 1),
(NULL, 'course_6', 1)
) AS C (course_ID, course_title, program_ID)
SELECT program_ID, program_title
INTO #programs
FROM (
VALUES (0, 'program_0'),
(1, 'program_1')
) AS P (program_ID, program_title)
and after that execute the query
SELECT P.program_title, COUNT(C.course_ID) AS courses_amount
FROM #courses C
INNER JOIN #programs P ON C.program_ID = P.program_ID
GROUP BY P.program_ID, P.program_title
So you basically GROUP BY the value to which you to aggregate to and COUNT the 'course_id'.
COUNT(C.course_ID) only counts actual values and will ignore NULLs.
If you want to count the NULLs as well, just use COUNT(*).
EDIT: Forgot the result...
So it'll look like this:
program_title
courses_amount
program_0
4
program_1
2

Figure out which employee has done more training

I have 2 tables, employees and classes_taken. I am trying to see which employees has taken more than 2 class. However, I am getting this error ERROR: aggregate functions are not allowed in WHERE Position: 87.
CREATE TABLE employees (
id integer primary key,
name text
);
CREATE TABLE classes_taken (
employee integer,
class text,
foreign key (employee) references employees(id)
);
INSERT INTO employees (id, name) VALUES
(1, 'bob'), (2, 'sam'), (3, 'mike');
INSERT INTO classes_taken (employee, class) VALUES
(1, 'swimming'), (1, 'dancing'), (2, 'swimming'), (2, 'tennis'), (3, 'golf'), (3, 'dancing');
My select statement.
select e.id, e.name
FROM employees e
JOIN classes_taken c
ON e.id = c.employee
WHERE count(c.class) > 2
GROUP BY c.class;
SQLFiddle: http://sqlfiddle.com/#!15/5296cb/4
You need to keep the count filter in a HAVING clause, because it is on an aggregated value.
select e.id, e.name
FROM employees e
INNER JOIN classes_taken c ON e.id = c.employee
GROUP BY e.id, e.name
HAVING count(c.class) > 2

HAVING clause with subquery -- Checking if group has at least one row matching conditions

Suppose I have the following table
DROP TABLE IF EXISTS #toy_example
CREATE TABLE #toy_example
(
Id int,
Pet varchar(10)
);
INSERT INTO #toy
VALUES (1, 'dog'),
(1, 'cat'),
(1, 'emu'),
(2, 'cat'),
(2, 'turtle'),
(2, 'lizard'),
(3, 'dog'),
(4, 'elephant'),
(5, 'cat'),
(5, 'emu')
and I want to fetch all Ids that have certain pets (for example either cat or emu, so Ids 1, 2 and 5).
DROP TABLE IF EXISTS #Pets
CREATE TABLE #Pets
(
Animal varchar(10)
);
INSERT INTO #Pets
VALUES ('cat'),
('emu')
SELECT Id
FROM #toy_example
GROUP BY Id
HAVING COUNT(
CASE
WHEN Pet IN (SELECT Animal FROM #Pets)
THEN 1
END
) > 0
The above gives me the error Cannot perform an aggregate function on an expression containing an aggregate or a subquery. I have two questions:
Why is this an error? If I instead hard code the subquery in the HAVING clause, i.e. WHEN Pet IN ('cat','emu') then this works. Is there a reason why SQL server (I've checked with SQL server 2017 and 2008) does not allow this?
What would be a nice way to do this? Note that the above is just a toy example. The real problem has many possible "Pets", which I do not want to hard code. It would be nice if the suggested method could check for multiple other similar conditions too in a single query.
If I followed you correctly, you can just join and aggregate:
select t.id, count(*) nb_of_matches
from #toy_example t
inner join #pets p on p.animal = t.pet
group by t.id
The inner join eliminates records from #toy_example that have no match in #pets. Then, we aggregate by id and count how many recors remain in each group.
If you want to retain records that have no match in #pets and display them with a count of 0, then you can left join instead:
select t.id, count(*) nb_of_records, count(p.animal) nb_of_matches
from #toy_example t
left join #pets p on p.animal = t.pet
group by t.id
How about this approach?
SELECT e.Id
FROM #toy_example e JOIN
#pets p
ON e.pet = p.animal
GROUP BY e.Id
HAVING COUNT(DISTINCT e.pet) = (SELECT COUNT(*) FROM #pets);

SQL Query to satisfy 2 conditions

I'm new to SQL and after designing the database, i'm having trouble with some queries. The query i'm currently struggling with states:
"A list of the customers who have ordered at least one project with a higher than average expected duration."
SELECT Customer.name
FROM Project, Customer
WHERE Project.c_id = Customer.c_id AND Project.exp_duration > AVG(Project.exp_duration)
I tried to implement this code but i keep gettin the following error message : "An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference."
Can someone help me with this? I've thought about using joins but i can't get it to work either.
Thanks in advance!
Replace the table variables (#Project & #Customer) with your real tables (Project & Customer).
DECLARE #Project TABLE
(
p_id INT,
exp_duration DECIMAL(18,2),
c_id INT
)
DECLARE #Customer TABLE
(
c_id INT,
name VARCHAR(20)
)
INSERT #Project VALUES (1, 10, 1), (2, 5, 1), (3, 20, 1), (4, 10, 2), (5, 15, 2), (6, 20, 1)
INSERT #Customer VALUES (1, 'C1'), (2, 'C2')
-- average duration
-- SELECT AVG(exp_duration) FROM #Project
SELECT DISTINCT C.name
FROM #Customer C INNER JOIN #Project P ON C.c_id = P.c_id
WHERE p.exp_duration > (SELECT AVG(exp_duration) FROM #Project)
The following query gives the list of Customers who have ordered at least one project (i.e. being a part of one or more projects) and whose ExpectedDuration is greater than the Average ExpectedDuration.
I have used left outer join, group by, count and avg aggregate functions.
Select
C.CustomerID,
C.Name
From SampleCustomer C
Left Join SampleProject P
On C.CustomerID = P.CustomerID
Where P.ExpectedDuration > (Select Avg(ExpectedDuration) From SampleProject Where CustomerID = C.CustomerID)
Group By C.CustomerID, C.Name
Having Count(P.ProjectID) >= 1
Order By C.CustomerID;

Aggregation Sum in SQL (Joins)

I'm having some trouble with summing some column in my query with aggregation.
It's a bit difficult to describe what is happening but I'll try my best:
I have 3 tables - details, extra details and places.
Places is a table that contains places in the world. Details contains details about events that happened, and extra details provides some more data on the events. Each place has an ID and a ParentID (Like New York has an ID and it's parent ID is the US. Something like that). The ID of the event(details) appears a number of times as a column in the extra details table. The extra details table also holds the ID of the place that that event occurred at.
OK after all of that, what I'm trying to achieve is, for each place, the sum of the events that happened there. I know it sounds very specific, but it's what the client asked. Anyhow, example of what I'm trying to get to:
NewYork 60, Chicago 20, Houston 10 Then the US will have 90. And it has several levels.
So this is what I was trying to do:
With C(ID, NAME, COUNTT, ROOT_ID) as
(
SELECT d.ID, d.NAME,
(SELECT COUNT(LX.ID) as COUNTT
FROM EXTRA LX
RIGHT JOIN d ON LX.PLACE_ID = d.ID -- ****
GROUP BY d.ID, d.NAME),
d.ID as ROOT_ID
FROM PLACES d
UNION ALL
SELECT d.ID, d.NAME,
(SELECT COUNT(LX.ID) as COUNTT
FROM EXTRA LX
RIGHT JOIN d ON LX.PLACE_ID = d.ID
GROUP BY d.ID, d.NAME),
C.ROOT_ID
FROM PLACES dx
INNER JOIN C ON dx.PARENT_ID = C.ID
)
SELECT p.ID, p.NAME, S.SumIncludingChildren
FROM places p
INNER JOIN (
SELECT ROOT_ID, SUM(COUNTT) as SumIncludingChildren
FROM C
GROUP BY ROOT_ID
) S
ON p.ID = S.ROOT_ID
ORDER BY p.ID;
The details table is only for showing their data. I'll add that later. It's only comparing the respective columns. To making it work I don't need that. Only for the site data.
It doesn't work because it doesn't recognizes the 'd' where the '****' is. If I'll put a 'new instance' of that table, it won't work either. So I tried to replicate what the right join by doing 'NOT EXISTS IN' on a query that gets all the places instead of the right join...on. Same problem.
Maybe I don't get something. But I'm really seeking a solution and some explanation. I know my code isn't perfect.
Thanks in advance.
EDIT: I'm using OracleSQL on Toad 10.6
create table p(id number, up number, name varchar2(100));
create table e(id number, pid number, dsc varchar2(100));
insert into p values (1, null, 'country');
insert into p values (2, 1, 'center');
insert into p values (3, 1, 'province');
insert into p values (4, 2, 'capital');
insert into p values (5, 2, 'suburb');
insert into p values (6, 3, 'forest');
insert into p values (7, 3, 'village');
insert into p values (8, 7, 'shed');
insert into p values (9, 2, 'gov');
insert into e values (1, 8, 'moo');
insert into e values (2, 8, 'clank');
insert into e values (3, 7, 'sowing');
insert into e values (4, 6, 'shot');
insert into e values (5, 6, 'felling');
insert into e values (6, 5, 'train');
insert into e values (7, 5, 'cottage');
insert into e values (8, 5, 'rest');
insert into e values (9, 4, 'president');
insert into e values (10,1, 'GNP');
commit;
with
places as
(select id,
up,
connect_by_root id as root,
level lvl
from p
connect by prior id = up),
ev_stats as
(select root as place, max(lvl) as max_lvl, count(e.id) as ev_count
from places left outer join e
on places.id = e.pid
group by root)
select max_lvl, p.name, ev_count
from ev_stats inner join p on p.id = ev_stats.place
order by max_lvl desc;