"Group by" expression - sql

I have some problems trying to print department_id, department_name and the sum of salaries of each department and I can't figure out why. I get the error: 00979 - "not a GROUP BY expression"
SELECT d.department_id, d.department_name, SUM(e.salary)
FROM departments d, employees e
WHERE d.department_id = e.department_id
GROUP BY d.department_id;

Just add the name to the group by expression. Along the way, also fix the query to use explicit join syntax:
SELECT d.department_id, d.department_name, SUM(e.salary)
FROM departments d JOIN
employees e
ON d.department_id = e.department_id
GROUP BY d.department_id, d.department_name;
Although you didn't ask, I will point out that your version of the query does make sense and is ANSI-compliant (although most databases don't support this feature). You are aggregating by a primary key, so bringing in additional columns is allowed -- although Oracle does not support this feature.

SELECT d.department_id, d.department_name, SUM(e.salary)
FROM departments d, employees e
WHERE d.department_id = e.department_id
GROUP BY d.department_id, d.department_name;
Just needed to add the department_name column in your group by clause.

Alternately, you can aggregate the department name as well using MIN() or MAX():
SELECT d.department_id, MAX(d.department_name) AS department_name
, SUM(e.salary) AS department_salary
FROM departments d INNER JOIN employees e
ON d.department_id = e.department_id
GROUP BY d.department_id;
Note that I updated your syntax from the old ANSI standard to the newer one. If you prefer the older syntax (as I do), then just use this:
SELECT d.department_id, MAX(d.department_name) AS department_name
, SUM(e.salary) AS department_salary
FROM departments d, employees e
WHERE d.department_id = e.department_id
GROUP BY d.department_id;

Related

SQL in Oracle HR Schema

I have made a query in Oracle HR schema to see the following information:
The city where the department is located
The total number of employees in the department
However, the query cannot be executed correctly and said this is "not a GROUP BY expression".
Does anyone knows what's the problem is? Thanks in advance.
SELECT department_name, city, COUNT(employees.department_id)
FROM departments
JOIN employees on (departments.department_id=employees.department_id)
JOIN locations USING (location_id)
GROUP BY department_name;
You are grouping by department and want to show the department's city. You expect this to work, because each department is in exactly one city. (SQL people call this functional dependency.)
For this to work, ...
there would have to be a unique contraint on the department name or you'd have to group by department_id instead
the DBMS must detect and support functional dependency in aggregation queries
Unfortunately, Oracle doesn't support functional dependency in aggregation queries. It forces us to put every such column in the GROUP BY clause or into an aggregation function.
So either extend the GROUP BY clause:
SELECT d.department_name, l.city, COUNT(e.department_id)
FROM departments d
JOIN employees e ON e.department_id = d.department_id
JOIN locations l USING (location_id)
GROUP BY d.department_name, l.city
ORDER BY d.department_name;
or use some aggregation function as MIN or MAX on that single value.
SELECT d.department_name, MAX(l.city) AS city, COUNT(e.department_id)
FROM departments d
JOIN employees e ON e.department_id = d.department_id
JOIN locations l USING (location_id)
GROUP BY d.department_name
ORDER BY d.department_name;
What I prefer though, is to aggregate first and only then join. You want to join the departments with their employee count, so do just that:
SELECT d.department_name, l.city, COALESCE(e.cnt, 0) AS employee_count
FROM departments d
JOIN locations l USING (location_id)
LEFT JOIN
(
SELECT department_id, COUNT(*) as cnt
FROM employees
GROUP BY department_id
) e ON e.department_id = d.department_id
ORDER BY d.department_name;
The problem is you have both aggregated and non-aggregated column (in your case city in the select list.
As I don't know the structure of location table and considering a department have only one location defined you can use max(city),
SELECT department_name, max(city) city, COUNT(employees.department_id) no_of_employees
FROM departments
JOIN employees on (departments.department_id=employees.department_id)
JOIN locations USING (location_id)
GROUP BY department_name;
As excellently explained by Thorsten, you could also group the data using OVER and PARTITION BY function which would eliminate the use of GROUP BY function.
SELECT d.department_name, l.city, COUNT(e.department_id) OVER (PARTITION BY e.department_id) as emp_count
FROM departments d
JOIN employees e ON e.department_id = d.department_id
JOIN locations l USING (location_id)
ORDER BY d.department_name;

Invalid Number Error when Using Non-Numerical Values

I have 2 tables, Departments and Employees. I want to display the department_id, department_name, and the number of employees in any department that has fewer than 4 employees.
Here's the code that I'm using (I use SQL developer, btw):
select d.department_id, d.department_name, count(e.last_name)
from departments d, employees e
where e.last_name < 4
group by d.department_id, d.department_name;
However, I'm getting a invalid number error. What is the correct way to do this?
Something like this would make more sense:
SELECT d.department_id,
d.department_name,
COUNT(*) AS numEmployees
FROM departments d
INNER JOIN employees e
ON d.department_id = e.department_id
GROUP BY d.department_id,
d.department_name
HAVING COUNT(*) < 4

Two aggregation functions group by

I'm trying to print the department names that have the sum of all salaries bigger than the average sum on departments.
SELECT d.department_name, SUM(e.salary)
FROM departments d, employees e
WHERE d.department_id = e.department_id
GROUP BY d.department_name
HAVING SUM(e.salary) > (SELECT AVG(SUM(salary)) from employees);
In the second select, after what do I have to group by AVG(SUM(salary))?
You need to repeat the first query in the condition. This can be done with the WITH clause.
WITH dept_sums AS (SELECT d.department_name, SUM(e.salary) sum_salary
FROM departments d, employees e
WHERE d.department_id = e.department_id
GROUP BY d.department_name)
SELECT * FROM dept_sums d_s_1 WHERE d_s_1.sum_salary > (SELECT AVG(sum_salary) FROM dept_sums d_s_2);
This is where window (analytic) functions come in handy. Below I am using AVG() as an analytic function to calculate the average total salary across all departments.
SELECT department_name, dept_salary FROM (
SELECT d.department_name, SUM(e.salary) AS dept_salary
, AVG(SUM(e.salary)) OVER ( ) AS avg_dept_salary
FROM departments d INNER JOIN employees e
ON d.department_id = e.department_id
GROUP BY d.department_name
) WHERE dept_salary > avg_dept_salary;

Order by subquery

I have the following oracle SQL code, but I can't understand what is the purpose of ordering by a subquery. Anyone can explain it clearly to me ?
SELECT employee_id, last_name
FROM employees e
ORDER BY (
SELECT department_name
FROM departments d
WHERE e.department_id = d.department_id
);
The ordering is done by results from other table. In this case the query returns only results from employees table, but the ordering is done by department_name, which is stored in departments table.
You could achieve identical result by using join, selecting only values from employees table, and ordering by department_name from departments table:
SELECT e.employee_id, e.last_name
FROM employees e INNER JOIN departments d
ON e.department_id = d.department_id
ORDER BY d.department_name
This query is valid if employee must always have a department. If there can be employees without departments then you should use LEFT join instead.
The clear intention of that query is employee_id and last_name from employees should be order by department_name from departments.
Okay, you don't subquery then go for join
select e.employee_id,e.last_name from employees e join departments d on
e.department_id = d.department_id order by d.department_name;

I have a issue in executing the below database query

I have an issue in executing the below database query.
I am using Oracle 11g Enterprise Edition
Query 1:
SELECT d.department_id, max(salary), min(salary), avg(salary), count(*) no_of_employees
FROM departments d, employees e
WHERE e.department_id = d.department_id
GROUP BY d.department_id
Result: successful output
Query 2:
SELECT d.department_id, d.department_name, max(salary), min(salary), avg(salary), count(*) no_of_employees
FROM departments d, employees e
WHERE e.department_id = d.department_id
GROUP BY d.department_id
Result:
ORA-00979: not a GROUP BY expression
Can anybody help me out with this issue?
Please let me know what is wrong with this expression.
you must also group by department_name
You need to GROUP BY all the columns that you don't have an aggregation function (MAX, COUNT, etc). Therefore:
select d.department_id
, d.department_name
, max(salary), min(salary), avg(salary) , count(*) no_of_employees
from departments d, employees e
where e.department_id = d.department_id
group by d.department_id, d.department_name;
But you should also consider doing an ANSI join instead:
select department_id
, department_name
, max(salary), min(salary), avg(salary) , count(*) no_of_employees
from departments
join employees USING (department_id)
group by department_id, department_name;