avg operation on repeated values - sql

I have two tables, employee and certified.
The certified table contains the list of employees certified to drive a plane. One employee may be certified for many planes and vice versa. Not all employees are certified.
Each employee draws a salary. Only one salary, no matter how many certifications.
How do i find the average salary of those employees who are certified for at least one plane?
My problem is,
SELECT AVG(SALARY) FROM EMPLOYEE E, CERTIFIED C WHERE E.EID=C.EID;
This includes a salary twice if the employee is certified for two planes. So AVG(salary) gives a wrong value.
I'm a newbie, so my apologies if my question seems too basic. Help?

You want a semi-join here. You can implement it with an IN predicate:
SELECT AVG(SALARY)
FROM EMPLOYEE
WHERE EID IN (
SELECT EID
FROM CERTIFIED
)
;
or with an EXISTS predicate:
SELECT AVG(e.SALARY)
FROM EMPLOYEE AS e
WHERE EXISTS (
SELECT *
FROM CERTIFIED AS c
WHERE c.EID = e.EID
)
;
The IN predicate will not work as expected if CERTIFIED.EID is nullable and indeed has nulls, although I would assume it would be unusual to store a certification in that table not associated with any employee.
Alternatively you could use a proper join (and I would recommend you seriously consider switching to the proper join syntax too), only you would need to join to a set of distinct EID values derived from CERTIFIED, rather than directly to CERTIFIED. For the derived table you can use DISTINCT:
SELECT AVG(e.SALARY)
FROM EMPLOYEE AS e
INNER JOIN (
SELECT DISTINCT EID
FROM CERTIFIED
) AS c
ON e.EID = c.EID
;
or GROUP BY:
SELECT AVG(e.SALARY)
FROM EMPLOYEE AS e
INNER JOIN (
SELECT EID
FROM CERTIFIED
GROUP BY EID
) AS c
ON e.EID = c.EID
;

If an employee can have only one salary, I suggest
SELECT AVG(e.salary)
FROM employee e
WHERE e.id in (SELECT c.id FROM certified c)
This makes sure, that each eid is taken once. I still do not understand what the certified table is doing here. Do you need only those salaries of employees which are certified for flights?

Related

practice sql explanation

http://studybyyourself.com/seminar/sql/exercises/8-3/?lang=en
Please provide data about all employees whose salary is higher or equal to the average salary of employees working in the same department (regardless if employees have left the company or not). Required attributes are last name, first name, salary and department name.
SELECT emp.Last_name, emp.First_name, emp.Salary, D.Name
FROM Employee AS emp
INNER JOIN Department AS D ON emp.Department_id = D.ID
WHERE emp.Salary >=
(
SELECT AVG(e.Salary)
FROM Employee AS e
GROUP BY e.Department_id
HAVING e.Department_id = emp.Department_id
)
Can anyone please help explain the solution? Specifically, what does the 'having' clause do in this case that allows the sub-query to work? I get stuck up until that point without the having clause and I expectedly get the 'subquery returns more than 1 row' error but I am not sure how the having clause is fixing the problem.

subquery and join not giving the same result

1
select *
from employees
where salary > (select max(salary) from employees where department_id=50)
2
select *
from employees e left join
employees d
on e.DEPARTMENT_ID =d.DEPARTMENT_ID
where d.salary > (select max(salary) from employees where department_id=50)
why the second query is giving multiple record
i want achieve the same result as of 1st query using join.....
Thanks in Advance......
Rocky, the first select is correct. Why do you want to do any join? Without further information the objective of the second select is not clear (nonsense).
I can't see the point about joining against the same table by DEPARTMENT_ID. Anyway, the problem about duplicates is because you are joining the same two tables by a key is not pk, basically you are multiplyng each employee for all the employees of the same department. This version eliminate duplicates but still has no improvement from the first one.
select *
from employees e left join
employees d
on e.employee_ID = d.employee_ID
where d.salary > (select max(salary) from employees where department_id=50)
You are probably looking for an anti join. This is a pattern mainly used in a young DBMS where IN and EXISTS clauses are slow compared to joins, because the developers focused on joins only.
You are looking for all employees whose salaries are greater than all salaries in department 50. With other words: WHERE NOT EXISTS a salary greater or equal in department 50.
Your query can hence be written as:
select *
from employees e
where not exists
(
select null
from employees e50
where e50.department_id = 50
and e50.salary >= e.salary
);
As an anti join (an outer join where you dismiss all matches):
select *
from employees e
left join employees e50 on e50.department_id = 50 and e50.salary >= e.salary
where e50.salary is null;

SQL select with multiple different conditions

I am still learning SQL and can't find a proper way to find the following information:
I have created a table "employees" with the following columns:
'department', 'age', 'salary', 'bonus';
I am trying to design a query that will give me all employees that have someone the same age as them in another department and with a bonus superior to their salary.
(to be more precise, if someone in department 'SALES' has the same age as someone in department 'RESEARCH' and have a bonus that is superior to that guy in research's salary, then I would like to display both of them)
Is this possible to do in sql?
Thank you for your time,
-Tom
You can do this using exists. Because you care about the relationship in both direction, this is as simple as looking for people with the same age in the two departments but who do not have the same bonus:
select e.*
from employees e
where exists (select 1
from employees e2
where e2.department <> e.department and e2.age = e.age and
e2.bonus <> e.bonus
);
To get the pairs on the same row, use a self-join:
select e1.*, e2.*
from employees e1 join
employees e2
on e1.age = e2.age and e1.department <> e2.department and
e1.bonus > e2.bonus;

Count on a database using Count function SQL

I have the database schema like this
Flights(flno,from,to,distance,departs,arrives,price)
Aircraft(aid,aname,cruisingRange)
Certified(employee,aircraft)
Employees(eid,ename,salary)
Where Flno is primary key and Each route corresponds to a "flno".
So I have this question to answer for the Schema
For each pilot, list their employee ID, name, and the number of routes
he can pilot.
I have this SQL, is this correct? ( I can test, as I dont have data for the database).
select eid, ename, count(flno)
from employees, flights
groupby flno
This is a simple questioin, but as everyone is mentioning you don't have any link between employee and flights. The relationships stop at certified.
You obviously have or will create some relationship. I have written a query that will give you the count taking into account that you will have a many to many relationship between employee and flights. Meaning an employee can have many flights and a single flight can be made by many employees.
Flights(flno,from,to,distance,departs,arrives,price)
Aircraft(aid,aname,cruisingRange)
Certified(employee,aircraft)
Employees(eid,ename,salary)
select
e.eid employee_id,
e.ename employee_name,
count(*)
from
employees e
inner join certified c on
c.employee = e.eid
inner join aircraft a on
a.aid = c.aircraft
inner join aircraft_flights af on -- new table that you would need to create
af.aircraft = a.aid and
inner join flights f on
f.flno = af.flno -- not I made up a relationship here which needs to exist in some for or another
group by
e.eid,
e.ename
I hope this at least shows you how to write a count statement correctly, but you should probably brush up on your understanding of joins.
Hope that helps.
EDIT
Without the relationships and working in your comments you could get the count as below.
select
e.eid employee_id,
e.ename employee_name,
count(*)
from
employees e
inner join certified c on
c.employee = e.eid
inner join aircraft a on
a.aid = c.aircraft
inner join flights f on
f.distance <= a.cruisingRange
group by
e.eid,
e.ename

Group by clause always produces error "Not a group by expression"

I have two tables
EMPLOYEES(employee_id,first_name,last_name,salary,manager_id,department_id)
and
DEPARTMENTS(department_id,department_name,manager_id)
When I try to create a new table "EMP_DEPT" which contains
department_id ,department_name, dcount(count of employees in each department),
dtotal(total salary of employees in each department),
dmaxsal(maximum salary in a department), dminsal(minimum salary in a department)
it shows ORA00979: not a GROUP BY expression
I did this in oracle
create table emp_dept as(select e.department_id,d.department_name,count(*),sum(salary),max(salary),min(salary)
from employees e,departments d where e.department_id= d.department_id
group by e.department_id);
You seem to be a bit confused in writing your query.
First of all, You are using aliases in columns of table departments, but you're not using aliases in columns of table employees.
select e.department_id,d.department_name,count(*),sum(salary),max(salary),min(salary)
from employees e,departments d
Secondly, you're using a where clause, but shouldn't a left join be a better choice?
where e.department_id= d.department_id
Thirdly, and most importantly, you're doing group by in e.department_id, whereas it should be d.department_id , since there is a parent-child relationship among the tables departments and employees. I don't know how you designed your table, but logically, department_id is supposed to be the primary key in departments table, and foreign key in employees table. That's why, group by e.department_id is incorrect.
Therefore,
group by e.department_id
Should instead be,
group by d.department_id
I think you just need to modify your query in the following way:
create table emp_dept as
(select d.department_id,d.department_name,count(e.*),
sum(e.salary),max(e.salary),min(e.salary)
from departments d
left join employees e
on e.department_id= d.department_id
group by d.department_id);
And hopefully it will fix your problem.