How to find the AVG given some instructions SQL - sql

I'm trying to create a query that returns me the Market.name and the average of the employees who worked for a particular Market, for example the average of the employees who worked for "IKEA" is 10% compared to all of the total employees among all of the markets.

Try this
select m.name as "Market Name",
avg(w.employee_id) as "Avg.NumberofEmployeesPerMarket"
from workson w
left join employee e on ( e.employee_id = w.employee_id )
left join market m on ( m.market_id = w.market_id )
group by m.name;

You need such a SQL statement with joins
select m.name market_name,
100*round(count(w.employee_id)::decimal/
(select count(employee_id) from employee ),2)
as "Avg Percent of Emp. Per Market"
from workson w
left join market m on ( m.market_id = w.market_id )
where m.name = 'IKEA'
group by m.name;
I don't know about your DBMS, I tried if it is Postgres, remove ::decimal if MySQL
Rextester Demo 1
Rextester Demo 2

Related

List the surnames of bosses who manage at least two employees from the query below

I need to get the surnames of bosses who manage at least two employees from the query below that earn no more than twice the average earnings of ALL people they direct.
I'm stuck here:
SELECT surname from emp k INNER JOIN
(SELECT surname, base_salary
from emp p LEFT JOIN
(select id_team, avg(base_salary) as s, count(*) as c from emp group by id_team)
as o ON(p.id_team = o.id_team)
where p.base_salary between o.s*0.7 and o.s*1.3 and o.c >=2) l ON (k.id_boss = o.id_boss)
having count(k.id_boss) >2 ??? AND k.base_salary < ????
I hope you get my point. Any advices how could I do that?
Here's what the full table looks like:
Based on your query you should just add the proper having clause
having count(k.id_boss) >2 AND k.base_salary < 2*l.s
SELECT k.surname
from emp k
INNER JOIN (
SELECT surname, base_salary, id_boss
from emp p
LEFT JOIN (
select id_team, avg(base_salary) as s, count(*) as c
from emp
group by id_team
) o ON p.id_team = o.id_team
where p.base_salary between o.s*0.7 and o.s*1.3 and o.c >=2
) l ON k.id_boss = o.id_boss
group by k.surname
having count(l.id_boss) >2
AND k.base_salary < 2*l.s

Having hard time understanding SQL Query

I ran across this query and having hard time understanding what it does
SELECT DISTINCT
EMPLOYEE.EMPLOYEE_ID,
EMPLOYEE.LAST_NAME,
EMPLOYEE.FIRST_NAME,
COUNT(*)
FROM EMPLOYEE
JOIN ENTRY ON EMPLOYEE.EMPLOYEE_ID = ENTRY.EMPLOYEE_ID
JOIN TICKET ON ENTRY.TICKET_ID = TICKET.TICKET_ID
WHERE ENTRY.ACTIVITY_ID = 'ADVTS' AND EMPLOYEE.DEPARTMENT_ID ='SLS'
GROUP BY EMPLOYEE.EMPLOYEE_ID, EMPLOYEE.LAST_NAME,
EMPLOYEE.FIRST_NAME,ENTRY.ENTRY_ID
HAVING COUNT(ENTRY.ENTRY_ID) >=
(SELECT CAST(1.25 * COUNT(ENTRY.ACTIVITY_ID)/COUNT(DISTINCT EMPLOYEE.EMPLOYEE_ID) AS float)
FROM
EMPLOYEE
JOIN
ENTRY ON EMPLOYEE.EMPLOYEE_ID = ENTRY.EMPLOYEE_ID
WHERE
ENTRY.ACTIVITY_ID = 'ADVTS' AND EMPLOYEE.DEPARTMENT_ID = 'SLS')
As far as I understand it gives a list of EMPLOYEEs who done ADVTS ACTIVITY and from DEPARTMENT SLS which made ENTRYs at least as much as Average entries made in the DEPARTMENT SLS for ADVTS purposes
Thanks to anyone who take their time to help
Edit succesful result after comments:
SELECT
EMPLOYEE.EMPLOYEE_ID,
EMPLOYEE.LAST_NAME,
EMPLOYEE.FIRST_NAME
FROM EMPLOYEE
JOIN
ENTRY ON ENTRY.EMPLOYEE_ID = EMPLOYEE.EMPLOYEE_ID
GROUP
BY EMPLOYEE.EMPLOYEE_ID, EMPLOYEE.LAST_NAME,
EMPLOYEE.FIRST_NAME
HAVING
COUNT(ENTRY.ENTRY_ID) >=
(SELECT
CAST(1.25 *
COUNT(ENTRY.ACTIVITY_ID)/COUNT(DISTINCT EMPLOYEE.EMPLOYEE_ID)AS float)
FROM
EMPLOYEE JOIN ENTRY ON EMPLOYEE.EMPLOYEE_ID = ENTRY.EMPLOYEE_ID
WHERE
ENTRY.ACTIVITY_ID = 'ADVTS' AND EMPLOYEE.DEPARTMENT_ID = 'SLS')
OUTPUT:
EMPLOYEE_ID| LAST_NAME| FIRST_NAME
7 | Salesman | Efficient
Assuming that TICKET_ID is unique in Ticket and is never NULL in ENTRY, then you can get rid of that JOIN.
Then I assume the purpose of the query is to return employees whose count is more than 1.25 the overall average. That requires a few more (reasonable) assumptions, but this is more simply written as:
SELECT e.*
FROM (SELECT EM.EMPLOYEE_ID, EM.LAST_NAME, EM.FIRST_NAME, COUNT(*) AS CNT,
SUM(COUNT(*)) OVER () * 1.0 / COUNT(*) OVER () as AVG_CNT
FROM EMPLOYEE EM JOIN
ENTRY EM
ON EM.EMPLOYEE_ID = EN.EMPLOYEE_ID
WHERE EN.ACTIVITY_ID = 'ADVTS' AND EM.DEPARTMENT_ID = 'SLS'
GROUP BY EM.EMPLOYEE_ID, EM.LAST_NAME, EM.FIRST_NAME
) e
WHERE cnt >= 1.25 * avg_cnt

SQL Joins Clarification

I wish to display the hospitalid,hosp name and hosp type for the hospital which have/has the highest no of doctors associated with them.
I have two tables:
Doctor: doctorid, hospitalid
Hospital: hospitalid, hname, htype
SELECT d.hospitalid,h.hname,h.htype
FROM doctor d
INNER JOIN hospital h ON d.hospitalid = h.hospitalid
GROUP BY d.hospitalid,h.hname,h.htype
HAVING MAX(count(d.doctorid));
I tried the above code, but i get an error "group func is nested too deeply". How should i modify d code?
This is a common error when learning SQL, thinking that having Max(col) says "keep only the row with the max". It simply means having <some function on the column> without any condition. For instance, you could say having count(d.doctorid) = 1 to get hospitals with only one doctor.
The way to do this is to order the columns and then take the first row. However, the syntax for "take the first row" varies by database. The following works in many SQL dialects:
SELECT d.hospitalid,h.hname,h.htype
FROM doctor d INNER JOIN
hospital h
ON d.hospitalid = h.hospitalid
GROUP BY d.hospitalid,h.hname,h.htype
order by count(d.doctorid) desc
limit 1;
In SQL Server and Sybase, the syntax is:
SELECT top 1 d.hospitalid,h.hname,h.htype
FROM doctor d INNER JOIN
hospital h
ON d.hospitalid = h.hospitalid
GROUP BY d.hospitalid,h.hname,h.htype
order by count(d.doctorid) desc;
In Oracle:
select t.*
from (SELECT d.hospitalid,h.hname,h.htype
FROM doctor d INNER JOIN
hospital h
ON d.hospitalid = h.hospitalid
GROUP BY d.hospitalid,h.hname,h.htype
order by count(d.doctorid) desc
) t
where rownum = 1;
EDIT (based on comment):
To get all rows with the maximum, then you can do something similar to your original query. It is just more complicated. You can calculate the maximum number using a subquery and do the comparison in the having clause:
SELECT d.hospitalid, h.hname, h.htype
FROM doctor d INNER JOIN
hospital h
ON d.hospitalid = h.hospitalid join
GROUP BY d.hospitalid,h.hname,h.htype
having count(d.doctorid) = (select max(NumDoctors)
from (select hospitalid, count(*) as NumDoctors
from hospitalId
group by hospitalid
) hd
)
As a note, there are easier mechanisms in other databases.
This is how I would write it for SQL Server. THe specific details might vary depending teh database backend you are using.
SELECT TOP 1 a.hospitalid,a.hname,a.htype
FROM
(SELECT d.hospitalid,h.hname,h.htype, count(d.doctorid) as doctorcount FROM doctor d INNER JOIN hospital h ON d.hospitalid = h.hospitalid
GROUP BY d.hospitalid,h.hname,h.htype) a
ORDER BY doctorcount DESC;

SQL Server Query using GROUP BY

I am having trouble writing a query that will select all Skills, joining the Employee and Competency records, but only return one skill per employee, their newest Skill. Using this sample dataset
Skills
======
id employee_id competency_id created
1 1 1 Jan 1
2 2 2 Jan 1
3 1 2 Jan 3
Employees
===========
id first_name last_name
1 Mike Jones
2 Steve Smith
Competencies
============
id title
1 Problem Solving
2 Compassion
I would like to retrieve the following data
Skill.id Skill.employee_id Skill.competency_id Skill.created Employee.id Employee.first_name Employee.last_name Competency.id Competency.title
2 2 2 Jan 1 2 Steve Smith 2 Compassion
3 1 2 Jan 3 1 Mike Jones 2 Compassion
I was able to select the employee_id and max created using
SELECT MAX(created) as created, employee_id FROM skills GROUP BY employee_id
But when I start to add more fields in the select statement or add in a join I get the 'Column 'xyz' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.' error.
Any help is appreciated and I don't have to use GROUP BY, it's just what I'm familiar with.
The error that you were getting is because SQL Server requires any item in the SELECT list to be included in the GROUP BY if there is an aggregate function being used.
The problem with that is you might have unique values in some columns which can throw off the result. So you will want to rewrite the query to use one of the following:
You can use a subquery to get this result. This gets the max(created) in a subquery and then you use that result to get the correct employee record:
select s.id SkillId,
s.employee_id,
s.competency_id,
s.created,
e.id employee,
e.first_name,
e.last_name,
c.id competency,
c.title
from Employees e
left join Skills s
on e.id = s.employee_id
inner join
(
SELECT MAX(created) as created, employee_id
FROM skills
GROUP BY employee_id
) s1
on s.employee_id = s1.employee_id
and s.created = s1.created
left join Competencies c
on s.competency_id = c.id
See SQL Fiddle with Demo
Or another way to do this is to use row_number():
select *
from
(
select s.id SkillId,
s.employee_id,
s.competency_id,
s.created,
e.id employee,
e.first_name,
e.last_name,
c.id competency,
c.title,
row_number() over(partition by s.employee_id
order by s.created desc) rn
from Employees e
left join Skills s
on e.id = s.employee_id
left join Competencies c
on s.competency_id = c.id
) src
where rn = 1
See SQL Fiddle with Demo
For every non-aggregated column you add to your SELECT statement you need to update your GROUP BY to include it.
This article may help you understand why.
;WITH
MAX_SKILL_created AS
(
SELECT
MAX(skills.created) as created,
skills.employee_id
FROM
skills
GROUP BY
skills.employee_id
),
MAX_SKILL_id AS
(
SELECT
MAX(skills.id) as id,
skills.employee_id
FROM
skills
INNER JOIN MAX_SKILL_created
ON MAX_SKILL_created.employee_id = skills.employee_id
AND MAX_SKILL_created.created = skills.created
GROUP BY
skills.employee_id
)
SELECT
* -- type all your columns here
FROM
employees
INNER JOIN MAX_SKILL_id
ON MAX_SKILL_id.employee_id = employees.employee_id
INNER JOIN skills
ON skills.id = MAX_SKILL_id.id
INNER JOIN competencies
ON competencies.id = skills.competency_id
If you are using SQL Server than you can use OUTER APPLY
SELECT *
FROM employees E
OUTER APPLY (
SELECT TOP 1 *
FROM skills
WHERE employee_id = E.id
ORDER BY created DESC
) S
INNER JOIN competencies C
ON C.id = S.competency_id

Sub-Query Problem

Department(DepartID,DepName)
Employees(Name,DepartID)
What i need is the Count of Employees in the Department with DepName.
If you are using SQL Server version 2005 or above, here is another possible way of getting employees count by department.
.
SELECT DPT.DepName
, EMP.EmpCount
FROM dbo.Department DPT
CROSS APPLY (
SELECT COUNT(EMP.DepartId) AS EmpCount
FROM dbo.Employees EMP
WHERE EMP.DepartId = DPT.DepartId
) EMP
ORDER BY DPT.DepName
Hope that helps.
Sample test query output:
I'd use an outer join rather than a subquery.
SELECT d.DepName, COUNT(e.Name)
FROM Department d
LEFT JOIN Employees e ON e.DepartID = d.DepartID
GROUP BY d.DepartID, d.DepName
SELECT d.DepName, COUNT(e.Name)
FROM Department d
LEFT JOIN Employees e
ON d.DepartID = e.DepartID
GROUP BY d.DepName
No need for a subquery.
SELECT dep.DepName, COUNT(emp.Name)
FROM DepName dep
LEFT OUTER JOIN Employees emp ON dep.DepartID = emp.DepartID
GROUP BY dep.DepName
SELECT COUNT(DISTINCT Name) FROM
Department AS d, Employees AS e
WHERE d.DepartID=e.DepartID AND d.DepName = '$thename'
And to avoid using a group by and save you a Sort operation in the queryplan:
SELECT
Department.DepName,
(SELECT COUNT(*)
FROM Employees
WHERE Employees.DepartID = Department.DepartID)
FROM
Department