Include Groups not having values for count - sql

I have the following query:
SELECT DepartmentID
, Count(EmployeeID) AS CountEmployee
FROM HR.EmployeeTransfer
GROUP BY DepartmentID
This is my current output:
DepartmentID CountEmployee
1 15
2 20
I want to include Departments which don't have any employees count like below:
DepartmentID CountEmployee
1 15
2 20
3 NULL
4 NULL

Presumably, you have a "departments" table of some sort. For this query, you want a LEFT JOIN from this table:
SELECT d.DepartmentID, Count(et.EmployeeID) AS CountEmployee
FROM HR.Departments d LEFT JOIN
HR.EmployeeTransfer et
ON d.DepartmentID = et.DepartmentID
GROUP BY d.DepartmentID;
Note that this returns 0 instead of NULL. If you really want NULL, you can use:
SELECT d.DepartmentID, NULLIF(Count(et.EmployeeID), 0) AS CountEmployee

SELECT ET.DepartmentID
,Count(E.EmployeeID) AS CountEmployee
FROM HR.EmployeeTransfer AS ET
LEFT JOIN EmployeesTable AS E ON E.DepartmentID=ET.DepartmentID
GROUP BY ET.DepartmentID

Related

SQL count occurrences of an id from another table in multiple rows

TABLE 1 employee:
employee_id, first_name, last_name
2 John Appleseed
TABLE 2 performance_review:
employee_id, reviewer_id
2 1
2 3
2 4
1 2
3 2
QUESTION: print the first_name and last_name in a single row, then how many times that id is found in the employee_id column, then how many times that same id is found in the reviewer_id column.
Example output:
Name Employee_id count Received_review count
-------------------------------------------------------------
John Appleseed 3 2
What I got so far (it doesn't work)
SELECT
CONCAT([employee_first_name], ' ' , [employee_last_name]) AS employee_full_name,
(SELECT COUNT(employee.employee_id)
FROM performance_review AS received_review
LEFT JOIN performance_review ON employee.employee_id = performance_review.employee_id) AS received_reviews
FROM
employee
Since this involves separate aggregation over two different columns you need two subqueries, one for each.
Here is an example [edit] left joins should be used here because the inner joins would fail for example if the performance review table has all rows with null reviewer for a particular employee.
with
emp as (select employee_id,count(*) employee_count
from performance_review
group by employee_id),
rev as (select reviewer_id,count(*) reviewer_count
from performance_review
group by reviewer_id)
select
first_name,
last_name,
employee_count,
reviewer_count
from
employee
left join emp on employee.employee_id=emp.employee_id
left join rev on employee.employee_id=rev.reviewer_id;
The result
first_name
last_name
employee_count
reviewer_count
John
Appleseed
3
2
Robert's answer is the clearest way to do it but I thought I would show another way to do it with a join -- here you use a trick of doing a test and sum to count certain items. I join both cases
SELECT e.first_name, e.last_name,
SUM(CASE WHEN e.employee_id = p.employee_id THEN 1 ELSE 0 END) as employee_count,
SUM(CASE WHEN e.employee_id = p.reviewer_id THEN 1 ELSE 0 END) as reviewer_count
FROM employee e
LEFT JOIN performance_review p on e.employee_id = p.reviewer_id
or e.employee_id = p.employee_id
GROUP BY e.first_name, e.last_name

Selecting the Id's that have the same EmailAddress column value

What I need:
I am looking for a solution that can give me all the Employee Id's that have the same EmailAddress Column (the filter needs to be by EmailAddress).
I want to know what are the Id's correspondent to the duplicated Email Addresses and retrieve that information.
Table Employee:
Id | PlNumber | EmailAddress | EmployeeBeginingDate | EmployedEndDate | Name UserId(FK) | CreatedBy | CreatedOn
SELECT a.Id,a.EmailAddress
FROM Employee a
INNER JOIN (SELECT
Employee.Id as EmployeeId,
Employee.EmailAddress as EmailAddress,
FROM Employee
GROUP BY Employee.Id,Employee.EmailAddress
HAVING count(Employee.EmailAddress) > 1
) b
ON a.Id= b.EmployeeId
ORDER BY a.Id
I am always getting an error:
the multi-part identifier could not be bound.
I know why the error is happening but I couldn't solve this.
UPDATE: After a few changes the query is returning 0 rows but I know it should return at least 3 rows that I have duplicate values.
Try the below query as you have an aliased table Employee as a. So in place of Employee, you have to use a.
SELECT a.Id, a.EmailAddress
FROM Employee a
INNER JOIN (SELECT
Employee.EmailAddress as EmailAddress
FROM Employee
GROUP BY Employee.EmailAddress
HAVING count(Employee.EmailAddress) > 1
) b
ON a.EmailAddress = b.EmailAddress
ORDER BY a.Id
Live db<>fiddle demo.
Assuming the ids are different on each row, I would go for exists:
SELECT e.Id, e.EmailAddress
FROM Employee e
WHERE EXISTS (SELECT 1
FROM Employee e2
WHERE e2.EmailAddress = e.EmailAddress AND
e2.Id <> e.Id
)
ORDER BY e.EmailAddress;
Or, if you want to know the number of matches, use window functions:
SELECT e.Id, e.EmailAddress, cnt
FROM (SELECT e.*, COUNT(*) OVER (PARTITION BY e.EmailAddress) as cnt
FROM Employee e
) e
WHERE cnt >= 2;

Is it possible to write right outer join where it would always be empty left part?

Here is an example what I want.
We have 2 tables - users and departments
users:
id name d_id
1 Alice 1
2 Bob 1
departments:
id name
1 Sales
2 Support
Next, I write usual right join:
SELECT u.id, u.name, d.name AS d_name
FROM users u
RIGHT OUTER JOIN departments d ON u.d_id = d.id
It returns:
id name d_name
-- -------- ---------
1 Alice Sales
2 Bob Sales
NULL NULL Support
Is it possible to write query, that returns next result?
id name d_name
-- -------- ---------
1 Alice Sales
2 Bob Sales
NULL NULL Sales
NULL NULL Support
You would appear to want UNION ALL:
SELECT u.id, u.name, d.name AS d_name
FROM users u JOIN
departments d
ON u.d_id = d.id
UNION ALL
SELECT NULL, NULL, d.name
FROM departments d
ORDER BY id NULLS LAST;
There is no need for an outer join for the first subquery.
Use a UNION to add departments that haven't been listed yet:
SELECT u.id, u.name, d.name AS d_name
FROM users u
RIGHT OUTER JOIN departments d ON u.d_id = d.id
UNION
SELECT NULL, NULL, d.name
FROM departments
Select
T.id,
T.name,
CASE WHEN T.d_name IS NULL THEN d.name ELSE T.d_name END d_name
from (
SELECT
u.id,
u.name,
d.name AS d_name,
did
FROM Users u
RIGHT OUTER JOIN Departments d
ON u.did = d.id )T
FULL OUTER JOIN Departments d
ON T.did <> d.id

SQL Server - For each distinct company count the number of employees

I am trying to create some SQL that will count the number of employees within each company and return only those companies with greater than or equal to n employees.
I have the following tables (simplified):
CompanyEmployee Table
ID Name IsCompany
1 John Joe 0
2 Company Y 1
3 Company X 1
4 Sally Jeff 0
5 James Peach 0
Employment Table
ID EmployeeID CompanyID
1 1 2
2 4 3
3 5 3
My desired result for n=2:
ID Name IsCompany
3 Company X 1
I have the following SQL:
SELECT t.* FROM CompanyEmployee AS t
WHERE t.ID IN (
SELECT DISTINCT (t.ID)
FROM CompanyEmployee AS t
INNER JOIN Employment AS t0 ON t.ID = t0.CompanyID
WHERE t.IsCompany = 1
GROUP BY t0.CompanyID
HAVING COUNT(t0.EmployeeID) >= n)
But it generates the following error:
Column 'CompanyEmployee.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Any help or advice would be greatly appreciated!
Mixing companies and employees this way is a bad idea. Using a straight inner join may work as long as the id values between the two different types of rows never intersect. I guess if it's an identity column then that's not supposed to happen.
with Companies as (
select ID as CompanyID, Name
from CompanyEmployee
where IsCompany = 1
), Employees as (
select CompanyID, Name
from CompanyEmployee where
IsCompany = 0
)
select c.CompanyID, Name, 1 as IsCompany
from
Companies as c
inner join Employment as ec on ec.CompanyID = c.CompanyID
inner join Employees as e on e.EmployeeID = ce.EmployeeID
group by
c.CompanyID
having count(*) >= n
There is still a straightforward way to do it:
select *
from CompanyEmployee
where ID in (
select CompanyID
from Employment
group by CompanyID
having count(*) >= n)
)
Try this:
SELECT t1.CompanyId, t2.CompanyName, COUNT(t1.CompanyId)
FROM Employment AS t1 INNER JOIN CompanyEmployee AS t2 ON t1.CompanyId = t2.Id
GROUP BY t1.CompanyId, t2.CompanyName
HAVING COUNT(t1.CompanyId)>=n
where n is a number of employees...

SQL Server Query using GROUP BY

I am having trouble writing a query that will select all Skills, joining the Employee and Competency records, but only return one skill per employee, their newest Skill. Using this sample dataset
Skills
======
id employee_id competency_id created
1 1 1 Jan 1
2 2 2 Jan 1
3 1 2 Jan 3
Employees
===========
id first_name last_name
1 Mike Jones
2 Steve Smith
Competencies
============
id title
1 Problem Solving
2 Compassion
I would like to retrieve the following data
Skill.id Skill.employee_id Skill.competency_id Skill.created Employee.id Employee.first_name Employee.last_name Competency.id Competency.title
2 2 2 Jan 1 2 Steve Smith 2 Compassion
3 1 2 Jan 3 1 Mike Jones 2 Compassion
I was able to select the employee_id and max created using
SELECT MAX(created) as created, employee_id FROM skills GROUP BY employee_id
But when I start to add more fields in the select statement or add in a join I get the 'Column 'xyz' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.' error.
Any help is appreciated and I don't have to use GROUP BY, it's just what I'm familiar with.
The error that you were getting is because SQL Server requires any item in the SELECT list to be included in the GROUP BY if there is an aggregate function being used.
The problem with that is you might have unique values in some columns which can throw off the result. So you will want to rewrite the query to use one of the following:
You can use a subquery to get this result. This gets the max(created) in a subquery and then you use that result to get the correct employee record:
select s.id SkillId,
s.employee_id,
s.competency_id,
s.created,
e.id employee,
e.first_name,
e.last_name,
c.id competency,
c.title
from Employees e
left join Skills s
on e.id = s.employee_id
inner join
(
SELECT MAX(created) as created, employee_id
FROM skills
GROUP BY employee_id
) s1
on s.employee_id = s1.employee_id
and s.created = s1.created
left join Competencies c
on s.competency_id = c.id
See SQL Fiddle with Demo
Or another way to do this is to use row_number():
select *
from
(
select s.id SkillId,
s.employee_id,
s.competency_id,
s.created,
e.id employee,
e.first_name,
e.last_name,
c.id competency,
c.title,
row_number() over(partition by s.employee_id
order by s.created desc) rn
from Employees e
left join Skills s
on e.id = s.employee_id
left join Competencies c
on s.competency_id = c.id
) src
where rn = 1
See SQL Fiddle with Demo
For every non-aggregated column you add to your SELECT statement you need to update your GROUP BY to include it.
This article may help you understand why.
;WITH
MAX_SKILL_created AS
(
SELECT
MAX(skills.created) as created,
skills.employee_id
FROM
skills
GROUP BY
skills.employee_id
),
MAX_SKILL_id AS
(
SELECT
MAX(skills.id) as id,
skills.employee_id
FROM
skills
INNER JOIN MAX_SKILL_created
ON MAX_SKILL_created.employee_id = skills.employee_id
AND MAX_SKILL_created.created = skills.created
GROUP BY
skills.employee_id
)
SELECT
* -- type all your columns here
FROM
employees
INNER JOIN MAX_SKILL_id
ON MAX_SKILL_id.employee_id = employees.employee_id
INNER JOIN skills
ON skills.id = MAX_SKILL_id.id
INNER JOIN competencies
ON competencies.id = skills.competency_id
If you are using SQL Server than you can use OUTER APPLY
SELECT *
FROM employees E
OUTER APPLY (
SELECT TOP 1 *
FROM skills
WHERE employee_id = E.id
ORDER BY created DESC
) S
INNER JOIN competencies C
ON C.id = S.competency_id