I have a table:
TABLE employee (
ID bigint,
name varchar,
department bigint
);
I would like to find a department that has minimal employees. (Count of rows in this table)
I believe this would require a HAVING statement with a nested sub-query, any help would be much appreciated.
I am using H2 database.
You could group by department and get the count of users in each department, order by the count and select top 1?
SELECT TOP 1
[department],
COUNT(*) AS [NoOfEmployees]
FROM [employee]
GROUP BY [department]
ORDER BY COUNT(*) ASC
Related
I have an employee table with duplicate instances of employees. For instance the last name Baba may show up 2 times with the same employee ID. I have to count last names from the table, but do not want to count the same one twice.
I am writing SQL in Postgres. Here is the table from which I draw my query:
CREATE TABLE Employee (
emp_no int NOT NULL,
birth_date date NOT NULL,
first_name varchar(100) NOT NULL,
last_name varchar(100) NOT NULL,
gender varchar(100) NOT NULL,
hire_date date NOT NULL,
CONSTRAINT pk_Salaries PRIMARY KEY (
emp_no
)
);
The data was given and contained duplicates. I cannot remove the duplicates but do not want to count them. Here is my query statement:
SELECT Employee.last_name, COUNT(Employee.last_name) AS "Last Name Count"
FROM Employee
GROUP BY Employee.last_name
ORDER BY "Last Name Count" DESC;
The output works well but I am sure it is counting some last names more than once.
I have tried adding a WHERE cause to get a count of last names where the emp_no is distinct but it does not work.
You want to count last names from the table, but do not count the same one twice.
So try this :
"SELECT Employee.last_name, COUNT(DISTINCT Employee.last_name) AS "Last Name Count" FROM Employee GROUP BY Employee.last_name"
The emp_no is a primary key, so it has to be unique and a where clause with distinct would have no impact. The query seems to be accurate, I'd be surprised if it's counting last names more than once.
Just use distinct keyword during applying the COUNT() aggregation :
SELECT e.last_name, COUNT(distinct e.last_name) AS "Last Name Count"
FROM Employee e
GROUP BY e.last_name
ORDER BY "Last Name Count" DESC;
You should try validating if the first name is counted uniquely by each last name
something like this
SELECT Employee.last_name, COUNT(distinct Employee.first_name) AS "Last Name Count"
FROM Employee
GROUP BY Employee.last_name
ORDER BY "Last Name Count" DESC;
see fiddle
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=f0a9568e6cb5fb5e0247d2f2c5e95114
or if necessary check if more data is repeating in both lines, doing something like
select distinct * from (
SELECT Employee.last_name,
COUNT(*) over (partition by first_name, birth_date, last_name, gender) AS n
FROM Employee
) V
where n > 1
see the fiddle
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=223143f0d603abf30d99ad87fa07781e
Thank you all for your quick responses. They were all very good and helpful!
I ran the following code to find that I was wrong and each individual had only one instance in the table and had only one unique employee ID (emp_no).
SELECT Employee.emp_no, COUNT(Employee.emp_no) AS "Employee ID Count"
FROM Employee
GROUP BY Employee.emp_no
ORDER BY "Employee ID Count" ASC;
Again, thank you all very much!
I have employee table with emp id (emp_id) and department (dep_id) fields. An employee could be working in more than one Department. I want to write a sql query to display unique emp_ids who work in more than one department.
Pl help me to write sql query.
Thx
Answered here: SQL query for finding records where count > 1
You need to use count, group by and having like this.
select emp_id, count(dep_id)
from employee_department
group by emp_id
having count(dep_id)>1
Query
SELECT COUNT(*)
FROM
(
SELECT id_employee, COUNT(*) AS CNT
FROM Department_Employee
GROUP BY id_employee
) AS T
WHERE CNT > 1
My SQL table design is:
id,
name,
department,
status,
value
The ask is to fetch equal number of records from each department. Maximum number of rows allowed per fetch is restricted to 100 rows. Assume that there are 4 distinct departments (A,B,C and D) in the table. For each department there are few hundred records. So, the query should fetch only 25 records for each department. And if the distinct department size is three than the split should be 100/3.
I used the below query. But it is not calculating the number of rows for each department dynamically. Currently I have used 25 a constant value.
SELECT *
FROM (
SELECT dept.*,
ROW_NUMBER() OVER( PARTITION BY dept.department
ORDER BY dept.department, dept.id ) deptEntry
FROM TABLE_NAME dept
) dept
WHERE dept.deptEntry <=25
AND dept.status='ACTIVE'
AND rownum <=100;
Count the number of distinct departments and then divide your number of rows by that value:
SELECT *
FROM (
SELECT dept.*,
ROW_NUMBER() OVER( PARTITION BY department
ORDER BY department, id ) AS deptEntry,
COUNT( DISTINCT department ) OVER () AS num_dept
FROM TABLE_NAME dept
) dept
WHERE dept.deptEntry <= 100 / num_dept
AND dept.status='ACTIVE'
AND rownum <=100;
Updated:
SELECT *
FROM (
SELECT dept.*,
ROW_NUMBER() OVER( PARTITION BY department
ORDER BY id ) AS deptEntry
FROM TABLE_NAME dept
WHERE status = 'ACTIVE'
ORDER BY deptEntry, department
) dept
WHERE ROWNUM <=100;
In my Employee table, I wanted to find the 3rd highest salary. Someone provided me with the following query to do this:
SELECT *
FROM employee C1
WHERE 3 = (SELECT Count(DISTINCT( C2.salary ))
FROM employee C2
WHERE C2.salary >= C1.salary)
This query works, but I don't how it works. What kind of query is this?
As others have said, this type of query is called a correlated sub-query. It's a sub-query because there is a query within a query and it's correlated because the inner query references the outer query in its definition.
Consider the inner query:
SELECT Count(DISTINCT( C2.salary ))
FROM employee C2
WHERE C2.salary >= C1.salary
Conceptually, this inner query will be evaluated once for every row produced by the outer query before the WHERE clause is applied, basically once for every row in employee. It will produce a single value, the count of rows from employee where the salary is less than the salary of the outer row.
The outer query will only return records where the value produced by the inner query is exactly 3. Assuming unique salary values, there is only one row from the employee table where there will be exactly 3 records with a salary value greater than or equal to it (the one row) and that one row is necessarily the third-highest salary value.
It's clever, but unnecessarily weird and probably not as optimal as something more straightforward.
Maybe a better solution would have been
SELECT TOP 1 *
FROM (
SELECT TOP 3 * FROM employee ORDER BY Salary DESC
) t
ORDER BY Salary ASC
Easier to read and more efficient than a correlated sub-query.
You can also use Dense_Rank to rank the salaries greatest to least, then select the ones that are ranked 3rd. This will also prevent you from getting the wrong salary if the top 2 are identical like the other answers above mine are doing. This has a better looking execution plan than the Distinct count one also
SELECT *
FROM (
SELECT *,
DENSE_RANK() OVER (ORDER BY Salary DESC) salary_rank
FROM employee e
) t
WHERE salary_rank = 3
Could also rewrite this with a common table expression.
WITH top_three
AS
(
SELECT TOP 3 * FROM employee ORDER BY Salary DESC
)
SELECT TOP 1 *
FROM top_three
ORDER BY Salary ASC;
Or, if you need to look for other ranks in this you can use row_number().
WITH ranked
AS
(
SELECT rank = ROW_NUMBER()OVER(ORDER BY Salary DESC), *
FROM employee
ORDER BY Salary DESC
)
SELECT *
FROM ranked
WHERE rank = #whatever_rank_you_want;
I've retrieved a count of the duplicates and their occurrences using the below code
select empID, count(empID) AS DUPLICATEempID
from employees
group by empID
having count (empID) > 1
I now want the table to include the number of rows returned (i.e. insert the number on the table returned)
thanks in advance.
In SAS, you can do this with a subquery:
select empId, DUPLICateempID, count(*) as NumDuplicates
from (select empID, count(empID) AS DUPLICATEempID
from employees
group by empID
having count (empID) > 1
) t
When you have an aggregation function without a group by, it applies the function to the whole table and re-merges the results.