project to which maximum number of employees have been allocated - sql

I have these tables with the following columns :
Employee24 (EMPLOYEEID, FIRSTNAME, LASTNAME, GENDER);
PROJECT24 (PROJECTID PROJECTNAME EMPLOYEEID);
I want to write a query to find project to which maximum number of employees are alocated.
SELECT FIRSTNAME, LASTNAME
FROM EMPLOYEE24 E
WHERE E.EMPLOYEEID IN ( SELECT L2.EMPLOYEEID
FROM PROJECT24 L2 group by l2.employeeid)\\

What do you want to do if there are ties? This is an important question and why row_number()/rank() might be a better choice:
select p.*
from (select p.projectid, p.projectname, count(*) as num_employees,
rank() over (order by count(*) desc) as seqnum
from project25 p
group by p.projectid, p.projectname
) p
where seqnum = 1;
Notes:
The above query returns all rows if there are ties. If you want only one (arbitrary) project when there is a tie, then use row_number().
I see no reason to join to employee24.
Your data structure is strange. The relationship between projects and employees should be in a separate table, say project_employees. That should have projectid, but not the name. The name should be in project24.

You might try something like this (though I'm quite sure it can be done in other ways):
SELECT *
FROM (SELECT prj.projectid,
prj.projectname,
COUNT(*) AS number_employees
FROM project24 prj
JOIN employee24 emp
ON prj.employeeid = emp.employeeid
GROUP BY prj.projectid,
prj.projectname
ORDER BY number_employees DESC)
WHERE ROWNUM = 1;

Related

How to find columns that only have one value - Postgresql

I have 2 tables, person(email, first_name, last_name, postcode, place_name) and location(postcode, place_name). I am trying to find people that live in places where only one person lives. I tried using SELECT COUNT() but failed because I couldn't figure out what to count in this situation.
SELECT DISTINCT email,
first_name,
last_name
FROM person
INNER JOIN location USING(postcode,
place_name)
WHERE 1 <=
(SELECT COUNT(?))
Aggregate functions always go with having:
SELECT DISTINCT first_value(email) over (partition by place_name),
first_value(first_name) over (partition by place_name),
first_value(last_name) over (partition by place_name),
count(*)
FROM person
INNER JOIN location USING(postcode,
place_name)
GROUP BY place_name
HAVING count(*) = 1
For more about the window functions (like first_value) check out this tutorial.
I would do this as follows. I find it plain and simple.
select p1.* from
person p1
join
(
select p.postcode, p.place_name, count(*) cnt from
person p
group by p.postcode, p.place_name
) t on p1.postcode = t.postcode and p1.place_name = t.place_name and t.cnt = 1
How does it work?
In the inner query (aliased t) we just count how many people live in each location.
Then we join the result of it (t) with the table person (aliased p1) and in the join we require t.cnt = 1. This is probably the most natural way of doing it, I think.
Thanks to the help of people here, I found this answer:
SELECT first_name,
last_name,
email
FROM person
WHERE postcode IN
(SELECT postcode
FROM person
GROUP BY postcode,
place_name
HAVING COUNT(place_name)=1
ORDER BY postcode)
AND place_name IN
(SELECT place_name
FROM person
GROUP BY postcode,
place_name
HAVING COUNT(postcode)=1
ORDER BY place_name)

Missing expression problem in SQL using Oracle

I want to get the number of employees by department and I wrote this script using Oracle but it always says that there is a missing expression
The columns used in my tables :
department :name (the name of the department) -
depnum (the id of the department"primary key"),
employee : empnum (the id of the employee) -
depnum (the id of the department in which the employee in question is working "foreign key")
Query:
select
s.name
from
department s
inner join
employee p on s.depnum = p.depnum
group by
s.name
having
count(p.empnum) = max(select count(p.empnum)
from employee p, department s
where s.depnum = p.depnum
group by s.name) ;
If you want the number of employees by department, I would expect something like this:
select s.name, count(*) as num_employees
from department s inner join
employe p
on s.depnum = p.depnum
group by s.name ;
If you want the department names with the maximum number of names, you can use a having clause:
select s.name, count(*) as num_employees
from department s inner join
employe p
on s.depnum = p.depnum
group by s.name
having count(*) = (select max(cnt)
from (select count(*) as cnt
from employee e2
group by e2.depnum
) e2
);
The problem with your query is that you are attempting to take the max() of a subquery. That syntax is not allowed -- and not necessary.
you sql statement is not correct that's why it thrown that error. I think you tried something like below
select s.name
from department s
inner join employe p on s.depnum=p.depnum
group by s.name
having count(p.empnum)=
select max(cnt) from
(
select count(p.empnum) as cnt
from employe p join department s
on s.depnum=p.depnum
group by s.name
) t;

The best way to find the Department with the maximum total salary Postgresql

Lets we have 2 standard tables Employees and Departments
CREATE TABLE departments (
id SERIAL PRIMARY KEY,
name VARCHAR
);
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
department_id INTEGER,
name VARCHAR,
salary NUMERIC(13,2)
);
What is the best way to find the name of the department with the maximum employees' total salary.
I've found two solutions and they looks too complicated for such simple task.
Using rank()
SELECT name FROM (
SELECT name, rank() OVER ( ORDER BY salary DESC ) AS rank
FROM (
SELECT
departments.name,
sum(salary) AS salary
FROM employees
JOIN departments ON department_id = departments.id
GROUP BY departments.name
) AS t1
) AS t2
WHERE rank = 1;
Using subquery
WITH t1 AS (SELECT
departments.name,
sum(salary) AS salary
FROM employees
JOIN departments ON departments.id = employees.department_id
GROUP BY departments.name
)
SELECT name FROM t1
WHERE t1.salary = (SELECT max(salary) FROM t1);
At first glance using rank should be less efficient as it performs unnecessary sorting. Though EXPLAIN shows that the first option is more efficient.
Or maybe someone suggests another solution.
So, what is the best way to find the Department with the maximum total salary using postgres?
I would write the rank() as:
SELECT *
FROM (SELECT d.name, SUM(e.salary) AS salary,
RANK() OVER (ORDER BY SUM(e.salary)) as rnk
FROM employees e JOIn
departments d
ON e.department_id = d.id
GROUP BY d.name
) d
WHERE rnk = 1;
(The additional subquery should not affect performance, but it adds nothing to clarify the query either.)
Because window functions are built-in to the database, the database has methods for making them more efficient. And there is overhead for getting the MAX() as well. But, to be honest, I would expect both methods to have similar performance.
I should note that if you want only one department returned -- even when there are ties -- then the simplest method is:
SELECT d.name, SUM(e.salary) AS salary
FROM employees e JOIn
departments d
ON e.department_id = d.id
GROUP BY d.name
ORDER BY SUM(e.salary) DESC
FETCH FIRST 1 ROW ONLY

SUM(SALARY) when ID is distinct

I am having trouble trying to solve this problem, I would like to only add a salary up if the
employee's id is distinct. I thought I could do this using the decode() function but I am having trouble defining an expression suitable. I was aiming for something like
SUM(DECODE(S.ID,IS DISTINCT,S.SALARY))
But this isn't going to work!
So the full query looks like
SELECT B.ID, SUM(S.SALARY), COUNT(DISTINCT S.ID), COUNT(DISTINCT RM.MEMBER_ID)
FROM BRANCH B
INNER JOIN STAFF S ON S.BRANCH_ID = B.ID
INNER JOIN RECRUIT_MEMBER RM ON RM.BRANCH_ID = B.ID
GROUP BY B.ID;
But the problem is with SUM(S.SALARY) it's adding up salaries from duplicate ID's
I don't know about DECODE, but this should work:
SELECT
SUM(S.SALARY)
FROM <table> S
WHERE NOT EXISTS (
SELECT ID FROM <table> WHERE ID=S.ID GROUP BY ID HAVING COUNT(*)>1
)
Perhaps something like this...
SELECT E.ID, SUM(E.Salary)
FROM Employers E
WHERE E.ID IN (SELECT DISTINCT E2.ID FROM Employers E2)
GROUP BY E.ID
If not, perhaps you could post some sample data so that I can understand better
The joins are introducing duplicate rows. One way to fix this is by adding a row number to sequentially identify different ids. The real way would be to fix the joins so this doesn't happen, but here is the first way:
SELECT B.ID, SUM(CASE WHEN SEQNUM = 1 THEN S.SALARY END),
COUNT(DISTINCT S.ID), COUNT(DISTINCT RM.MEMBER_ID)
FROM (SELECT B.ID, S.ID, RM.MEMBER_ID,
ROW_NUMBER() OVER (PARTITION BY S.ID ORDER BY S.ID) as seqnum
FROM BRANCH B
INNER JOIN STAFF S ON S.BRANCH_ID = B.ID
INNER JOIN RECRUIT_MEMBER RM ON RM.BRANCH_ID = B.ID
) t
GROUP BY B.ID
You can create a virtual table with only one salary per ID like this...
SELECT
...whatever fields you've already got...
s.Salary
FROM
...whatever tables and joins you've already got...
LEFT JOIN (SELECT ID, MAX(SALARY) as "Salary" FROM SALARY_TABLE GROUP BY ID) s
ON whatevertable.ID = s.ID

Can I use more than one column in subquery?

I want to show the names of all employees from the EMPLOYEES table who are working on more than three projects from the PROJECT table.
PROJECTS.PersonID is a a foreign key referencing EMPLOYEES.ID:
SELECT NAME, ID
FROM EMPLOYEES
WHERE ID IN
(
SELECT PersonID, COUNT(*)
FROM PROJECTS
GROUP BY PersonID
HAVING COUNT(*) > 3
)
Can I have both PersonID, COUNT(*) in that subquery, or there must be only one column?
Not in an IN clause (or at least not the way you are trying to use it. Some RDBMSs allow tuples with more than one column in the IN clause but it wouldn't help your case here)
You just need to remove the COUNT(*) from the SELECT list to achieve your desired result.
SELECT NAME, ID
FROM EMPLOYEES
WHERE ID IN
(
SELECT PersonID
FROM PROJECTS
GROUP BY PersonID
HAVING COUNT(*) > 3
)
If you wanted to also return the count you could join onto a derived table or common table expression with more than one column though.
SELECT E.NAME,
E.ID,
P.Cnt
FROM EMPLOYEES E
JOIN (SELECT PersonID,
Count(*) AS Cnt
FROM PROJECTS
GROUP BY PersonID
HAVING Count(*) > 3) P
ON E.ID = P.PersonID
To answer your question, you can only have 1 column for the IN subquery. You could get your results using the query below:
SELECT e.ID
,e.Name
FROM dbo.Projects p
LEFT OUTER JOIN dbo.Employees e
ON p.PersonID = e.ID
GROUP BY e.ID
,e.Name
HAVING COUNT(*) > 3