Can I use more than one column in subquery? - sql

I want to show the names of all employees from the EMPLOYEES table who are working on more than three projects from the PROJECT table.
PROJECTS.PersonID is a a foreign key referencing EMPLOYEES.ID:
SELECT NAME, ID
FROM EMPLOYEES
WHERE ID IN
(
SELECT PersonID, COUNT(*)
FROM PROJECTS
GROUP BY PersonID
HAVING COUNT(*) > 3
)
Can I have both PersonID, COUNT(*) in that subquery, or there must be only one column?

Not in an IN clause (or at least not the way you are trying to use it. Some RDBMSs allow tuples with more than one column in the IN clause but it wouldn't help your case here)
You just need to remove the COUNT(*) from the SELECT list to achieve your desired result.
SELECT NAME, ID
FROM EMPLOYEES
WHERE ID IN
(
SELECT PersonID
FROM PROJECTS
GROUP BY PersonID
HAVING COUNT(*) > 3
)
If you wanted to also return the count you could join onto a derived table or common table expression with more than one column though.
SELECT E.NAME,
E.ID,
P.Cnt
FROM EMPLOYEES E
JOIN (SELECT PersonID,
Count(*) AS Cnt
FROM PROJECTS
GROUP BY PersonID
HAVING Count(*) > 3) P
ON E.ID = P.PersonID

To answer your question, you can only have 1 column for the IN subquery. You could get your results using the query below:
SELECT e.ID
,e.Name
FROM dbo.Projects p
LEFT OUTER JOIN dbo.Employees e
ON p.PersonID = e.ID
GROUP BY e.ID
,e.Name
HAVING COUNT(*) > 3

Related

Selecting the Id's that have the same EmailAddress column value

What I need:
I am looking for a solution that can give me all the Employee Id's that have the same EmailAddress Column (the filter needs to be by EmailAddress).
I want to know what are the Id's correspondent to the duplicated Email Addresses and retrieve that information.
Table Employee:
Id | PlNumber | EmailAddress | EmployeeBeginingDate | EmployedEndDate | Name UserId(FK) | CreatedBy | CreatedOn
SELECT a.Id,a.EmailAddress
FROM Employee a
INNER JOIN (SELECT
Employee.Id as EmployeeId,
Employee.EmailAddress as EmailAddress,
FROM Employee
GROUP BY Employee.Id,Employee.EmailAddress
HAVING count(Employee.EmailAddress) > 1
) b
ON a.Id= b.EmployeeId
ORDER BY a.Id
I am always getting an error:
the multi-part identifier could not be bound.
I know why the error is happening but I couldn't solve this.
UPDATE: After a few changes the query is returning 0 rows but I know it should return at least 3 rows that I have duplicate values.
Try the below query as you have an aliased table Employee as a. So in place of Employee, you have to use a.
SELECT a.Id, a.EmailAddress
FROM Employee a
INNER JOIN (SELECT
Employee.EmailAddress as EmailAddress
FROM Employee
GROUP BY Employee.EmailAddress
HAVING count(Employee.EmailAddress) > 1
) b
ON a.EmailAddress = b.EmailAddress
ORDER BY a.Id
Live db<>fiddle demo.
Assuming the ids are different on each row, I would go for exists:
SELECT e.Id, e.EmailAddress
FROM Employee e
WHERE EXISTS (SELECT 1
FROM Employee e2
WHERE e2.EmailAddress = e.EmailAddress AND
e2.Id <> e.Id
)
ORDER BY e.EmailAddress;
Or, if you want to know the number of matches, use window functions:
SELECT e.Id, e.EmailAddress, cnt
FROM (SELECT e.*, COUNT(*) OVER (PARTITION BY e.EmailAddress) as cnt
FROM Employee e
) e
WHERE cnt >= 2;

Missing expression problem in SQL using Oracle

I want to get the number of employees by department and I wrote this script using Oracle but it always says that there is a missing expression
The columns used in my tables :
department :name (the name of the department) -
depnum (the id of the department"primary key"),
employee : empnum (the id of the employee) -
depnum (the id of the department in which the employee in question is working "foreign key")
Query:
select
s.name
from
department s
inner join
employee p on s.depnum = p.depnum
group by
s.name
having
count(p.empnum) = max(select count(p.empnum)
from employee p, department s
where s.depnum = p.depnum
group by s.name) ;
If you want the number of employees by department, I would expect something like this:
select s.name, count(*) as num_employees
from department s inner join
employe p
on s.depnum = p.depnum
group by s.name ;
If you want the department names with the maximum number of names, you can use a having clause:
select s.name, count(*) as num_employees
from department s inner join
employe p
on s.depnum = p.depnum
group by s.name
having count(*) = (select max(cnt)
from (select count(*) as cnt
from employee e2
group by e2.depnum
) e2
);
The problem with your query is that you are attempting to take the max() of a subquery. That syntax is not allowed -- and not necessary.
you sql statement is not correct that's why it thrown that error. I think you tried something like below
select s.name
from department s
inner join employe p on s.depnum=p.depnum
group by s.name
having count(p.empnum)=
select max(cnt) from
(
select count(p.empnum) as cnt
from employe p join department s
on s.depnum=p.depnum
group by s.name
) t;

project to which maximum number of employees have been allocated

I have these tables with the following columns :
Employee24 (EMPLOYEEID, FIRSTNAME, LASTNAME, GENDER);
PROJECT24 (PROJECTID PROJECTNAME EMPLOYEEID);
I want to write a query to find project to which maximum number of employees are alocated.
SELECT FIRSTNAME, LASTNAME
FROM EMPLOYEE24 E
WHERE E.EMPLOYEEID IN ( SELECT L2.EMPLOYEEID
FROM PROJECT24 L2 group by l2.employeeid)\\
What do you want to do if there are ties? This is an important question and why row_number()/rank() might be a better choice:
select p.*
from (select p.projectid, p.projectname, count(*) as num_employees,
rank() over (order by count(*) desc) as seqnum
from project25 p
group by p.projectid, p.projectname
) p
where seqnum = 1;
Notes:
The above query returns all rows if there are ties. If you want only one (arbitrary) project when there is a tie, then use row_number().
I see no reason to join to employee24.
Your data structure is strange. The relationship between projects and employees should be in a separate table, say project_employees. That should have projectid, but not the name. The name should be in project24.
You might try something like this (though I'm quite sure it can be done in other ways):
SELECT *
FROM (SELECT prj.projectid,
prj.projectname,
COUNT(*) AS number_employees
FROM project24 prj
JOIN employee24 emp
ON prj.employeeid = emp.employeeid
GROUP BY prj.projectid,
prj.projectname
ORDER BY number_employees DESC)
WHERE ROWNUM = 1;

SQL: group by table

Suppose we use PostgreSQL and have 2 tables, department and employee, the latter belonging and having a FK into the former.
We now want to do an aggregate select, where we want to put all the information from department and then some aggregate values from employee:
SELECT d.id, d.name, d.budget, count(*), avg(e.salary), max(e.age), sum(e.children)
FROM department d LEFT JOIN employee e ON e.dept = d.id
GROUP BY d.id, d.name, d.budget
I don't like that I need to specify all the columns from department in the GROUP BY - is there a way to "group by the whole table"?
And a bit more philosophical question, suppose I do GROUP BY d.id. Assuming d.id is the primary key of department, why do I need to group by all the other columns as well?
If employee is pre aggregated then there is no need to list the select columns
select *
from
department d
left join (
select
dept as id,
count(*) as count_employee,
avg(salary) as avg_salary,
max(age) as max_age,
sum(children) as sum_children
from employee
group by dept
) e using (id)
The using clause avoids the joined on column duplicity.

SUM(SALARY) when ID is distinct

I am having trouble trying to solve this problem, I would like to only add a salary up if the
employee's id is distinct. I thought I could do this using the decode() function but I am having trouble defining an expression suitable. I was aiming for something like
SUM(DECODE(S.ID,IS DISTINCT,S.SALARY))
But this isn't going to work!
So the full query looks like
SELECT B.ID, SUM(S.SALARY), COUNT(DISTINCT S.ID), COUNT(DISTINCT RM.MEMBER_ID)
FROM BRANCH B
INNER JOIN STAFF S ON S.BRANCH_ID = B.ID
INNER JOIN RECRUIT_MEMBER RM ON RM.BRANCH_ID = B.ID
GROUP BY B.ID;
But the problem is with SUM(S.SALARY) it's adding up salaries from duplicate ID's
I don't know about DECODE, but this should work:
SELECT
SUM(S.SALARY)
FROM <table> S
WHERE NOT EXISTS (
SELECT ID FROM <table> WHERE ID=S.ID GROUP BY ID HAVING COUNT(*)>1
)
Perhaps something like this...
SELECT E.ID, SUM(E.Salary)
FROM Employers E
WHERE E.ID IN (SELECT DISTINCT E2.ID FROM Employers E2)
GROUP BY E.ID
If not, perhaps you could post some sample data so that I can understand better
The joins are introducing duplicate rows. One way to fix this is by adding a row number to sequentially identify different ids. The real way would be to fix the joins so this doesn't happen, but here is the first way:
SELECT B.ID, SUM(CASE WHEN SEQNUM = 1 THEN S.SALARY END),
COUNT(DISTINCT S.ID), COUNT(DISTINCT RM.MEMBER_ID)
FROM (SELECT B.ID, S.ID, RM.MEMBER_ID,
ROW_NUMBER() OVER (PARTITION BY S.ID ORDER BY S.ID) as seqnum
FROM BRANCH B
INNER JOIN STAFF S ON S.BRANCH_ID = B.ID
INNER JOIN RECRUIT_MEMBER RM ON RM.BRANCH_ID = B.ID
) t
GROUP BY B.ID
You can create a virtual table with only one salary per ID like this...
SELECT
...whatever fields you've already got...
s.Salary
FROM
...whatever tables and joins you've already got...
LEFT JOIN (SELECT ID, MAX(SALARY) as "Salary" FROM SALARY_TABLE GROUP BY ID) s
ON whatevertable.ID = s.ID