SQL Conditional JOIN - sql

I have three tables 'Employees', 'Departments' and 'EmployeesInDepartments'
The 'Employees' tables references the 'Departments' table (Each employee must have a DepartmentId). However, an employee can exist in multiple departments by adding entries (EmployeeId and DepartmentId) to 'EmployeeInDepartments' table.
I currently have the following stored procedure to retrieve employees by departmentId:
CREATE PROCEDURE dbo.CollectEmployeesByDepartmentId
(
#DepartmentId int,
#IsDeleted bit
)
AS
BEGIN
SELECT Employees.*
FROM Employees
WHERE ((Employees.IsDeleted = #IsDeleted )
AND ((Employees.DepartmentId = #DepartmentId)
OR (Employees.EmployeeId IN (SELECT EmployeesInDepartments.EmployeeId
FROM EmployeesInDepartments
WHERE (EmployeesInDepartments.DepartmentId = #DepartmentId)
)
)
)
)
END
How can I optimize this stored procedure and possibly use JOINS?

My first recommendation to you is to remove department Id from the employee table. Insert all records to the employees in Departments table.
Then it's a simple inner join.
And of course, never use select * in production code.

Here's my re-write of your query:
WITH summary AS (
SELECT e.*
FROM EMPLOYEES e
WHERE e.isdeleted = #IsDeleted
AND e.parentid = 0)
SELECT a.*
FROM summary a
WHERE a.departmentid = #DepartmentId
UNION
SELECT b.*
FROM summary b
JOIN EMPLOYEESINDEPARTMENTS ed ON ed.employeeid = b.employeeid
AND ed.departmentid = #DepartmentId
The UNION is necessary to remove duplicates - if you know there'll never be duplicates, change UNION to UNION ALL.
The CTE called "summary" doesn't provide any performance benefit, it's just shorthand.

SELECT E.*
FROM Employees E
Left Join EmployeesInDepartments EID ON E.EmployeeId = EID.EmployeeId
And E.DepartmentId <> #DepartmentId
WHERE E.IsDeleted = #IsDeleted
And
(
E.DepartmentId = #DepartmentId
Or (EID.DepartmentId = #DepartmentId)
)
Edit to include IsDeleted logic.
I do agree with some of the other answers, your design should probably be changed. But this query should do it. If you have duplicates in EmployeesInDepartments, you can change that to a Select Distinct.

I would suggest you change the IN clause used in your query to WHERE EXISTS.
If you are using IN in your query it means you are not taking benifits of the indexes defined on the table, and the query will perform a complete table scan which impacts the query performance.
You can check this thread to convert a IN to WHERE EXISTS:
Changing IN to EXISTS in SQL

Related

How to tune query to fetch result faster | Oracle 19c |

I have a table which as huge records in table
My tables : employee and customer
Now the issue here is I have 2 billion records in employee table and 1 billion records in customer table
Employee columns
empid
empname
empage
empdcourse
Customer columns
custid
custdesc
custmessage
My query :
select emp_id from employee where empid not in ( select custid from customer);
Error : It throws me table space issue. Not allowed to increase table space
Is their any way I can tune my query or run in batch by batch so I get output
Any solution is much appreciated !!!
Need it on high priority
NOT EXISTS may be more efficient and less memory consuming in such case.
(The query suggests Customer and Employee share the same PK, does it mean you have an "super" table Person ?)
Try this:
with tmp as
(select /*+full(c)*/
custid
from customer c)
select /*+full(e)*/
e.emp_id
from employee e, tmp t
where e.empid = t.custid(+)
and t.custid is null;
The hint full will prevent the tablespace issue.
The OUTER JOIN is faster than the NOT IN.
You can improve it by adding the hint parallel, starting with a degree=2 or 4 like this:
with tmp as
(select /*+full(c) parallel(c,2)*/
custid
from customer c)
select /*+full(e) parallel(e,2)*/
e.emp_id
from employee e, tmp t
where e.empid = t.custid(+)
and t.custid is null;
You can add indexes for columns, for example, if they aren’t primary keys:
CREATE INDEX empid_index
ON employee(empid);
Also, you can update the query:
select e.empid from employee e where not exists (select 1 from customer c where c.custid = e.empid);

SELECT and JOIN to return only one row for each employee

I have a user table that stores the employeeId, lastName, firstName, department, hire date and mostRecentLogin.
I have another table that stores the employeeId, emailAddress.
The emailAddress table can have multiple rows for an employee if they have multiple email addresses.
I'm trying to return results that only show one row for each employee. I don't care which email address, just as long as it only picks one.
But all the queries I've tried always return all possible rows.
Here is my most recent attempt:
select *
from EmployeeInfo i
left join EmployeeEmail e ON i.employeeId = e.employeeId
where i.hireDate = 2015
and employeeId IN (
SELECT MIN(employeeId)
FROM EmployeeInfo
GROUP BY employeeId
)
But then again, this returns all possible rows.
Is there a way to get this to work?
Use a sub-query instead of a join:
select *
, (select top 1 E.EmailAddress from EmplyeeEmail E where E.employeeId = I.employeeId)
from EmployeeInfo I
where I.hireDate = 2015;
Note: If you change your mind and decide you do have a preference as to which email address is returned then just add an order by to the sub-query - otherwise it is truly unknown which one you will get.
This should work.
SELECT *
FROM EmployeeInfo
Left JOIN EmployeeEmail
ON EmployeeInfo.employeeId = EmployeeEmail.employeeId
WHERE EmployeeInfo.hireDate = '2015'
GROUP BY EmployeeInfo.employeeId;

How to check if there exist only one record for a certain Id

How to check if there exist only one record for a certain Id
I have two tables called Tbl_Company and Tbl_Employee
I am fetching employees as follows-
SELECT DISTINCT emp.employee_id
FROM Tbl_Company comp
, Tbl_Employee emp
WHERE
emp.company_id = comp.company_id
AND emp.company_id = 1234;
This query returns exactly one value.
How can I make sure that above query returns exacly one value for any comany_id I enter.
I tried using solutions given in This post with no success.
Is there any simpler way to do this.
this would return one row per company
SELECT comp.companyid, max(emp.employee_id) lastEmployeeID
FROM Tbl_Company comp
, Tbl_Employee emp
WHERE
emp.company_id = comp.company_id
AND emp.company_id = 1234
GROUP BY comp.companyid
the following query is simple and flexible. it will return a list of all employees which are alone in their companies (returned with full employee information).
you can check if a defined company has one lonely employee by enable condition about company or you can check if employee is a lonely employee by enabling employee condition.
SELECT emp.*
FROM Tbl_Company comp
/*, Tbl_Employee emp*/
WHERE (emp.company_id , 1) in (select t.company_id, count(t.employee_id) from Tbl_Company t )
--AND emp.company_id = 1111 /*filter conditions on company_id*/
--AND emp.employee_id = 1234/*filter conditions on employee_id*/;
I have solved this by using ans by #davegreen100 in comment
SELECT comp.companyid, count(distinct emp.employee_id),
FROM Tbl_Company comp
, Tbl_Employee emp
WHERE
emp.company_id = comp.company_id
AND emp.company_id = 1234
GROUP BY comp.companyid
This will give me the count of employees per company

SQL Loop/Crawler

I am trying to figure out some ways to accomplish this script. I import an excel sheet and then I need to populate 5 different tables based on this excel sheet. However for this example I just need help with the initial loop then I think I can work through the rest.
select distinct Department from IPACS_New_MasterList
where Department is not null
This provides me a list of 7 different departments.
Dep1, Dep2, Dep3, Dep4, Dep5, Dep6, Dep7
For each of these departments I need to perform some code.
Step #1:
Insert the department into table_one
I then need to keep the SCOPE_IDENTITY() for the rest of the code.
Step #2
perform the second loop (inserting all functions in that department into table2.
I'm not sure how to really do a foreach row in this select statement loop, or if I need to do something completely different. I've looked at several answers but can't seem to find exactly what I'm looking for.
Sample Data:
Source Table
Dep1, func1, process1, procedure1
dep1, func1, process1, procedure2
dep1, func1, process2, procedure3
dep1, func1, process2, procedure4
dep1, func1, process2, procedure5
dep1, func2, process3, procedure6
dep2, func3, process4, procedure7
My Tables:
My first table is a list of every department from the above query. With a key on the departmentID. Each department can have many functions.
My second table is a list of all functions with a key on functionID and a foreign key on departmentID. Each function must have 1 department and can have many processes
My third table is a list of all processes with a key on processID and a foreign key on functionID. Each process must have 1 function and can have many procedures.
There are two approaches you can use without a loop.
1) If you have candidate keys in your source (department name) just join your source table back to the table you inserted
e.g.
INSERT INTO Department
(Name)
SELECT DISTINCT Dep1
FROM SOURCE;
INSERT INTO Functions
(
Name,
DepartmentID)
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name;
SQL Fiddle
2) If you don't have candidate keys in your source then you can use the output clause
For example here if a department weren't guaranteed to be unique this would correctly find only the newly add
DECLARE #Department TABLE
(
DepartmentID INT
)
DECLARE #Functions TABLE
(
FunctionID INT
)
INSERT INTO Department
(Name)
OUTPUT INSERTED.DepartmentID INTO #Department
SELECT DISTINCT Dep1
FROM SOURCE
INSERT INTO Functions
(
Name,
DepartmentID)
OUTPUT INSERTED.FunctionID INTO #FunctionID
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN #Department d2
ON d.departmentID = d2.departmentID;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name
INNER JOIN #Functions f2
ON f.Functions = f2.Functions
SELECT * FROM Department;
SELECT * FROm Functions;
SELECT * FROM processes;
SQL Fiddle
If I am understanding what you are trying to do... yes you can use a loop. Its not really talked about and I bet I am going to get some feedback from other SQL developers that its not a best practice. But if you really need to do a loop
DECLARE #rowcount as int
DECLARE #numberOfRows as int
SET #rowcount = 0
SET #numberOfRows = SELECT COUNT(*) from tablename --put in anything to get the number of times to loop.
WHILE #numberOfRows <= #rowcount
BEGIN
--Put whatever process you need to repeat here
SET #rowcount = #rowcount + 1
END
Assuming you have tables set up with an IDENTITY field set for the Primary Key, you can populate each successive table's foreign key by joining to the previous table and the source table, something like:
INSERT INTO Table1
SELECT DISTINCT Department
FROM SourceTable
GO
INSERT INTO Table2
SELECT DISTINCT b.Deptartment_ID, a.Function
FROM SourceTable a
JOIN Table1 b
ON a.Department = b.Department
GO
INSERT INTO Table3
SELECT DISTINCT b.Function_ID, a.Process
FROM SourceTable a
JOIN Table2 b
ON a.Function = b.Function
GO
INSERT INTO Table4
SELECT DISTINCT b.Process_ID, a.Procedure
FROM SourceTable a
JOIN Table3 b
ON a.Process = b.Process
GO

Trace a Hierarchy in a Table

I have an "Employee" table with an "EmployeeID" column and a column representing Employee's
Boss (BossID) which in turn is an employee in the "Employee" table. How can I trace the hierarchy from a given "EmployeeID" to the top most Boss. I do not want a self join approach in this, also I am using SQL Server 2005.
Thank you
Manu
You have to use some sort of self join basically with the table structure you describe but can use a recursive CTE for this to handle arbitrary depths of hierarchy if that was the concern?
WITH cte AS
(
SELECT EmployeeID, BossId
FROM Employee where EmployeeID = #EmployeeID
UNION ALL
SELECT e.EmployeeID, e.BossId
FROM Employee e JOIN cte ON cte.BossId = e.EmployeeID
)
SELECT EmployeeID
FROM cte