SQL Server : finding where the chain of hierarchy broke - sql

I have more than 70k employee data in a table. It looks like this:
+----------------+----------------------+----------+
| EmployeeId | name | ManagerID|
+----------------+----------------------+----------+
| 1 | Iron Man | 2 |
| 2 | Batman | 4 |
| 3 | Superman | 2000 |
| 4 | Captain America | 3 |
+----------------+----------------------+----------+
Here, Superman has an invalid ManagerID because ManagerID = 2000 doesn't exist in the EmployeeID column. In order to assign a new ManagerID for Superman, I need to find out at what level of hierarchy he is located. I know it should be some recursive query, but I am having much difficulty. Could anybody help? Thank you so much!

This might help you get started.
CREATE TABLE Employees ( EmployeeID INT, Name VARCHAR(200), ManagerID INT )
INSERT INTO Employees ( EmployeeID, Name, ManagerID ) VALUES ( 1, 'Iron Man', 2 ), ( 2, 'Batman', 4 ), ( 3, 'Superman', 2000 ), (4, 'Captain America', 3 )
WITH Relationships ( ManagerID, Name, EmployeeID ) AS
(
SELECT
ManagerID, Name, EmployeeID
FROM
Employees
WHERE
ManagerID IN ( SELECT EmployeeID FROM Employees )
UNION ALL
SELECT
Employees.ManagerID, Relationships.Name, Relationships.EmployeeID
FROM
Employees,
relationships
WHERE
Employees.EmployeeID = Relationships.ManagerID
)
SELECT
EmployeeID, Name, ManagerID
FROM
Relationships
WHERE
EmployeeID = 1 -- Iron Man
OPTION ( MAXRECURSION 25000 )
Replace the "EmployeeID = 1" with whatever the Employee ID is that you want to target. The number of rows it returns is the level. You can probably add a ROW_NUM to the outermost query to get that value.

In order to find out the broken records you can use subquery :
select *
from table t
where not exists (select 1 from table where EmployeeId = t.ManagerID);

Related

Trying to avoid duplication in Oracle query with table alias causing "missing right parenthesis" error

I am trying to write a query that assembles a list of ID numbers from two tables.
The first part of the query selects employee ID numbers that start with 'B' and were created after January 1, 2020 from a table containing the master list of IDs. The second part of the query looks at another table that holds team IDs (which are composed of two or more employee IDs) - basically I want to select any team ID that contains an employee ID that meets the criteria in the first query.
I have the below code working and it returns the proper data:
SELECT
emp_id AS id
FROM
emp_master_view
WHERE
emp_id IN
(
SELECT DISTINCT t.team_id AS id
FROM team_master_view t
WHERE t.emp_id IN
(
SELECT emp_id
FROM emp_master_view
WHERE SUBSTR(emp_id, 1, 1) IN ('B')
AND crt_date >= TO_DATE('2020-01-01', 'yyyy-mm-dd')
)
)
UNION SELECT emp_id AS id
FROM emp_master_view
WHERE SUBSTR(emp_id, 1, 1) IN ('B')
AND crt_date >= TO_DATE('2020-01-01', 'yyyy-mm-dd')
However, I hate the duplication I need to use with the union select statement...I tried using a table alias like this:
SELECT
emp_id
FROM
emp_master_view
WHERE
emp_id IN
(
SELECT DISTINCT t.team_id AS emp_id
FROM team_master_view t
WHERE team_id IN
(
SELECT emp_id
FROM emp_master_view
WHERE SUBSTR(emp_id, 1, 1) IN ('B')
AND crt_date >= TO_DATE('2020-01-01', 'yyyy-mm-dd')
) emp
)
UNION SELECT emp_id
FROM emp
...but this causes a "missing right parenthesis" error. I know I'm making some dumb mistake but I've tried a number of variations and can't get it working without duplicating the select entirely, so I'm waving the white flag. Can anyone point out the proper way to use a table alias in this case?
Edit
Sample data would be as follows:
emp_master_view contains the master list for every ID number, including teams:
EMP_MASTER_VIEW
+--------+------------+
| emp_id | crt_date |
+--------+------------+
| B56 | 2019-11-02 |
| B99 | 2020-03-02 |
| S34 | 2020-03-02 |
| RZF | 2020-04-01 |
| RQR | 2020-04-01 |
+--------+------------+
team_master_view contains only team IDs and there is a row for every individual employee in the team:
TEAM_MASTER_VIEW
+---------+--------+
| team_id | emp_id |
+---------+--------+
| RZF | B99 |
| RZF | B56 |
| RQR | B56 |
| RQR | S34 |
+---------+--------+
Desired results - pull out all of the IDs in the emp_master_view that meet the criteria and ALSO select all the team IDs from team_master_view where that team ID contains one of the employee IDs selected in the first part of the query.
Given the above tables, it would select B99 from emp_master_view as it's the only code that meets both criteria (starts with 'B' and created after Jan 1 2020). It would also select RZF from the team_master_view because that team ID contains B99. The end result set would be two rows:
B99
RZF
Hope that makes sense...
Since you want to use the same query more than once, you could use the WITH clause for subquery factoring:
with data as
(
SELECT emp_id AS id
FROM emp_master_view
WHERE SUBSTR(emp_id, 1, 1) IN ('B')
AND crt_date >= TO_DATE('2020-01-01', 'yyyy-mm-dd')
)
SELECT
emp_id AS id
FROM
emp_master_view
WHERE
emp_id IN
(
SELECT DISTINCT t.team_id AS id
FROM team_master_view t
WHERE t.emp_id IN
(
SELECT emp_id
FROM data)
)
UNION
select emp_id from data;
Update:
After you posted your table data and requirement, it looks like a simple JOIN between the two tables:
SELECT e.emp_id,
t.team_id
FROM
emp_master_view e
JOIN team_master_view t
ON e.emp_id = t.emp_id
WHERE
e.emp_id LIKE 'B%'
AND e.crt_date >= TO_DATE('2020-01-01', 'yyyy-mm-dd');

Recursive CTE with three tables

I'm using SQL Server 2008 R2 SP1.
I would like to recursively find the first non-null manager for a certain organizational unit by "walking up the tree".
I have one table containing organizational units "ORG", one table containing parents for each org. unit in "ORG", lets call that table "ORG_PARENTS" and one table containing managers for each organizational unit, lets call that table "ORG_MANAGERS".
ORG has a column ORG_ID:
ORG_ID
1
2
3
ORG_PARENTS has two columns.
ORG_ID, ORG_PARENT
1, NULL
2, 1
3, 2
MANAGERS has two columns.
ORG_ID, MANAGER
1, John Doe
2, Jane Doe
3, NULL
I'm trying to create a recursive query that will find the first non-null manager for a certain organizational unit.
Basically if I do a query today for the manager for ORG_ID=3 I will get NULL.
SELECT MANAGER FROM ORG_MANAGERS WHERE ORG_ID = '3'
I want the query to use the ORG_PARENTS table to get the parent for ORG_ID=3, in this case get "2" and repeat the query against the ORG_MANAGERS table with ORG_ID=2 and return in this example "Jane Doe".
In case the query also returns NULL I want to repeat the process with the parent of ORG_ID=2, i.e. ORG_ID=1 and so on.
My CTE attempts so far have failed, one example is this:
WITH BOSS (MANAGER, ORG_ID, ORG_PARENT)
AS
( SELECT m.MANAGER, m.ORG_ID, p.ORG_PARENT
FROM dbo.MANAGERS m INNER JOIN
dbo.ORG_PARENTS p ON p.ORG_ID = m.ORG_ID
UNION ALL
SELECT m1.MANAGER, m1.ORG_ID, b.ORG_PARENT
FROM BOSS b
INNER JOIN dbo.MANAGERS m1 ON m1.ORG_ID = b.ORG_PARENT
)
SELECT * FROM BOSS WHERE ORG_ID = 3
It returns:
Msg 530, Level 16, State 1, Line 4
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
MANAGER ORG_ID ORG_PARENT
NULL 3 2
You need to keep track of the original ID you start with. Try this:
DECLARE #ORG_PARENTS TABLE (ORG_ID INT, ORG_PARENT INT )
DECLARE #MANAGERS TABLE (ORG_ID INT, MANAGER VARCHAR(100))
INSERT #ORG_PARENTS (ORG_ID, ORG_PARENT)
VALUES (1, NULL)
, (2, 1)
, (3, 2)
INSERT #MANAGERS (ORG_ID, MANAGER)
VALUES (1, 'John Doe')
, (2, 'Jane Doe')
, (3, NULL)
;
WITH BOSS
AS
(
SELECT m.MANAGER, m.ORG_ID AS ORI, m.ORG_ID, p.ORG_PARENT, 1 cnt
FROM #MANAGERS m
INNER JOIN #ORG_PARENTS p
ON p.ORG_ID = m.ORG_ID
UNION ALL
SELECT m1.MANAGER, b.ORI, m1.ORG_ID, OP.ORG_PARENT, cnt +1
FROM BOSS b
INNER JOIN #ORG_PARENTS AS OP
ON OP.ORG_ID = b.ORG_PARENT
INNER JOIN #MANAGERS m1
ON m1.ORG_ID = OP.ORG_ID
)
SELECT *
FROM BOSS
WHERE ORI = 3
Results in:
+----------+-----+--------+------------+-----+
| MANAGER | ORI | ORG_ID | ORG_PARENT | cnt |
+----------+-----+--------+------------+-----+
| NULL | 3 | 3 | 2 | 1 |
| Jane Doe | 3 | 2 | 1 | 2 |
| John Doe | 3 | 1 | NULL | 3 |
+----------+-----+--------+------------+-----+
General tips:
Don't predefine the columns of a CTE; it's not necessary, and makes maintenance annoying.
With recursive CTE, always keep a counter, so you can limit the recursiveness, and you can keep track how deep you are.
edit:
By the way, if you want the first not null manager, you can do for example (there are many ways) this:
SELECT BOSS.*
FROM BOSS
INNER JOIN (
SELECT BOSS.ORI
, MIN(BOSS.cnt) cnt
FROM BOSS
WHERE BOSS.MANAGER IS NOT NULL
GROUP BY BOSS.ORI
) X
ON X.ORI = BOSS.ORI
AND X.cnt = BOSS.cnt
WHERE BOSS.ORI IN (3)

Get list of unique records

I have the following table which lists the employees and their corresponding managers:
id | employeeid | managerid
1 | 34256 | 12789
2 | 21222 | 34256
3 | 12435 | 34256
.....
.....
What is the recommended way to list out all distinct employees(id) in a single list.
Note that all managers may not be listed under the employeeid column (as he may not have a manager in turn).
If I understand this correctly:
This will unite all distict Employee IDs avoiding duplicates from between the two column (UNION)
SELECT employeeid AS Employee
FROM tableA
UNION
SELECT managerid AS Employee
FROM tableA
This should d it :
SELECT DISTINCT employeeid FROM yourtablename
But seriously, by googling the keyword "distinct" you could have found out very easily yourself ! Or did I miss something out ?
SELECT id, employeeid, managerid
FROM
(SELECT yourtablename.*,
ROW_NUMBER() OVER (PARTITION BY managerid ORDER BY employeeid DESC) AS RN
FROM yourtablename) AS t
WHERE RN = 1
ORDER BY ID

SQL QUERY Select employee who under a specific boss

A challenging question, i have table like below
EmployeeID BossID
pic http://img16.imageshack.us/img16/7659/20130430113245.jpg
any idea to create the query so that when certain employee go in , it will query all employee under him and who is his boss
Try this query
You have to use CTE as Sohail has mentioned.
WITH DirectReports (bossId, EmpID, Level)
AS
(
-- Anchor member definition
SELECT bossId, empId,
0 AS Level
FROM tbl
WHERE empId = 2
UNION ALL
-- Recursive member definition
SELECT e.bossId, e.empId,
Level + 1 AS Level
FROM tbl e
INNER JOIN DirectReports AS d
ON e.bossId = d.empId
)
-- Statement that executes the CTE
SELECT *
FROM DirectReports;
SQL FIDDLE
| BOSSID | EMPID | LEVEL |
--------------------------
| 1 | 2 | 0 |
| 2 | 4 | 1 |
| 4 | 5 | 2 |
| 4 | 6 | 2 |
You should create the query using the CTE(Common table expression). For help you can read Recursive Queries
Below is just similar example you can modify as per your need but it would help.
This is a table structure.
CREATE TABLE [dbo].[Categories](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[CategoryName] varchar NULL,
[ParentId] [bigint] NULL)
This is the hierarchical query. I just created a view for it because I need to filter it more in my scenario.
CREATE VIEW [dbo].[CategoriesWithNameHierarchy] AS WITH Categories_Tree AS
(SELECT c.id AS 'Id' ,
0 AS 'Level',
c.CategoryName AS 'CategoryName',
cast(c.CategoryName AS varchar(30)) AS 'CNameHierarchy'
FROM Categories c
WHERE ParentId IS NULL
UNION ALL SELECT ChildCategories.Id AS 'Id',
(1 + ct.[Level]) AS 'Level',
ChildCategories.CategoryName AS 'CategoryName',
cast(ct.CNameHierarchy + '>' + ChildCategories.CategoryName AS varchar(30)) AS 'CNameHierarchy'
FROM Categories ChildCategories,
Categories_Tree ct
WHERE ChildCategories.ParentId = ct.Id)
SELECT *
FROM Categories_Tree
Hope this would help.

Data Matching with SQL and assigning Identity ID's

How to write a query that will match data and produce and identity for it.
For Example:
RecordID | Name
1 | John
2 | John
3 | Smith
4 | Smith
5 | Smith
6 | Carl
I want a query which will assign an identity after matching exactly on Name.
Expected Output:
RecordID | Name | ID
1 | John | 1X
2 | John | 1X
3 | Smith | 1Y
4 | Smith | 1Y
5 | Smith | 1Y
6 | Carl | 1Z
Note: The ID should be unique for every match. Also, it can be numbers or varchar.
Can somebody help me with this? The main thing is to assign the ID's.
Thanks.
How about this:
with temp as
(
select 1 as id,'John' as name
union
select 2,'John'
union
select 3,'Smith'
union
select 4,'Smith'
union
select 5,'Smith'
union
select 6,'Carl'
)
SELECT *, DENSE_RANK() OVER
(ORDER BY Name) as NewId
FROM TEMP
Order by id
The first part is for testing purposes only.
Please try:
SELECT *,
Rank() over (order by Name ASC)
FROM table
This structure seems to work:
CREATE TABLE #Table
(
Department VARCHAR(100),
Name VARCHAR(100)
);
INSERT INTO #Table VALUES
('Sales','michaeljackson'),
('Sales','michaeljackson'),
('Sales','jim'),
('Sales','jim'),
('Sales','jill'),
('Sales','jill'),
('Sales','jill'),
('Sales','j');
WITH Cte_Rank AS
(
SELECT [Name],
rw = ROW_NUMBER() OVER (ORDER BY [Name])
FROM #Table
GROUP BY [Name]
)
SELECT a.Department,
a.Name,
b.rw
FROM #Table a
INNER JOIN Cte_Rank b
ON a.Name = b.Name;