I am trying to understand the concepts of using CTE in my SQL code. I have gone through a number of online posts explaining the concept but I cannot grasp how it iterates to present the hierarchical data. One of the widely used examples to explain the R-CTE is the Employee and ManagerID Example as below:
USE AdventureWorks
GO
WITH Emp_CTE AS (
SELECT EmployeeID, ContactID, LoginID, ManagerID, Title, BirthDate
FROM HumanResources.Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT e.EmployeeID, e.ContactID, e.LoginID, e.ManagerID, e.Title, e.BirthDate
FROM HumanResources.Employee e
INNER JOIN Emp_CTE ecte ON ecte.EmployeeID = e.ManagerID
)
SELECT *
FROM Emp_CTE
GO
The anchor query will grab the manager. After that I can't understand how it would bring the other employees if the recursive query is calling the anchor query again and again and the anchor query just has a single record which is the manager.
So you want to understand a recursive CTE.
It's simple really.
First there's the seed query which gets the original records.
SELECT EmployeeID, ContactID, LoginID, ManagerID, Title, BirthDate
FROM HumanResources.Employee
WHERE ManagerID IS NULL
In your case it's the employees without a manager.
Which would be the boss(es)
To demonstrate with a simplified example:
EmployeeID LoginID ManagerID Title
---------- ------- --------- ------------
101 boss NULL The Boss
The second query looks for employees that have the previous record as a manager.
SELECT e.EmployeeID, e.ContactID, e.LoginID, e.ManagerID, e.Title, e.BirthDate
FROM HumanResources.Employee e
INNER JOIN Emp_CTE ecte ON ecte.EmployeeID = e.ManagerID
Since it's a recursive CTE, the CTE uses itself in the second query.
You could see it as a loop, where it uses the previous records to get the next.
For the first iteration of that recursive loop you could get something like this:
EmployeeID LoginID ManagerID Title
---------- ------- --------- ------------
102 head1 101 Top Manager 1
103 head2 101 Top Manager 2
For the second iteration it would use the records from that first iteration to find the next.
EmployeeID LoginID ManagerID Title
---------- ------- --------- ------------
104 bob 102 Department Manager 1
105 hilda 102 Department Manager 2
108 john 103 Department Manager 4
109 jane 103 Department Manager 5
For the 3th iteration it would use the records from the 2nd iteration.
...
And this continues till there are no more employees to join on the ManagerID
Then after all the looping, the CTE will return all the records that were found through all those iterations.
Well, a short introduction to recursive CTEs:
A recursive CTE is rather something iterativ, than really recursive. The anchor query is taken to get some initial result set. With this set we can dive deeper. Try these simple cases:
Just a counter, not even a JOIN needed...
The 1 of the anchor will lead to a 2 in the UNION ALL. This 2 is passed into the UNION ALL again and will be returned as a 3 and so on...
WITH recCTE AS
(
SELECT 1 AS Mycounter
UNION ALL
SELECT recCTE.MyCounter+1
FROM recCTE
WHERE recCTE.MyCounter<10
)
SELECT * FROM recCTE;
A counter of 2 columns
This is exactly the same as above. But we have two columns and deal with them separately.
WITH recCTE AS
(
SELECT 1 AS Mycounter1, 10 AS MyCounter2
UNION ALL
SELECT recCTE.MyCounter1+1,recCTE.MyCounter2+1
FROM recCTE
WHERE recCTE.MyCounter1<10
)
SELECT * FROM recCTE;
Now we have two rows in the initial query
Running alone, the initial query will return two rows. Both with the counter==1 and two different values for the Nmbr-column
WITH recCTE AS
(
SELECT MyCounter=1, Nmbr FROM(VALUES(1),(10)) A(Nmbr)
UNION ALL
SELECT recCTE.MyCounter+1, recCTE.Nmbr+1
FROM recCTE
WHERE recCTE.MyCounter<10
)
SELECT * FROM recCTE ORDER BY MyCounter,Nmbr;
Now we get 20 rows back, not 10 as in the examples before. This is, because both rows of the anchor are used independently.
We can use the recursive CTE in a JOIN
In this example we will create a derived set first, then we will join this to the recursive CTE. Guess why the first row carries "X" instead of "A"?
WITH SomeSet AS (SELECT * FROM (VALUES(1,'A'),(2,'B'),(3,'C'),(4,'D'),(5,'E'),(6,'F'),(7,'G'),(8,'H'),(9,'I'),(10,'J')) A(id,Letter))
,recCTE AS
(
SELECT MyCounter=1, Nmbr,'X' AS Letter FROM(VALUES(1),(10)) A(Nmbr)
UNION ALL
SELECT recCTE.MyCounter+1, recCTE.Nmbr+1, SomeSet.Letter
FROM SomeSet
INNER JOIN recCTE ON SomeSet.id=recCTE.MyCounter+1
WHERE recCTE.MyCounter<10
)
SELECT * FROM recCTE ORDER BY MyCounter,Nmbr;
This will use a self-referring join to simulate your hierarchy, but with one gap-less chain
WITH SomeSet AS (SELECT * FROM (VALUES(1,'A',NULL),(2,'B',1),(3,'C',2),(4,'D',3),(5,'E',4),(6,'F',5),(7,'G',6),(8,'H',7),(9,'I',8),(10,'J',9)) A(id,Letter,Previous))
,recCTE AS
(
SELECT id,Letter,Previous,' ' PreviousLetter FROM SomeSet WHERE Previous IS NULL
UNION ALL
SELECT SomeSet.id,SomeSet.Letter,SomeSet.Previous,recCTE.Letter
FROM SomeSet
INNER JOIN recCTE ON SomeSet.Previous=recCTE.id
)
SELECT * FROM recCTE:
And now almost the same as before, but with several elements with the same "previous".
This is - in principles - your hierarchy
WITH SomeSet AS (SELECT * FROM (VALUES(1,'A',NULL),(2,'B',1),(3,'C',2),(4,'D',2),(5,'E',2),(6,'F',3),(7,'G',3),(8,'H',4),(9,'I',1),(10,'J',9)) A(id,Letter,Previous))
,recCTE AS
(
SELECT id,Letter,Previous,' ' PreviousLetter FROM SomeSet WHERE Previous IS NULL
UNION ALL
SELECT SomeSet.id,SomeSet.Letter,SomeSet.Previous,recCTE.Letter
FROM SomeSet
INNER JOIN recCTE ON SomeSet.Previous=recCTE.id
)
SELECT * FROM recCTE
Conclusio
The key points
The anchor query must return at least one row, but may return many
The second part must match the column list (as any UNION ALL query)
The second part must refer to the cte in its FROM-clause
either directly, or
through a JOIN
The second part will be called over and over using the result of the call before
Each row is handled separately (a hidden RBAR)
You can start with a Manager (top-most-node) and walk down by querying for employees with this manager id, or
You can start with the lowest in hierarchy (the ones, where no other row exists, using their id as manager id) and move up the list
As it is a hidden RBAR you can use this for row-by-row actions like string cummulation.
An example for the last statement
See how the column LetterPath is built.
WITH SomeSet AS (SELECT * FROM (VALUES(1,'A',NULL),(2,'B',1),(3,'C',2),(4,'D',2),(5,'E',2),(6,'F',3),(7,'G',3),(8,'H',4),(9,'I',1),(10,'J',9)) A(id,Letter,Previous))
,recCTE AS
(
SELECT id,Letter,Previous,' ' PreviousLetter,CAST(Letter AS VARCHAR(MAX)) AS LetterPath FROM SomeSet WHERE Previous IS NULL
UNION ALL
SELECT SomeSet.id,SomeSet.Letter,SomeSet.Previous,recCTE.Letter,recCTE.LetterPath + SomeSet.Letter
FROM SomeSet
INNER JOIN recCTE ON SomeSet.Previous=recCTE.id
)
SELECT * FROM recCTE
It's all about recursive step: firstly, root is used to proceed first step of recursion, so:
SELECT EmployeeID, ContactID, LoginID, ManagerID, Title, BirthDate
FROM HumanResources.Employee
WHERE ManagerID IS NULL
This provides first set of records.
Second set of records will be queried based on first set (anchor), so it will query all employees, which have manager in first set.
Second step of recursion will be based on second result set, not the anchor.
Third step will be based on third result set, etc.
For a simple SQL like,
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
how to add row numbers to them so that the row numbers become 1,2, and 3?
UPDATE:
I thought I can simplify my question as above, but it turns out to be more complicated. So here is a fuller version -- I need to give three random picks (from MyTable) for each person, with pick/row number of 1, 2, and 3, and there is no logical joining between person and picks.
SELECT * FROM Person
LEFT JOIN (
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
) D ON 1=1
The problem with above SQL are,
Obviously, pick/row number of 1, 2, and 3 should be added
and what is not obvious is that, the above SQL will give each person the same picks, whereas I need to give different person different picks
Here is a working SQL to test it out:
SELECT TOP 15 database_id, create_date, cs.name FROM sys.databases
CROSS apply (
SELECT top 3 Row_number()OVER(ORDER BY (SELECT NULL)) AS RowNo,*
FROM (SELECT top 3 name from sys.all_views ORDER BY NEWID()) T
) cs
So, Please help.
NOTE: This is NOT about MySQL byt T-SQL as their syntax are different, Thus the solution is different as well.
Add Row_number to outer query. Try this
SELECT Row_number()OVER(ORDER BY (SELECT NULL)),*
FROM (SELECT TOP 3 MyId
FROM MyTable
ORDER BY Newid()) a
Logically TOP keyword is processed after Select. After Row Number is generated random 3 records will be pulled. So you should not generate Row Number in original query
Update
It can be achieved through CROSS APPLY. Replace the column names inside cross apply where clause with valid column name from Person table
SELECT *
FROM Person p
CROSS apply (SELECT Row_number()OVER(ORDER BY (SELECT NULL)) rn,*
FROM (SELECT TOP 3 MyId
FROM MyTable
WHERE p.some_col = p.some_col -- Replace it with some column from person table
ORDER BY Newid())a) cs