Recursive Query on a self referential table (not hierarchical) - sql

I am creating a state chart of sorts with the data being stored in a simple self referencing table (JobPath)
JobId - ParentJobId
I was using a standard SQL CTE to get the data out which was working perfectly until I ended up with the following data
JobId - ParentId
1 2
2 3
3 4
4 2
Now as you can see Job 4 links to Job 2 which goes to Job 3 and then to Job 4 and so on.
Is there any way I can tell my query not to pull out data it already has?
Here is my current query
WITH JobPathTemp (JobId, ParentId, Level)
AS
(
-- Anchor member definition
SELECT j.JobId, jp.ParentJobId, 1 AS Level
FROM Job AS j
LEFT OUTER JOIN dbo.JobPath AS jp
ON j.JobId = jp.JobId
where j.JobId=1516
UNION ALL
-- Recursive member definition
SELECT j.JobId, jp.ParentJobId, Level + 1
FROM dbo.Job as j
INNER JOIN dbo.JobPath AS jp
ON j.JobId = jp.JobId
INNER JOIN JobPathTemp AS jpt
ON jpt.ParentId = jp.JobId
WHERE jp.ParentJobId <> jpt.JobId
)
-- Statement that executes the CTE
SELECT * FROM JobPathTemp

If you are not dealing with a large number of entries, the following solution might be suitable. The idea is to build the complete "id path" for each row and make sure the "current id" (in the recursive part) is not already in the path being processed:
(I removed the join to jobpath for testing purposes but the basic pattern should be the same)
WITH JobPathTemp (JobId, ParentId, Level, id_path)
AS
(
SELECT jobid,
parentid,
1 as level,
'|' + cast(jobid as varchar(max)) as id_path
FROM job
WHERE jobid = 1
UNION ALL
SELECT j.JobId,
j.parentid,
Level + 1,
jpt.id_path + '|' + cast(j.jobid as varchar(max))
FROM Job as j
INNER JOIN JobPathTemp AS jpt ON j.jobid = jpt.parentid
AND charindex('|' + cast(j.jobid as varchar), jpt.id_path) = 0
)
SELECT *
FROM JobPathTemp
;

This solution doesn't work, SQL Server doesn't support using UNION to join together the recursive term. Since you can't refer to the the recursion except as the join, tbh I don't see any alternative to using a stored function...
You didn't post your query... but I tried (in postgres, which works in much the same way) and if you use "UNION" (not "UNION ALL") in the recursive term, then it should automatically remove duplicate rows:
with /*recursive*/ jobs as
(select jobpath.jobid, jobpath.parentjobid from jobpath where jobid = 1
union
select jobpath.jobid, jobpath.parentjobid
from jobpath
join jobs on jobs.parentjobid = jobpath.jobid
)
select jobpath.* from jobpath join jobs on jobpath.jobid = jobs.jobid;

Related

Recursive Query CTE Father - Son - Grandson error

I have a table that has an ID and IDFATHER of some projects, these projects can receive N sons, so, the structure is like
ID
IDFATHER
REV
1
1
0
2
1
1
5
2
2
I need to, iniciating in ID 5 go to ID 1, so I did a CTE Query:
WITH lb (ID, IDFATHER) AS (
SELECT ID, IDFATHER
FROM PROJECTS
WHERE ID = 5
UNION ALL
SELECT I.ID, I.IDFATHER
FROM PROJECTS I
JOIN lb LBI ON I.ID = LBI.IDFATHER
--WHERE I.ID = LBI.IDFATHER -- Recursive Subquery
)
SELECT *
FROM lb
WHERE LB.ID = LB.IDFATHER
When this code runs it gives me:
The statement terminated. The maximum recursion 100 has been exhausted
before statement completion.
So basically I handle it by just adding:
SELECT TOP 1 * FROM LB WHERE LB.ID = LB.IDFATHER
But I really want to know were is my error. Can anyone give me a hand on these?
The first row points to itself so the recursion never stops. You need to add this condition inside the recursive cte:
WHERE LBI.ID <> LBI.IDFATHER
I would rather set IDFather of the first row to NULL.
The recursion didn't stop because your top row refers to itself endlessly.
If the top row has a null parent, that would have stopped the recursion.
Another approach is to use that case id = parentid as the termination logic.
The fiddle
WITH LB (ID, IDFATHER, idstart) AS (
SELECT ID, IDFATHER, id
FROM PROJECTS WHERE ID = 5
UNION ALL
SELECT I.ID, I.IDFATHER, lbi.idstart
FROM PROJECTS I
JOIN LB LBI
ON I.ID = lbi.IDFATHER
AND lbi.id <> lbi.idfather
)
SELECT id AS idtop, idstart
FROM LB
WHERE LB.ID = LB.IDFATHER
;
The result:

How to detect cyclical reference in SQL Server Query - SQL Server 2017

I have a recursive WITH query in SQL Server 2017:
;WITH rec AS (
SELECT
col1 AS root_order
,col1
,col2
,col3
,col4
,col5
,col6
,col7
,col8
,col9
FROM
TableA
UNION ALL
SELECT
rec.root_order,
TableA.col2,
TableA.col3,
TableA.col4,
TableA.col5,
TableA.col6,
TableA.col7,
TableA.col8,
TableA.col9,
rec.the_level
FROM
rec
INNER JOIN TableA on rec.Details = TableA.Orders
)
SELECT DISTINCT * FROM rec
This yields a: The statement terminated. The maximum recursion 100 has been exhausted before statement completion. error.
I have tried:
OPTION (maxrecursion 0) to let it continue. But when I do that, the query infinitely loops, so that doesn't work.
In Oracle, I can use CONNECT BY ROOT and CONNECT BY PRIOR and NOCYCLE, but I know things like that aren't available in SQL Server. So I found this MSDN link which suggest something of the form:
with hierarchy
as
(
select
child,
parent,
0 as cycle,
CAST('.' as varchar(max)) + LTRIM(child) + '.' as [path]
from
#hier
where
parent is null
union all
select
c.child,
c.parent,
case when p.[path] like '%.' + LTRIM(c.child) + '.%' then 1 else 0 end as cycle,
p.[path] + LTRIM(c.child) + '.' as [path]
from
hierarchy as p
inner join
#hier as c
on p.child = c.parent
and p.cycle = 0
)
select
child,
parent,
[path]
from
hierarchy
where
cycle = 1;
go
For finding the cycles (or avoiding them). I cannot seem to take my current query and edit it in that fashion. How can I edit my current SQL to perform the cyclic reference detection like in the MSDN article?
Some sample data as requested here in SQL FIDDLE.
What I normally do is pretty simple. In the anchor query (the first part of the CTE), I include a value "1 AS Level" in the select list. Then in the bottom query, I select Level + 1 as the Level, so I know what depth I'm up to. Then I can just put a sanity clause into the bottom query to limit the depth i.e. WHERE LEVEL <= 10 or whatever depth you want. But yes, you still need MAXRECURSION set to 0 if you want to go above 100 levels.
Here's an example based on AdventureWorks:
WITH Materials (BillOfMaterialsID, ProductName, ProductAssemblyID, ComponentID, [Level])
AS
(
SELECT bom.BillOfMaterialsID,
p.[Name],
bom.ProductAssemblyID,
bom.ComponentID,
1
FROM Production.BillOfMaterials AS bom
INNER JOIN Production.Product AS p
ON bom.ComponentID = p.ProductID
AND bom.EndDate IS NULL
WHERE bom.ProductAssemblyID IS NULL
UNION ALL
SELECT bom.BillOfMaterialsID,
p.[Name],
bom.ProductAssemblyID,
bom.ComponentID,
m.[Level] + 1
FROM Production.BillOfMaterials AS bom
INNER JOIN Production.Product AS p
ON bom.ComponentID = p.ProductID
INNER JOIN Materials AS m
ON bom.ProductAssemblyID = BOM.ComponentID
WHERE m.[Level] <= 5
)
SELECT m.BillOfMaterialsID,
m.ProductName,
m.ProductAssemblyID,
m.ComponentID,
m.[Level]
FROM Materials AS m
ORDER BY m.[Level], m.BillOfMaterialsID;

Query runs fine in SQL Server but Errors in trying to read into a Pandas DF

I have a query that I can run fine in SQL Server but errors trying to read it into a DF using PYOBC.
I've copy and pasted the exact query from SQL Server into a variable named query in my python script (works for all other queries).
When I run the query in SQL Server, this part:
;WITH DeDupe AS (
-- Trace 105704 was received, returned to vendor, and re-received under same PO. DeDupe handles this
SELECT DISTINCT A.PurchaseOrderID, A.POLineNumber, D.TraceID
FROM #UniquePOs A
LEFT JOIN Trace.ReceivingReport B WITH (NOLOCK)
ON A.PurchaseOrderID = B.PurchaseOrderID AND A.POLineNumber = B.POLineNumber
LEFT JOIN Trace.EventInstance C WITH (NOLOCK)
ON B.EventInstanceID = C.InstanceID AND C.EventID = 'RCVD' -- Keep only receive events, not project transfers to avoid double-counting
LEFT JOIN Trace.Trace D
ON C.TraceID = D.TraceID
)
SELECT * INTO #DeDupe FROM DeDupe
SELECT * FROM #DeDupe; <---I take this part out when I try to run the query to get the second table.
;WITH TraceQtys AS (
-- Use this to solve Case 3 below
SELECT A.PurchaseOrderID, A.POLineNumber, SUM(B.Quantity) AS 'SumOfTraceQty'
FROM #DeDupe A
LEFT JOIN Trace.Trace B
ON A.TraceID = B.TraceID
GROUP BY A.PurchaseOrderID, A.POLineNumber
)
SELECT * INTO #TraceQtys FROM TraceQtys
SELECT * FROM #TraceQtys;
Returns these as results:
PurchaseOrderID POLineNumber TraceID
-------------------------------------------------- ------------ -----------
007004 1 NULL
007004 1 41963
(2 rows affected)
Warning: Null value is eliminated by an aggregate or other SET operation.
(1 row affected)
PurchaseOrderID POLineNumber SumOfTraceQty
-------------------------------------------------- ------------ ---------------------------------------
007004 1 8.00000
Ran in my Python script:
query = 'SET NOCOUNT ON; ' + query
query = query.replace('GO', '')
conn = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
'Server=SL-SQL;'
'Database=TRACE DB;'
'Trusted_Connection=yes;')
df = pd.read_sql(query, conn)
df.head()
The above returns the same result for the first table but an error for the second:
TypeError: 'NoneType' object is not iterable
This similar query works in Python:
query = '''
SET NOCOUNT ON;
WITH Temp as (
SELECT TraceID, ClassID, SUM(Quantity) as Q
FROM Trace.Trace
GROUP BY TraceID, ClassID
)
SELECT * INTO #Temp FROM Temp
SELECT * FROM #Temp'''
It also runs if I take out this part in the Python query:
SUM(B.Quantity) AS 'SumOfTraceQty'
I have a similar piece of code that runs successfully (further up):
WITH POQtys AS (
SELECT A.PurchaseOrderID, SUM(B.QtyVouched) AS POTotalQtyVouched
FROM #orders A
LEFT JOIN [MSS-ISD-DYN01].App.dbo.PurOrdDet B WITH (NOLOCK)
ON A.PurchaseOrderID = B.PONbr AND B.InvtID IN ('DIR-MATERIALS', 'ASSET-INVENTORY')
GROUP BY A.PurchaseOrderID
)
SELECT * INTO #POQtys FROM POQtys
For Python cursor calls you can usually only submit one SQL statement at a time. Hence, why others worked but not your combined which maintains multiple SQL statements. However, you can use multiple CTEs in one call that even depend on each other and avoid the SELECT...INTO actions.
Below also uses more informative table aliases to kick the A-B-C habit and No LOCK everywhere habit. Below can be read as one statement in Pandas' read_sql.
WITH DeDupe AS (
SELECT DISTINCT u.PurchaseOrderID, u.POLineNumber, t.TraceID
FROM #UniquePOs u
LEFT JOIN Trace.ReceivingReport r
ON u.PurchaseOrderID = r.PurchaseOrderID
AND u.POLineNumber = r.POLineNumber
LEFT JOIN Trace.EventInstance e
ON r.EventInstanceID = e.InstanceID
AND e.EventID = 'RCVD'
LEFT JOIN Trace.Trace t
ON e.TraceID = t.TraceID
),
TraceQtys AS (
SELECT d.PurchaseOrderID, d.POLineNumber, SUM(t.Quantity) AS 'SumOfTraceQty'
FROM DeDupe d
LEFT JOIN Trace.Trace t
ON d.TraceID = t.TraceID
GROUP BY d.PurchaseOrderID, t.POLineNumber
)
SELECT * FROM TraceQtys

Can you help me to write recursive query for two tables?

I need to write recursive query for finding all child nodes in this tables.
For example i need find all child ExecutorTasks for ParentManagerTaskId = 6
ManagerTasks
{
Id,
ParentExecutorTaskId
}
ExecutorTasks
{
Id,
ParentManagerTaskId
}
;WITH query AS
(
SELECT et.Id,et.ParentManagerTaskId,mt.ParentExecutorTaskId
FROM [Planning.ExecutorTasks] et
left outer join [Planning.ManagerTasks] mt on et.ParentManagerTaskId=mt.Id
WHERE mt.Id = 6
UNION ALL
SELECT q.Id, q.ParentManagerTaskId,et.Id
FROM [Planning.ExecutorTasks] et
JOIN query q ON et.Id = q.Id
)
SELECT *
FROM query
Your model is a little misleading since you have 2 tables that are related against each other and you only want to display records from one. This means you have to do 2 joins on the recursive part to get the children of the related entity.
Try with the following:
;WITH Recursion AS
(
-- Anchor
SELECT
ExecutorTaskId = E.Id,
RecursionLevel = 0
FROM
[Planning.ExecutorTasks] AS E
WHERE
E.ParentManagerTaskId = 6
UNION ALL
-- Further childs
SELECT
ExecutorTaskId = E.Id,
RecursionLevel = R.RecursionLevel + 1
FROM
Recursion AS R
INNER JOIN [Planning.ManagerTasks] AS M ON R.ExecutorTaskId = M.ParentExecutorTaskId
INNER JOIN [Planning.ExecutorTasks] AS E ON M.Id = E.ParentManagerTaskId
)
SELECT
R.RecursionLevel,
R.ExecutorTaskId
FROM
Recursion AS R
ORDER BY
R.RecursionLevel,
R.ExecutorTaskId
I can't test if you don't supply example values, expected outcome and table's DDL.

pass an outer selects row variable to inner select in oracle

How do you pass an outer selects row variable to inner select in oracle, here is a sample query ( other outer joins has been removed. This query will be loaded 1 time in life time of an application). This query works
select l5.HIERARCHY_ID,
(select wm_concat(isbn) isbns from (
select op.isbn from oproduct op
LEFT JOIN assignment ha on op.r.reference = ha.reference
where ha.hierarchy_id = '100589'))isbns
from level l5 where l5.gid = '1007500000078694'
but when I change the inner select's where clause
where ha.hierarchy_id = '100589'))isbns
to
where ha.hierarchy_id = l5.HIERARCHY_ID))isbns
I get the following error
ORA-00904: "L5"."HIERARCHY_ID": invalid identifier
You cannot pass the value of a 2nd level SELECT.
For example -
SELECT value1 -- 1st level select
FROM (
SELECT value2 -- 2nd level select
FROM (
SELECT value3 -- 3rd level select.
You can have values from the 1st level SELECT available for only the second level SELECT.
Similarly the values in the second level SELECT are only available to the 1st level SELECT and the 3rd level SELECT not beyond that.
I did something like this to fix the problem. There was one unnecessary select
select
l5.HIERARCHY_ID,
(
select
wm_concat(op.isbn)
from
oproduct op
LEFT JOIN assignment ha on op.r.reference = ha.reference
where ha.hierarchy_id = l5.HIERARCHY_ID
) ISBNS
from
level l5
where
l5.gid = '1007500000078694'
I think I am reading your SQL correctly - you want an outer join when the hierarchy ids match?
SELECT
l5.hierarchy_id,
op.isbns
FROM
LEVEL l5
LEFT OUTER JOIN
(SELECT
wm_concat (op.isbn) isbns,
ha.hierarch_id
FROM
oproduct op
LEFT JOIN
assignment ha
ON op.reference = ha.reference) op
ON l5.gid = op.hierarchy_id