Using SQL recursive query as AND statement - sql

I have the following flowchart. Hopefully, it's self-explanatory
On top of the hierarchy there's a request that is a basic parent of all the request below it. Requests below have the 'id', 'parent_id', 'state' fields
My final goal is to find all parents ids that satisfy all AND statements including the last one (hierarchical query). However, I don't know how to use it as an AND statement.
The hierarchical query looks like this:
with cte
as (select id, state
from tbl_request as rH
WHERE id = /* each id from the very first select */
UNION ALL
select rH.id, rH.state
from tbl_request as rH
join cte
on rH.parent_id = cte.id
and (cte.state is null or cte.state NOT IN('not-legit'))
)
select case when exists(select 1 from cte where cte.state IN('not-legit'))
then 1 else 0 end
Expectantly, it does what it's supposed to
The solution was suggested in the question
Return true/false in recursive SQL query based on condition
For your convenience, here's a SQL Fiddle

I think I've worked out what you want.
You need to recurse through all the nodes and their children, returning its state and its ultimate root parent_id.
Then aggregate by that ID and exclude any group that contains a row with state = 'not-legit'. In other words, flip the logic to a double negative.
WITH cte AS (
SELECT rH.id, rH.state, rH.id AS top_parent
FROM tbl_request as rH
WHERE (rH.state is null or rH.state <> 'not-legit')
AND rH.parent_id IS NULL
UNION ALL
SELECT rH.id, rH.state, cte.top_parent
FROM tbl_request as rH
JOIN cte
ON rH.parent_id = cte.id
)
SELECT top_parent
FROM cte
GROUP BY
cte.top_parent
HAVING COUNT(CASE WHEN cte.state = 'not-legit' THEN 1 END) = 0;
You could also change the logic back to a positive, but it would need to look like this:
HAVING COUNT(CASE WHEN cte.state is null or cte.state <> 'not-legit' THEN 1 END) = COUNT(*)
In other words, there are the same number of these filtered rows as there are all rows.
This feels more complex than what I have put above.
SQL Fiddle

Replace your
WHERE id = /* each id from the very first select */
by
WHERE id in (
SELECT r.id FROM tbl_request AS r
/* there's also an INNER JOIN (hopefully, it won't be an obstacle) */
WHERE r.parent_id is null
/* a lot of AND statements */
)
Also, you should use UNION instead of UNION ALL, since there is no point using duplicated tuples (id and state) in this case.
To summarize, your query should look like this one
with cte
as (select id, state
from tbl_request as rH
WHERE id in (
SELECT r.id
FROM tbl_request AS r
/* there's also an INNER JOIN (hopefully, it won't be an obstacle) */
WHERE r.parent_id is null
/* a lot of AND statements */
) UNION
select rH.id, rH.state
from tbl_request as rH
join cte
on rH.parent_id = cte.id
and (cte.state is null or cte.state NOT IN('not-legit'))
)
Your subquery can contain any inner joins or any number of AND operators you need, as long as it returns one column (select r.id) it will work fine.

Related

Return true/false in recursive SQL query based on condition

I have the following flowchart. Hopefully, it's self-explanatory
On top of the hierarchy there's a request that is a basic parent of all the request below it. Requests below have the 'id', 'parent_id', 'state' fields
My final goal is to find out whether at least one of the parent's sub-request has illegal/not-legit states. There are a few 'not-legit' states - that's why I'm using NOT IN. So, I need to simply return true/false, if at least one sub-request has the wrong state
I use the query below to build the hierarchy
DECLARE #main_parent_id bigint = 1
; with cte
as (select id
from tbl_request as rH
WHERE id = #main_parent_id
UNION ALL
select rH.id
from tbl_request as rH
join cte
on rH.parent_id = cte.id
WHERE rH.state NOT IN('not-legit'))
select *
from cte
order by id;
But I don't know how to return true/false instead of just returning id's. In addition #main_parent_id is a dynamic variable that comes from another SELECT which returns all requests that are on top of the hierarchy.
In a sense, the query above should return true if all sub-requests are in LEGIT states, false if there's at least one sub-request in NOT-LEGIT state.
For your convenience, here's a SQL Fiddle
Stop searching a branch when a non-legit is already found.
with cte
as (select id, state
from tbl_request as rH
WHERE id = #main_parent_id
UNION ALL
select rH.id, rH.state
from tbl_request as rH
join cte
on rH.parent_id = cte.id
and (cte.state is null or cte.state NOT IN('not-legit'))
)
select case when exists(select 1 from cte where cte.state IN('not-legit'))
then 1 else 0 end
You mean something like this?
DECLARE #main_parent_id bigint = 1
;with cte as (
select id, state
from tbl_request as rH
WHERE id = #main_parent_id
UNION ALL
select rH.id, rh.state
from tbl_request as rH
join cte
on rH.parent_id = cte.id
)
select
case when (select top 1 1 from cte where state = 'not-legit') = 1 then 'true' else 'false' end as NonLegit
Here try this, I just tested it and it works
Part of the problem was you were filtering out the not-legit. Second you can use CASE WHEN to test for the presence of 'not-legit.'
; WITH cte as
(
SELECT id, state
FROM tbl_request AS rH
WHERE id = #main_parent_id
UNION ALL
SELECT rH.id, rh.state
FROM tbl_request AS rH
INNER JOIN cte ON rH.parent_id = cte.id
)
SELECT
CASE WHEN SUM
(
CASE WHEN state='not-legit' THEN 1 ELSE 0 END
)>0 THEN 'false' ELSE 'true' END
FROM cte
;

sql query to format the parent child in a hierarchical list

I want to have the data displayed in the below parent child like relationship, where the first row should be parent row, and every subsequent row should be for child with null in all parent specific columns.
[![Here is the example][2]][2]
In the above the consumer 99999999 has 2 dependents 22222222 and 33333333.
I tried doing this with Lead and lag but that will need an order by in the CTE.
I was thinking of looking at the previous ApplicationroductID, and would flag whenever I see that it has changed. So by doing that my I can flag the parent and Child.
I could potentially get away with the order by problem by using a temp table, but that doesnt sound like a very good solution.
Is there a better way of doing this that you guys can think of. I want to have the data to be in the above format.
select
de.ID as ConsumerID
,dd.ID as DependentID,
f.ApplicationroductID,
LEAD(f.ApplicationroductID) OVER (Order by fep.ApplicationroductID) Lead,
LAG(f.ApplicationroductID) OVER (Order by fep.ApplicationroductID) Lag
from Fact f
INNER JOIN DimEmp de ON fep.DimEmp_FK = de.ID
INNER JOIN DimDep dd ON fep.DimDep_FK = dd.ID
Try something like this:
;with cte as
(
select
de.ID as ConsumerID,
null as DependentID,
de.ConsumerName,
null as ParticipantName,
f.ApplicationroductID
from Fact f
INNER JOIN DimEmp de ON f.DimEmp_FK = de.ID
union
select
de.ID as ConsumerID,
dd.ID as DependentID,
null as ConsumerName,
dd.ParticipantName,
f.ApplicationroductID
from Fact f
INNER JOIN DimEmp de ON f.DimEmp_FK = de.ID
INNER JOIN DimDep dd ON f.DimDep_FK = dd.ID
)
select case
when DependentID is null then ConsumerID
else null
end as ConsumerID,
DependentID,
ConsumerName,
ParticipantName,
ApplicationroductID
from cte
order by ConsumerId,
DependentId
The two SELECTS + UNION will get you all rows, with NULL or real value for each row type. We get the ConsumerId in both SELECTs in order to use it in the outer SELECT's ORDER BY expression. The outer SELECT switches NULL or ConsumerId, depending on the row type.

Confusing use of cte ,inner join and union all

I am confused about the use of inner join on the CTE here. what is in a as appears in the inner join at the end and what is in cte1 c?
WITH cte1 AS
(SELECT id,geographyname,
OriginalGoals,
ParentGeographyname,
0 AS HierarchyLevel,
paradigm
FROM businessobject_RefinementMaster
WHERE Id = #Geo
UNION ALL
SELECT a.id,
a.geographyname,
a.OriginalGoals,
a.ParentGeographyName,
HierarchyLevel-1 AS HierarchyLevel,
a.paradigm
FROM businessobject_RefinementMaster a
INNER JOIN cte1 c ON c.ParentGeographyname = a.geographyname
AND c.paradigm=a.paradigm )
what will be the result of this query?
This is a recursive CTE (hidden-RBAR). I'll try to comment it in a way, that you can understand, what is going on:
WITH cte1 AS
(
/*
This is called "anchor" and reads the "head" lines of a hierarchy
*/
SELECT id,
geographyname,
OriginalGoals,
ParentGeographyname,
0 AS HierarchyLevel, --obviously this starts with a "0"
paradigm
FROM businessobject_RefinementMaster --The source-table
WHERE Id = #Geo --You read elements with Id=#Geo. This is - probably - one single element
--The next SELECT will be "added" to the result-set
UNION ALL
/*
The column-list must be absolutely the same (count and type) of the anchor
*/
SELECT a.id,
a.geographyname,
a.OriginalGoals,
a.ParentGeographyName,
HierarchyLevel-1 AS HierarchyLevel, --this is simple counting. Started with 0 this will lead to -1, -2, -3...
a.paradigm
FROM businessobject_RefinementMaster a --same source-table as above
INNER JOIN cte1 c ON c.ParentGeographyname = a.geographyname --Find rows where the name of the element is the parent-name of the former element
AND c.paradigm=a.paradigm
)
/*
Return the result-set
*/
SELECT * FROM cte1
The result should be a full recursive list of parents to a given element.

Using CTE with hierarchical data and 'cumulative' values

I'm experimenting with SQL Common Table Expressions using a sample hierarchy of cities, countries and continents and which have been visited and which haven't.
The table t_hierarchy looks like this:
(NOTE: The visited column is deliberately NULL for non-cities because I want this to be a dynamically calculated percentage.)
I have then used the following SQL to create a recursive result set based on the data in t_hierarchy:
WITH myCTE (ID, name, type, parentID, visited, Depth)
AS
(
Select ID, name, type, parentID, visited, 0 as Depth From t_hierarchy where parentID IS NULL
UNION ALL
Select t_hierarchy.ID, t_hierarchy.name, t_hierarchy.type, t_hierarchy.parentID, t_hierarchy.visited, Depth + 1
From t_hierarchy
inner join myCte on t_hierarchy.parentID = myCte.ID
)
Select ID, name, type, parentID, Depth, cnt.numDirectChildren, visited
FROM myCTE
LEFT JOIN (
SELECT theID = parentID, numDirectChildren = COUNT(*)
FROM myCTE
GROUP BY parentID
) cnt ON cnt.theID = myCTE.ID
order by ID
The result looks like this:
What I would like to do now, which I am struggling with, is to create a column, e.g. visitedPercentage to show the percentage of cities visited for each 'level' of the hierarchy (treating cities differently to countries and continents). To explain, working our way up the 'tree':
Madrid would be 100% because it has been visited (visited = 1)
Barcelona would be 0% because it has not been visited (visited = 0)
Spain would therefore be 50% because it has 2 direct children and one is 100% and the other is 0%
Europe would therefore be 50% because Spain is 50%, France is 100% (Paris has been visited), and Germany is 0% (Berlin has not been visited)
I hope this makes sense. I kind of want to say "if it's not a city, work out the visitedPercentage of THIS level based on the visitedPercentage of all direct children, otherwise just show 100% or 0%. Any guidance is much appreciated.
UPDATE:
I've managed to progress it a bit further using Daniel Gimenez's suggestion to the point where I've got France 100, Spain 50 etc. But the top level items (e.g. Europe) are still 0, like this:
I think this is because the calculation is being done after the recursive part of the query, rather than within it. I.e. this line:
SELECT... , visitPercent = SUM(CAST visited AS int) / COUNT(*) FROM myCTE GROUP BY parentID
is saying "look at the visited column for child objects, calculate the SUM of the values, and show the result as visitPercent", when it should be saying "look at the existing visitPercent value from the previous calculation", if that makes sense. I've no idea where to go from here! :)
I think I've done it, using 2 CTE's. In the end it was easier to get the total number of descendents for each level (children, grandchildren etc) and use that to calculate the overall percentage.
That was painful. At one point typing 'CATS' instead of 'CAST' had me puzzled for about 10 minutes.
with cte1 (ID,parentID,type,name,visited,Lvl) as (
select t.ID, t.parentID, t.type, t.name, t.visited, 0 as [Lvl]
from t_hierarchy t
where t.parentID is not null
union all
select c.ID, t.parentID, c.type, c.name, c.visited, c.Lvl + 1
from t_hierarchy t
inner join cte1 c on c.parentID = t.ID
where t.parentID is not null
),
cte2 (ID,name,type,parentID,parentName_for_reference,visited,Lvl) as (
Select t_hierarchy.ID, t_hierarchy.name, t_hierarchy.type, t_hierarchy.parentID, p.name as parentName_for_reference, t_hierarchy.visited, 0 as Lvl
From t_hierarchy
left join t_hierarchy p ON p.ID = t_hierarchy.parentID
where t_hierarchy.parentID IS NULL
UNION ALL
Select t_hierarchy.ID, t_hierarchy.name, t_hierarchy.type, t_hierarchy.parentID,p.name as parentName_for_reference, t_hierarchy.visited, Lvl + 1
From t_hierarchy
inner join cte2 on t_hierarchy.parentID = cte2.ID
inner join t_hierarchy p ON p.ID = t_hierarchy.parentID
)
select cte2.ID,cte2.name,cte2.type,cte2.parentID,cte2.parentName_for_reference,cte2.visited,cte2.Lvl
,CASE WHEN type = 'city' THEN 'N/A' ELSE CAST(cnt.totalDescendents as varchar) END AS totalDescendents
,CASE WHEN type = 'city' THEN 'N/A' ELSE CAST(COALESCE(cnt2.totalDescendentsVisited,0) as varchar) END AS totalDescendentsVisited
,CASE WHEN type = 'city' THEN 'N/A' ELSE CAST((CAST(ROUND(CAST(COALESCE(cnt2.totalDescendentsVisited,0) as float)/CAST(cnt.totalDescendents as float),2) AS numeric(36,2))*100) as varchar) END as asPercentage
from cte2
left JOIN (
SELECT theID = parentID, COUNT(*) as totalDescendents
FROM cte1
WHERE type = 'city'
GROUP BY parentID
) cnt ON cnt.theID = cte2.ID
left JOIN (
SELECT theID = parentID, COUNT(*) as totalDescendentsVisited
FROM cte1
WHERE type = 'city' AND visited = 1
GROUP BY parentID
) cnt2 ON cnt2.theID = cte2.ID
ORDER BY ID
These posts were helpful:
Keeping it simple and how to do multiple CTE in a query
CTE to get all children (descendants) of a parent

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...