Recursive query rows to single column? - sql

1) What are recursive queries ?
2) Are they dangerous ?
3) How can I make a recursive query to give me results from
ID Date
1 10/10/2010
1 20/10/2010
1 20/10/2010
2 11/10/2010
2 22/10/2010
to
ID Dates
1 10/10/2010,20/10/2010,20/10/2010
2 11/10/2010,22/10/2010
4) Can you explain how recursion operates inside the query? I googled but can't get how the recursion works actually. My database is DB2 ISeries V5R4.

Recursive query is a SQL query that can do a recursive computation. In other words, it can use the results of itself to continue query. Here is an abstract description:
1/ ancestor (x, y) = parent (x, y)
2/ ancestor (x, y) = parent (x, z) && ancestor (Z, Y).
It can be understood in a brief way that you to list all ancestor of Z, you list all of its parents and then all parents of those parents...
For example, if you have a table of Family with 2 columns Parent and Child like this:
pkey char 1 not null primary key
ckey char 1 not null primary key
('A','B')
('A','C')
('A','D')
('C','E')
('D','A')
('D','E')
('D','F')
('F','G')
The left handside is parent and the right hand side is children. Now you want to find all descedants of A then here is some code:
with parent_ctl (ckey) as
(
select ckey
from parents
where pkey='A'
UNION ALL
select c.ckey
from parents C, parent_ctl P
where P.ckey = C.Pkey
)
select ckey from parent_ctl;

Related

Solution of hackerrank Binary Tree Nodes question

You are given a table, BST, containing two columns: N and P, where N represents the value of a node in Binary Tree, and P is the parent of N.
Write a query to find the node type of Binary Tree ordered by the value of the node. Output one of the following for each node:
Root: If node is root node.
Leaf: If node is leaf node.
Inner: If node is neither root nor leaf node.
Sample Input
Sample Output
1 Leaf
2 Inner
3 Leaf
5 Root
6 Leaf
8 Inner
9 Leaf
Explanation
The Binary Tree below illustrates the sample:
why below solution is not working :
select n,
CASE when P is null then 'Root'
when (select count(*) from BST where n = p)>0 then 'Inner'
else 'Leaf'
end as nodetype from BST
order by n
and below solution is working:
select n,
CASE when P is null then 'Root'
when (select count(*) from BST where b.n = p)>0 then 'Inner'
else 'Leaf'
end as nodetype from BST b
order by n
In your fist query you are comparing n with p column within the subquery which should never be true.
In second query you are comparing n column of outer query with p column of subquery which will return more than 0 if there is at least one leaf under the b.n node otherwise it will return 0.
Using a case statement is a way to go. However, to determine if a node is 'Inner' one needs to see if it is a parent to another node i.e. is its value N in the set of all P values.
SELECT N, CASE
WHEN P IS NULL THEN 'Root'
WHEN N IN (SELECT P FROM BST) THEN 'Inner'
ELSE 'Leaf' END as node_type FROM BST ORDER BY N
By using below code, you can easily solve the binary tree nodes question.In foreach statement we can check the max value and print in console
int n = Convert.ToInt32(Console.ReadLine().Trim());
string[] groupings = Convert.ToString(n, 2).Split("0");
int max = 0;
foreach(string s in groupings){
if(max < s.Length){
max = s.Length;
}
}
Console.WriteLine(max);

Query "is valid" but produces an error: Scalar subquery produced more than one element

This query “is valid” according to the BigQuery SQL editor. However when it is run, it produces an error:Scalar subquery produced more than one element
Input:
SELECT
(Select
pcr.repdte
from
usa_fdic_call_reports_1992.All_Reports_19921231_
Performance_and_Condition_Ratios as PCR) as Quarter,
(SELECT
Round(PCR.lnlsdepr)
FROM
usa_fdic_call_reports_1992.All_Reports_19921231_Performance_
and_Condition_Ratios as PCR) as NetLoansAndLeasesToDeposits,
(SELECT LD.IDdepsam
FROM usa_fdic_call_reports_1992.All_Reports_19921231_
Deposits_Based_on_the_dollar250_000_Reporting_Threshold
AS LD) as DepositAccountsWithMoreThan250k
Output
Scalar subquery produced more than one element
The output of the queries when they are run independently is below:
SELECT
PCR.repdte as quarter
FROM
usa_fdic_call_reports_1992.All_Reports_19921231_
Performance_and_Condition_Ratios as PCR
Output:
SELECT
Round(PCR.lnlsdepr) as NetLoansAndLeasesToDeposits
FROM
usa_fdic_call_reports_1992.All_Reports_19921231_
Performance_and_Condition_Ratios as PCR
Output:
SELECT LD.IDdepsam as DepositAccountsWithMoreThan250k
FROM
usa_fdic_call_reports_1992.All_Reports_
19921231_Deposits_Based_on_the_dollar250_000_
Reporting_Threshold AS LD
Output:
Scalar subqueries cannot produce more than a single row. You are showing your scalar subqueries show a single column, and multiple rows. That -- by definition -- won't work.
I solved the problem by not using a subquery, and instead using JOIN
SELECT
pcr.cert as cert,
pcr.name as NameOfBank,
pcr.repdte as Quarter,
Round(PCR.lnlsdepr) as NetLoansAndLeasesToDeposits,
LD.IDdepsam as DepositAccountsWithMoreThan250k
FROM
usa_fdic_call_reports_1992.All_Reports_19921231_Performance
_and_Condition_Ratios as PCR
JOIN
usa_fdic_call_reports_1992.All_Reports_19921231_Deposits_Based_
on_the_dollar250_000_Reporting_Threshold AS LD
ON PCR.cert = LD.cert
Output:
To fix, use ARRAY.
For example, this query works:
SELECT 1 x, (SELECT y FROM UNNEST(SPLIT("1")) y) y
But this one will give you the stated error:
SELECT 1 x, (SELECT y FROM UNNEST(SPLIT("1,2")) y) y
"Scalar subquery produced more than one element"
And I can fix it with ARRAY(), that will produce nested repeated results:
SELECT 1 x, ARRAY(SELECT y FROM UNNEST(SPLIT("1,2")) y) y
Or make sure to emit only one row, with LIMIT:
SELECT 1 x, (SELECT y FROM UNNEST(SPLIT("1,2")) y LIMIT 1) y

Recursive CTE...dealing with nested parent/children records

I have the following records:
My goal is to check the SUM of the children for each parent and make sure it is 1 (or 100%).
In the example above, you have a first parent:
12043
It has 2 children:
12484 & 12485
Child (now parent) 12484 has child 12486. The child here (12486) has a percentage of 0.6 (which is NOT 100%). This is NOT OK.
Child (now parent) 12485 has child 12487. The child here (12487) has a percentage of 1 (or 100%). This is OK.
I need to sum the percentages of each nested children and get that value because it doesn't sum up to 100%, then I have to display a message. I'm having a hard time coming up with a query for this. Can someone give me a hand?
This is what I tried and I'm getting the "The statement terminated. The maximum recursion 100 has been exhausted before statement completion." error message.
with cte
as (select cp.parent_payee_id,
cp.payee_id,
cp.payee_pct,
0 as level
from dbo.tp_contract_payee cp
where cp.participant_id = 12067
and cp.payee_id = cp.parent_payee_id
union all
select cp.parent_payee_id,
cp.payee_id,
cp.payee_pct,
c.level + 1 as level
from dbo.tp_contract_payee cp
inner join cte c
on cp.parent_payee_id = c.payee_id
where cp.participant_id = 12067
)
select *
from cte
I believe something like the following should work:
WITH RECURSIVE recCTE AS
(
SELECT
parent_payee_id as parent,
payee_id as child,
payee_pct
1 as depth,
parent_payee_id + '>' + payee_id as path
FROM
table
WHERE
--top most node
parent_payee_id = 12043
AND payee_id <> parent_payee_id --prevent endless recursion
UNION ALL
SELECT
table.parent_payee_id as parent,
table.payee_id as child,
table.payee_pct,
recCTE.depth + 1 as Depth,
recCTE.path + '>' + table.payee_id as path
FROM
recCTE
INNER JOIN table ON
recCTE.child = table.parent_payee_id AND
recCTE.child <> table.payee_id --again prevent records where parent is child
Where depth < 15 --prevent endless cycles
)
SELECT DISTINCT parent
FROM recCTE
GROUP BY parent
HAVING sum(payee_pct) <> 1;
This differs from yours mostly in the WHERE statements on both the Recursive Seed (query before UNION) and the recursive term (query after UNION). I believe yours is too restrictive, especially in the recursive term since you want to allow records that are children of 12067 through, but then you only allow 12067 as the parent id to pull in.
Here, though, we pull every descendant of 12043 (from your example table) and it's payee_pct. Then we analyze each parent in the final SELECT and the sum of all it's payee_pcts, which are essentially that parent's first childrens sum(payee_pct). If any of them are not a total of 1, then we display the parent in the output.
At any rate, between your query and mine, I would imagine this is pretty close to the requirements, so it should be tweaks to get you exactly where you need to be if this doesn't do the trick.

How to group by more than one row value?

I am working with POSTGRESQL and I can't find out how to solve a problem. I have a model called Foobar. Some of its attributes are:
FOOBAR
check_in:datetime
qr_code:string
city_id:integer
In this table there is a lot of redundancy (qr_code is not unique) but that is not my problem right now. What I am trying to get are the foobars that have same qr_code and have been in a well known group of cities, that have checked in at different moments.
I got this by querying:
SELECT * FROM foobar AS a
WHERE a.city_id = 1
AND EXISTS (
SELECT * FROM foobar AS b
WHERE a.check_in < b.check_in
AND a.qr_code = b.qr_code
AND b.city_id = 2
AND EXISTS (
SELECT * FROM foobar as c
WHERE b.check_in < c.check_in
AND c.qr_code = b.qr_code
AND c.city_id = 3
AND EXISTS(...)
)
)
where '...' represents more queries to get more persons with the same qr_code, different check_in date and those well known cities.
My problem is that I want to group this by qr_code, and I want to show the check_in fields of each qr_code like this:
2015-11-11 14:14:14 => [2015-11-11 14:14:14, 2015-11-11 16:16:16, 2015-11-11 17:18:20] (this for each different qr_code)
where the data at the left is the 'smaller' date for that qr_code, and the right part are all the other dates for that qr_code, including the first one.
Is this possible to do with a sql query only? I am asking this because I am actually doing this app with rails, and I know that I can make a different approach with array methods of ruby (a solution with this would be well received too)
You could solve that with a recursive CTE - if I interpret your question correctly:
Assuming you have a given list of cities that must be visited in order by the same qr_code. Your text doesn't say so, but your query indicates as much.
WITH RECURSIVE
c AS (SELECT '{1,2,3}'::int[] AS cities) -- your list of city_id's here
, route AS (
SELECT f.check_in, f.qr_code, 2 AS idx
FROM foobar f
JOIN c ON f.city_id = c.cities[1]
UNION ALL
SELECT f.check_in, f.qr_code, r.idx + 1
FROM route r
JOIN foobar f USING (qr_code)
JOIN c ON f.city_id = c.cities[r.idx]
WHERE r.check_in < f.check_in
)
SELECT qr_code, array_agg(check_in) AS check_in_list
FROM (
SELECT *
FROM route
ORDER BY qr_code, idx -- or check_in
) sub
HAVING count(*) = (SELECT array_length(cities) FROM c);
GROUP BY 1;
Provide the list as array in the first (non-recursive) CTE c.
In the recursive part start with any rows in the first city and travel along your array until the last element.
In the final SELECT aggregate your check_in column in order. Only return qr_code that have visited all cities of the array.
Similar:
Recursive query used for transitive closure

SQL - WHERE (X, Y) IN (A, B)

I have some kind of blockage currently.
My theoretic query looks something like this:
SELECT * FROM Table WHERE X in (a, b, c) AND Y IN (d, e, f)
So basically, I want all rows having multiple columns match, meaning:
X, Y
1, 2
3, 4
5, 6
7, 8,
9, 10
If I want to get all rows where (X=1, Y=2) or (X=5, Y=6), so X and Y are correlated, how would I do that?
(MS SQL2005+)
Why not something simple like the following?
WHERE (X = 1 AND Y = 2) OR (X = 5 AND Y = 6) ...
Or, if you're looking for rows (based on your example) where Y should be X + 1, then:
WHERE Y = X + 1
If you have thousands of OR clauses like the above, then I would suggest you populate a criterion table ahead of time, and rewrite your query as a join. Suppose you have such a table Criteria(X, Y) then your query becomes much simpler:
SELECT Table.*
FROM Table
INNER JOIN Criteria ON Table.X = Criteria.X AND Table.Y = Criteria.Y
Don't forget to add an index / foreign keys as necessary to the new table.
If for some reason you prefer to not create a table ahead of time, you can use a temporary table or table variable and populate it within your procedure.
If X and Y are in a table then a JOIN would be cleanest:
SELECT * FROM Table t
INNER JOIN XandY xy
WHERE tX = xy.X AND t.Y = xy.Y
If there not in a table I would strongly suggest putting them in one. IN only works with single-value sets and there's no way to line up results using multiple IN clauses.