Possible explanation on WITH RECURSIVE Query Postgres - sql

I have been reading around With Query in Postgres. And this is what I'm surprised with
WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n) FROM t;
I'm not able to understand how does the evaluation of the query work.
t(n) it sound like a function with a parameter. how does the value of n is passed.
Any insight on how the break down happen of the recursive statement in SQL.

This is called a common table expression and is a way of expressing a recursive query in SQL:
t(n) defines the name of the CTE as t, with a single column named n. It's similar to an alias for a derived table:
select ...
from (
...
) as t(n);
The recursion starts with the value 1 (that's the values (1) part) and then recursively adds one to it until the 99 is reached. So it generates the numbers from 1 to 99. Then final query then sums up all those numbers.
n is a column name, not a "variable" and the "assignment" happens in the same way as any data retrieval.
WITH RECURSIVE t(n) AS (
VALUES (1) --<< this is the recursion "root"
UNION ALL
SELECT n+1 FROM t WHERE n < 100 --<< this is the "recursive part"
)
SELECT sum(n) FROM t;
If you "unroll" the recursion (which in fact is an iteration) then you'd wind up with something like this:
select x.n + 1
from (
select x.n + 1
from (
select x.n + 1
from (
select x.n + 1
from (
values (1)
) as x(n)
) as x(n)
) as x(n)
) as x(n)
More details in the manual:
https://www.postgresql.org/docs/current/static/queries-with.html

If you are looking for how it is evaluated, the recursion occurs in two phases.
The root is executed once.
The recursive part is executed until no rows are returned. The documentation is a little vague on that point.
Now, normally in databases, we think of "function" in a different way than we think of them when we do imperative programming. In database terms, the best way to think of a function is "a correspondence where for every domain value you have exactly one corresponding value." So one of the immediate challenges is to stop thinking in terms of programming functions. Even user-defined functions are best thought about in this other way since it avoids a lot of potential nastiness regarding the intersection of running the query and the query planner... So it may look like a function but that is not correct.
Instead the WITH clause uses a different, almost inverse notation. Here you have the set name t, followed (optionally in this case) by the tuple structure (n). So this is not a function with a parameter, but a relation with a structure.
So how this breaks down:
SELECT 1 as n where n < 100
UNION ALL
SELECT n + 1 FROM (SELECT 1 as n) where n < 100
UNION ALL
SELECT n + 1 FROM (SELECT n + 1 FROM (SELECT 1 as n)) where n < 100
Of course that is a simplification because internally we keep track of the cte state and keep joining against the last iteration, so in practice these get folded back to near linear complexity (while the above diagram would suggest much worse performance than that).
So in reality you get something more like:
SELECT 1 as n where 1 < 100
UNION ALL
SELECT 1 + 1 as n where 1 + 1 < 100
UNION ALL
SELECT 2 + 1 AS n WHERE 2 + 1 < 100
...
In essence the previous values carry over.

Related

WHILE Window Operation with Different Starting Point Values From Column - SQL Server [duplicate]

In SQL there are aggregation operators, like AVG, SUM, COUNT. Why doesn't it have an operator for multiplication? "MUL" or something.
I was wondering, does it exist for Oracle, MSSQL, MySQL ? If not is there a workaround that would give this behaviour?
By MUL do you mean progressive multiplication of values?
Even with 100 rows of some small size (say 10s), your MUL(column) is going to overflow any data type! With such a high probability of mis/ab-use, and very limited scope for use, it does not need to be a SQL Standard. As others have shown there are mathematical ways of working it out, just as there are many many ways to do tricky calculations in SQL just using standard (and common-use) methods.
Sample data:
Column
1
2
4
8
COUNT : 4 items (1 for each non-null)
SUM : 1 + 2 + 4 + 8 = 15
AVG : 3.75 (SUM/COUNT)
MUL : 1 x 2 x 4 x 8 ? ( =64 )
For completeness, the Oracle, MSSQL, MySQL core implementations *
Oracle : EXP(SUM(LN(column))) or POWER(N,SUM(LOG(column, N)))
MSSQL : EXP(SUM(LOG(column))) or POWER(N,SUM(LOG(column)/LOG(N)))
MySQL : EXP(SUM(LOG(column))) or POW(N,SUM(LOG(N,column)))
Care when using EXP/LOG in SQL Server, watch the return type http://msdn.microsoft.com/en-us/library/ms187592.aspx
The POWER form allows for larger numbers (using bases larger than Euler's number), and in cases where the result grows too large to turn it back using POWER, you can return just the logarithmic value and calculate the actual number outside of the SQL query
* LOG(0) and LOG(-ve) are undefined. The below shows only how to handle this in SQL Server. Equivalents can be found for the other SQL flavours, using the same concept
create table MUL(data int)
insert MUL select 1 yourColumn union all
select 2 union all
select 4 union all
select 8 union all
select -2 union all
select 0
select CASE WHEN MIN(abs(data)) = 0 then 0 ELSE
EXP(SUM(Log(abs(nullif(data,0))))) -- the base mathematics
* round(0.5-count(nullif(sign(sign(data)+0.5),1))%2,0) -- pairs up negatives
END
from MUL
Ingredients:
taking the abs() of data, if the min is 0, multiplying by whatever else is futile, the result is 0
When data is 0, NULLIF converts it to null. The abs(), log() both return null, causing it to be precluded from sum()
If data is not 0, abs allows us to multiple a negative number using the LOG method - we will keep track of the negativity elsewhere
Working out the final sign
sign(data) returns 1 for >0, 0 for 0 and -1 for <0.
We add another 0.5 and take the sign() again, so we have now classified 0 and 1 both as 1, and only -1 as -1.
again use NULLIF to remove from COUNT() the 1's, since we only need to count up the negatives.
% 2 against the count() of negative numbers returns either
--> 1 if there is an odd number of negative numbers
--> 0 if there is an even number of negative numbers
more mathematical tricks: we take 1 or 0 off 0.5, so that the above becomes
--> (0.5-1=-0.5=>round to -1) if there is an odd number of negative numbers
--> (0.5-0= 0.5=>round to 1) if there is an even number of negative numbers
we multiple this final 1/-1 against the SUM-PRODUCT value for the real result
No, but you can use Mathematics :)
if yourColumn is always bigger than zero:
select EXP(SUM(LOG(yourColumn))) As ColumnProduct from yourTable
I see an Oracle answer is still missing, so here it is:
SQL> with yourTable as
2 ( select 1 yourColumn from dual union all
3 select 2 from dual union all
4 select 4 from dual union all
5 select 8 from dual
6 )
7 select EXP(SUM(LN(yourColumn))) As ColumnProduct from yourTable
8 /
COLUMNPRODUCT
-------------
64
1 row selected.
Regards,
Rob.
With PostgreSQL, you can create your own aggregate functions, see http://www.postgresql.org/docs/8.2/interactive/sql-createaggregate.html
To create an aggregate function on MySQL, you'll need to build an .so (linux) or .dll (windows) file. An example is shown here: http://www.codeproject.com/KB/database/mygroupconcat.aspx
I'm not sure about mssql and oracle, but i bet they have options to create custom aggregates as well.
You'll break any datatype fairly quickly as numbers mount up.
Using LOG/EXP is tricky because of numbers <= 0 that will fail when using LOG. I wrote a solution in this question that deals with this
Using CTE in MS SQL:
CREATE TABLE Foo(Id int, Val int)
INSERT INTO Foo VALUES(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)
;WITH cte AS
(
SELECT Id, Val AS Multiply, row_number() over (order by Id) as rn
FROM Foo
WHERE Id=1
UNION ALL
SELECT ff.Id, cte.multiply*ff.Val as multiply, ff.rn FROM
(SELECT f.Id, f.Val, (row_number() over (order by f.Id)) as rn
FROM Foo f) ff
INNER JOIN cte
ON ff.rn -1= cte.rn
)
SELECT * FROM cte
Not sure about Oracle or sql-server, but in MySQL you can just use * like you normally would.
mysql> select count(id), count(id)*10 from tablename;
+-----------+--------------+
| count(id) | count(id)*10 |
+-----------+--------------+
| 961 | 9610 |
+-----------+--------------+
1 row in set (0.00 sec)

most effecient way to search SQL table using result from same table until final value is found

In the image above which represents an SQL table I would like to search 1111 and retrieve its last replaced number which should be 4444 where 1111 is just a single number replaced by a single number 4444. then i would like to search 5555 which should return (6666,9999,8888). NB 9999 had replaced 7777.
so 1111 was a single part number replaced multiple times and 5555 was a group number with multiple parts breakdown with one replaced number within(7777>>9999).
What would be the fastest and most efficient method?
if possible a solution using SQL for efficiency.
if unable within SQL then from within PHP.
What I have tried:
1) while loop. but need to access database 1000 times for 1000 replaced numbers. ##too inefficient.
2)
SELECT C.RPLPART
FROM TABLE A
left join TABLE B on A.RPLPART=B.PART#
left join TABLE C on B.RPLPART=C.PART#
WHERE A.PART#='1111' ##Unable to know when last number is reached.
A recursive Common Table Expression (CTE) would seem to be the ticket.
Something like so...
with rcte (lvl, topPart, part#, rplpart) as
(select 1 as lvl, part#, part#, rplpart
from MYTABLE
union all
select p.lvl + 1, p.topPart, c.part#, c.rplpart
from rcte p, MYTABLE c
where p.rplpart = c.part#
)
select topPart, rplpart
from rcte
where toppart = 1111
order by lvl desc
fetch first row only;
You can do this using a recursive CTE that generates a complete replacement chain for a given starting part id, and then limit the result to just those that don't exist in the parts# column:
WITH cte(part) AS
(SELECT replpart FROM parts WHERE part# = 1111
UNION ALL
SELECT parts.replpart FROM parts, cte WHERE parts.part# = cte.part)
SELECT DISTINCT part
FROM cte
WHERE part NOT IN (SELECT part# FROM parts);
Fiddle example

How to generate consecutive integers in Azure SQL?

I am interested in generating a series of consecutive integers from 1 to 1000 (for example) and storing these numbers in each row of some table. I would like to do this in Microsoft Azure SQL but I am not sure if arrays are even supported.
One relatively simple method is a recursive CTE:
with n as (
select 1 as n
union all
select n + 1
from n
where n < 1000
)
select n.n
from n
options (maxrecursion 0);
Another mechanism to solve something like this could be to use a SEQUENCE on the table. It's similar to an IDENTITY column (they actually have the same behavior under the covers) without some of the restrictions. Just reset it to a new seed value as you add data to the table.

Browse subcolumns, but discard some

I have a table (or view) in my PostgreSQL database and want to do the following:
Query the table and feed a function in my application subsequent n-tuples of rows from the query, but only those that satisfy some condition. I can do the n-tuple listing using a cursor, but I don't know how to do the condition checking on database level.
For example, the query returns:
3
2
4
2
0
1
4
6
2
And I want triples of even numbers. Here, they would be:
(2,4,2) (4,2,0) (4,6,2)
Obviously, I cannot discard the odd numbers from the query result. Instead using cursor, a query returning arrays in similar manner would also be acceptable solution, but I don't have any good idea how to use them to do this.
Of course, I could check it at application level, but I think it'd be cleaner to do it on database level. Is it possible?
With the window function lead() (as mentioned by #wildplasser):
SELECT *
FROM (
SELECT tbl_id, i AS i1
, lead(i) OVER (ORDER BY tbl_id) AS i2
, lead(i, 2) OVER (ORDER BY tbl_id) AS i3
FROM tbl
) sub
WHERE i1%2 = 0
AND i2%2 = 0
AND i3%2 = 0;
There is no natural order of rows - assuming you want to order by tbl_id in the example.
% .. modulo operator
SQL Fiddle.
You can also use an array aggregate for this instead of using lag:
SELECT
a[1] a1, a[2] a2, a[3] a3
FROM (
SELECT
array_agg(i) OVER (ORDER BY tbl_id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM
tbl
) x(a)
WHERE a[1] % 2 = 0 AND a[2] % 2 = 0 AND a[3] % 2 = 0;
No idea if this'll be better, worse, or the same as Erwin's answer, just putting it in for completeness.

select in sql server 2005

I have a table follow:
ID | first | end
--------------------
a | 1 | 3
b | 3 | 8
c | 8 | 10
I want to select follow:
ID | first | end
---------------------
a-c | 1 | 10
But i can't do it. Please! help me. Thanks!
This works for me:
SELECT MIN(t.id)+'-'+MAX(t.id) AS ID,
MIN(t.[first]) AS first,
MAX(t.[end]) AS [end]
FROM dbo.YOUR_TABLE t
But please, do not use reserved words like "end" for column names.
I believe you can do this using a recursive Common Table Expression as follows, especially if you're not expecting very long chains of records:
WITH Ancestors AS
(
SELECT
InitRow.[ID] AS [Ancestor],
InitRow.[ID],
InitRow.[first],
InitRow.[end],
0 AS [level],
'00000' + InitRow.[ID] AS [hacky_level_plus_ID]
FROM
YOUR_TABLE AS InitRow
WHERE
NOT EXISTS
(
SELECT * FROM YOUR_TABLE AS PrevRow
WHERE PrevRow.[end] = InitRow.[first]
)
UNION ALL
SELECT
ParentRow.Ancestor,
ChildRow.[ID],
ChildRow.[first],
ChildRow.[end],
ParentRow.level + 1 AS [level],
-- Avoids having to build the recursive structure more than once.
-- We know we will not be over 5 digits since CTEs have a recursion
-- limit of 32767.
RIGHT('00000' + CAST(ParentRow.level + 1 AS varchar(4)), 5)
+ ChildRow.[ID] AS [hacky_level_plus_ID]
FROM
Ancestors AS ParentRow
INNER JOIN YOUR_TABLE AS ChildRow
ON ChildRow.[first] = ParentRow.[end]
)
SELECT
Ancestors.Ancestor + '-' + SUBSTRING(MAX([hacky_level_plus_ID]),6,10) AS [IDs],
-- Without the [hacky_level_plus_ID] column, you need to do it this way:
-- Ancestors.Ancestor + '-' +
-- (SELECT TOP 1 Children.ID FROM Ancestors AS Children
-- WHERE Children.[Ancestor] = Ancestors.[Ancestor]
-- ORDER BY Children.[level] DESC) AS [IDs],
MIN(Ancestors.[first]) AS [first],
MAX(Ancestors.[end]) AS [end]
FROM
Ancestors
GROUP BY
Ancestors.Ancestor
-- If needed, add OPTION (MAXRECURSION 32767)
A quick explanation of what each part does:
The WITH Ancestors AS (...) clause creates a Common Table Expression (basically a subquery) with the name Ancestors. The first SELECT in that expression establishes a baseline: all the rows that have no matching entry prior to it.
Then, the second SELECT is where the recursion kicks in. Since it references Ancestors as part of the query, it uses the rows it has already added to the table and then performs a join with new ones from YOUR_TABLE. This will recursively find more and more rows to add to the end of each chain.
The last clause is the SELECT that uses this recursive table we've built up. It does a simple GROUP BY since we've saved off the original ID in the Ancestor column, so the start and end are a simple MIN and MAX.
The tricky part is figuring out the ID of the last row in the chain. There are two ways to do it, both illustrated in the query. You can either join back with the recursive table, in which case it will build the recursive table all over again, or you can attempt to keep track of the last item as you go. (If building the recursive list of chained records is expensive, you definitely want to minimize the number of times you need to do that.)
The way it keeps track as it goes is to keep track of its position in the chain (the level column -- notice how we add 1 each time we recurse), zero-pad it, and then stick the ID at the end. Then, getting the item with the max level is simply a MAX followed by stripping the level data out.
If the CTE has to recurse too much, it will generate an error, but I believe you can tweak that using the MAXRECURSION option. The default is 100. If you have to set it higher than that, you may want to consider not using a recursive CTE to do this.
This also doesn't handle malformed data very well. If you have two records with the same first or a record where first == end, then this won't work right and you may have to tweak the join conditions inside the CTE or go with another approach.
This isn't the only way to do it. I believe it would be easier to follow if you built a custom procedure and did all the steps manually. But this has the advantage of operating in a single statement.