How to restrict recursive CTE row count - sql

I have a function use to generate a record id, I want to use CTE to get batch of record id.
Now the recursive CTE like below
with T as (
select
dbo.Ufn_GetRecordId() AS recordId
union all
SELECT
dbo.Ufn_GetRecordId() AS recordId
FROM T
)select * from T
OPTION (MaxRecursion 0);
However, this query will not terminate. How restrict the count of CTE?(e.g. if I only need 3 rows in T)

You can try something like below. Idea taken from SQL Server: How to limit CTE recursion to rows just recursivly added?
with T as (
select
dbo.Ufn_GetRecordId() AS recordId, 1 as testnum
union all
SELECT
dbo.Ufn_GetRecordId() AS recordId, testnum + 1
FROM T
WHERE testnum < 3
)select * from T
OPTION (MaxRecursion 0);
This will restrict to 3 returned rows.

This is a fairly standard way of generating N rows with a recursive CTE.
WITH T
AS (SELECT 1 AS Dummy
UNION ALL
SELECT Dummy + 1
FROM T
WHERE Dummy < 3)
SELECT dbo.Ufn_GetRecordId() AS RecordId
FROM T;
If you need to generate more than 100 numbers then you'll need OPTION (MAXRECURSION 0) (or some suitable value instead of 0).

Related

Want multiple Rows from 1 Row

I have a table in a SQL Server like this (example):
ID
ARTICLE
ARTTEXT
COUNT
1
123456
Test1
5
2
324644
blabla
1
3
765456
nanana
12
Now these items are to be labelled. I.e. each copy needs a label. I then do this via the SSRS.
So I need from ID 1 5 labels, ID 2 1, ID 3 12.
Now the question is what does the select look like to get 5 rows from ID 1, 1 row from ID 2 and 12 rows from ID 3.
I guess a CTE, but it's not clear to me how to get x times the records
I look forward to your ideas.
I would use a Tally over the (far slower) rCTE solution. I use an inline tally here. If you need more than 100 rows, simply add more cross joins to N in the CTE defined as Tally (each cross join increases the maximum number of rows by a factor or 10).
CREATE TABLE dbo.YourTable (ID int,
Article int,
Arttext varchar(15),
[Count] int);
INSERT INTO dbo.YourTable
VALUES(1,123456,'Test1',5),
(2,324644,'blabla',1),
(3,765456,'nanana',12);
GO
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (SELECT MAX([Count]) FROM dbo.YourTable)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2) --100 rows, add more cross joins for more rows
SELECT YT.ID,
YT.Article,
YT.Arttext,
T.I AS [Count]
FROM dbo.YourTable YT
JOIN Tally T ON YT.[Count] >= T.I
ORDER BY YT.ID,
T.I;
GO
DROP TABLE dbo.YourTable;
If I understand correctly, you want to multiply the number of rows based on count. One method uses a recursive CTE:
with cte as (
select id, article, arttext, count
from t
union all
select id, article, arttext, count - 1
from cte
where count > 1
)
select id, article, arttext
from t;
If the count exceeds 100, then you want option (maxrecursion 0).

How to copy a row n times in sqlite?

For a load test I have to copy a row in my table in a sqlite database 1000 (5000, 10000) times.
With the statement
INSERT INTO MYTABLE (
created,
modified,
anotherfield,
etc
)
SELECT created,
modified,
anotherfield,
etc FROM MYTABLE WHERE id = 1;
I can copy it one time. But it would be great to be able to put this into a loop to execute this statement n times.
It seems like SQLite does not support for-loops. I found something called WITH RECURSIVE which could be something like the SQLite way to handle loops. But if I execute
WITH RECURSIVE
cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<1000)
<insert_statement_from_above>
the insert statement gets executed only once.
What am I doing wrong? How can I get to insert 1000 (5000, 10000) rows without having to add them all one by one? Thanks!
You must CROSS join the table to the recursive cte to produce 1000 rows:
WITH RECURSIVE cte(x) AS (SELECT 1 UNION ALL SELECT x + 1 FROM cte WHERE x < 1000)
INSERT INTO MYTABLE (created, modified, anotherfield)
SELECT m.created, m.modified, m.anotherfield
FROM MYTABLE m CROSS JOIN cte c
WHERE m.id = 1;
See the demo (for 3 rows).
Another way to use the recursive cte:
WITH RECURSIVE cte AS (
SELECT created, modified, anotherfield, 1 x
FROM MYTABLE
WHERE id = 1
UNION ALL
SELECT created, modified, anotherfield, x + 1
FROM cte
WHERE x < 1000
)
INSERT INTO MYTABLE (created, modified, anotherfield)
SELECT created, modified, anotherfield
FROM cte;
See the demo.

Using INSERT with CTE

For a somewhat complex SQL script I need the following mapping:
WITH days_mapping AS (SELECT 1 AS day
UNION ALL
SELECT 2 AS day
UNION ALL
...
SELECT 31 AS day)
Is there any way to create the same mapping but without manually writing a SELECT and UNION ALL for every single number/day that should be in this mapping? I was thinking of doing an INSERT in a WHILE loop instead of the SELECT but I don't know how or if it is even possible to do that with common table expressions.
You can use a recursive CTE:
with days_mapping as (
select 1 as day
union all
select day + 1
from days_mapping
where day < 31
)
select *
from days_mapping;
Here is a db<>fiddle.
Note: If you have more than 100 rows being generating, you need to use option (maxrecursion 0) at the end of the query.

Selecting a sequence in SQL

There seems to be a few blog posts on this topic but the solutions really are not so intuitive. Surely there's a "Canonical" way?
I'm using Teradata SQL.
How would I select
A range of number
A date range
E.g.
SELECT 1:10 AS Nums
SELECT 1-1-2010:5-1-2014 AS Dates1
The result would be 10 rows (1 - 10) in the first SELECT query and ~(365 * 3.5) rows in the second?
The "canonical" way to do this in SQL is using recursive CTEs, which the more recent versions of Teradata support.
For your first example:
with recursive nums(n) as (
select 1 as n
union all
select n + 1
from nums
where n < 10
)
select *
from nums;
You can do something similar for dates.
EDIT:
You can also do this by using row_number() and an existing table:
with nums(n) as (
select n
from (select row_number() over (order by col) as n
from ExstingTable t
) t
where n <= 10
)
select *
from nums;
ExistingTable is just any table with enough rows. The best choice of col is the primary key.
with digits(n) as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9 union all select 10
)
select *
from digits;
If your version of Teradata supports multiple CTEs, you can build on the above:
with digits(n) as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9 union all select 10
),
nums(n) as (
select d1.n*100 + d2.n*10 + d3.n
from digits d1 cross join digits d2 cross join digits d3
)
select *
from nums;
In Teradata you can use the existing sys_calendar to get those dates:
SELECT calendar_date
FROM sys_calendar.CALENDAR
WHERE calendar_date BETWEEN DATE '2010-01-01' AND DATE '2014-05-01';
Note:
DATE '2010-01-01' is the only recommended way to write a date in Teradata
There's probably another custom calendar for the specific business needs of your company, too. Everyone will have access rights to it.
You might also use this for the range of numbers:
SELECT day_of_calendar
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 10;
But you should check Explain to see if the estimated number of rows is correct. sys_calendar is a kind of template and day_of_calendar is a calculated column, so no statistics exists on that and Explain will return an estimated number of 14683 (20 percent of the number of rows in that table) instead of 10. If you use it in additional joins the optimizer might do a bad plan based on that totally wrong number.
Note:
If you use sys_calendar you are limited to a maximum of 73414 rows, dates between 1900-01-01 and 2100-12-31 and numbers between 1 and 73414, your business calendar might vary.
Gordon Linoff's recursive query is not really efficient in Teradata, as it's a sequential row-by-row processing in a parallel database (each loop is an "all-AMPs step" in Explain) and the optimizer doesn't know how many rows will be returned.
If you need those ranges regularly you might consider creating a numbers table, I usually got one with a million rows or I use my calendar with the full range of 10000 years :-)
--DROP TABLE nums;
CREATE TABLE nums(n INT NOT NULL PRIMARY KEY CHECK (n BETWEEN 0 AND 999999));
INSERT INTO Nums
WITH cte(n) AS
(
SELECT day_of_calendar - 1
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 1000
)
SELECT
t1.n +
t2.n * 1000
FROM cte t1 CROSS JOIN cte t2;
COLLECT STATISTICS COLUMN(n) ON Nums;
The COLLECT STATS is the most important step to get correct estimates.
Now it's a simple
SELECT n FROM nums WHERE n BETWEEN 1 AND 10;
There's also a nice UDF on GitHub for creating sequences which is easy to use:
SELECT DATE '2010-01-01' + SEQUENCE
FROM TABLE(gen_sequence(0,DATE '2014-05-01' - DATE '2010-01-01')) AS t;
SELECT SEQUENCE
FROM TABLE(gen_sequence(1,10)) AS t;
But it's usually hard to convince your DBA to install any C-UDFs and the number of rows returned is unknown again.
sequence 1 to 10
sel sum (1) over (ROWS UNBOUNDED PRECEDING) as seq_val
from sys_calendar.CALENDAR
qualify row_number () over (order by 1)<=10

SQL Server Top 1

In Microsoft SQL Server 2005 or above, I would like to get the first row, and if there is no matching row, then return a row with default values.
SELECT TOP 1 ID,Name
FROM TableName
UNION ALL
SELECT 0,''
ORDER BY ID DESC
This works, except that it returns two rows if there is data in the table, and 1 row if not.
I'd like it to always return 1 row.
I think it has something to do with EXISTS, but I'm not sure.
It would be something like:
SELECT TOP 1 * FROM Contact
WHERE EXISTS(select * from contact)
But if not EXISTS, then SELECT 0,''
What happens when the table is very full and you might want to specify which row of your top 1 to get, such as the first name? OMG Ponies' query will return the wrong answer in that case if you just change the ORDER BY clause. His query also costs about 8% more CPU than this modification (though it has equal reads)
SELECT TOP 1 *
FROM (
SELECT TOP 1 ID,Name
FROM TableName
ORDER BY Name
UNION ALL
SELECT 0,''
) X
ORDER BY ID DESC
The difference is that the inner query has a TOP 1 also, and which TOP 1 can be specified there (as shown).
Just for fun, this is another way to do it which performs very closely to the above query (-15ms to +30ms). While it's more complicated than necessary for such a simple query, it demonstrates a technique that I don't see other SQL folks using very often.
SELECT
ID = Coalesce(T.ID, 0),
Name = Coalesce(T.Name, '')
FROM
(SELECT 1) X (Num)
LEFT JOIN (
SELECT TOP 1 ID, Name
FROM TableName
ORDER BY ID DESC
) T ON 1 = 1 -- effective cross join but does not limit rows in the first table
Use:
SELECT TOP 1
x.id,
x.name
FROM (SELECT t.id,
t.name
FROM TABLENAME t
UNION ALL
SELECT 0,
'') x
ORDER BY id DESC
Using a CTE equivalent:
WITH query AS (
SELECT t.id,
t.name
FROM TABLENAME t
UNION ALL
SELECT 0,
'')
SELECT TOP 1
x.id,
x.name
FROM query x
ORDER BY x.id DESC
CREATE TABLE #sample(id INT, data VARCHAR(10))
SELECT TOP 1 id, data INTO #temp FROM #sample
IF ##ROWCOUNT = 0 INSERT INTO #temp VALUES (null, null)
SELECT * FROM #temp
put the top oustide of the UNION query
SELECT TOP 1 * FROM(
SELECT ID,Name
FROM TableName
UNION ALL
SELECT 0,''
) z
ORDER BY ID DESC
IF EXISTS ( SELECT TOP 1 ID, Name FROM TableName )
BEGIN
SELECT TOP 1 ID, Name FROM TableName
END
ELSE
BEGIN
--exists returned no rows
--send a default row
SELECT 0, ''
END