Using recursive sql query not for parent-child - sql

I'm not new in sql and t-sql, but at past I've never used recursive query - all problems were solved with WHILE or CURSOR. I just got 1 question - how to organaze recursion query for following problem: I want to manipulate with last row of data in certain partition. Can't understand how to stop my recursion at last level of partition.
CREATE TABLE #temp
(i int
, s int
, v int);
INSERT INTO #temp
SELECT 1, 1, 10
UNION
SELECT 1, 2, 20
UNION
SELECT 2, 1, 5
UNION
SELECT 2, 2, 5
UNION
SELECT 2, 3, 2
WITH CTE AS
(
SELECT i
, s
, v
FROM #temp
WHERE s=1
UNION ALL
SELECT t.i
, t.s
, t.v + cte.v as new_v
FROM #temp t
INNER JOIN cte
ON (cte.i=t.i)
WHERE t.s>1
)
SELECT *
FROM cte
OPTION(MAXRECURSION 0)
I want to get 5 rows as result:
result
I know that it could be solved with OUTER APPLY, JOINS, WHILE or CURSOR methods. Could you please share any features for my to understand how to get same result with recurcive cte query? SUM function there is just for example - for that problem recurcive query is best way cause I will use many scalar functions in big CASE which will use value from last row in partition and value of current row partition.
Thanks.
Sorry for my bad english level.
Will it be correctly if I'll try same problem with following example? I guess that need to correctly say in which order way recursive query gonna do any data manipulating. So below code which will help you understand what did I want to solve:
CREATE TABLE #temp
(i_key int
, step int
, step_h int
, value int);
INSERT INTO #temp
SELECT 1, 1, NULL, 20
UNION
SELECT 1, 2, 1, 20
UNION
SELECT 2, 1, NULL, 10
UNION
SELECT 2, 2, 1, 10
UNION
SELECT 2, 3, 2, 5
WITH CTE AS
(
SELECT i_key
, step
, value
FROM #temp
WHERE step=1
--AND i_key=2
UNION ALL
SELECT t.i_key
, t.step
, CASE
WHEN cte.value - t.value <=0 THEN 0
ELSE cte.value - t.value
END as value
FROM #temp t
INNER JOIN cte
ON (cte.i_key=t.i_key
AND cte.step=t.step_h)
--WHERE t.step>1
)
SELECT *
FROM CTE
OPTION(MAXRECURSION 0)
Is parent-child structure always need for solving this problems?
So i guess it could be done with another join (without column of parent-child).
AND cte.step=t.step-1

For your particular example, recursion is unnecessary. All you need is SQL Server 2012 or later version:
select t.*,
sum(t.v) over(partition by t.i order by t.s) as [RT]
from #temp t
order by t.i, t.s;
If you need to access previos / next row, there are lag() / lead() ranking functions that were introduced in the same aforementioned version of SQL Server.
EDIT: Ah, I see. You simply want to know how to write recursive CTEs properly. Here is a (seemingly) correct code for your second example:
with cte as (
select t.i_key, t.step, t.value
from #temp t
where t.step_h is null
union all
select c.i_key, t.step, case
when c.value < t.value then 0
else c.value - t.value
end as [Value]
from #temp t
inner join cte c on c.step = t.step_h
and c.i_key = t.i_key
)
select *
from cte c
order by c.i_key, c.step;
In the end, it stops by itself when an iteration does not produce any new rows.

Related

Get every combination of sort order and value of a csv

If I have a string with numbers separated by commas, like this:
Declare #string varchar(20) = '123,456,789'
And would like to return every possible combination + sort order of the values by doing this:
Select Combination FROM dbo.GetAllCombinations(#string)
Which would in result return this:
123
456
789
123,456
456,123
123,789
789,123
456,789
789,456
123,456,789
123,789,456
456,789,123
456,123,789
789,456,123
789,123,456
As you can see not only is every combination returned, but also each combination+sort order as well. The example shows only 3 values separated by commas, but should parse any amount--Recursive.
The logic needed would be somewhere in the realm of using a WITH CUBE statement, but the problem with using WITH CUBE (in a table structure instead of CSV of course), is that it won't shuffle the order of the values 123,456 456,123 etc., and will only provide each combination, which is only half of the battle.
Currently I have no idea what to try. If someone can provide some assistance it would be appreciated.
I use a User Defined Table-valued Function called split_delimiter that takes 2 values: the #delimited_string and the #delimiter_type.
CREATE FUNCTION [dbo].[split_delimiter](#delimited_string VARCHAR(8000), #delimiter_type CHAR(1))
RETURNS TABLE AS
RETURN
WITH cte10(num) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,cte100(num) AS
(
SELECT 1
FROM cte10 t1, cte10 t2
)
,cte10000(num) AS
(
SELECT 1
FROM cte100 t1, cte100 t2
)
,cte1(num) AS
(
SELECT TOP (ISNULL(DATALENGTH(#delimited_string),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM cte10000
)
,cte2(num) AS
(
SELECT 1
UNION ALL
SELECT t.num+1
FROM cte1 t
WHERE SUBSTRING(#delimited_string,t.num,1) = #delimiter_type
)
,cte3(num,[len]) AS
(
SELECT t.num
,ISNULL(NULLIF(CHARINDEX(#delimiter_type,#delimited_string,t.num),0)-t.num,8000)
FROM cte2 t
)
SELECT delimited_item_num = ROW_NUMBER() OVER(ORDER BY t.num)
,delimited_value = SUBSTRING(#delimited_string, t.num, t.[len])
FROM cte3 t;
Using that I was able to parse the CSV to a table and join it back to itself multiple times and use WITH ROLLUP to get the permutations you are looking for.
WITH Numbers as
(
SELECT delimited_value
FROM dbo.split_delimiter('123,456,789',',')
)
SELECT CAST(Nums1.delimited_value AS VARCHAR)
,ISNULL(CAST(Nums2.delimited_value AS VARCHAR),'')
,ISNULL(CAST(Nums3.delimited_value AS VARCHAR),'')
,CAST(Nums4.delimited_value AS VARCHAR)
FROM Numbers as Nums1
LEFT JOIN Numbers as Nums2
ON Nums2.delimited_value not in (Nums1.delimited_value)
LEFT JOIN Numbers as Nums3
ON Nums3.delimited_value not in (Nums1.delimited_value, Nums2.delimited_value)
LEFT JOIN Numbers as Nums4
ON Nums4.delimited_value not in (Nums1.delimited_value, Nums2.delimited_value, Nums3.delimited_value)
GROUP BY CAST(Nums1.delimited_value AS VARCHAR)
,ISNULL(CAST(Nums2.delimited_value AS VARCHAR),'')
,ISNULL(CAST(Nums3.delimited_value AS VARCHAR),'')
,CAST(Nums4.delimited_value AS VARCHAR) WITH ROLLUP
If you will potentially have more than 3 or 4, you'll want to expand your code accordingly.

Find overlapping sets of data in a table

I need to identify duplicate sets of data and give those sets who's data is similar a group id.
id threshold cost
-- ---------- ----------
1 0 9
1 100 7
1 500 6
2 0 9
2 100 7
2 500 6
I have thousands of these sets, most are the same with different id's. I need find all the like sets that have the same thresholds and cost amounts and give them a group id. I'm just not sure where to begin. Is the best way to iterate and insert each set into a table and then each iterate through each set in the table to find what already exists?
This is one of those cases where you can try to do something with relational operators. Or, you can just say: "let's put all the information in a string and use that as the group id". SQL Server seems to discourage this approach, but it is possible. So, let's characterize the groups using:
select d.id,
(select cast(threshold as varchar(8000)) + '-' + cast(cost as varchar(8000)) + ';'
from data d2
where d2.id = d.id
for xml path ('')
order by threshold
) as groupname
from data d
group by d.id;
Oh, I think that solves your problem. The groupname can serve as the group id. If you want a numeric id (which is probably a good idea, use dense_rank():
select d.id, dense_rank() over (order by groupname) as groupid
from (select d.id,
(select cast(threshold as varchar(8000)) + '-' + cast(cost as varchar(8000)) + ';'
from data d2
where d2.id = d.id
for xml path ('')
order by threshold
) as groupname
from data d
group by d.id
) d;
Here's the solution to my interpretation of the question:
IF OBJECT_ID('tempdb..#tempGrouping') IS NOT NULL DROP Table #tempGrouping;
;
WITH BaseTable AS
(
SELECT 1 id, 0 as threshold, 9 as cost
UNION SELECT 1, 100, 7
UNION SELECT 1, 500, 6
UNION SELECT 2, 0, 9
UNION SELECT 2, 100, 7
UNION SELECT 2, 500, 6
UNION SELECT 3, 1, 9
UNION SELECT 3, 100, 7
UNION SELECT 3, 500, 6
)
, BaseCTE AS
(
SELECT
id
--,dense_rank() over (order by threshold, cost ) as GroupId
,
(
SELECT CAST(TblGrouping.threshold AS varchar(8000)) + '/' + CAST(TblGrouping.cost AS varchar(8000)) + ';'
FROM BaseTable AS TblGrouping
WHERE TblGrouping.id = BaseTable.id
ORDER BY TblGrouping.threshold, TblGrouping.cost
FOR XML PATH ('')
) AS MultiGroup
FROM BaseTable
GROUP BY id
)
,
CTE AS
(
SELECT
*
,DENSE_RANK() OVER (ORDER BY MultiGroup) AS GroupId
FROM BaseCTE
)
SELECT *
INTO #tempGrouping
FROM CTE
-- SELECT * FROM #tempGrouping;
UPDATE BaseTable
SET BaseTable.GroupId = #tempGrouping.GroupId
FROM BaseTable
INNER JOIN #tempGrouping
ON BaseTable.Id = #tempGrouping.Id
IF OBJECT_ID('tempdb..#tempGrouping') IS NOT NULL DROP Table #tempGrouping;
Where BaseTable is your table, and and you don't need the CTE "BaseTable", because you have a data table.
You may need to take extra-precautions if your threshold and cost fields can be NULL.

Joining a list of values with table rows in SQL

Suppose I have a list of values, such as 1, 2, 3, 4, 5 and a table where some of those values exist in some column. Here is an example:
id name
1 Alice
3 Cindy
5 Elmore
6 Felix
I want to create a SELECT statement that will include all of the values from my list as well as the information from those rows that match the values, i.e., perform a LEFT OUTER JOIN between my list and the table, so the result would be like follows:
id name
1 Alice
2 (null)
3 Cindy
4 (null)
5 Elmore
How do I do that without creating a temp table or using multiple UNION operators?
If in Microsoft SQL Server 2008 or later, then you can use Table Value Constructor
Select v.valueId, m.name
From (values (1), (2), (3), (4), (5)) v(valueId)
left Join otherTable m
on m.id = v.valueId
Postgres also has this construction VALUES Lists:
SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) AS t (num,letter)
Also note the possible Common Table Expression syntax which can be handy to make joins:
WITH my_values(num, str) AS (
VALUES (1, 'one'), (2, 'two'), (3, 'three')
)
SELECT num, txt FROM my_values
With Oracle it's possible, though heavier From ASK TOM:
with id_list as (
select 10 id from dual union all
select 20 id from dual union all
select 25 id from dual union all
select 70 id from dual union all
select 90 id from dual
)
select * from id_list;
the following solution for oracle is adopted from this source. the basic idea is to exploit oracle's hierarchical queries. you have to specify a maximum length of the list (100 in the sample query below).
select d.lstid
, t.name
from (
select substr(
csv
, instr(csv,',',1,lev) + 1
, instr(csv,',',1,lev+1 )-instr(csv,',',1,lev)-1
) lstid
from (select ','||'1,2,3,4,5'||',' csv from dual)
, (select level lev from dual connect by level <= 100)
where lev <= length(csv)-length(replace(csv,','))-1
) d
left join test t on ( d.lstid = t.id )
;
check out this sql fiddle to see it work.
Bit late on this, but for Oracle you could do something like this to get a table of values:
SELECT rownum + 5 /*start*/ - 1 as myval
FROM dual
CONNECT BY LEVEL <= 100 /*end*/ - 5 /*start*/ + 1
... And then join that to your table:
SELECT *
FROM
(SELECT rownum + 1 /*start*/ - 1 myval
FROM dual
CONNECT BY LEVEL <= 5 /*end*/ - 1 /*start*/ + 1) mypseudotable
left outer join myothertable
on mypseudotable.myval = myothertable.correspondingval
Assuming myTable is the name of your table, following code should work.
;with x as
(
select top (select max(id) from [myTable]) number from [master]..spt_values
),
y as
(select row_number() over (order by x.number) as id
from x)
select y.id, t.name
from y left join myTable as t
on y.id = t.id;
Caution: This is SQL Server implementation.
fiddle
For getting sequential numbers as required for part of output (This method eliminates values to type for n numbers):
declare #site as int
set #site = 1
while #site<=200
begin
insert into ##table
values (#site)
set #site=#site+1
end
Final output[post above step]:
select * from ##table
select v.id,m.name from ##table as v
left outer join [source_table] m
on m.id=v.id
Suppose your table that has values 1,2,3,4,5 is named list_of_values, and suppose the table that contain some values but has the name column as some_values, you can do:
SELECT B.id,A.name
FROM [list_of_values] AS B
LEFT JOIN [some_values] AS A
ON B.ID = A.ID

How to optimizing SQL Query with Cross Join

How can I make this SQL query more efficient? The CteFinal code shown below is a portion of my query which add up to 6 minutes to my query. The cteMonth is shown below. The cteDetail is another cte which pulls information directly from the database, and it takes less than a second to run.
What CteFinal is doing is creating missing fiscal period rows while including some of the column data from the row where f.FiscalPeriod=0.
I cannot add, delete, or change any of the indexes on the tables, as this is a ERP database and I'm not allowed to make those type of changes.
CteFinal:
SELECT Account,Month, CONVERT(DATETIME, CAST(#Year as varchar(4)) + '-' + CAST(Month as VARCHAR(2)) + '-' + '01', 102) JEDate
,accountdesc,'' Description,'' JournalCode,NULL JournalNum,NULL JournalLine
,'' LegalNumber,'' CurrencyCode,0.00 DebitAmount,0.00 CreditAmount,fiscalcalendarid,company,bookid,SegValue2,SegValue1,SegValue3,SegValue4
FROM cteDetail f
CROSS JOIN cteMonths m
WHERE f.FiscalPeriod=0 and not exists(select * from cteDetailADDCreatedZero x where x.Account=f.Account and x.FiscalPeriod=Month)
CteMonth:
cteMonths (Month) AS(
select 0 as Month
UNION select 1 as Month
UNION select 2 as Month
UNION select 3 as Month
UNION select 4 as Month
UNION select 5 as Month
UNION select 6 as Month
UNION select 7 as Month
UNION select 8 as Month
UNION select 9 as Month
UNION select 10 as Month
UNION select 11 as Month
UNION select 12 as Month)
Thank you!
Here's a slightly more efficient way to generate the 12 months of a given year (even more efficient if you have your own Numbers table):
DECLARE #year INT = 2013;
;WITH cteMonths([Month],AsDate) AS
(
SELECT n-1,DATEADD(YEAR, #Year-1900, DATEADD(MONTH,n-1,0)) FROM (
SELECT TOP (13) RANK() OVER (ORDER BY [object_id]) FROM sys.all_objects
) AS c(n)
)
SELECT [Month], AsDate FROM cteMonths;
So now, you can say:
;WITH cteMonths([Month],AsDate) AS
(
SELECT n,DATEADD(YEAR, #Year-1900, DATEADD(MONTH,n-1,0)) FROM (
SELECT TOP (13) RANK() OVER (ORDER BY [object_id]) FROM sys.all_objects
) AS c(n)
),
cteDetail AS
(
...no idea what is here...
),
cteDetailADDCreatedZero AS
(
...no idea what is here...
)
SELECT f.Account, m.[Month], JEDate = m.AsDate, f.accountdesc, Description = '',
JournalCode = '', JournalNum = NULL, JournalLine = NULL, LegalNumber = '',
CurrencyCode = '', DebitAmount = 0.00, CreditAmount = 0.00, f.fiscalcalendarid,
f.company, f.bookid, f.SegValue2, f.SegValue1, f.SegValue3, f.SegValue4
FROM cteMonths AS m
LEFT OUTER JOIN cteDetail AS f
ON ... some clause I am not clear on ...
WHERE f.FiscalPeriod = 0
AND NOT EXISTS
(
SELECT 1 FROM cteDetailADDCreatedZero AS x
WHERE x.Account = f.Account
AND x.FiscalPeriod = m.[Month]
);
I suspect this won't solve your problem though: it is likely that this is forcing an entire table scan on either whatever tables are mentioned in cteDetail or cteDetailADDCreatedZero or both. You should inspect the actual execution plan for this query and see if there are any scans or other expensive operations that could guide you towards better indexing. It also might just be that you have a bunch of inefficient CTEs stacked up together - we can't really help with that unless you show everything. CTEs are like views - if you start stacking them up on top of each other, you really limit the optimizer's ability to generate an efficient plan for you. At some point it will just throw its hands in the air...
One possibility is to physicalize the SQL View (if it the query is a view). Sometimes views with complex queries are slow.

How to get the deepest levels of a hierarchical sql query

I'm using SQLServer 2008.
Say I have a recursive hierarchy table, SalesRegion, whit SalesRegionId and ParentSalesRegionId. What I need is, given a specific SalesRegion (anywhere in the hierarchy), retrieve ALL the records at the BOTTOM level.
I.E.:
SalesRegion, ParentSalesRegionId
1, null
1-1, 1
1-2, 1
1-1-1, 1-1
1-1-2, 1-1
1-2-1, 1-2
1-2-2, 1-2
1-1-1-1, 1-1-1
1-1-1-2, 1-1-1
1-1-2-1, 1-1-2
1-2-1-1, 1-2-1
(in my table I have sequencial numbers, this dashed numbers are only to be clear)
So, if the user enters 1-1, I need to retrieve al records with SalesRegion 1-1-1-1 or 1-1-1-2 or 1-1-2-1 (and NOT 1-2-2). Similarly, if the user enters 1-1-2-1, I need to retrieve just 1-1-2-1
I have a CTE query that retrieves everything below 1-1, but that includes rows that I don't want:
WITH SaleLocale_CTE AS (
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, 1 AS Level /*Added as a workaround*/
FROM SaleLocale SL
WHERE SL.Deleted = 0
AND (#SaleLocaleId IS NULL OR SaleLocaleId = #SaleLocaleId)
UNION ALL
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, Level + 1 AS Level
FROM SaleLocale SL
INNER JOIN SaleLocale_CTE SLCTE ON SLCTE.SaleLocaleId = SL.ParentSaleLocaleId
WHERE SL.Deleted = 0
)
SELECT *
FROM SaleLocale_CTE
Thanks in advance!
Alejandro.
I found a quick way to do this, but I'd rather the answer to be in a single query. So if you can think of one, please share! If I like it better, I'll vote for it as the best answer.
I added a "Level" column in my previous query (I'll edit the question so this answer is clear), and used it to get the last level and then delete the ones I don't need.
INSERT INTO #SaleLocales
SELECT *
FROM SaleLocale_GetChilds(#SaleLocaleId)
SELECT #LowestLevel = MAX(Level)
FROM #SaleLocales
DELETE #SaleLocales
WHERE Level <> #LowestLevel
Building off your post:
; WITH CTE AS
(
SELECT *
FROM SaleLocale_GetChilds(#SaleLocaleId)
)
SELECT
FROM CTE a
JOIN
(
SELECT MAX(level) AS level
FROM CTE
) b
ON a.level = b.level
There were a few edits in there. Kept hitting post...
Are you looking for something like this:
declare #SalesRegion as table ( SalesRegion int, ParentSalesRegionId int )
insert into #SalesRegion ( SalesRegion, ParentSalesRegionId ) values
( 1, NULL ), ( 2, 1 ), ( 3, 1 ),
( 4, 3 ), ( 5, 3 ),
( 6, 5 )
; with CTE as (
-- Get the root(s).
select SalesRegion, CAST( SalesRegion as varchar(1024) ) as Path
from #SalesRegion
where ParentSalesRegionId is NULL
union all
-- Add the children one level at a time.
select SR.SalesRegion, CAST( CTE.Path + '-' + cast( SR.SalesRegion as varchar(10) ) as varchar(1024) )
from CTE inner join
#SalesRegion as SR on SR.ParentSalesRegionId = CTE.SalesRegion
)
select *
from CTE
where Path like '1-3%'
I haven't tried this on a serious dataset, so I'm not sure how it'll perform, but I believe it solves your problem:
WITH SaleLocale_CTE AS (
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, CASE WHEN EXISTS (SELECT 1 FROM SaleLocal SL2 WHERE SL2.ParentSaleLocaleId = SL.SaleLocaleID) THEN 1 ELSE 0 END as HasChildren
FROM SaleLocale SL
WHERE SL.Deleted = 0
AND (#SaleLocaleId IS NULL OR SaleLocaleId = #SaleLocaleId)
UNION ALL
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, CASE WHEN EXISTS (SELECT 1 FROM SaleLocal SL2 WHERE SL2.ParentSaleLocaleId = SL.SaleLocaleID) THEN 1 ELSE 0 END as HasChildren
FROM SaleLocale SL
INNER JOIN SaleLocale_CTE SLCTE ON SLCTE.SaleLocaleId = SL.ParentSaleLocaleId
WHERE SL.Deleted = 0
)
SELECT *
FROM SaleLocale_CTE
WHERE HasChildren = 0