stored proc recursion in SQL Server - sql

I have a situation where I want to have a stored proc returning a table that calls itself recursively as part of its calculation.
Unfortunately SQL Server is having none of this and gives me an error along the lines of both being unable to declare a cursor that already exists and about not being able to nest and insert exec statement.
Could I get around some of these issues by using a function? Is there another better way to do this?
The calculation is inherently recursive in nature, so there isn't any getting around this using joins as far as I can tell.
EDIT: to clarify the actual calculation since the code is complicated by other stuff and might complicate the matter-
suppose table A has columns (containerID, objID, objType, weight) and table B has columns (itemID, value).
objType in table A tells you whether objID in table A is a containerID (again in table A) or is and itemID from table B.
(containerID, objID) is a primary key on table A as is itemID on table B.
Generally a container will have tens to hundreds of items or other containers in it. Hopefully the recursion depth isn't more than a dozen levels. (guessing) The calculation is to get a weighted average.

you provide very little information, as a result here is a guess: try using Recursive Queries Using Common Table Expressions, try set based operations and not a cursor, or try using dynamic SQL.

This article gives 7 different ways to do what you're trying to do.
Recursive CTE methods
The blackbox XML methods
Using Common Language Runtime.
Scalar UDF with recursion
Table valued UDF with a WHILE loop.
Dynamic SQL
The Cursor approach.
http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/#_Toc205129484

I think you get an error because the same cursor name is probably used by every recursive call, and the nested call can't open a cursor of the same name until the parent call closes the cursor. If possible, can you make the cursor name dynamic, maybe something as simple as SOME_CURSOR_{$RECURSION_DEPTH}, and you might have to add the recursion depth as a parameter to the procedure though. I've never done anything like this in SQL Server though so I'm not 100% sure.
Not sure about the next/insert exec problem, though it might be tied to the cursor.

Declaring the cursor with LOCAL scope may resolve the issue. Although I'm not sure how the cursor would act in a recursive context.
Check out this article: http://msdn.microsoft.com/en-us/library/ms189238.aspx

DECLARE StudentdIDCursor CURSOR LOCAL FOR SELECT ...blahblah
The key is the LOCAL term. It will generate a separate cursor definition behind the scenes every time.

Related

SQL Server - While Loop vs "LOCAL STATIC READ_ONLY FORWARD_ONLY" Cursor

I have created many cursors in my application to do row by row operations in each cursor single run I selected only 500 or 1000 records so that the cursor can be completed as quickly as postilion in single run, in other words I have selected limited number of records for single cursor run.
To perform the cursor faster and not to put load on server I have used following two ways of declaring cursor.
Declaration 1:
DECLARE DB_CURSOR_01 CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY FOR
Declaration 2:
DECLARE DB_CURSOR_02 CURSOR FAST_FORWARD FOR
Note: I am not using the default declaration of cursor, I am using other types of cursors to make it work faster, and according to my knowledge the declaration 1 mention above is faster then declaration 2, correct me if I am wrong.
Question:
The other way of doing row by row operations is through "While loop using temporary table". So now my question if I convert all of my cursors to while loop using temporary table will it help to improve server performance?
Actually our DBA pointed out that server performance is effecting because of the cursor and if I put that much effort to convert all of those cursors into while loop will it give me the performance benefit? or the way I declared the cursor in declaration 1 mention above will be same performance as while loop?
Cursors in SQL Server are very slow. On other RDBMSes .e.g Sybase they are OK.
Below is practical approach of how to deal with them:
In my experience of "optimising" old dodgy code, the main problem with cursors is when they are based on a complex query. By complex query I mean a query that has more than a few joins and/or complex join conditions.
What the cursor does is, for every iteration, it has to run this join operation, which can take more time than operations inside the body of the loop.
In cases like these it is way more efficient to run a single select into a temp table and then use the temp table in the cursor, an alternative way is to use STATIC or INSENSITIVE keyword (MSDN). One important aspect to consider is concurrency; by saving results of the main cursor query into temp table you prevent changes to the underlying tables being visible to your cursor.
The second aspect to consider are select queries inside a cursor. This is important as each query is run for each cursor iteration and therefore a select on a large table with consume a lot of resources.
I have seen some especially "dodgy" code where:
A table is queried to return a single value using one of the cursor's fetch variables as filter. - This table should be JOINed to the main cursor query. This way this table will be queried only once and results saved to temp table.
A table is queried to return some data based on some conditions and then later on queried again to return more data (different columns) based on the same conditions. - These two selects should be combined into one so that all data (all columns) can be returned at once.
If you have nested cursors (one inside the other), it is killer. Try removing nesting.
If you have many places with cursors prioritise fixing of the ones that match one of the cases above.
P.S. While loop on is own will not save you. You still need to use temp tables and have proper indexes on temp tables. See: https://dba.stackexchange.com/questions/84365/why-choose-a-top-query-and-temporary-table-instead-of-a-cursor-for-a-loop
The above link to Aaron Bertrand blog which discusses performance along with recommendations for cursor options.

SQL performance for a returning stored procedure

I have been asked the following question, what would you look into when you want to improve a stored procedure performance? The stored procedure is returning some value and have three joins in it.
Other than making sure the joins are well written what one can do to make it perform better? This was a general question and no code was provided.
Any ideas?
Check the indexes on the tables used in the joins. Particularly, are the columns used in the joins indexed?
Example -
SELECT *
FROM SomeTable a
JOIN SomeOtherTable b on a.ItemId = b.ItemId
If these tables are large, indexing ItemId in both tables will typically help performance a lot.
You should do the same thing for any columns that are used in the WHERE clause, if your query has one.
WHERE a.ProductId = #SomeVariableYouPassedToTheStoredProc
Indexing ProductId may help in this case.
Query performance is something you could go into a rabbit hole on, but this is a logical (and quick) place to start.
There are a lot of things you can do to optimize procedures, but it sounds like your SQL statement is pretty simple. Some things to watch out for:
Inline functions. These can cause SQL to do a row by row evaluation and slow things down
Data conversions on join statements. These can prevent indexes from being used.
Make sure columns being joined on/in the where clause are indexed (for large data sets)
You can check out this website for more performance tips, but I think I covered most of what you need for simple statements:
SQL Optimizations School
The fact that it's a stored procedure has little to nothing to do with it. Optimise the sql inside.
As to how, all the usual suspects, including written by the sort of eejit who thinks you can guess what's wrong.
Copy the sql from the proc into a suitable tool, prefix it with Explain to see what's going on.
I presume there are others options. For example:
1. each of those joins could use restrict conditions which looks like 'and permited_used_name = (select user_name from user_list where )'. The value could be derived once during procedure start (I mean the first string of procedure) to not overload the DB by many similar queries.
2. starting from Oracle11 you could declare a function as function with cached results (i.e. function is calculated once and isn't recalculated each time when it is invoked) defining a set of tables which changes invalidate cache.
At any case the question is mostly DB-specific.
Run the Query Analyser on the SQL statement

Cursor in Stored Procedure Performance Issues

I found a cursor being used in the below SQL and dynamic SQL. Profile brings up quite a bit of execution plans and I think it has to deal with this cursor. Is this a bad choice of SQL?
SET #SelectStmtSubHeader = 'SELECT DISTINCT
dbo.dsb_testID(sh.GPCustomerID) AScursor -- RIGHT HERE
PONumber,
sh.GPCustomerID,
.....
That's not an example of a cursor.
A cursor needs to be...
DECLARE this_is_a_cursor CURSOR
FOR
SELECT
stuff
FROM
a_query
The snipped code you've shown appears to use a scalar function to derive a value, which it aliases to the word cursor. But having a field called cursor doesn't make it a cursor.
Cursors are nearly always a bad choice to be avoided if alternatives exist in set logic.
SQL is based around set logic. They aren't meant to be iterated through like a collection.
The SQL Optimizers are usually pretty good at finding clever ways to retrieve your data. A cursor is a relatively unsophisticated tool. ANSI SQL does require it though, so it's usually present.
Here is a good example from Sybase
Cursor Performance Example

Cursors vs Procedures in SQL

So, I just learned about CURSORS but still don't exactly grasp them. What is the difference between a cursor and procedure or even a function?
So far from the various examples (DECLARE CURSOR ... SELECT ... FROM ...) It seems at most its a variable to hold a query. Is the data real time, or a snapshot of when the cursor was declared?
i.e.
I have a table with one row and one col with a value of 2.
I do DECLARE CURSOR ... SELECT * FROM table1
I then insert a new row with a value of 3.
When I run the cursor, would I Just get the one row from before the cursor was declared, or both rows?
Thanks
I would recommend researching a bit on "Set based" versus "Row Based". This article does a decent job.
Most database systems are geared toward performing set based operations. Because of this you will often see performance problems when you perform row based operations (like using a cursor). In my experience A LOT of sql that uses cursors can be rewritten without cursors.
In the example you asked about your cursor would only have one record in it.
Also, keep in mind that a stored procedure can make use of a cursor.
I believe the documentation will answer some of your questions. Read through the different options like "Insensitive" to see what they mean.
Also, as a general rule, it is frowned upon to use cursors. You should always try to find a "set based" solution before going the cursor route. There is much debate and documentation on this subject as well that is easily accessible.
My advice would be to forget you learned the syntax for cursors. Cursors are the last resort technique and should only be used by an expert who understnds what impact the cursor will have on performance and why the set-based alternatives won't work in his specific case. Most things done in cursors are far more easily done in a set-based fashion if you understand set operations. This link will help you learn what you need to know so that you will only rarely have to write a cursor:
http://wiki.lessthandot.com/index.php/Cursors_and_How_to_Avoid_Them
CURSOR:
do something that is to be done row by row and not possible with simple query
for example, if you have a temporary table to store hierarchy of categories, cursor can be useful to populate it
procedures or stored procedures are something that can have collection of sql statements (including cursor) in it. when executed by passing params (if any) it will execute the statements in it
A function or procedure is a set of instructions to perform some task.
A cursor is an array that can stores the result set of a query.
Stored procedures are pre-compiled objects and executes as bulk of statements, whereas cursors are used to execute row by row.
For Example: You can take cursors like a bag (cursors are similar to pointer that points out on a single row out of so many rows) and put all the result of a query run from within procedures, whenever you will need the results, just open the bag and find out the results row by row.

What would be a better way to handle this sql logic?

Inside of a stored procedure, I populate a table of items (#Items). Just basic information about them. However for each item, I need to make sure I can sell them, and in order to do that I need to perform a whole lot of validation. In order to keep the stored procedure somewhat maintainable, I moved the logic into another stored procedure.
What would be the best way to call the stored procedure for each item in the temp table?
The way I have it now, I apply an identity column and then just do a while loop, executing the stored procedure for each row and inserting the validation result into a temporary table. (#Validation)
However now that logic has changed, and in between the creation of #Items and the execution of the loop, some records are deleted which screws up the while loop, since the Identity no longer equals the counter.
I could handle that by dropping the identity column and reapplying it before the while loop, but I was just wondering if there was a better way. Is there a way to get a specific row at an index if I apply an order by clause?
I know I could do a cursor, but those are a pain in the ass to me. Also performance is somewhat of a concern, would a fastforward readonly cursor be a better option than a while loop? The number of rows in the #Items table isn't that large, maybe 50 at most, but the stored procedure is going to be called quite frequently.
Turn your validation stored procedure into a user defined function that accepts an item id or the data columns needed to validate an item record
Create the temp table
Insert all your items
Write a delete query for the temp table that calls your new UDF in the WHERE clause.
I agree that if you can do it set-based then do it that way. Perhaps put the validation into a user-defined function instead of a sproc to enable that. Which may pave the way for you to be able to do it set-based.
e.g.
SELECT * FROM SomeTable WHERE dbo.fnIsValid(dataitem1, dataitem2....) = 1
However, I know this is may not be possible depending on your exact scenario, so...
Correction edit based on now understanding the IDENTITY/loop issue:
You can use ROW_NUMBER() in SQL 2005 to get the next row, doesn't matter if there are gaps in the IDENTITY field as this will assign a row number to each record ordered by what you tell it:
-- Gets next record
SELECT * FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY IDField ASC) AS RowNo, *
FROM #temptable
) s
WHERE s.RowNo = #Counter
Does this kind of business logic really has to be in database?
I don't know much about your scenario, but maybe it would be best to move that decision you're trying to model with SPs into the application?
So you might try to use a function instead of stored procedure for that logic, and include the result of this function as a column in your temporary table? Would that work for you? Or if you need the data in realtime every time you use it later, then function returning 0/1 values, included in select list, could be a good bet anyway
If it's possible to rewrite your stored procedure logic using a query, i. e. a set-based approach?
You should try this first.