Cursor in Stored Procedure Performance Issues - sql

I found a cursor being used in the below SQL and dynamic SQL. Profile brings up quite a bit of execution plans and I think it has to deal with this cursor. Is this a bad choice of SQL?
SET #SelectStmtSubHeader = 'SELECT DISTINCT
dbo.dsb_testID(sh.GPCustomerID) AScursor -- RIGHT HERE
PONumber,
sh.GPCustomerID,
.....

That's not an example of a cursor.
A cursor needs to be...
DECLARE this_is_a_cursor CURSOR
FOR
SELECT
stuff
FROM
a_query
The snipped code you've shown appears to use a scalar function to derive a value, which it aliases to the word cursor. But having a field called cursor doesn't make it a cursor.

Cursors are nearly always a bad choice to be avoided if alternatives exist in set logic.
SQL is based around set logic. They aren't meant to be iterated through like a collection.
The SQL Optimizers are usually pretty good at finding clever ways to retrieve your data. A cursor is a relatively unsophisticated tool. ANSI SQL does require it though, so it's usually present.
Here is a good example from Sybase
Cursor Performance Example

Related

SQL Server - While Loop vs "LOCAL STATIC READ_ONLY FORWARD_ONLY" Cursor

I have created many cursors in my application to do row by row operations in each cursor single run I selected only 500 or 1000 records so that the cursor can be completed as quickly as postilion in single run, in other words I have selected limited number of records for single cursor run.
To perform the cursor faster and not to put load on server I have used following two ways of declaring cursor.
Declaration 1:
DECLARE DB_CURSOR_01 CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY FOR
Declaration 2:
DECLARE DB_CURSOR_02 CURSOR FAST_FORWARD FOR
Note: I am not using the default declaration of cursor, I am using other types of cursors to make it work faster, and according to my knowledge the declaration 1 mention above is faster then declaration 2, correct me if I am wrong.
Question:
The other way of doing row by row operations is through "While loop using temporary table". So now my question if I convert all of my cursors to while loop using temporary table will it help to improve server performance?
Actually our DBA pointed out that server performance is effecting because of the cursor and if I put that much effort to convert all of those cursors into while loop will it give me the performance benefit? or the way I declared the cursor in declaration 1 mention above will be same performance as while loop?
Cursors in SQL Server are very slow. On other RDBMSes .e.g Sybase they are OK.
Below is practical approach of how to deal with them:
In my experience of "optimising" old dodgy code, the main problem with cursors is when they are based on a complex query. By complex query I mean a query that has more than a few joins and/or complex join conditions.
What the cursor does is, for every iteration, it has to run this join operation, which can take more time than operations inside the body of the loop.
In cases like these it is way more efficient to run a single select into a temp table and then use the temp table in the cursor, an alternative way is to use STATIC or INSENSITIVE keyword (MSDN). One important aspect to consider is concurrency; by saving results of the main cursor query into temp table you prevent changes to the underlying tables being visible to your cursor.
The second aspect to consider are select queries inside a cursor. This is important as each query is run for each cursor iteration and therefore a select on a large table with consume a lot of resources.
I have seen some especially "dodgy" code where:
A table is queried to return a single value using one of the cursor's fetch variables as filter. - This table should be JOINed to the main cursor query. This way this table will be queried only once and results saved to temp table.
A table is queried to return some data based on some conditions and then later on queried again to return more data (different columns) based on the same conditions. - These two selects should be combined into one so that all data (all columns) can be returned at once.
If you have nested cursors (one inside the other), it is killer. Try removing nesting.
If you have many places with cursors prioritise fixing of the ones that match one of the cases above.
P.S. While loop on is own will not save you. You still need to use temp tables and have proper indexes on temp tables. See: https://dba.stackexchange.com/questions/84365/why-choose-a-top-query-and-temporary-table-instead-of-a-cursor-for-a-loop
The above link to Aaron Bertrand blog which discusses performance along with recommendations for cursor options.

Difference between FETCH/FOR to loop a CURSOR in PL/SQL

I know that fetching a cursor will give me access to variables like %ROWCOUNT, %ROWTYPE, %FOUND, %NOTFOUND, %ISOPEN
...but I was wondering if there are any other reasons to use
Open - Fetch - Close instructions to loop a cursor
rather than
Loop the cursor with a FOR cycle... (In my opinion this is better becase it is simple)
What do you think?
From a performance standpoint, the difference is a lot more complicated than the Tim Hall tip that OMG Ponies linked to would imply. I believe that this tip is an introduction to a larger section that has been excerpted for the web-- I expect that Tim went on to make most if not all of these points in the book. Additionally, this entire discussion depends on the Oracle version you're using. I believe this is correct for 10.2, 11.1, and 11.2 but there are definitely differences if you start going back to older releases.
The particular example in the tip, first of all, is rather unrealistic. I've never seen anyone code a single-row fetch using an explicit cursor rather than a SELECT INTO. So the fact that SELECT INTO is more efficient is of very limited practical importance. If we're discussing loops, the performance we're interested in is how expensive it is to fetch many rows. And that's where the complexity starts to come in.
Oracle introduced the ability to do a BULK COLLECT of data from a cursor into a PL/SQL collection in 10.1. This is a much more efficient way to get data from the SQL engine to the PL/SQL collection because it allows you to minimize context shifts by fetching many rows at once. And subsequent operations on those collections are more efficient because your code can stay within the PL/SQL engine.
In order to take maximum advantage of the BULK COLLECT syntax, though, you generally have to use explicit cursors because that way you can populate a PL/SQL collection and then subsequently use the FORALL syntax to write the data back to the database (on the reasonable assumption that if you are fetching a bunch of data in a cursor, there is a strong probability that you are doing some sort of manipulation and saving the manipulated data somewhere). If you use an implicit cursor in a FOR loop, as OMG Ponies correctly points out, Oracle will be doing a BULK COLLECT behind the scenes to make the fetching of the data less expensive. But your code will be doing slower row-by-row inserts and updates because the data is not in a collection. Explicit cursors also offer the opportunity to set the LIMIT explicitly which can improve performance over the default of 100 for an implicit cursor in a FOR loop.
In general, assuming that you're on 10.2 or greater and that your code is fetching data and writing it back to the database,
Fastest
Explicit cursors doing a BULK COLLECT into a local collection (with an appropriate LIMIT) and using FORALL to write back to the database.
Implicit cursors doing a BULK COLLECT for you behind the scenes along with single-row writes back to the datbase.
Explicit cursors that are not doing a BULK COLLECT and not taking advantage of PL/SQL collections.
Slowest
On the other hand, using implicit cursors gets you quite a bit of the benefit of using bulk operations for very little of the upfront cost in refactoring old code or learning the new feature. If most of your PL/SQL development is done by developers whose primary language is something else or who don't necessarily keep up with new language features, FOR loops are going to be easier to understand and maintain than explicit cursor code that used all the new BULK COLLECT functionality. And when Oracle introduces new optimizations in the future, it's far more likely that the implicit cursor code would get the benefit automatically while the explicit code may require some manual rework.
Of course, by the time you're troubleshooting performance to the point where you really care about how much faster different variants of your looping code might be, you're often at the point where you would want to consider moving more logic into pure SQL and ditching the looping code entirely.
The OPEN / FETCH / CLOSE is called explicit cursor syntax; the latter is called implicit cursor syntax.
One key difference you've already noticed is that you can't use %FOUND/%NOTFOUND/etc in implicit cursors... Another thing to be aware of is that implicit cursors are faster than explicit ones--they read ahead (~100 records?) besides not supporting the explicit logic.
Additional info:
Implicit vs. Explicit Cursors
I don't know about any crucial differences in this two realizations besides one: for ... loop implicitly closes the cursor after the loop is finished and if open ... fetch ... close syntax you'd rather close the cursor yourself (just a good manner) - thought this is not a necessity: Oracle will close the cursor automatically outbound the visibility scope. Also you can't use %FOUND and %NOTFOUND in for ... loop cursors.
As for me I find the for ... loop realization much easier to read and support.
Correct me if I'm wrong but I think both have one nice feature what other one doesn't have.
With for loop you can do like this:
for i in (select * from dual)
dbms_output.put_line('ffffuuu');
end loop;
And with open .. fetch you can do like this:
declare
cur sys_refcursor;
tmp dual.dummy%type;
begin
open cur for 'select dummy from dual';
loop
fetch cur into tmp;
exit when cur%notfound;
dbms_output.put_line('ffffuuu');
end loop;
close cur;
end;
So with open fetch you can use dynamic cursors but with for loop you can define normal cursor without declaration.

stored proc recursion in SQL Server

I have a situation where I want to have a stored proc returning a table that calls itself recursively as part of its calculation.
Unfortunately SQL Server is having none of this and gives me an error along the lines of both being unable to declare a cursor that already exists and about not being able to nest and insert exec statement.
Could I get around some of these issues by using a function? Is there another better way to do this?
The calculation is inherently recursive in nature, so there isn't any getting around this using joins as far as I can tell.
EDIT: to clarify the actual calculation since the code is complicated by other stuff and might complicate the matter-
suppose table A has columns (containerID, objID, objType, weight) and table B has columns (itemID, value).
objType in table A tells you whether objID in table A is a containerID (again in table A) or is and itemID from table B.
(containerID, objID) is a primary key on table A as is itemID on table B.
Generally a container will have tens to hundreds of items or other containers in it. Hopefully the recursion depth isn't more than a dozen levels. (guessing) The calculation is to get a weighted average.
you provide very little information, as a result here is a guess: try using Recursive Queries Using Common Table Expressions, try set based operations and not a cursor, or try using dynamic SQL.
This article gives 7 different ways to do what you're trying to do.
Recursive CTE methods
The blackbox XML methods
Using Common Language Runtime.
Scalar UDF with recursion
Table valued UDF with a WHILE loop.
Dynamic SQL
The Cursor approach.
http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/#_Toc205129484
I think you get an error because the same cursor name is probably used by every recursive call, and the nested call can't open a cursor of the same name until the parent call closes the cursor. If possible, can you make the cursor name dynamic, maybe something as simple as SOME_CURSOR_{$RECURSION_DEPTH}, and you might have to add the recursion depth as a parameter to the procedure though. I've never done anything like this in SQL Server though so I'm not 100% sure.
Not sure about the next/insert exec problem, though it might be tied to the cursor.
Declaring the cursor with LOCAL scope may resolve the issue. Although I'm not sure how the cursor would act in a recursive context.
Check out this article: http://msdn.microsoft.com/en-us/library/ms189238.aspx
DECLARE StudentdIDCursor CURSOR LOCAL FOR SELECT ...blahblah
The key is the LOCAL term. It will generate a separate cursor definition behind the scenes every time.

Cursors vs Procedures in SQL

So, I just learned about CURSORS but still don't exactly grasp them. What is the difference between a cursor and procedure or even a function?
So far from the various examples (DECLARE CURSOR ... SELECT ... FROM ...) It seems at most its a variable to hold a query. Is the data real time, or a snapshot of when the cursor was declared?
i.e.
I have a table with one row and one col with a value of 2.
I do DECLARE CURSOR ... SELECT * FROM table1
I then insert a new row with a value of 3.
When I run the cursor, would I Just get the one row from before the cursor was declared, or both rows?
Thanks
I would recommend researching a bit on "Set based" versus "Row Based". This article does a decent job.
Most database systems are geared toward performing set based operations. Because of this you will often see performance problems when you perform row based operations (like using a cursor). In my experience A LOT of sql that uses cursors can be rewritten without cursors.
In the example you asked about your cursor would only have one record in it.
Also, keep in mind that a stored procedure can make use of a cursor.
I believe the documentation will answer some of your questions. Read through the different options like "Insensitive" to see what they mean.
Also, as a general rule, it is frowned upon to use cursors. You should always try to find a "set based" solution before going the cursor route. There is much debate and documentation on this subject as well that is easily accessible.
My advice would be to forget you learned the syntax for cursors. Cursors are the last resort technique and should only be used by an expert who understnds what impact the cursor will have on performance and why the set-based alternatives won't work in his specific case. Most things done in cursors are far more easily done in a set-based fashion if you understand set operations. This link will help you learn what you need to know so that you will only rarely have to write a cursor:
http://wiki.lessthandot.com/index.php/Cursors_and_How_to_Avoid_Them
CURSOR:
do something that is to be done row by row and not possible with simple query
for example, if you have a temporary table to store hierarchy of categories, cursor can be useful to populate it
procedures or stored procedures are something that can have collection of sql statements (including cursor) in it. when executed by passing params (if any) it will execute the statements in it
A function or procedure is a set of instructions to perform some task.
A cursor is an array that can stores the result set of a query.
Stored procedures are pre-compiled objects and executes as bulk of statements, whereas cursors are used to execute row by row.
For Example: You can take cursors like a bag (cursors are similar to pointer that points out on a single row out of so many rows) and put all the result of a query run from within procedures, whenever you will need the results, just open the bag and find out the results row by row.

SQL Server Fast Forward Cursors

It is generally accepted that the use of cursors in stored procedures should be avoided where possible (replaced with set based logic etc). If you take the cases where you need to iterate over some data, and can do in a read only manner, are fast forward (read only forward) cursor more or less inefficient than say while loops? From my investigations it looks as though the cursor option is generally faster and uses less reads and cpu time. I haven't done any extensive testing, but is this what others find? Do cursors of this type (fast forward) carry additional overhead or resource that could be expensive that I don't know about.
Is all the talk about not using cursors really about avoiding the use of cursors when set-based approaches are available, and the use of updatable cursors etc.
While a fast forward cursor does have some optimizations in Sql Server 2005, it is not true that they are anywhere close to a set based query in terms of performance. There are very few situations where cursor logic cannot be replaced by a set-based query. Cursors will always be inherently slower, due in part to the fact that you have to keep interrupting the execution in order to fill your local variables.
Here are few references, which would only be the tip of the iceberg if you research this issue:
http://www.code-magazine.com/Article.aspx?quickid=060113
http://dataeducation.com/re-inventing-the-recursive-cte/
This answer hopes to consolidate the replies given to date.
1) If at all possible, used set based logic for your queries i.e. try and use just SELECT, INSERT, UPDATE or DELETE with the appropriate FROM clauses or nested queries - these will almost always be faster.
2) If the above is not possible, then in SQL Server 2005+ FAST FORWARD cursors are efficient and perform well and should be used in preference to while loops.
You can avoid cursors most of the time, but sometimes it's necessary.
Just keep in mind that FAST_FORWARD is DYNAMIC ... FORWARD_ONLY you can use with a STATIC cursor.
Try using it on the Halloween problem to see what happens !!!
IF OBJECT_ID('Funcionarios') IS NOT NULL
DROP TABLE Funcionarios
GO
CREATE TABLE Funcionarios(ID Int IDENTITY(1,1) PRIMARY KEY,
ContactName Char(7000),
Salario Numeric(18,2));
GO
INSERT INTO Funcionarios(ContactName, Salario) VALUES('Fabiano', 1900)
INSERT INTO Funcionarios(ContactName, Salario) VALUES('Luciano',2050)
INSERT INTO Funcionarios(ContactName, Salario) VALUES('Gilberto', 2070)
INSERT INTO Funcionarios(ContactName, Salario) VALUES('Ivan', 2090)
GO
CREATE NONCLUSTERED INDEX ix_Salario ON Funcionarios(Salario)
GO
-- Halloween problem, will update all rows until then reach 3000 !!!
UPDATE Funcionarios SET Salario = Salario * 1.1
FROM Funcionarios WITH(index=ix_Salario)
WHERE Salario < 3000
GO
-- Simulate here with all different CURSOR declarations
-- DYNAMIC update the rows until all of then reach 3000
-- FAST_FORWARD update the rows until all of then reach 3000
-- STATIC update the rows only one time.
BEGIN TRAN
DECLARE #ID INT
DECLARE TMP_Cursor CURSOR DYNAMIC
--DECLARE TMP_Cursor CURSOR FAST_FORWARD
--DECLARE TMP_Cursor CURSOR STATIC READ_ONLY FORWARD_ONLY
FOR SELECT ID
FROM Funcionarios WITH(index=ix_Salario)
WHERE Salario < 3000
OPEN TMP_Cursor
FETCH NEXT FROM TMP_Cursor INTO #ID
WHILE ##FETCH_STATUS = 0
BEGIN
SELECT * FROM Funcionarios WITH(index=ix_Salario)
UPDATE Funcionarios SET Salario = Salario * 1.1
WHERE ID = #ID
FETCH NEXT FROM TMP_Cursor INTO #ID
END
CLOSE TMP_Cursor
DEALLOCATE TMP_Cursor
SELECT * FROM Funcionarios
ROLLBACK TRAN
GO
Some alternatives to using cursor:
WHILE loops
Temp tablolar
Derived tables
Associated subqueries
CASE statements
Multiple interrogations
Often, cursor operations can also be achieved with non-cursor techniques.
If you are sure that the cursor needs to be used, the number of records to be processed should be reduced as much as possible. One way of doing this is to get the records to be processed first into a temp table, not the original table, but a cursor that will use the records in the temp table. When this path is used, it is assumed that the number of records in the temp table has been greatly reduced compared to the original table. With fewer records, the cursor completes faster.
Some cursor properties that affect performance include:
FORWARD_ONLY: Supports forwarding only the cursor from the first row to the end with FETCH NEXT. Unless set as KEYSET or STATIC, the SELECT clause is re-evaluated when each fetch is called.
STATIC: Creates a temp copy of the created data and is used by the cursor. This prevents the cursor from being recalculated each time it is called, which improves performance. This does not allow cursor type modification, and changes to the table are not reflected when the fetch is called.
KEYSET: Cursored rows are placed in a table under tempdb, and changes to nonkey columns are reflected when the fetch is called. However, new records added to the table are not reflected. With the keyset cursor, the SELECT statement is not evaluated again.
DYNAMIC: All changes to the table are reflected in the cursore. The cursor is re-evaluated when each fetch is called. It uses a lot of resources and adversely affects performance.
FAST_FORWARD: The cursor is one-way, such as FORWARD_ONLY, but specifies the cursor as read-only. FORWARD_ONLY is a performance increase and the cursor is not reevaluated every fetch. It gives the best performance if it is suitable for programming.
OPTIMISTIC: This option can be used to update rows in the cursor. If a row is fetched and updated, and another row is updated between fetch and update operations, the cursor update operation fails. If an OPTIMISTIC cursor is used that can perform line update, it should not be updated by another process.
NOTE: If cursore is not specified, the default is FORWARD_ONLY.
"If You want a even faster cursor than FAST FORWARD then use a STATIC cursor. They are faster than FAST FORWARD. Not extremely faster but can make a difference."
Not so fast! According to Microsoft:
"Typically, when these conversions occurred, the cursor type degraded to a ‘more expensive’ cursor type. Generally, a (FAST) FORWARD-ONLY cursor is the most performant, followed by DYNAMIC, KEYSET, and finally STATIC which is generally the least performant."
from: Link
People avoid cursor because they generally are more difficult to write than a simple while loops, however, a while loop can be expensive because your constantly selecting data from a table, temporary or otherwise.
With a cursor, which is readonly fast forward, the data is kept in memory and has been specifically designed for looping.
This article highlights that an average cursor runs 50 times faster than a while loop.
To answer Mile's original questions...
Fast Forward, Read Only, Static cursors (affectionately known as a "Fire Hose Cursor") are typically as fast or faster than a equivalent Temp Table and a While loop because such a cursor is nothing more than a Temp Table and a While loop that has been optimized a bit behind the scenes.
To add to what Eric Z. Beard posted on this thread and to further answer the question of...
"Is all the talk about not using cursors really about avoiding the use
of cursors when set-based approaches are available, and the use of
updatable cursors etc."
Yes. With very few exceptions, it takes less time and less code to write proper set-based code to do the same thing as most cursors and has the added benefit of using much fewer resources and usually runs MUCH faster than a cursor or While loop. Generally speaking and with the exception of certain administrative tasks, they really should be avoided in favor of properly written set-based code. There are, of course, exceptions to every "rule" but, in the case of Cursors, While loops, and other forms of RBAR, most people can count the exceptions on one hand without using all of the fingers. ;-)
There's also the notion of "Hidden RBAR". This is code that looks set-based but actually isn't. This type of "set-based" code is the reason why certain people have embraced RBAR methods and say they're "OK". For example, solving the running total problem using an aggregated (SUM) correlated sub-query with an inequality in it to build the running total isn't really set-based in my book. Instead, it's RBAR on steroids because ,for each row calculated, it has to repeatedly "touch" many other rows at a rate of N*(N+1)/2. That's known as a "Triangular Join" and is at least half as bad as a full Cartesian Join (Cross Join or "Square Join").
Although MS has made some improvements in how Cursors work since SQL Server 2005, the term "Fast Cursor" is still an oxymoron compared to properly written set-based code. That also holds true even in Oracle. I worked with Oracle for a short 3 years in the past but my job was to make performance improvements in existing code. Most of the really substantial improvements were realized when I converted Cursors to set-based code. Many jobs that previously took 4 to 8 hours to execute were reduced to minutes and, sometimes, seconds.
The 'Best Practice' of avoiding cursors in SQL Server dates back to SQL Server 2000 and earlier versions. The rewrite of the engine in SQL 2005 addressed most of the issues related to the problems of cursors, particularly with the introduction of the fast forward option. Cursors are not neccessarily worse than set-based and are used extensively and successfully in Oracle PL/SQL (LOOP).
The 'generally accepted' that you refer to was valid, but is now outdated and incorrect - go on the assumption that fast forward cursors behave as advertised and perform. Do some tests and research, basing your findings on SQL2005 and later
If You want a even faster cursor than FAST FORWARD then use a STATIC cursor. They are faster than FAST FORWARD. Not extremely faster but can make a difference.