I'm aware that SQL Server has issues with running user-defined functions over large rowsets. But my problem is a little bit different.
I have a test code like:
SELECT TESTNAME = TESTFUNC()
INTO #TEMPTABLE
FROM SOMETABLE
TESTFUNC has no code in it. Just returns 1 for testing purposes.
When I run this, it takes >= 24 seconds for 1.5 million rows.
If I put the same code inside a stored procedure and execute it with the same user and everything, it takes ~ 6 seconds.
The query time stats in the execution plan for the plans linked below are
vs
CREATE PROCEDURE TESTPROC AS
BEGIN
SELECT TESTNAME = TESTFUNC()
INTO #TEMPTABLE
FROM SOMETABLE
END
Estimated and actual execution plans are the same for both but "Compute Scalar" step takes a lot longer when the statement is called directly.
What makes the difference?
Edit: Actual execution plans
Edit 2: Function definition
CREATE FUNCTION TESTFUNC
(
)
RETURNS BIGINT
AS
BEGIN
RETURN 1
END
Plans:
https://www.brentozar.com/pastetheplan/?id=Sy0gFh53F
https://www.brentozar.com/pastetheplan/?id=r1Dfi352F
(#PaymentId int)
returns int
as
begin
Declare #viewedCount int
Select #viewedCount = Count(OtSrno)
From OtTendersViewDetail
Where OTPaymentId = #PaymentId
And OTPaymentId is not null
return (#viewedCount)
end
Optimize is normally somthing you do for complex operations - this one from pure SQL is as good as it gets.
What you can do is "with recompile" or "optimize for unknown" to avoid parameter sniffing - which means reusing query plans.
What you can also do (which is better) is check for OTPaymentId being null and if/else between 2 different select statements. Again, this is about query plan reuse / getting stuck with a bad query plan.
If I have statement
DECLARE #i INT;
DECLARE #d NUMERIC(9,3);
SET #i = 123;
SET #d = #i;
SELECT #d;
and I include actual execution plan and run this query, I don't get an execution plan. Will the query trigger execution plan only when there is FROM statement in the batch?
The simple answer is you don't get execution plans without table access.
Execution plans are what the optimiser produces: it work out the best way to satisfy the query based on indexes, statistics, etc.
What you have above is trivial and has no table access. Why do you need a plan?
Edit:
A derived table is table access as per Lucero's example in comments
Edit 2:
"Trivial" table access gives constant scans, not real scans or seeks:
SELECT * FROM sys.tables WHERE 1=0
Lucero's examples in comments
What you mean by what will trigger execution plan? Also I didn't understand I include actual execution plan and run this query, I don't get an execution plan. Hope this link may help you.
SQL Tuning Tutorial - Understanding
a Database Execution Plan (1)
I would assume that a query plan is generated whenever a set-based operation needs to be performed.
Yes you need a from clause.
You can do like this
declare #i int
declare #d numeric(9,3)
set #i = 123
select #d = #i
from (select 1) as x(x)
select #d
And in the execution plan you see this
<ScalarOperator ScalarString="CONVERT_IMPLICIT(numeric(9,3),[#i],0)">
I want to check if SQL Server(2000/05/08) has the ability to write a nested stored procedure, what I meant is - WRITING a Sub Function/procedure inside another stored procedure. NOT calling another SP.
Why I was thinking about it is- One of my SP is having a repeated lines of code and that is specific to only this SP.So if we have this nested SP feature then I can declare another sub/local procedure inside my main SP and put all the repeating lines in that. and I can call that local sp in my main SP. I remember such feature is available in Oracle SPs.
If SQL server is also having this feature, can someone please explain some more details about it or provide a link where I can find documentation.
Thanks in advance
Sai
I don't recommend doing this as each time it is created a new execution plan must be calculated, but YES, it definitely can be done (Everything is possible, but not always recommended).
Here is an example:
CREATE PROC [dbo].[sp_helloworld]
AS
BEGIN
SELECT 'Hello World'
DECLARE #sSQL VARCHAR(1000)
SET #sSQL = 'CREATE PROC [dbo].[sp_helloworld2]
AS
BEGIN
SELECT ''Hello World 2''
END'
EXEC (#sSQL)
EXEC [sp_helloworld2];
DROP PROC [sp_helloworld2];
END
You will get the warning
The module 'sp_helloworld' depends on the missing object 'sp_helloworld2'.
The module will still be created; however, it cannot run successfully until
the object exists.
You can bypass this warning by using EXEC('sp_helloworld2') above.
But if you call EXEC [sp_helloworld] you will get the results
Hello World
Hello World 2
It does not have that feature. It is hard to see what real benefit such a feature would provide, apart from stopping the code in the nested SPROC from being called from elsewhere.
Oracle's PL/SQL is something of a special case, being a language heavily based on Ada, rather than simple DML with some procedural constructs bolted on. Whether or not you think this is a good idea probably depends on your appetite for procedural code in your DBMS and your liking for learning complex new languages.
The idea of a subroutine, to reduce duplication or otherwise, is largely foreign to other database platforms in my experience (Oracle, MS SQL, Sybase, MySQL, SQLite in the main).
While the SQL-building proc would work, I think John's right in suggesting that you don't use his otherwise-correct answer!
You don't say what form your repeated lines take, so I'll offer three potential alternatives, starting with the simplest:
Do nothing. Accept that procedural
SQL is a primitive language lacking
so many "essential" constructs that
you wouldn't use it at all if it
wasn't the only option.
Move your procedural operations outside of the DBMS and execute them in code written in a more sophisticated language. Consider ways in which your architecture could be adjusted to extract business logic from your data storage platform (hey, why not redesign the whole thing!)
If the repetition is happening in DML, SELECTs in particular, consider introducing views to slim down the queries.
Write code to generate, as part of your build process, the stored procedures. That way if the repeated lines ever need to change, you can change them in one place and automatically generate the repetition.
That's four. I thought of another one as I was typing; consider it a bonus.
CREATE TABLE #t1 (digit INT, name NVARCHAR(10));
GO
CREATE PROCEDURE #insert_to_t1
(
#digit INT
, #name NVARCHAR(10)
)
AS
BEGIN
merge #t1 AS tgt
using (SELECT #digit, #name) AS src (digit,name)
ON (tgt.digit = src.digit)
WHEN matched THEN
UPDATE SET name = src.name
WHEN NOT matched THEN
INSERT (digit,name) VALUES (src.digit,src.name);
END;
GO
EXEC #insert_to_t1 1,'One';
EXEC #insert_to_t1 2,'Two';
EXEC #insert_to_t1 3,'Three';
EXEC #insert_to_t1 4,'Not Four';
EXEC #insert_to_t1 4,'Four'; --update previous record!
SELECT * FROM #t1;
What we're doing here is creating a procedure that lives for the life of the connection and which is then later used to insert some data into a table.
John's sp_helloworld does work, but here's the reason why you don't see this done more often.
There is a very large performance impact when a stored procedure is compiled. There's a Microsoft article on troubleshooting performance problems caused by a large number of recompiles, because this really slows your system down quite a bit:
http://support.microsoft.com/kb/243586
Instead of creating the stored procedure, you're better off just creating a string variable with the T-SQL you want to call, and then repeatedly executing that string variable.
Don't get me wrong - that's a pretty bad performance idea too, but it's better than creating stored procedures on the fly. If you can persist this code in a permanent stored procedure or function and eliminate the recompile delays, SQL Server can build a single execution plan for the code once and then reuse that plan very quickly.
I just had a similar situation in a SQL Trigger (similar to SQL procedure) where I basically had same insert statement to be executed maximum 13 times for 13 possible key values that resulted of 1 event. I used a counter, looped it 13 times using DO WHILE and used CASE for each of the key values processing, while kept a flag to figure out when I need to insert and when to skip.
it would be very nice if MS develops GOSUB besides GOTO, an easy thing to do!
Creating procedures or functions for "internal routines" polute objects structure.
I "implement" it like this
BODY1:
goto HEADER HEADER_RET1:
insert into body ...
goto BODY1_RET
BODY2:
goto HEADER HEADER_RET2:
INSERT INTO body....
goto BODY2_RET
HEADER:
insert into header
if #fork=1 goto HEADER_RET1
if #fork=2 goto HEADER_RET2
select 1/0 --flow check!
I too had need of this. I had two functions that brought back case counts to a stored procedure, which was pulling a list of all users, and their case counts.
Along the lines of
select name, userID, fnCaseCount(userID), fnRefCount(UserID)
from table1 t1
left join table2 t2
on t1.userID = t2.UserID
For a relatively tiny set (400 users), it was calling each of the two functions one time. In total, that's 800 calls out from the stored procedure. Not pretty, but one wouldn't expect a sql server to have a problem with that few calls.
This was taking over 4 minutes to complete.
Individually, the function call was nearly instantaneous. Even 800 near instantaneous calls should be nearly instantaneous.
All indexes were in place, and SSMS suggested no new indexes when the execution plan was analyzed for both the stored procedure and the functions.
I copied the code from the function, and put it into the SQL query in the stored procedure. But it appears the transition between sp and function is what ate up the time.
Execution time is still too high at 18 seconds, but allows the query to complete within our 30 second time out window.
If I could have had a sub procedure it would have made the code prettier, but still may have added overhead.
I may next try to move the same functionality into a view I can use in a join.
select t1.UserID, t2.name, v1.CaseCount, v2.RefCount
from table1 t1
left join table2 t2
on t1.userID = t2.UserID
left join vwCaseCount v1
on v1.UserID = t1.UserID
left join vwRefCount v2
on v2.UserID = t1.UserID
Okay, I just created views from the functions, so my execution time went from over 4 minutes, to 18 seconds, to 8 seconds. I'll keep playing with it.
I agree with andynormancx that there doesn't seem to be much point in doing this.
If you really want the shared code to be contained inside the SP then you could probably cobble something together with GOTO or dynamic SQL, but doing it properly with a separate SP or UDF would be better in almost every way.
For whatever it is worth, here is a working example of a GOTO-based internal subroutine. I went that way in order to have a re-useable script without side effects, external dependencies, and duplicated code:
DECLARE #n INT
-- Subroutine input parameters:
DECLARE #m_mi INT -- return selector
-- Subroutine output parameters:
DECLARE #m_use INT -- instance memory usage
DECLARE #m_low BIT -- low memory flag
DECLARE #r_msg NVARCHAR(max) -- low memory description
-- Subroutine internal variables:
DECLARE #v_low BIT, -- low virtual memory
#p_low BIT -- low physical memory
DECLARE #m_gra INT
/* ---------------------- Main script ----------------------- */
-- 1. First subroutine invocation:
SET #m_mi = 1 GOTO MemInfo
MI1: -- return here after invocation
IF #m_low = 1 PRINT '1:Low memory'
ELSE PRINT '1:Memory OK'
SET #n = 2
WHILE #n > 0
BEGIN
-- 2. Second subroutine invocation:
SET #m_mi = 2 GOTO MemInfo
MI2: -- return here after invocation
IF #m_low = 1 PRINT '2:Low memory'
ELSE PRINT '2:Memory OK'
SET #n = #n - 1
END
GOTO EndOfScript
MemInfo:
/* ------------------- Subroutine MemInfo ------------------- */
-- IN : #m_mi: return point: 1 for label MI1 and 2 for label MI2
-- OUT: #m_use: RAM used by isntance,
-- #m_low: low memory condition
-- #r_msg: low memory message
SET #m_low = 1
SELECT #m_use = physical_memory_in_use_kb/1024,
#p_low = process_physical_memory_low ,
#v_low = process_virtual_memory_low
FROM sys.dm_os_process_memory
IF #p_low = 1 OR #v_low = 1 BEGIN
SET #r_msg = 'Low memory.' GOTO LOWMEM END
SELECT #m_gra = cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name = N'Memory Grants Pending'
IF #m_gra > 0 BEGIN
SET #r_msg = 'Memory grants pending' GOTO LOWMEM END
SET #m_low = 0
LOWMEM:
IF #m_mi = 1 GOTO MI1
IF #m_mi = 2 GOTO MI2
EndOfScript:
Thank you all for your replies!
I'm better off then creating one more SP with the repeating code and call that, which is the best way interms of performance and look wise.
Is there a difference in performance between TOP and SET ROWCOUNT or do they just get executed in the same manner?
Yes, functionally they are the same thing. As far as I know there are no significant performance differences between the two.
Just one thing to note is that once you have set rowcount this will persist for the life of the connection so make sure you reset it to 0 once you are done with it.
EDIT (post Martin's comment)
The scope of SET ROWCOUNT is for the current procedure only. This includes procedures called by the current procedure. It also includes dynamic SQL executed via EXEC or SP_EXECUTESQL since they are considered "child" scopes.
Notice that SET ROWCOUNT is in a BEGIN/END scope, but it extends beyond that.
create proc test1
as
begin
begin
set rowcount 100
end
exec ('select top 101 * from master..spt_values')
end
GO
exec test1
select top 102 * from master..spt_values
Result = 100 rows, then 102 rows
One more note about performance, according to BOL:
As a part of a SELECT statement, the query optimizer can consider the value of expression in the TOP or FETCH clauses during query optimization. Because SET ROWCOUNT is used outside a statement that executes a query, its value cannot be considered in a query plan.
Article on BOL
Meaning there might be actually performance difference in these.