Comparing The Performance Of Indexed Views And Stored Procedures In SQL Server

Comparing The Performance Of Indexed Views And Stored Procedures In SQL Server - sql

I've just recently become aware of the fact that you can now index your views in SQL Server (see http://technet.microsoft.com/en-us/library/cc917715.aspx). I'm now trying to figure out when I'd get better performance from a query against an indexed view versus the same query inside a stored procedure that's had it's execution path cached?
Take for example the following:
SELECT colA, colB, sum(colC), sum(colD), colE
FROM myTable
WHERE colFDate < '9/30/2011'
GROUP BY colA, colB, colE
The date will be different every time it's run, so if this were a view, I wouldn't include the WHERE in the view and instead have that as part of my select against the view. If it were a stored procedure, the date would be a parameter. Note, there are about 300,000 rows in the table. 200,000 of them would meet the where clause with the date. 10,000 would be returned after the group by.
If this were an indexed view, should I expect to get better performance out of it than a stored procedure that's had an opportunity to cache the execution path? Or would the proc be faster? Or would the difference be negligible? I know we could say "just try both out" but there are too many factors that could falsely bias the results leading me down a false conclusion, so I'd like to hear more of the theory behind it and what the expected outcomes are instead.
Thanks!

An indexed view can be regarded like a normal table - it's a materialized collection of rows.
So the question really boils down to whether or not a "normal" query is faster than a stored procedure.
If you look at what steps the SQL Server goes through to execute any query (stored procedure call or ad-hoc SQL statement), you'll find (roughly) these steps:
syntactically check the query
if it's okay - it checks the plan cache to see if it already has an execution plan for that query
if there is an execution plan - that plan is (re-)used and the query executed
if there is no plan yet, an execution plan is determined
that plan is stored into the plan cache for later reuse
the query is executed
The point is: ad-hoc SQL and stored procedures are treatly no differently.
If an ad-hoc SQL query is properly using parameters - as it should anyway, to prevent SQL injection attacks - its performance characteristics are no different and most definitely no worse than executing a stored procedure.
Stored procedure have other benefits (no need to grant users direct table access, for instance), but in terms of performance, using properly parametrized ad-hoc SQL queries is just as efficient as using stored procedures.
Using stored procedures over non-parametrized queries is better for two main reasons:
since each non-parametrized query is a new, different query to SQL Server, it has to go through all the steps of determining the execution plan, for each query (thus wasting time - and also wasting plan cache space, since storing the execution plan into plan cache doesn't really help in the end, since that particular query will probably not be executed again)
non-parametrized queries are at risk of SQL injection attack and should be avoided at all costs
Now of course, if you're indexed view can reduce down the number rows significantly (by using a GROUP BY clause) - then of course that indexed view will be significantly faster than when you're running a stored procedure against the whole dataset. But that's not because of the different approaches taken - it's just a matter of scale - querying against a few dozen or few hundred rows will be faster than querying against 200'000 or more rows - no matter which way you query.

Related

Why doesn't exec sp_recompile sometimes not help parameter sniffing?

We have a complex stored procedure that is sometimes subject to parameter sniffing. It is a large, "all-in-one" procedure that is called by many different parts of the system and so it stands to reason that one query plan would not fit all use cases.
This works fine except periodically ONE particular report goes from seconds to minutes. In the past, a quick exec sp_recompile would speed it back up immediately. Now that never works. The report just eventually "fixes itself" in a day or two, meaning it goes back to taking seconds.
Refactoring the stored procedure is currently not an option and I don't want to do the other recommended approaches (saving parameters to local variables, WITH RECOMPILE, OPTIMIZE FOR UNKNOWN) as those are said to have other side effects.
So I have these questions:
Why wouldn't exec sp_recompile speed it up like before?
How can I tell if exec sp_recompile actually cleared the query plan cache? What should be run before, and after, the exec? I've tried some queries from the web but can't clearly tell if something changed, so a specific recipe would be great to have.
Would it be reasonable to clone the procedure with a different name, and call that clone just for this one report? The goal would be to get SQL Server to cache a separate plan just for the report. But I'm not sure if SQL Server is caching plans by procedure name, or if it caches the various queries inside the stored procedures. (If it's the latter, then there's no use to this approach, as the any clones of the procedure would have the same queries.)

Using several CTEs, especially complex queries (just like when joining with views) can potentially cause the query optimiser problems with producing an optimal execution plan.
If you have a lot of CTE definitions used, SQL Server will be attempting to construct a single monolithic execution plan and you could have a plan compilation timeout resulting in a sub-optimal plan being used.
You could instead replace the CTEs with temp tables - using intermediate results often has better performance as each query executes in isolation with a dedicated optimal (or at least better) plan. This can help the optimizer make a better choice for joins and index usage.
If you can benefit from two key different types of parameters that ideally require their own optimal plan then an option would be, as you suggest, to duplicate the procedure specific to each use-case.
You can confirm that this results in a separate execution plan by querying for your procedure name using dm_exec_sql_text
select s.plan_handle, t.text
from sys.dm_exec_query_stats s
cross apply Sys.dm_exec_sql_text(s.plan_handle)t
where t.text like '%proc name%'
You will note you have a different plan_handle for each procedure.

How can a stored proc have multiple execution plans?

I am working with MS SQL Server 2008 R2. I have a stored procedure named rpt_getWeeklyScheduleData. This is the query I used to look up its execution plan in a specific database:
select
*
from
sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
where
OBJECT_NAME(st.objectid, st.dbid) = 'rpt_getWeeklyScheduleData' and
st.dbid = DB_ID()
The above query returns me 9 rows. I was expecting 1 row.
This stored procedure has been modified multiple times so I believe SQL Server has been building a new execution plan for it whenever it was modified and run. Is it correct explanation? If not then how can you explain this?
Also is it possible to see when each plan was created? If yes then how?
UPDATE:
This is the stored proc's signature:
CREATE procedure [dbo].[rpt_getWeeklyScheduleData]
(
#a_paaipk int,
#a_location_code int,
#a_department_code int,
#a_week_start_date varchar(12),
#a_week_end_date varchar(12),
#a_language_code int,
#a_flag int
)
as
begin
...
end
The stored proc is long; has only 2 if conditions both for #a_flag parameter.
if #a_flag = 0
begin
...
end
if #a_flag = 1
begin
...
end

Depending on the nature of the stored procedure (which wasn't provided) this is very possible for any number of reasons (most likely not limited to below):
Does the proc use a lot of if this then this select, else this select/update
Does the proc contain dynamic sql?
Are you executing the SP from both web and SSMS? Then you're likely executing the SP with different connection settings.
Does the stored proc have parameters? Sometimes a difference in parameters can cause one execution plan to be terrible for a specific set, so a different plan is used.
Going to try an analogy which might help... maybe...
Say you have a stored procedure for your weekend shopping.
You typically need to get groceries, sometimes an air filter, and even less often a big pack of something that needs replacing 4 times a year.
The grocery store can handle groceries, and is the closest to your house (5 minutes).
Target can handle the air filter and groceries, but add 25 minutes travel time.
"Big place of everything" has everything you'd possibly need, but is an hours drive away.
So here, depending on your parameters #needsAirFilter and #needsBigPackOfSomething could vastly change your "execution plan" of your stored procedure of "shopping".
If #needsAirFilter and #needsBigPackOfSomething is false, there's no reason to make the 30 minute or hour drive, as everything you need is at the grocery store.
One a month, #needsAirFilter is true, in that case we need to go to Target, as the grocery store's execution plan is insufficient.
4 times a year #needsBigPackOfSomething is true, and we need to make the hour drive to get the big pack of something, while grabbing groceries, and airfilter since we're there.
Sure... we could make the hour drive every time to get groceries, and the other things when needed (imagine single execution plan). But this is in no way the most efficient way to do it. In instances like this, we have different execution plans for what information/goods are actually needed.
No idea if that helps... but I had fun :D

Typically SQL Server will generate a new query plan depending on the values of the parameters being passed in (this can determine what indexes, if any, it will use) and if indexes are added, changed or updated (on the tables/views being used in the proc) so SQL Server may decide that it is more effective to use one or more indexes that it previously ignored. The more involved the SQL in the proc will also kick off more work on SQL Server side as it attempts to optimize the query. If the data changes (suddenly you have many more customers in NJ and there is a query and index for states) it may decide that its going to use that index and the query plan is changed. If any of the tables or views involved in the query change (schema change) will also invalidate an existing plan and result in a new plan being generated.

Is there any performance difference between view and stored procedures

I had large amount of data . I had write SQL Queries for all of these and retrieve data.My point is should i write these queries in views or SP's.
i.e i need to know is there is any major difference between
INSERT INTO TABLE TABLE_NAME EXEC SP
OR
INSERT INTO TABLE TABLE_NAME SELECT * FROM VIEW

Is there any major performance difference? No, but only if the query inside the stored proc is the exact same query inside the view. You should not see any major performance difference. If there is a performance difference, you won't notice it. If you start adding extra code to the proc (parameters, logic, etc.), then all bets are off.

Its the matter of art, or the ability to reuse (maintenance).
I personally, prefer using drop and create tables other than creating a complex view. Reason is simple, any one will have to understand the logic from one single screen other than opening multi tables, views, GUI code, and maybe Reports processes.

If the query inside the stored procedure is exactly the same as the query in the view. The query analizer should use the same execution plan.
I have ran some tests here, with the same query in a SP and a view. Both have similar execution times at around 6 seconds for each type with the SP being a bit slower. But this was just a test with a simple sql statement.

Option Recompile makes query fast - good or bad?

I have two SQL queries with about 2-3 INNER JOINS each. I need to do an INTERSECT between them.
Problem is that indiividually the queryes work fast, but after intersecting take about 4 seconds in total.
Now, if I put an OPTION (RECOMPILE) at the end of this whole query, the query works great again working quite fast returning almost instantly!.
I understand that option recopile forces a rebuild of execution plan, so I am confused now if my earler query taking 4 seconds is better or now the one with recompile, but taking 0 seconds is better.

Rather than answer the question you asked, here's what you should do:
Update your statistics:
EXEC sp_updatestats
If that doesn't work, rebuild indexes.
If that doesn't work, look at OPTIMIZE FOR

WITH RECOMPILE is specified SQL Server does not cache a plan for this stored procedure,
the stored procedure is recompiled each time it is executed.
Whenever a stored procedure is run in SQL Server for the first time, it is optimized and a query plan is compiled and cached in SQL Server's memory. Each time the same stored procedure is run after it is cached, it will use the same query plan eliminating the need for the same stored procedure from being optimized and compiled every time it is run. So if you need to run the same stored procedure 1,000 times a day, a lot of time and hardware resources can be saved and SQL Server doesn't have to work as hard.
you should not use this option because by using this option, you lose most of the advantages you get by substituting SQL queries with the stored procedures.

Different Execution Plan for the same Stored Procedure

We have a query that is taking around 5 sec on our production system, but on our mirror system (as identical as possible to production) and dev systems it takes under 1 second.
We have checked out the query plans and we can see that they differ. Also from these plans we can see why one is taking longer than the other. The data, schame and servers are similar and the stored procedures identical.
We know how to fix it by re-arranging the joins and adding hints, However at the moment it would be easier if we didn't have to make any changes to the SProc (Paperwork). We have also tried a sp_recompile.
What could cause the difference between the two query plans?
System: SQL 2005 SP2 Enterprise on Win2k3 Enterprise
Update: Thanks for your responses, it turns out that it was statistics. See summary below.

Your statistics are most likely out of date. If your data is the same, recompute the statistics on both servers and recompile. You should then see identical query plans.
Also, double-check that your indexes are identical.

Most likely statistics.
Some thoughts:
Do you do maintenance on your non-prod systems? (eg rebuidl indexes, which will rebuild statistics)
If so, do you use the same fillfactor and statistics sample ratio?
Do you restore the database regularly onto test so it's 100% like production?

is the data & data size between your mirror and production as close to the same as possible?
If you know why one query taking longer then the other? can you post some more details?
Execution plans can be different in such cases because of the data in the tables and/or the statistics. Even in cases where auto update statistics is turned on, the statistics can get out of date (especially in very large tables)
You may find that the optimizer has estimated a table is not that large and opted for a table scan or something like that.

Provided there is no WITH RECOMPILE option on your proc, the execution plan will get cached after the first execution.
Here is a trivial example on how you can get the wrong query plan cached:
create proc spTest
#id int
as
select * from sysobjects where #id is null or id = id
go
exec spTest null
-- As expected its a clustered index scan
go
exec spTest 1
-- OH no its a clustered index scan
Try running your Sql in QA on the production server outside of the stored proc to determine if you have an issue with your statistics being out of date or mysterious indexes missing from production.

Tying in to the first answer, the problem may lie with SQL Server's Parameter Sniffing feature. It uses the first value that caused compilation to help create the execution plan. Usually this is good but if the value is not normal (or somehow strange), it can contribute to a bad plan. This would also explain the difference between production and testing.
Turning off parameter sniffing would require modifying the SProc which I understand is undesirable. However, after using sp_recompile, pass in parameters that you'd consider "normal" and it should recompile based off of these new parameters.
I think the parameter sniffing behavior is different between 2005 and 2008 so this may not work.

The solution was to recalculate the statistics. I overlooked that as usually we have scheduled tasks to do all of that, but for some reason the admins didn't put one one this server, Doh.
To summarize all the posts:
Check the setup is the same
Indexes
Table sizes
Restore Database
Execution Plan Caching
If the query runs the same outside the SProc, it's not the Execution Plan
sp_recompile if it is different
Parameter sniffing
Recompute Statistics

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas