How to reliably compare query run times?

How to reliably compare query run times? - sql

I try to compare the run times of different queries on the same database but a problem I am facing is that results are cached, and cache is stored in different places so the run times do not reflect the efficiency of the queries.
How can I avoid this problem so that I can compare these queries ?

There are two cache's that you have control over. The query result cache can be disabled with the following:
ALTER SESSION SET USE_CACHED_RESULT = FALSE ;
The other cache that you can control is the data cache - sometimes also called the warehouse cache. This can be cleared by suspending and resuming the warehouse:
ALTER WAREHOUSE MY_WAREHOUSE SUSPEND ;
ALTER WAREHOUSE MY_WAREHOUSE RESUME ;
Finally there is the metadata cache, however, this is held in the global services layer and to the best of my knowledge there is no way to clear this or to instruct Snowflake not to use it.
For comparing queries that are supposed to do the same thing it is often a good idea to check the query execution plan since databases in general are quite smart and you might find they compile to exactly the same plan.

Related

Why doesn't exec sp_recompile sometimes not help parameter sniffing?

We have a complex stored procedure that is sometimes subject to parameter sniffing. It is a large, "all-in-one" procedure that is called by many different parts of the system and so it stands to reason that one query plan would not fit all use cases.
This works fine except periodically ONE particular report goes from seconds to minutes. In the past, a quick exec sp_recompile would speed it back up immediately. Now that never works. The report just eventually "fixes itself" in a day or two, meaning it goes back to taking seconds.
Refactoring the stored procedure is currently not an option and I don't want to do the other recommended approaches (saving parameters to local variables, WITH RECOMPILE, OPTIMIZE FOR UNKNOWN) as those are said to have other side effects.
So I have these questions:
Why wouldn't exec sp_recompile speed it up like before?
How can I tell if exec sp_recompile actually cleared the query plan cache? What should be run before, and after, the exec? I've tried some queries from the web but can't clearly tell if something changed, so a specific recipe would be great to have.
Would it be reasonable to clone the procedure with a different name, and call that clone just for this one report? The goal would be to get SQL Server to cache a separate plan just for the report. But I'm not sure if SQL Server is caching plans by procedure name, or if it caches the various queries inside the stored procedures. (If it's the latter, then there's no use to this approach, as the any clones of the procedure would have the same queries.)

Using several CTEs, especially complex queries (just like when joining with views) can potentially cause the query optimiser problems with producing an optimal execution plan.
If you have a lot of CTE definitions used, SQL Server will be attempting to construct a single monolithic execution plan and you could have a plan compilation timeout resulting in a sub-optimal plan being used.
You could instead replace the CTEs with temp tables - using intermediate results often has better performance as each query executes in isolation with a dedicated optimal (or at least better) plan. This can help the optimizer make a better choice for joins and index usage.
If you can benefit from two key different types of parameters that ideally require their own optimal plan then an option would be, as you suggest, to duplicate the procedure specific to each use-case.
You can confirm that this results in a separate execution plan by querying for your procedure name using dm_exec_sql_text
select s.plan_handle, t.text
from sys.dm_exec_query_stats s
cross apply Sys.dm_exec_sql_text(s.plan_handle)t
where t.text like '%proc name%'
You will note you have a different plan_handle for each procedure.

Execution plan cost estimation

There is a problem in which i have to update a table with millions of record based on certain conditions.I have written a long shell script and SQL statements to do it.To check the performance , i plan on using explain plan ,studying it from http://docs.oracle.com/cd/B10500_01/server.920/a96533/ex_plan.htm#19259
Here it is written that "Execution plans can differ due to the following:"
Different Costs->
Data volume and statistics
Bind variable types
Initialization parameters - set globally or at session level
Here i dont understand how Initialization parameters - set globally or at session level affects execution plan.
can anybody explain this?
Also is there any other way i can check the SQL statements for performance other than explain plan or autotrace.

There are several (initialization) parameters that can influence the execution plan for your statement. The one that immediately comes to mind is OPTIMIZER_MODE
Other not so obvious session are things like NLS settings that might influence the usability of indexes.
An alternative approach to get the real execution plan (apart from tracing the session and using tkprof) is to use the /*+ gather_plan_statistics */ hint together with 'dbms_xplan.display_cursor()'.
This is done by actually running the statement using the above hint first (so this does take longer than a "normal" explain):
select /*+ gather_plan_statistics */ *
from some_table
join ...
where ...
Then after that statement is finished, you can retrieve the used plan using dbms_xplan:
SELECT *
FROM table(dbms_xplan.display_cursor(format => 'ALLSTATS LAST');

GLOBAL OR SESSION parameters
Oracle is setup with a set of initialisation parameters. Those will be used by default if nothing is specified to override them. They can be overriden by using an ALTER SESSION (just affects a single user) or ALTER SYSTEM (affects all users until Oracle is restarted) commands to change things at the session or system level or by using optimiser hints in the code. These can have an effect on the plans you see.
In relation to explain plan, a different Oracle database may have different initialisation parameters or have some session/system parameters set which could mean the SAME code behaves differently (by getting a different execution plan on one Oracle database compared to another Oracle database).
Also, as the execution plan is affected by the data chosen, it's possible that a query or package that runs fine in TEST never finishes in PRODUCTION where the volume of data is much larger. This is a common issue when code is not tested with accurate volumes of data (or at least with the table statistics imported from a production database if test cannot hold a full volume of production like data).
TRACING
The suggestions so far tell you how to trace an individual statement assuming you know which statement has a problem but you mention you have a shell script with several SQL statements.
If you are using a here document with a single call to SQL plus containin several SQL statements like this ...
#!/bin/ksh
sqlplus -S user/pass <<EOF
set heading off
BEGIN
-- DO YOUR FIRST SQL HERE
-- DO YOUR SECOND SQL HERE
END;
/
EOF
... you can create a single oracle trace file like this
#!/bin/ksh
sqlplus -S user/pass <<EOF
set heading off
BEGIN
ALTER SESSION SET EVENTS '10046 trace name context forever, level 8'
-- DO YOUR FIRST SQL HERE
-- DO YOUR SECOND SQL HERE
END;
/
EOF
Note the level 8 is for tracing with WAITS. You can do level 4 (bind variables), and level 12 (binds and waits) but i've always found the problem just with level 8. Level 12 can also take a lot longer to execute and in full size environments.
Now run the shell script to do a single execution, then check where your trace file is created in SQL PLUS using
SQL> show parameter user_dump_dest
/app/oracle/admin/rms/udump
Go to that directory and if no other tracing has been enabled, there will be a .trc file that contains the trace of the entire run of the SQL in your script.
You need to convert this to readable format with the unix tkprof command like this
unix> tkprof your.trc ~/your.prf sys=no sort=fchela,exeela
Now change to your home directory and there will be a .prf file with the SQL statements listed in order of those that take the most execution time or fetch time to execute along with the explain plans. This set of parameters to tkprof allow you to focus on fixing the statements that take the longest and therefore have the biggest return for tuning.
Note that if your shell script runs several sqlplus commands, each one will create a separate session and therefore adding an ALTER SESSION statement to each one will create separate trace files.
Also, dont forget that it's easy to get lost in the detail, sometimes tuning is about looking at the overall picture and doing the same thing another way rather than starting by working on a single SQL that may gain a few seconds but in the overall scheme doesnt help to reduce the overall run time.
If you can, try to minimise the number of update statements as if you have one big table, it will be more efficient if you can do the updates in one pass of the table (and have indexes supporting the updates) rather than doing lots of small updates.
Maybe you can post the relevant parts of your script, the overall time it takes to run and the explain plan if you need more assistance.

Personally I only trust the rowsource operations because this gives the exact plan as is executed. There do exist a few parameters that have effect on how the plan is constructed. Most parameters will be set on instance level but can be over ridden on session level. This means that every session could have it's own set of effective parameters.
Problem you have is that you need to know the exact settings of the session that is going to run your script. There are a few ways to change session level settings. Settings can be changed in a logon trigger, in a stored procedure or in the script.
If your script is not influenced by a logon trigger and does not call any code that issues alter session statements, you will be using the settings that your instance has.

SP taking 15 minutes, but the same query when executed returns results in 1-2 minutes

So basically I have this relatively long stored procedure. The basic execution flow is that it SELECTS INTO some data into temp tables declared with the # sign and then runs a cursor through these tables to generate a 'running total' into a third temp table which is created using CREATE. Then this resulting temp table is joined with other tables in the DB to generated the result after some grouping etc. The problem is, this SP had been running fine until now returning results in 1-2 minutes. And now, suddenly, its taking 12-15 minutes. If I extract the query from the SP and executed it in management studio by manually setting the same parameters, it returns results in 1-2 minutes but the SP takes very long. Any idea what could be happening? I tried to generate the Actual Execution plans of both the query and the SP but it couldn't generate it because of the cursor. Any idea why the SP takes so long while the query doesn't?

This is the footprint of parameter-sniffing. See here for another discussion about it; SQL poor stored procedure execution plan performance - parameter sniffing
There are several possible fixes, including adding WITH RECOMPILE to your stored procedure which works about half the time.
The recommended fix for most situations (though it depends on the structure of your query and sproc) is to NOT use your parameters directly in your queries, but rather store them into local variables and then use those variables in your queries.

its due to parameter sniffing. first of all declare temporary variable and set the incoming variable value to temp variable and use temp variable in whole application here is an example below.
ALTER PROCEDURE [dbo].[Sp_GetAllCustomerRecords]
#customerId INT
AS
declare #customerIdTemp INT
set #customerIdTemp = #customerId
BEGIN
SELECT *
FROM Customers e Where
CustomerId = #customerIdTemp
End
try this approach

Try recompiling the sproc to ditch any stored query plan
exec sp_recompile 'YourSproc'
Then run your sproc taking care to use sensible paramters.
Also compare the actual execution plans between the two methods of executing the query.
It might also be worth recomputing any statistics.

I'd also look into parameter sniffing. Could be the proc needs to handle the parameters slighlty differently.

I usually start troubleshooting issues like that by using
"print getdate() + ' - step '". This helps me narrow down what's taking the most time. You can compare from where you run it from query analyzer and narrow down where the problem is at.

I would guess it could possible be down to caching. If you run the stored procedure twice is it faster the second time?
To investigate further you could run them both from management studio the stored procedure and the query version with the show query plan option turned on in management studio, then compare what area is taking longer in the stored procedure then when run as a query.
Alternativly you could post the stored procedure here for people to suggest optimizations.

For a start it doesn't sound like the SQL is going to perform too well anyway based on the use of a number of temp tables (could be held in memory, or persisted to tempdb - whatever SQL Server decides is best), and the use of cursors.
My suggestion would be to see if you can rewrite the sproc as a set-based query instead of a cursor-approach which will give better performance and be a lot easier to tune and optimise. Obviously I don't know exactly what your sproc does, to give an indication as to how easy/viable this is for you.
As to why the SP is taking longer than the query - difficult to say. Is there the same load on the system when you try each approach? If you run the query itself when there's a light load, it will be better than when you run the SP during a heavy load.
Also, to ensure the query truly is quicker than the SP, you need to rule out data/execution plan caching which makes a query faster for subsequent runs. You can clear the cache out using:
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
But only do this on a dev/test db server, not on production.
Then run the query, record the stats (e.g. from profiler). Clear the cache again. Run the SP and compare stats.

1) When you run the query for the first time it may take more time. One more point is if you are using any corellated sub query and if you are hardcoding the values it will be executed for only one time. When you are not hardcoding it and run it through the procedure and if you are trying to derive the value from the input value then it might take more time.
2) In rare cases it can be due to network traffic, also where we will not have consistency in the query execution time for the same input data.

I too faced a problem where we had to create some temp tables and then manipulating them had to calculate some values based on rules and finally insert the calculated values in a third table. This all if put in single SP was taking around 20-25 min. So to optimize it further we broke the sp into 3 different sp's and the total time now taken was around 6-8 mins. Just identify the steps that are involved in the whole process and how to break them up in different sp's. Surely by using this approach the overall time taken by the entire process will reduce.

This is because of parameter snipping. But how can you confirm it?
Whenever we supposed to optimize SP we look for execution plan. But in your case, you will see an optimized plan from SSMS because it's taking more time only when it called through Code.
For every SP and Function, the SQL server generates two estimated plans because of ARITHABORT option. One for SSMS and second is for the external entities(ADO Net).
ARITHABORT is by default OFF in SSMS. So if you want to check what exact query plan your SP is using when it calls from Code.
Just enable the option in SSMS and execute your SP you will see that SP will also take 12-13 minutes from SSMS.
SET ARITHABORT ON
EXEC YourSpName
SET ARITHABORT OFF
To solve this problem you just need to update the estimate query plan.
There are a couple of ways to update the estimate query plan.
1. Update table statistics.
2. recompile SP
3. SET ARITHABORT OFF in SP so it will always use query plan created for SSMS (this option is not recommended)
For more options please refer to this awesome article -
http://www.sommarskog.se/query-plan-mysteries.html

I would suggest the issue is related to the type of temp table (the # prefix). This temp table holds the data for that database session. When you run it through your app the temp table is deleted and recreated.
You might find when running in SSMS it keeps the session data and updates the table instead of creating it.
Hope that helps :)

Different Execution Plan for the same Stored Procedure

We have a query that is taking around 5 sec on our production system, but on our mirror system (as identical as possible to production) and dev systems it takes under 1 second.
We have checked out the query plans and we can see that they differ. Also from these plans we can see why one is taking longer than the other. The data, schame and servers are similar and the stored procedures identical.
We know how to fix it by re-arranging the joins and adding hints, However at the moment it would be easier if we didn't have to make any changes to the SProc (Paperwork). We have also tried a sp_recompile.
What could cause the difference between the two query plans?
System: SQL 2005 SP2 Enterprise on Win2k3 Enterprise
Update: Thanks for your responses, it turns out that it was statistics. See summary below.

Your statistics are most likely out of date. If your data is the same, recompute the statistics on both servers and recompile. You should then see identical query plans.
Also, double-check that your indexes are identical.

Most likely statistics.
Some thoughts:
Do you do maintenance on your non-prod systems? (eg rebuidl indexes, which will rebuild statistics)
If so, do you use the same fillfactor and statistics sample ratio?
Do you restore the database regularly onto test so it's 100% like production?

is the data & data size between your mirror and production as close to the same as possible?
If you know why one query taking longer then the other? can you post some more details?
Execution plans can be different in such cases because of the data in the tables and/or the statistics. Even in cases where auto update statistics is turned on, the statistics can get out of date (especially in very large tables)
You may find that the optimizer has estimated a table is not that large and opted for a table scan or something like that.

Provided there is no WITH RECOMPILE option on your proc, the execution plan will get cached after the first execution.
Here is a trivial example on how you can get the wrong query plan cached:
create proc spTest
#id int
as
select * from sysobjects where #id is null or id = id
go
exec spTest null
-- As expected its a clustered index scan
go
exec spTest 1
-- OH no its a clustered index scan
Try running your Sql in QA on the production server outside of the stored proc to determine if you have an issue with your statistics being out of date or mysterious indexes missing from production.

Tying in to the first answer, the problem may lie with SQL Server's Parameter Sniffing feature. It uses the first value that caused compilation to help create the execution plan. Usually this is good but if the value is not normal (or somehow strange), it can contribute to a bad plan. This would also explain the difference between production and testing.
Turning off parameter sniffing would require modifying the SProc which I understand is undesirable. However, after using sp_recompile, pass in parameters that you'd consider "normal" and it should recompile based off of these new parameters.
I think the parameter sniffing behavior is different between 2005 and 2008 so this may not work.

The solution was to recalculate the statistics. I overlooked that as usually we have scheduled tasks to do all of that, but for some reason the admins didn't put one one this server, Doh.
To summarize all the posts:
Check the setup is the same
Indexes
Table sizes
Restore Database
Execution Plan Caching
If the query runs the same outside the SProc, it's not the Execution Plan
sp_recompile if it is different
Parameter sniffing
Recompute Statistics

Stored Procedure Timing out.. Drop, then Create and it's up again?

I have a web-service that calls a stored procedure from a MS-SQL2005 DB. My Web-Service was timing out on a call to one of the stored procedures I have (this has been in production for a couple of months with no timeouts), so I tried running the query in Query Analyzer which also timed out. I decided to drop and recreate the stored procedure with no changes to the code and it started performing again..
Questions:
Would this typically be an error in the TSQL of my Stored Procedure?
-Or-
Has anyone seen this and found that it is caused by some problem with the compilation of the Stored Procedure?
Also, of course, any other insights on this are welcome as well.
Similar:
SQL poor stored procedure execution plan performance - parameter sniffing
Parameter Sniffing (or Spoofing) in SQL Server

Have you been updating your statistics on the database? This sounds like the original SP was using an out-of-date query plan. sp_recompile might have helped rather than dropping/recreating it.

There are a few things you can do to fix/diagnose this.
1) Update your statistics on a regular/daily basis. SQL generates query plans (think optimizes) bases on your statistics. If they get "stale" your stored procedure might not perform as well as it used to. (especially as your database changes/grows)
2) Look a your stored procedure. Are you using temp tables? Do those temp tables have indexes on them? Most of the time you can find the culprit by looking at the stored procedure (or the tables it uses)
3) Analyze your procedure while it is "hanging" take a look at your query plan. Are there any missing indexes that would help keep your procedure's query plan from going nuts. (Look for things like table scans, and your other most expensive queries)
It is like finding a name in a phone book, sure reading every name is quick if your phone book only consists of 20 or 30 names. Try doing that with a million names, it is not so fast.

This happend to me after moving a few stored procs from development into production, It didn't happen right away, it happened after the production data grew over a couple months time. We had been using Functions to create columns. In some cases there were several function calls for each row. When the data grew so did the function call time.The original approach was good in a testing environment but failed under a heavy load. Check if there are any Function calls in the proc.

I think the table the SP is trying to use is locked by some process. Use "exec sp_who" and "exec sp_lock" to find out what is going on to your tables.

If it was working quickly, is (with the passage of several months) no longer working quickly, and the code has not been changed, then it seems likely that the underlying data has changed.
My first guess would be data growth--so much data has been added over the past few months (or past few hours, ya never know) that the query is now bogging down.
Alternatively, as CodeByMoonlight implies, the data may have changed so much over time that the original query plan built for the procedure is no longer good (though this assumes that the query plan has not cleared and recompiled at all over a long period of time).
Similarly Index/Database statistics may also be out of date. Do you have AutoUpdateSatistics on or off for the database?
Again, this might only help if nothing but the data has changed over time.

Parameter sniffing.
Answered a day or 3 ago: "strange SQL server report performance problem related with update statistics"

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas