Why/how do my SQL queries get faster with use?

Why/how do my SQL queries get faster with use? - sql

I ran a query on about 1,300,000 records in one table. It takes certain records that meet some WHERE conditions and inserts them into another table. It first clears out the target table entirely.
The time taken to complete the query gets drastically better with each Execute:
1st: 5 minutes, 3 seconds
2nd: 2 minutes, 43 seconds
3rd: 12 seconds
4th: 3 seconds
I'm not doing anything other than just hitting Execute. My query looks like this (somewhat abbreviated for length purposes):
DELETE FROM dbo.ConsolidatedLogs --clear target table
DECLARE #ClientID int
DECLARE c CURSOR FOR SELECT ClientID FROM dbo.Clients
OPEN c
FETCH NEXT FROM c INTO #ClientID --foreach LogID
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO dbo.ConsolidatedLogs
(col1, col2)
SELECT col1, col2
FROM dbo.CompleteLogsRaw
WHERE col3 = true AND
ClientID = #ClientID
FETCH NEXT FROM c INTO #ClientID
END
CLOSE c
DEALLOCATE c
How/why does this happen? What is SQL Server doing exactly to make this possible?
This query is going to be run as an SQL Server Agent job, once every 3 hours. Will it take the full 5 minutes every time, or will it be shorter because the job is only running this one query, even though it's got a 3 hour delay?

If identical queries get faster with each run, there is an outstanding chance that things are being cached. So, where can things be cached?
SQL Server Query Plan Cache
SQL Server's Query Cache
Operating System IO Buffers
Hard Disk Controller Cache
Hard Disk On-Disk Cache
You can clear the SQL Server Query Cache between runs to see what the impact of that cache is
How can I clear the SQL Server query cache?
SQL Server will use whatever RAM is dedicated to it to keep things that it accesses frequently in RAM rather than on disk. The more you run the same query, the more the data that would normally be found on disk is likely to reside in RAM instead.
The OS level and hardware-level caches are easiest to reset by performing a reboot if you wish to see whether they are contributing to the improving results.
If you publish the query plan that SQL Server is using for each execution of your query, a more detailed diagnostics would be possible.

When Sql Server executes a query in Enterprise Manager, it creates an "execution" or "query plan" after the first execution, and caches that plan. A "query plan", in a nutshell, describes how SQL Server will attack the tables, fields, indexes, and data necessary to satisfy the result. Each time you re-run it, that query is fetched from the plan cache, and the "heavy lifting" that the query preprocessor would ordinarily have to do is omitted. That allows the query to be performed more rapidly on second and subsequent executions.
Mind you, that's an oversimplification of a much more detailed (and, thus, inherently cooler) process, but that's Query Plan 101 :)

Related

Stored procedure execution plan WITH RECOMPILE

I have a stored procedure that runs every 5 minutes in a job on SQL server. The job will run for 80% of the time with no results , this is expected, but when it does have data to process it is a very long process.
The code is like this below simplified.
IF exists (Select top 1 col1 from tbl1 where processed = '0' )
BEGIN
HUGE PROCESS with multiple selects joins and updates
END
How will the execution plan evaluate this SP? Is this a rare case that using with WITH RECOMPILE is the best option?

If you use Include Actual Execution Plan option in the SQL Server Management Studio, you will see that when the IF expression is evaluated to false its body's operators are not included in the execution plan.
So, there is no need to worry - the SQL Engine will use correct execution plan and will not touch the data.
The recompile option can be helpful in particular queries but I believe you can skip it for now.

Query Plan Recompiled suddenly and degrades performance

Scenario: We have a simple select query
Declare P#
SELECT TOP(1) USERID
FROM table
WHERE non_clusteredindex_column = (#P) ORDER BY PK_column DESC
It usually executes with in 0.12sec since 1 year. But Yesterday suddenly exactly after mid night it started consuming all my CPU and taking 150 sec to execute. I checked SP_who2 and found no dead locks and nothing except this one query consuming all CPU. I decided to reboot the server to get rid of any Parameter sniffing issue or to kill any stale connections.I took a SLQ profiler Trace for 1 min before restarting the server for future Root Cause Analysis. After reboot, everything is back to normal. I was surprised and curiously started reviewing the Execution plan in profiler that I took and comparing to the current execution plan of the SAME query. I found both are different.
Execution plan before problematic Night is same as the execution plan after the Reboot. (Doing perfect Index seeks)
But the execution plan in Problematic Night SQL profiler is doing full Index Scan which is taking all CPU and taking 150 sec to execute.
Quesion:
I can say the execution plan was suddenly recompiled or query started using new execution plan(full scan) after yesterday midnight and after I rebooted, again it started using the old and good execution plan( Index SEEK).
Q1. What made SQL server to use new EXECUTION plan all of a sudden?
Q2. What made SQL server use the old & good execution plan after Reboot?
Q3. Anything related to Parameter Sniffing as I am passing Parameter. But technically, it shouldn't be as The parameter column is well organized with evenly distributed Data.

It sounds like you are having a parameter sniffing issue. I can't see your data but often we found these crop up even in simple query scenarios when either many rows match the parameter result and it flipped to a scan even when it shouldn't or there was some other problem with the data such as many values are unique but they decided under some scenario that column should have a 0 in a large portion of the table throwing everything for a loop. If the query from code is running slow but you can do a test procedure execution from ssms this is a pretty big red flag that something along this line is your issue.
You are correct that SQL restart flushes all the plan cache or you can manually flush all the plans out but you absolutely do not want to use this method to fix the plan of a single procedure. A quick fix is you can execute a EXEC sp_recompile 'dbo.procname'; to force it to flush just a single procedure execution plan and make a new one. Redoing all your plans especially in a busy database can cause significant performance concerns of other procs and restart of course has some downtime. This only temporarily fixes the problem when it crops up though if you have identified a parameter causing issues I would consider looking into addition an optimize for unknown hint specifically designed for parameter sniffing issues that have been identified. But also maybe make sure some good index maintenance is going on the regular in your environment in case that is causing bad plans not the sql engine.

In your case, you can do the following :
-- Activate the query store option in you database setting . Set Operation Mode To On.
-- This will start capturing the query plan for each request.
-- You can start tracking the query that consumes a lot of resources
-- Finally you can force an execution plan to be used for this query

How can a stored proc have multiple execution plans?

I am working with MS SQL Server 2008 R2. I have a stored procedure named rpt_getWeeklyScheduleData. This is the query I used to look up its execution plan in a specific database:
select
*
from
sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
where
OBJECT_NAME(st.objectid, st.dbid) = 'rpt_getWeeklyScheduleData' and
st.dbid = DB_ID()
The above query returns me 9 rows. I was expecting 1 row.
This stored procedure has been modified multiple times so I believe SQL Server has been building a new execution plan for it whenever it was modified and run. Is it correct explanation? If not then how can you explain this?
Also is it possible to see when each plan was created? If yes then how?
UPDATE:
This is the stored proc's signature:
CREATE procedure [dbo].[rpt_getWeeklyScheduleData]
(
#a_paaipk int,
#a_location_code int,
#a_department_code int,
#a_week_start_date varchar(12),
#a_week_end_date varchar(12),
#a_language_code int,
#a_flag int
)
as
begin
...
end
The stored proc is long; has only 2 if conditions both for #a_flag parameter.
if #a_flag = 0
begin
...
end
if #a_flag = 1
begin
...
end

Depending on the nature of the stored procedure (which wasn't provided) this is very possible for any number of reasons (most likely not limited to below):
Does the proc use a lot of if this then this select, else this select/update
Does the proc contain dynamic sql?
Are you executing the SP from both web and SSMS? Then you're likely executing the SP with different connection settings.
Does the stored proc have parameters? Sometimes a difference in parameters can cause one execution plan to be terrible for a specific set, so a different plan is used.
Going to try an analogy which might help... maybe...
Say you have a stored procedure for your weekend shopping.
You typically need to get groceries, sometimes an air filter, and even less often a big pack of something that needs replacing 4 times a year.
The grocery store can handle groceries, and is the closest to your house (5 minutes).
Target can handle the air filter and groceries, but add 25 minutes travel time.
"Big place of everything" has everything you'd possibly need, but is an hours drive away.
So here, depending on your parameters #needsAirFilter and #needsBigPackOfSomething could vastly change your "execution plan" of your stored procedure of "shopping".
If #needsAirFilter and #needsBigPackOfSomething is false, there's no reason to make the 30 minute or hour drive, as everything you need is at the grocery store.
One a month, #needsAirFilter is true, in that case we need to go to Target, as the grocery store's execution plan is insufficient.
4 times a year #needsBigPackOfSomething is true, and we need to make the hour drive to get the big pack of something, while grabbing groceries, and airfilter since we're there.
Sure... we could make the hour drive every time to get groceries, and the other things when needed (imagine single execution plan). But this is in no way the most efficient way to do it. In instances like this, we have different execution plans for what information/goods are actually needed.
No idea if that helps... but I had fun :D

Typically SQL Server will generate a new query plan depending on the values of the parameters being passed in (this can determine what indexes, if any, it will use) and if indexes are added, changed or updated (on the tables/views being used in the proc) so SQL Server may decide that it is more effective to use one or more indexes that it previously ignored. The more involved the SQL in the proc will also kick off more work on SQL Server side as it attempts to optimize the query. If the data changes (suddenly you have many more customers in NJ and there is a query and index for states) it may decide that its going to use that index and the query plan is changed. If any of the tables or views involved in the query change (schema change) will also invalidate an existing plan and result in a new plan being generated.

SP taking 15 minutes, but the same query when executed returns results in 1-2 minutes

So basically I have this relatively long stored procedure. The basic execution flow is that it SELECTS INTO some data into temp tables declared with the # sign and then runs a cursor through these tables to generate a 'running total' into a third temp table which is created using CREATE. Then this resulting temp table is joined with other tables in the DB to generated the result after some grouping etc. The problem is, this SP had been running fine until now returning results in 1-2 minutes. And now, suddenly, its taking 12-15 minutes. If I extract the query from the SP and executed it in management studio by manually setting the same parameters, it returns results in 1-2 minutes but the SP takes very long. Any idea what could be happening? I tried to generate the Actual Execution plans of both the query and the SP but it couldn't generate it because of the cursor. Any idea why the SP takes so long while the query doesn't?

This is the footprint of parameter-sniffing. See here for another discussion about it; SQL poor stored procedure execution plan performance - parameter sniffing
There are several possible fixes, including adding WITH RECOMPILE to your stored procedure which works about half the time.
The recommended fix for most situations (though it depends on the structure of your query and sproc) is to NOT use your parameters directly in your queries, but rather store them into local variables and then use those variables in your queries.

its due to parameter sniffing. first of all declare temporary variable and set the incoming variable value to temp variable and use temp variable in whole application here is an example below.
ALTER PROCEDURE [dbo].[Sp_GetAllCustomerRecords]
#customerId INT
AS
declare #customerIdTemp INT
set #customerIdTemp = #customerId
BEGIN
SELECT *
FROM Customers e Where
CustomerId = #customerIdTemp
End
try this approach

Try recompiling the sproc to ditch any stored query plan
exec sp_recompile 'YourSproc'
Then run your sproc taking care to use sensible paramters.
Also compare the actual execution plans between the two methods of executing the query.
It might also be worth recomputing any statistics.

I'd also look into parameter sniffing. Could be the proc needs to handle the parameters slighlty differently.

I usually start troubleshooting issues like that by using
"print getdate() + ' - step '". This helps me narrow down what's taking the most time. You can compare from where you run it from query analyzer and narrow down where the problem is at.

I would guess it could possible be down to caching. If you run the stored procedure twice is it faster the second time?
To investigate further you could run them both from management studio the stored procedure and the query version with the show query plan option turned on in management studio, then compare what area is taking longer in the stored procedure then when run as a query.
Alternativly you could post the stored procedure here for people to suggest optimizations.

For a start it doesn't sound like the SQL is going to perform too well anyway based on the use of a number of temp tables (could be held in memory, or persisted to tempdb - whatever SQL Server decides is best), and the use of cursors.
My suggestion would be to see if you can rewrite the sproc as a set-based query instead of a cursor-approach which will give better performance and be a lot easier to tune and optimise. Obviously I don't know exactly what your sproc does, to give an indication as to how easy/viable this is for you.
As to why the SP is taking longer than the query - difficult to say. Is there the same load on the system when you try each approach? If you run the query itself when there's a light load, it will be better than when you run the SP during a heavy load.
Also, to ensure the query truly is quicker than the SP, you need to rule out data/execution plan caching which makes a query faster for subsequent runs. You can clear the cache out using:
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
But only do this on a dev/test db server, not on production.
Then run the query, record the stats (e.g. from profiler). Clear the cache again. Run the SP and compare stats.

1) When you run the query for the first time it may take more time. One more point is if you are using any corellated sub query and if you are hardcoding the values it will be executed for only one time. When you are not hardcoding it and run it through the procedure and if you are trying to derive the value from the input value then it might take more time.
2) In rare cases it can be due to network traffic, also where we will not have consistency in the query execution time for the same input data.

I too faced a problem where we had to create some temp tables and then manipulating them had to calculate some values based on rules and finally insert the calculated values in a third table. This all if put in single SP was taking around 20-25 min. So to optimize it further we broke the sp into 3 different sp's and the total time now taken was around 6-8 mins. Just identify the steps that are involved in the whole process and how to break them up in different sp's. Surely by using this approach the overall time taken by the entire process will reduce.

This is because of parameter snipping. But how can you confirm it?
Whenever we supposed to optimize SP we look for execution plan. But in your case, you will see an optimized plan from SSMS because it's taking more time only when it called through Code.
For every SP and Function, the SQL server generates two estimated plans because of ARITHABORT option. One for SSMS and second is for the external entities(ADO Net).
ARITHABORT is by default OFF in SSMS. So if you want to check what exact query plan your SP is using when it calls from Code.
Just enable the option in SSMS and execute your SP you will see that SP will also take 12-13 minutes from SSMS.
SET ARITHABORT ON
EXEC YourSpName
SET ARITHABORT OFF
To solve this problem you just need to update the estimate query plan.
There are a couple of ways to update the estimate query plan.
1. Update table statistics.
2. recompile SP
3. SET ARITHABORT OFF in SP so it will always use query plan created for SSMS (this option is not recommended)
For more options please refer to this awesome article -
http://www.sommarskog.se/query-plan-mysteries.html

I would suggest the issue is related to the type of temp table (the # prefix). This temp table holds the data for that database session. When you run it through your app the temp table is deleted and recreated.
You might find when running in SSMS it keeps the session data and updates the table instead of creating it.
Hope that helps :)

strange SQL server report performance problem related with update statistics

I got a complex report using reporting service, the report connect to a SQl 2005 database, and calling a number of store procedure and functions. it works ok initially, but after a few months(data grows), it run into timeout error.
I created a few indexes to improve the performance, but the strange thing it that it works after the index was created, but throws out the same error the next day. Then I try to update the statistics on the database, it works again (the running time of the query improve 10 times). But again, it stop working the next day.
Now, the temp solution is that I run the update statistic every hour. But I can't find a reasonable explanation for this behaviour. the database is not very busy, there won't be lots of data being updated for one day. how can the update statistics make so much difference?

I suspect you have parameter sniffing. Updating statistics merely forces all query plans to be discarded, so it appears to work for a time
CREATE PROC dbo.MyReport
#SignatureParam varchar(10),
...
AS
...
DECLARE #MaskedParam varchar(10), ...
SELECT #MaskedParam = #SignatureParam, ...
SELECT...WHERE column = #MaskedParam AND ...
...
GO

I've seen this problem when the indexes on the underlying tables need to be adjusted or the SQL needs work.
The rebuild index and the update statistics read the table into the cache, which improves performance. The next day the table has been flushed out of the cache and the performance problems return.
SQL Profiler is very useful in these situations to identify what changes from run to run.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas