I have the following issue: when a stored proc is called from my application, every now and then (like 1 time out of 1000 calls), it takes 10-30 seconds to finish. Typically, the sproc runs in under a second. It's a fairly simply proc with a single select that ties together a couple of tables. All the table names are set with a (NOLOCK) hint, so it probably isn't locking. The indexes are all in place too, otherwise it would be slow all the time.
The problem is that I can't replicate this issue in SSMS (as it always runs subsecond) no matter how many times it runs the sproc, yet I see the problem when I point the profiler to the user who's running my app. The query plan in SSMS seems correct, yet the problem persists.
Where do I go from here? How do I debug this issue?
Some options:
What does profiler or SET STATISTICS xx ON say? Is there simply resource starvation, say CPU
The engine decides statistics are out of date. Are the tables changing by 10% row count change (rule of thumb). To test:
SELECT
name AS stats_name,
STATS_DATE(object_id, stats_id) AS statistics_update_date
FROM
sys.stats
WHERE
object_id IN (OBJECT_ID('relevanttable1'), OBJECT_ID('relevanttable2'))
What else is happening on the server? example: Index rebuild: not blocking, just resource intensive.
Usually I'd suggest parameter sniffing but you say the parameters are the same for every call. I'd also expect it to happen more often.
Autogrows on the database? Check for messages in the SQL error logs.
Page splits due to inserted records? Check table fragmentation with DBCC SHOWCONTIG
Antivirus scans? Don't.
Out of date statistics? Don't rely on auto-update statistics on tables that change a lot.
Don't rule out a problem on the client end, or the networking between them.
Run profiler with a filter on duration, only capturing events with duration > 10 seconds, look for patterns in parameters, clients, time of day.
I would set up a trace in SQL Server Profiler to see what SET options settings your application is using for the connection, and what settings are being used in SSMS. By SET options settings, I mean
ARITHABORT
ANSI_NULLS
CONCAT_NULL_YIELDS_NULL
//etc
Take a look at MSDN for a table of options
I have seen the problem before where the set options used between SSMS and an application were different (in that particular case, it was ARITHABORT) and the performance difference was huge (in fact, the application would time out for certain queries, depending on the parameter values).
This would be where I would recommend starting an investigation. By setting up a trace, you'll be able to see which particular calls are taking longer and the parameters that are being used.
On the runs that are slow is there anything different about the parameters passed to the proc?
Are you absolutely sure it's the database query, and not some other adjacent logic in your code? (i.e. have you put timestamped "trace" statements immediately before and after?)
Russ' suggestion makes the most sense to me so far as it sounds like you've looked into profiler to verify that the plan is optimized and so on.
I'd also watch for data-type coercion. i.e. I've seen similar problems when a varchar(60) parameter is being compared against and index with varchar(80) data. In some cases like that, SQL Server loses its mind and forces scans instead of seeks - though, I believe that in cases like that, you usually see this kind of thing happening in the execution plan.
Sadly, another potential culprit (and I'm a bit leery of throwing it out because it might be a red herring) is hyper-threading. I've seen it do VERY similar things in the past [1].
1 http://sqladvice.com/blogs/repeatableread/archive/2007/02/13/Burned-again-by-HyperThreading-on-SQL-Server-2000.aspx
Recompile the Stored Proc then see what happens. This actually helps.
I have also similar performance problem.
Adding WITH RECOMPILE to SP helped.
This is not the solution I've looked for but I didn't find better so far...
See:
Slow performance of SqlDataReader
Related
I have a complex query that joins on tables that have large amounts of data. The query timesout after the application runs it a few times. The only ways I can get it working again are by restarting SQL Server or running:
DBCCÂ DROPCLEANBUFFERS;
Can someone give me an idea of what things I should be looking into? I am trying to narrow down what needs to be done to fix this. Is there a way to completely disable caching for the query? It seems like the caching is what is making it timeout eventually.
TheGameiswar's suggestion of updating statistics is a very good idea. Though, you may want to investigate further even if updating statistics alleviates the timeout issue.
It sounds like you are getting a query plan that is only good for some of the parameters being sent to it, which could be caused parameter sniffing; particularly with heavily skewed data.
Have you tried adding option (recompile)to the query or with recompile if it is a procedure?
Have you checked the execution plan?
Reference:
Parameter Sniffing, Embedding, and the RECOMPILE Options - Paul White
It seems your query might have out of date statistics,try updating the statistics for all tables involved in the query,this presents SQLServer a good chance in getting right estimates which also lowers several grants
If this happens even after updating statistics ,try fine tuning the query
to update statistics ,use below query..also try running with full scan,even this might not be needed for all cases
UPDATE STATISTICS tablename with fullscan;
I have a job that runs daily and executes dozens of stored procedures.
Most of them run just fine, but several of them recently started taking a while to run (4-5 minutes).
When I come in in the morning and try to troubleshoot them, they only take 10-20 seconds, just as they supposed to.
This has been happening for the last 10 days or so. No changes had been made to the server (we are running SQL 2012).
How do I even troubleshoot it and what can I do to fix this??
Thanks!!
You can use some DMVs (Dynamic Management Views) that SQL provides to investigate the plan cache. However, the results can be a little intimidating and without some background in it, it may be hard to dig through the results. I would recommend looking into some DMVs like sys.dm_exec_query_stats and sys.dm_exec_cached_plans. Kimberly Tripp from SQLSkills.com does some GREAT courses on Pluralsight on how to use these and get some awesome results by building more advanced queries off of those DMVs.
As well, these DMVs will return a plan_handle column which you can pass to another DMV, sys.dm_exec_query_plan(plan_handle), to return the Execution Plan for a specific statement. The hard part is going to be digging through the results of dm_exec_cached_plans to find the specific job/stored procs that are causing issues. sys.dm_exec_sql_text(qs.[sql_handle]) can help by providing a snapshot of the SQL that was run for that job but you'll get the most benefit out of it (in my opinion) by CROSS APPLYing it with some of the other DMVs I mentioned. If you can identify the Job/Proc/Statement and look at the plan, it will likely show you some indication of the parameter sniffing problem that Sean Lange mentioned.
Just in case: parameter sniffing is when you run the first instance of a query/stored proc, SQL looks at the parameter that you passed in and builds a plan based off of it. The plan that gets generated from that initial compilation of the query/proc will be ideal for the specific parameter that you passed in but might not be ideal for other parameters. Imagine a highly skewed table where all of the dates are '01-01-2000', except one which is '10-10-2015'.
Passing those two parameters in would generate vastly different plans due to data selectivity (read: how unique is the data?). If one of those plans gets saved to cache and called for each subsequent execution, it's possible (and in some cases, likely) that it's not going to be ideal for other parameters.
The likely reason why you're seeing a difference in speed between the Job and when you run the command yourself, is that when you run it, you're running it Ad Hoc. The Job isn't, it's running them as Stored Procs, which means they're going to use different execution plans.
TL;DR:
The Execution Plan that you have saved for the Job is not optimized. However, when you run it manually, you're likely creating an Ad Hoc plan that is optimized for that SPECIFIC run. It's a bit of a hard road to dig into the plan cache and see what's going on but it's 100% worth it. I highly recommend looking up Kimberly Tripp's blog as she has some great posts about this and also some fantastic courses on Pluralsight regarding this.
I have a strange inconsistenty in query speed with linq to sql functions. I have a function which I call from an MVC application. This is invariably extremely slow, in the order of 7 seconds. When I call the same function from SQL management studio, it is sometimes slow, and sometimes fast (a fraction of a second). I'm not sure when it becomes slow, and when it becomes fast exactly, but I have found one cycle (apart from the MVC application always being slow) that gives consistent results.
The query runs in the application. This is slow.
I try the query exactly as LINQ performs it. This is in the form sp_execute N' select [some] [select] [clauses] from functionname(#p0)', 'declare #p0 decimal(9,0)', #p0=123456789. This is also slow, on a first run, and on consecutive runs.
I try the query "unwrapped" in the form select [some] [select] [clauses] from functionname(123456789). This is still slow, also on consecutive runs.
I re-define the function with alter function [...].
running the original sp_execute query is still slow, also on censecutive runs.
running the unwrapped function is fast. Really fast.
running the original sp_execute query is now really fast too. Also with different #p0 parameters.
the query runs in the application. we're back to being slow.
I'm completely and utterly puzzeled as to why this happens, and how I can remedy it. It feels like it has something to do with cached execution plans or something of the kind, but I don't know enough about that to know exactly what is going on - or how to remedy it. Does anyone know what is happening?
This sounds like Parameter Sniffing:
https://www.simple-talk.com/sql/t-sql-programming/parameter-sniffing/
http://blogs.technet.com/b/mdegre/archive/2012/03/19/what-is-parameter-sniffing.aspx
The articles describe it very well, but in a nutshell: the engine makes some assumptions about optimizing a query based on the parameters that are passed in, which causes slower-than-optimal performance.
The issue can be explained by issue of parameter sniffing and, also, different execution settings between you MVC application and SQL Management studio. The difference can be dramatic.
Additional information according to execution settings can be found here
Among this, there's some other tips in the article http://www.sommarskog.se/query-plan-mysteries.html
And, sure, for check correctness, need to clean cashed planes:
DBCC FREEPROCCACHE; DBCC DROPCLEANBUFFERS
The reason that the query is taking time in first time execution on SQL management studio and taking less time subsequently is due to the reason that when the query gets executed, the data gets stored in SQL server cache.
And when the same query is executed again, then data is fetched from Cache.
To see the exact time taken by query, you need to clear the cache before running the query
and that can be done by
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
My first post here, please be gentle. =)
I work for a company that inherited the maintenance of a bespoke system used by one of our customers. The previous developer (no longer with us) encrypted all the database objects (WITH ENCRYPTION).
The system has been plagued with various timeout issues well before we took ownership of it, and we want to get to the bottom of these.
The database is on SQL Express 2005 in production. We want to run the profiler on it but because the various objects are encrypted, most stored procedure calls etc.. show up as '-- Encrypted Text'.
Not very useful. I've written a little console app in C# to decrypt all the database objects, which works perfectly as far as I can tell
It finds all encrypted objects in the database and for each one, decrypts it, removes the with encryption clause, drops the original and recreates it using the new 'without encryption' text.
There are some computed columns that get dropped before trying to decrypt the functions that are used in their definitions, then get recreated.
What I'm finding is that once everything is decrypted, I can't get into the system because the stored procedures etc.. take far too long to run on their first call. Execution plans are being compiled for the first time, so some delay is understandable, but we're talking 1 minute plus.. after 30 seconds the command timeout is hit, so the plans never get compiled.
I also have the same issue if I drop and recreate the database objects using their original scripts (keeping the WITH ENCRYPTION clause in).
So there's some consistency there. However, what absolutely mystifies me is that if I drop the execution plans from the original copy of the database (which was created from a backup of the production database), the same stored procedures are much faster. 10 seconds for first call. As far as I can tell, the stored procedures, functions etc.. are the same.
From my testing, I don't think it's a particular procedure or function that is causing the problem. It seems like the delay is cumulative, the more objects I drop & recreate the slower things are.
I've taken a few random stabs in the dark, rebuilding indexes and updating stats - this has had no effect at all.
We could write something to execute all 540 functions, triggers, sprocs etc.. to pre-empt the first real call from a user, however once SQL server is restarted (and our client does restart their server from time to time) the execution plans will be dropped and we'd need to run the same tool again. To me it doesn't seem a viable option (neither is increasing the CommandTimeout property), I want to know why I'm seeing this behaviour.
I've been using sys.dm_exec_query_plan and sys.dm_exec_sql_text to look at the execution plans, and using DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE as part of my testing.
I'm totally stumped, please help me before I jump out the office window.
Thanks in advance,
Andy.
--EDIT--
I don't quite know how I missed it, but the Activity Monitor is showing a session being blocked by a recompile of a table valued function. It takes far too long to compile and the blocked query hits the timeout.
I don't understand why in original version of the database (restored from backup taken from the customer site), the compilation takes around 10 seconds, but after dropping and recreating these objects in the same database, the table valued function takes almost a minute to compile.
I've tried truncating the log, which didn't have any effect. I still need to have a look at the file sizes.
-- Another edit --
The TVF returns a temporary table, and has 12 outer joins in the query, all on either sys.server_principals or sys.database_role_members.
I seem to remember reading something about recompiles and temporary tables, which I'll have to check again..
You said yourself that (computed) columns were dropped. Is it possible that other stuff was manipulated in the tables? If so, you will probably want to reindex your tables, (which will update the tables' statistics as well) using a command such as:
Exec sp_msForEachTable #COMMAND1= 'DBCC DBREINDEX ( "?")'
...though it sounds like you've done something like this. Either way, I recommend doing it once you make such a big change to all of those objects.
Third recommendation:
While you are waiting for your procs to execute, run an sp_who2 on the database to make sure nothing is blocking your queries. It's quite possible that you might have some sort of long-lived transaction happening that you haven't accounted for.
Fourth recommendation:
Make sure your server has enough memory. Make sure your transaction log files and datafiles aren't auto-growing after all of those big index and object updates. That can take FOREVER to happen, especially on under-spec'ed hardware like you may have running SQL Express.
Fifth recommendation:
Run a SQL Server Profiler trace against the database and look at what statements are starting specifically, and which are timing out. "Zoom in" on those and analyze them piece by piece and see what's up. This will likely just take a lot of good ol' hard work to fully understand.
In summary, the act of dropping and recreating procs itself shouldn't cause this slowdown if the statistics and indexes they were initially built against are sufficiently similar to what they are now. It's likely that you will find that there's Something Else happening which isn't necessarily directly related to changing the proc definitions themselves.
Another shot in the dark: Were the computed columns which you had to drop originally persisted (and not persisted after recreation) or vice versa?
If the functions called in the computation are complex or expensive, persisted columns are very advantageous and might be responsible for the behavior you are seeing.
Turns out that if I pass the parameter of the TVF to a variable, then replace where the original parameter was used, normal service is resumed (query takes less than a second, instead of a minute!)
Some kind of parameter sniffing shenanigans going on, I don't really understand why though - at the point I'm trying to call the function, no query plans exist, good or bad.
I'm in contact with Microsoft on this one (first time I've ever used my MSDN support entitlement) so hopefully we'll find out more and I'll post what I've discovered.
Thanks all for your help, we're getting there!
Today again, I have a MAJOR issue with what appears to be parameter sniffing in SQL Server 2005.
I have a query comparing some results with known good results. I added a column to the results and the known good results, so that each month, I can load a new months results in both sides and compare only the current month. The new column is first in the clustered index, so new months will add to the end.
I add a criteria to my WHERE clause - this is code-generated, so it's a literal constant:
WHERE DATA_DT_ID = 20081231 -- Which is redundant because all DATA_DT_ID are 20081231 right now.
Performance goes to pot. From 7 seconds to compare about 1.5m rows to 2 hours and nothing completing. Running the generated SQL right in SSMS - no SPs.
I've been using SQL Server for going on 12 years now and I have never had so many problems with parameter sniffing as I have had on this production server since October (build build 9.00.3068.00). And in every case, it's not because it was run the first time with a different parameter or the table changed. This is a new table and it's only run with this parameter or no WHERE clause at all.
And, no, I don't have DBA access, and they haven't given me enough rights to see the execution plans.
It's to the point where I'm not sure I'm going to be able to handle this system off to SQL Server users with only a couple years experience.
UPDATE Turns out that although statistics claim to be up to date, running UPDATE STATISTICS WITH FULLSCAN clears up the problem.
FINAL UPDATE Even with recreating the SP, using WITH RECOMPILE and UPDATE STATISTICS, it turned out the query had to be rewritten in a different way to use a NOT IN instead of a LEFT JOIN with NULL check.
Not quite an answer, but I'll share my experience.
Parameter sniffing took a few years of SQL Server to come and bite me, when I went back to Developer DBA after moving away to mostly prod DBA work. I understood more about the engine, how SQL works, what was best left to the client etc and I was a better SQL coder.
For example, dynamic SQL or CURSORs or just plain bad SQL code probably won't ever suffer parameter sniffing. But better set programming or how to avoid dynamic SQL or more elegant SQL more likely will.
I noticed it for complex search code (plenty of conditionals) and complex reports where parameter defaults affected the plan. When I see how less experienced developers would write this code, then it won't suffer parameter sniffing.
In any event, I prefer parameter masking to WITH RECOMPILE. Updating stats or indexes forces a recompile anyway. But why recompile all the time? I've answered elsewhere to one of your questions with a link that mentions parameters are sniffed during compilation, so I don't have faith in it either.
Parameter masking is an overhead, yes, but it allows the optimiser to evaluate the query case by case, rather than blanket recompiling. Especially with statement level recompilation of SQL Server 2005
OPTIMISE FOR UNKNOWN in SQL Server 2008 also appears to do exactly the same thing as masking. My SQL Server MVP colleague and I spent some time investigating and came to this conclusion.
I suspect your problem is caused by out of data statistics. Since you do not have DBA access to the server, I would encourage you to ask the DBA when the last time statistics were updated. This can have a huge impact on performance. It also sounds like your tables are not indexed very well.
Basically, this does not "feel" like a parameter sniffing issue, but more of a "healthy" database issue.
This article describes how you can determine the last time statistics were updated:
Statistics Update Time
I second the comment about checking the statistics - I have seen several instances where a query's performance has fallen off a cliff specifically because the statistics are out of date.
Specifically, if you have a date in your PK, and SQL Server thinks there are only a 10 or 100 records which after a specific date when in fact there are thousands, it may choose terribly inefficient query plans because it thinks the dataset is much smaller than it really is.
HTH,
Andrew
I had a production issue exactly like this. A tab in the application which called a stored proc would not show. I ran a trace for the specific proc and saw the call. The application times out in 30 secs and the proc would take close to 40 - 50 secs to complete (ran the proc exactly as called from the trace).
Next step was to figure out which statement was causing the scans I notice in the execution of the procedure. So I scripted out the proc, removed the procedure syntax and declared variables and ran in query analyser. It RAN in 3 secs!!!
I'm writing this to let anyone out there looking for answeres know that this can happen in SQL. It stems from the parameter sniffing issue. I was able t ofind this thread because I pin-pointed the cause as a faulty cached query plan! I've read posts where they said it happens to one specific users/ value. But it can happen to any value and once it starts, it can be a continuous thing.
The solution for me was to script out the proc and run it again. yeah. that simple. An alter works fine. No need to drop and re-create. This causes SQL to refresh the cached plan and things were fine. I have not figured out how to disable this at a server level. It is too cumbersome to clean up all the procs. Hope this helps