SQL execution plan caching

SQL execution plan caching - sql

I have a few questions regarding Microsoft SQL Server 2008 performance, mainly about execution plans.
According, to MSDN, stored procedures have better performance compared to direct SQL queries, because:
The database can prepare, optimize, and cache the execution plan so that the execution plan can be reused at a later time.
My first question is why this is the case. I have previously read that when using parameterized queries (prepared statements), the execution plan is cached for subsequent executions with potentially different values (execution context). Would a stored procedure still be more efficient? If so, is a stored procedure's execution plan only recreated on demand, or is it just less likely to be cleared from the cache? Is a parameterized query treated as an ad-hoc query, meaning that the execution plan is more likely to be cleared from the cache?
Also, since I am still a novice in this field, I am wondering if there are certain commands that only work in T-SQL. I have a query that takes ~12 seconds to complete on the first run and then ~3 seconds after that, in both Microsoft SQL Management Studio and ADO.NET. The query is supposed to be ineffective as part of my presentation. The thing is that in my query, I use both CHECKPOINT and DBCC DROPCLEANBUFFERS as per this article and also OPTION (RECOMPILE). However, at least the two first do not seem to make a difference, as the query will still take 3 seconds. My guess would be that it is due to the data cache not being cleared. Any ideas why the cache does not seem to be cleared, or any ideas as to why my query is significantly faster after the first execution?
Those are the questions I could think of for now.

"Would a stored procedure still be more efficient?": Essentially no. It saves very little. From a performance standpoint, you can pretty much use SQL literals in your app (except if they are HUGE). SQL Server will match the string you send to it to a cached plan just fine.
" I have a query that takes ~12 seconds to complete on the first run and then ~3 seconds after " Considering that you cleared all caches, this is probably a statistics issue. SQL Server is auto-creating statistics the first time you access a column. I guess this is what happened once to you. Try running sp_updatestats (before you clear the caches).

Related

SQL Server Lost Execution Plan

Database runs OLTP constantly. About once a week (sometimes more, sometimes less), one stored procedure brings the whole database server to a halt via CPU consumption. I alleviate the issue by recompiling the stored procedure. This is no longer a viable solution and need assistance in identifying the cause and a solution. Any guidance would be appreciated. My Our assumption is that a “good” execution plan is being lost and replaced with a “bad” execution plan.

Our assumption is that a “good” execution plan is being lost and replaced with a “bad” execution plan.
More likely is that a "good" execution at one time is now a "bad" plan with the current set of data.
Some things that may help:
Make sure statistics are up-to-date (daily if not more frequent)
Perform routine (daily) maintenance to reduce fragmentation
You don't say why recompiling is not a viable option but that could be helpful as well.
All of this can be automated so you don't have to babysit the system.

How to be as fair as possible to conduct SQL query performance test?

I am using SQL server 2008 R2 and am trying to test the performance of some SQL queries with different writing style. These involve udfs, views, inline, join, pivot, etc.
After quite a bit of testings, I can say that udfs are super slow, so now I expanded them into my queries. But as the performance test went on, it became much harder to tell the difference because I had been getting inconsistent results.
Like for example, I tested a query full of joins and one that moves some joins to inline select. The inline style performed better and better until I moved all the joins to inline selects, it performed much worst than the original.
Another thing is the execution time, which is highly unreliable. For the same query, the execution time varies and inconsistent as well. Like when I opened up the SQL Profiler, even I saw a higher CPU time, Reads, Writes, the execution time can be quicker!
I really need a good method as fair as possible to test the SQL queries. Execution plan doesn't help. I need numbers like CPU cycles(not elapse time), Reads, Writes, Memory, something I can calculate and are consistent. And I need a way to force the same testing environment. I tried DBCC DROPCLEANBUFFERS, but thats not helping.
Thank you!

Testing for Performance in Multiple SQL Queries

I'm working to improve the efficiency of some SQL Queries on SQL-Server-2008. There are different ways of performing each query and I want to find the fastest of them.
However, the issue that I'm having is that I am having trouble determining which is actually executing faster. Ideally I could just run each query one after the other and see which runs fastest. Ideally...
Problem is, is that SQL is too smart for my liking. When constructing these queries I run them multiple times. When I do this, the queries' efficiencies improve on their own. This I would imaged is because of some behind-the-scenes stuff that SQL does. What is this? How can I avoid it?
For example, I run the query once and it takes 30s. I run it again and it takes 10s. The more I run the query the faster it seems to run.
So.. Is there any way of "clearing the cache" or whatever the equivalent would be in SQL? I want to get an accurate indication of which query is going to actually run faster. Alternatively, what would be the best way to do the type of testing that I want?
Any information in regards to this topic would be accepted as valid input.

When the query is run first most likely the data is still on disk, SQl Server has to fetch this data, when you run the same query the data is already in RAM and thus it will be much faster than going to disk
run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE to clear the cache without doing a restart
You need to look at execution plans, statistics io and statistics time to really see what is going on. in the plan look for conversions and also for scans (you want seeks if possible).
See also Client Statistics in SSMS. Check execution times

The improvement in speed that you see is a result of the database's query cache. Most relational DB engines have this feature, which caches the result of a query until the table(s) you read from are updated.
This post gives good pointers on how to work around this for performance tuning. You should also look into Execution Plans, which show you how the database would run the query, without actually running it. The benefit of this is that you can see if full table scans are being done where an index could be used instead.

Include the Actual Execution Plan and execute the following command:
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO

What is the best way to spot the slowest block of a SQL query?

I am facing a problem that running a stored procedure is taking too much resources which sometimes causes a time out on the server (especially when the CPU usage is more than 90%).
Can anyone suggest what the best and quickest way is to spot the block which takes much resources, and also suggest a good way to solve it, please?
I am using SQL server 2005

You want to use the Query profiler. Explained here. Which will show you a graphical representation of your queries execution path, as well as which parts of it are taking the most time.

If you want to know which block is slowest, use the following
SET STATISTICS PROFILE ON
SET STATISTICS IO ON
SET STATISTICS TIME ON
When you run the SP this will display stats for each query.

If you are using the SQl Server Management studio, you can turn on the execution plan to display information about how the query will be executed by sql server including what percentage of the entire process will be taken up by each sub-process.
often when doing this, there will be a part of the query that is obviously using most of the resources.
using this informationm you can then make an informed decision about how to tune the database. (like adding an index to the offending table(s))

You don't need to use SQL Profiler to view an execution plan - just:
SET SHOWPLAN_XML ON

If there are a bunch of statements in the sproc it can be a bit convoluted to turn on the SET STATISTICS options since you have many chunks of output to associate with input.
The graphical representation of a query plan in SSMS is pretty useful since it shows you the % cost of each statement relative to the cost of the entire batch/sproc. But this is a single value, so it can be more helpful at times just to run Profiler and turn on statement level output. Profiler will give you separate IO and CPU cost for each statement if you add event SQL:StmtCompleted and columns CPU and Reads.

How do I clear oracle execution plan cache for benchmarking?

On oracle 10gr2, I have several sql queries that I am comparing performance. But after their first run, the v$sql table has the execution plan stored for caching, so for one of the queries I go from 28 seconds on first run to .5 seconds after.
I've tried
ALTER SYSTEM FLUSH BUFFER_CACHE;
After running this, the query consistently runs at 5 seconds, which I do not believe is accurate.
Thought maybe deleting the line item itself from the cache:
delete from v$sql where sql_text like 'select * from....
but I get an error about not being able to delete from view.

Peter gave you the answer to the question you asked.
alter system flush shared_pool;
That is the statement you would use to "delete prepared statements from the cache".
(Prepared statements aren't the only objects flushed from the shared pool, the statement does more than that.)
As I indicated in my earlier comment (on your question), v$sql is not a table. It's a dynamic performance view, a convenient table-like representation of Oracle's internal memory structures. You only have SELECT privilege on the dynamic performance views, you can't delete rows from them.
flush the shared pool and buffer cache?
The following doesn't answer your question directly. Instead, it answers a fundamentally different (and maybe more important) question:
Should we normally flush the shared pool and/or the buffer cache to measure the performance of a query?
In short, the answer is no.
I think Tom Kyte addresses this pretty well:
http://www.oracle.com/technology/oramag/oracle/03-jul/o43asktom.html
http://www.oracle.com/technetwork/issue-archive/o43asktom-094944.html
<excerpt>
Actually, it is important that a tuning tool not do that. It is important to run the test, ignore the results, and then run it two or three times and average out those results. In the real world, the buffer cache will never be devoid of results. Never. When you tune, your goal is to reduce the logical I/O (LIO), because then the physical I/O (PIO) will take care of itself.
Consider this: Flushing the shared pool and buffer cache is even more artificial than not flushing them. Most people seem skeptical of this, I suspect, because it flies in the face of conventional wisdom. I'll show you how to do this, but not so you can use it for testing. Rather, I'll use it to demonstrate why it is an exercise in futility and totally artificial (and therefore leads to wrong assumptions). I've just started my PC, and I've run this query against a big table. I "flush" the buffer cache and run it again:
</excerpt>
I think Tom Kyte is exactly right. In terms of addressing the performance issue, I don't think that "clearing the oracle execution plan cache" is normally a step for reliable benchmarking.
Let's address the concern about performance.
You tell us that you've observed that the first execution of a query takes significantly longer (~28 seconds) compared to subsequent executions (~5 seconds), even when flushing (all of the index and data blocks from) the buffer cache.
To me, that suggests that the hard parse is doing some heavy lifting. It's either a lot of work, or its encountering a lot of waits. This can be investigated and tuned.
I'm wondering if perhaps statistics are non-existent, and the optimizer is spending a lot of time gathering statistics before it prepares a query plan. That's one of the first things I would check, that statistics are collected on all of the referenced tables, indexes and indexed columns.
If your query joins a large number of tables, the CBO may be considering a huge number of permutations for join order.
A discussion of Oracle tracing is beyond the scope of this answer, but it's the next step.
I'm thinking you are probably going to want to trace events 10053 and 10046.
Here's a link to an "event 10053" discussion by Tom Kyte you may find useful:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:63445044804318
tangentially related anecdotal story re: hard parse performance
A few years back, I did see one query that had elapsed times in terms of MINUTES on first execution, subsequent executions in terms of seconds. What we found was that vast majority of the time for the first execution time was spent on the hard parse.
This problem query was written by a CrystalReports developer who innocently (naively?) joined two humongous reporting views.
One of the views was a join of 62 tables, the other view was a join of 42 tables.
The query used Cost Based Optimizer. Tracing revealed that it wasn't wait time, it was all CPU time spent evaluating possible join paths.
Each of the vendor supplied "reporting" views wasn't too bad by itself, but when two of them were joined, it was agonizingly slow. I believe the problem was the vast number of join permutations that the optimizer was considering. There is an instance parameter that limits the number of permutations considered by the optimizer, but our fix was to re-write the query. The improved query only joined the dozen or so tables that were actually needed by the query.
(The initial immediate short-term "band aid" fix was to schedule a run of the query earlier in the morning, before report generation task ran. That made the report generation "faster", because the report generation run made use of the already prepared statement in the shared pool, avoiding the hard parse.
The band aid fix wasn't a real solution, it just moved the problem to a preliminary execution of the query, when the long execution time wasn't noticed.
Our next step would have probably been to implement a "stored outline" for the query, to get a stable query plan.
Of course, statement reuse (avoiding the hard parse, using bind variables) is the normative pattern in Oracle. It mproves performance, scalability, yada, yada, yada.
This anecdotal incident may be entirely different than the problem you are observing.
HTH

It's been a while since I worked with Oracle, but I believe execution plans are cached in the shared pool. Try this:
alter system flush shared_pool;
The buffer cache is where Oracle stores recently used data in order to minimize disk io.

We've been doing a lot of work lately with performance tuning queries, and one culprit for inconsistent query performance is the file system cache that Oracle is sitting on.
It's possible that while you're flushing the Oracle cache the file system still has the data your query is asking for meaning that the query will still return fast.
Unfortunately I don't know how to clear the file system cache - I just use a very helpful script from our very helpful sysadmins.

FIND ADDRESS AND HASH_VALUE OF SQL_ID
select address,hash_value,inst_id,users_executing,sql_text from gv$sqlarea where sql_id ='7hu3x8buhhn18';
PURGE THE PLAN FROM SHARED POOL
exec sys.dbms_shared_pool.purge('0000002E052A6990,4110962728','c');
VERIFY
select address,hash_value,inst_id,users_executing,sql_text from gv$sqlarea where sql_id ='7hu3x8buhhn18';

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas