Is there any hints in Oracle that works the same way as these SQL Server hints?
Recompile: That a query is recompiled every time it's run (if execution plans should vary greatly depending on parameters). Would this be best compared to cursor_sharing in Oracle?
Optimize for: When you want the plan to get optimized for a certain parameter even if a different one is used the first time the SQL is run? I guess maybe could be helped with cursor_sharing as well?
Since you're using 11g, Oracle should use adaptive cursor sharing by default. If you have a query that uses bind variables and the histogram on the column with skewed data indicates that different bind variable values should use different query plans, Oracle will maintain multiple query plans for the same SQL statement automatically. There would be no need to specifically hint the queries to get this behavior, it's already baked in to the optimizer.
I didn't know, but found a discussion with some solutions here on forums.oracle.com
Related
According to this Oracle documentation, I can assume that the Optimizer postpones the hard parse and it doesn't generate an execution plan until the first time a prepared statement is executed:
"The answer is a phenomenon called bind peeking. Earlier, when you ran that query with the bind variable value set to 'NY', the optimizer had to do a hard parse for the first time and while doing so it peeked at the bind variable to see what value had been assigned to it."
But when executing an EXPLAIN PLAN for a prepared statement with bind parameters, we get an executed plan. On his site, Markus Winand says that:
"When using bind parameters, the optimizer has no concrete values available to determine their frequency. It then just assumes an equal distribution and always gets the same row count estimates and cost values. In the end, it will always select the same execution plan."
Which one is true? Does an execution plan get generated when the statement is prepared using an evenly distribution value model, or is the hard parsing postponed until the first execution time.
This discussion misses a very important point about bind variables, parsing and bind peeking; and this is Histograms! Bind variables only becomes an issue when the column in question have histograms. Without histograms there is no need to peek at the value. Oracle have no information then about the distribution of the data, and will only use pure math (distinct values, number of null values, number of rows etc) to find the selectivity of the filter in question.
Binds and histograms are logical opposites. You use bind variables to get one execution plan for all your queries. You use histograms to get different execution plans for different search values. Bind peeking tried to overcome this issue. But it does not do a very good job at it. Many people have actually characterized the bind peeking feature as "a bug". Adaptive Cursor Sharing that comes around in Oracle 11g does a better job of solving this.
Actually I see to many histograms around. I usually disable histograms (method opt=>'for all columns size 1', and only create them when I truly need them.
And then to the original question: "Does Oracle choose a default execution plan when parsing a prepared statement?"
Parsing is not one activity. Parsing involves syntax checking, semantic analysis (does the tables and columns exist, do you have access to the tables), query rewrite (Oracle might rewrite the query in a better way - for instance - if we use the filters a=b and b=c, then Oracle can add the filter a=c), and of course finding an execution plan. We actually differ between different types of parsing - soft parse and hard parse. Hard parsing is where Oracle also have to create the execution plan for the query. This is a very costly activity.
Back to the question. The parsing doesn't really care if you are using bind variables or not. The difference is that if you use bind, you probably only have to do a soft parse. Using bind variables your query will look the same every time you run it (therefor getting the same hash_value). When you run a query Oracle will check (in the library cache) to see if there all ready is an execution plan for your query. This is not a default plan, but a plan that allready exist because someone else has executed the same query (and made Oracle do a hard parse generating an execution plan for this query) and the execution plan hasn't aged out of the cache yet. This is not a default plan. It's just the plan the optimizer at parse time considered the best choice for your query.
When you come to Oracle 12c it actually gets even more complicated. In 12 Oracle have Adaptive Execution plans - this means that the execution plan has an alternative. It can start out with a nested loop, and if it realize that it got the cardinality estimates wrong it can switch to a hash join in the middle of the execution of the query. It also have something called adaptive statistics and sql plan directives. All to make the optimizer and Oracle to make better choises when running your SQLs :-)
The first bind peek actually happens at the first execution. The plan optimization is deferred it doesn't happen at the prepare phase. And later on another bind peek might happen. Typically for VARCHAR2 when you bind two radically different values (i. e. in length of first value 1 byte and later 10 bytes) the optimizer peeks again and it might produce a new plan. In Oracle 12 it's extended even more, it has adaptive join methods. So optimizer suggest NESTED LOOPs but when it's actually being executed after many more rows than estimated comes it switches to HASH join immediately. It's not like adaptive cursor sharing where you need to make a mistake first to produce new execution plan.
Also one very important thing to prepared statements. Since these just re-executes the same cursor as is created with the first execution. They will always execute the same plan, there cannot be any adaptation. For adaptation and alternative execution plans at least SOFT parse must occur. So if the plan is aged out from shared pool or invalidated for any reason.
Explain plan is not cursor it will never respect bind variables. It's only display cursor where you can see bind variable information.
You can find actual information about captured bind values in V$SQL_BIND_CAPTURE.
According to Tom Kyte bind peeking takes place at the hard-parse stage, which chimes with the first quote in your post. In 11g the optimizer is even able to come up with different plans for different bind ranges, which directly contradicts the second quote (although to be fair he is talking about bind variables and not peeking specifically).
The query in the application uses bind values that drive it to one plan or the other consistently. It is only when the plan flip-flops between two radically different execution paths, and for some segment of users, that you have a really bad plan. In such cases, Oracle Database 11g might be the right answer for you, because it accommodates multiple plans.
In general, Oracle behavior starting from 11g is best described by adaptive cursor sharing (see http://docs.oracle.com/database/121/TGSQL/tgsql_cursor.htm#BGBJGDJE)
For JDBC (Thin Driver) specifically: When using PreparedStatements, no plan is generated before the execution step.
See the following example:
String metrics[] = new String[OracleConnection.END_TO_END_STATE_INDEX_MAX];
metrics[OracleConnection.END_TO_END_MODULE_INDEX] = "adaptiveCSTest";
((OracleConnection) conn).setEndToEndMetrics(metrics, (short) 0);
String getObjectNames = "select object_name from key.objects where object_type=?";
PreparedStatement objectNamesStmt = conn.prepareStatement(getObjectNames);
// module set, but statement not parsed
objectNamesStmt.setString(1, "CLUSTER");
// same state
ResultSet rset1 = objectNamesStmt.executeQuery();
// statement parsed and executed
I have a query using wrong indexes. I can see that with the usage of index there is no easy way for oracle fetch the data.The query is framed by a vendor software, and cannot be changed, Is there a way to force oracle to change the explain plan without hints.
Any help would be much appreciated.
There are at least 11 ways to control a plan without modifying the query. They are listed below roughly in the order of usefulness:
SQL Plan Baseline - Replace one plan with a another plan.
SQL Profiles - Add "corrective" hints to the plans. For example, a profile might say "this join returns 100 times more rows than expected", which indirectly changes the plan.
Stored Outline - Similar in idea to SQL Plan Baseline, but with less features. This option is simpler to use but less powerful and not supported anymore.
DBMS_STATS.SET_X_STATS - Manually modifying table, column, and index stats can significantly change plans by making objects artificially look more or less expensive.
Session Control - For example alter session set optimizer_features_enable='11.2.0.3';. There aren't always helpful parameters. But one of the OPTIMIZER_* parameters may help, or you may be able to change the plan with an undocumented hint or disabling a feature like this: alter session set "_fix_control"='XYZ:OFF';
System Control - Similar to above but applies to the whole system.
DBMS_SPD - A SQL Plan Directive is similar to a profile in that it provides some corrective information to the optimizer. But this works at a lower level, across all plans, and is new to 12c.
DBMS_ADVANCED_REWRITE - Change a query into another query.
Virtual Private Database - Change a query into another query, by adding predicates. It's not intended for performance, but you can probably abuse it to change index access paths.
SQL Translation Framework - Change a query into another query, before it even gets parsed. This can enable totally "wrong" SQL to run.
SQL Patch (dbms_sqldiag internal.i_create_patch) - Change a query into another query. Similar to DBMS_ADVANCED_REWRITE but it's undocumented and perhaps a bit more powerful.
I have a query that is being called by a DataContext object that is creating an extremely inefficient execution plan. I would like to add an "OPTION(RECOMPILE)" query hint to the query, but I do not know how to add this query hint to a DataContext object's query.
I ran a SQL trace in order to capture the query. I ran it manually as is and it took almost four minutes, by adding "OPTION(RECOMPILE)" to the query it reduced the run time to a second. The query contains many variables, a couple table-value functions and a view with an embedded table-value function. All the input variables are numbers. The query plans between the two executions were very different.
I do not need help optimizing the code to avoid the poor execution plan; I can do this myself if I need to go this route. All I need to know is if there is a way to add the OPTION(RECOMPILE) query hint to my Linq query. I'm not going to post the code, it is irrelevant to my question.
If it's possible to add the Recompile Query-Hint please let me know how and if it is not possible if you could please provide a link to some documentation that indicates this to be the case I would appreciate it.
I'm using SQL Server 2012 as my rdbms.
There is an issue against EF requesting that hints are added in future - http://entityframework.codeplex.com/workitem/261.
If you're lucky it would make it into EF 6.
Suppose I have a table called Projects with a column called Budget with a standard B-Tree index. The table has 50,000 projects, and only 1% of them have a Budget of over one million. If I ran the SQL Query:
SELECT * From Projects WHERE Budget > 1000000;
The planner will use an index range scan on Budget to get the rows off the heap table. However, if I use the query:
SELECT * From Projects WHERE Budget > 50;
The planner will most likely do a sequential scan on the table, as it will know this query will end up returning most or all rows anyway and there's no reason to load all the pages of the index into memory.
Now, let's say I run the query:
SELECT * From Projects WHERE Budget > :budget;
Where :budget is a bind parameter passed into my database. From what I've read, the query as above will be cached, and no data on cardinality can be inferred. In fact, most databases will just assume an even distribution and the cached query plan will reflect that. This surprised me, as usually when you read about the benefits of bind parameters it's on the subject of preventing SQL injection attacks.
Obviously, this could improve performance if the resulting query plan would be the same, as a new plan wouldn't have to be compiled, but could also hurt performance if the values of :budget greatly varied.
My Question: Why are bind parameters not resolved before the query plan is generated and cached? Shouldn't modern databases strive to generate the best plan for the query, which should mean looking at the value for each parameter and getting accurate index stats?
Note: This question probably doesn't apply to mySql as mySql doesn't cache SQL plans. However, I'm interested in why this is the case on Postgres, Oracle and MS SQL.
For Oracle specifically, it depends.
For quite some time (at least 9i), Oracle has supported bind variable peeking. That means that the first time a query is executed, the optimizer peeks at the value of the bind variable and bases its cardinality estimates on the value of that first bind variable. That makes sense in cases where most of the executions of a query are going to have bind variable values that return similarly sized results. If 99% of the queries are using small budget values, it is highly likely that the first execution will use a small value and thus the cached query plan will be appropriate for small bind variable values. Of course, that means that when you do specify a large bind variable value (or, worse, if you get lucky and the first execution is with a large value) you'll get less than optimal query plans.
If you are using 11g, Oracle can use adaptive cursor sharing. This allows the optimizer to maintain multiple query plans for a single query and to pick the appropriate plan based on the bind variable values. That can get rather complicated over time, though. If you have a query with N bind variables, the optimizer has to figure out how to partition that N-dimensional space into different query plans for different bind variable values in order to figure out when and whether to re-optimize a query for a new set of bind variable values and when to simply reuse an earlier plan. A lot of that work ends up being done at night during the nightly maintenance window in order to avoid incurring those costs during the productive day. But that also brings up issues about how much freedom the DBA wants to give the database to evolve plans over time vs how much the DBA wants to control plans so that the database doesn't suddenly start picking a poor plan that causes some major system to slow to a crawl on a random day.
This surprised me, as usually when you read about the benefits of bind parameters it's on the subject of preventing SQL injection attacks.
Don't confuse parameterized queries with prepared statements. Both offer parameterization, but prepared statements offer the additional caching of the query plan.
Why are bind parameters not resolved before the query plan is generated and cached?
Because sometimes generating the query plan is an expensive step. Prepared statements allow you to amortize the cost of query planning.
However, if all you're looking for is SQL injection protection, don't use prepared statements. Use parameterized queries.
For example, in PHP, you can use http://php.net/pg_query_params to execute a parameterized query WITHOUT caching the query plan; meanwhile http://php.net/pg_prepare and http://php.net/pg_execute are used to cache a plan for a prepared statement and later execute it.
Edit: 9.2 apparently changes the way prepared statements are planned
My primary concern is with SQL Server 2005... I went through many website and each tells something different.
What are the scenarios that are good / ok to use.. For example does it hurts to even set variable values inside IF or only if I run a query. Supposing my SPs is building a dynamic SQL based of several conditions in Input Parameters, do I need to rethink about the query... What about a SP that runs different query based on whether some record exists in the table. etc.. etc.. My question is not just limited to these scenarios... I'm looking for a little more generalised answer so that I can improve my future SPs
In essense... Which statements are good to use in Branching conditions / Loops, which is bad and which is Okay.
Generally... Avoid procedural code in your database, and stick to queries. That gives the Query Optimizer the chance to do its job much better.
The exceptions would be code that is designed to do many things, rather than making a result-set, and when a query would need to join rows exponentially to get a result.
It is very hard to answer this question if you don't provide any code. No language construct is Good/Bad/Okay by itself, its what you want to achieve and how well that can be expressed with those constructs.
There's no definitive answer as it really depends on the situation.
In general, I think it's best to keep the logic within a sproc as simple and set-based as possible. Making it too complicated with multiple nested IF conditions for example, may complicate it for the query optimiser meaning it can't create a good execution plan suitable for all paths through the sproc. For example, the first time the sproc is run, it takes path A through the logic and the execution plan reflects this. The next time it runs with different parameters, it takes path B through but resuses the original execution plan which is not optimal for this second path. One solution to this is to break the load into separate stored procedures to call depending on the path being followed - this allows that sub sproc to be optimised and execution plan cached independently.
Loops can be the only viable option, but in general I'd try to not use them - always try to do things in a set-based fashion if it is possible.