SQL Server 2014 performance - parameterized SQL vs. literals - sql

I have a simple query like
select count(distinct key)
from table
where date between '2014-01-01' and '2014-12-31'
It's fast (about 1 second), but becomes much slower (about 4 seconds) when I try to parameterize it within sp_executesql:
exec sp_executesql
N'select count(distinct key) from table where date between #start and #end',
N'#start date, #end date',
#start = '2014-01-01', #end='2014-12-31'
Why the difference in performance?
UPDATE The plan difference appears to be because of type conversion. When I change the parameter types to N'#start datetime, #end datetime', to match the columns exactly, the discrepancy disappears, and the plans for parameters vs. constants are practically identical (same costs, etc.) (Facepalm.)
I'll accept an answer that explains why the type conversion results in such a dramatic plan difference rather than just converting the parameters up front and proceeding as usual.
END UPDATE
The plans are very similar - same index, same estimated cardinalities, same row counts, and same I/O - although the CPU cost estimates in the parameterized version are higher.
Specific discrepancies:
The same Index Seek appears in both plans, but it is parallel in the (fast) version with literals, and NOT parallel in the (slow) version with parameters. The non-parallel one has higher CPU estimate.
The version with parameters has an Nested Loop join connecting the index scan with some logic around parameters. This appears to add some CPU overhead of its own (CPU estimate = 7.6 on the nested loop).
Both versions have "Parallelism" as a child of the Hash Match; in the parameterized version it is Distribute Streams (CPU estimate = 16.5) but in the literal version it is Repartition Streams (CPU estimate = 8.3)
How can I get the parameterized version to perform similarly as the version with literals?
All I could find in my research about performance of parameterized queries has to do with estimates -- either plans cached with different parameter values, or local variables being treated as unknowns. Neither is a factor here; my estimates are correct.

Why the difference in performance?
Because you have two different queries and they have two different execution plans (even though they are similar).
Why do you have two different plans?
There is a great detailed article by Erland Sommarskog Slow in the Application, Fast in SSMS? Understanding Performance Mysteries, where he explains how query optimizer works. The key points applicable to your example are:
A constant is a constant, and when a query includes a constant, SQL Server can use the value of the constant with full trust, and even
take such shortcuts to not access a table at all, if it can infer from
constraints that no rows will be returned.
For a parameter, SQL Server does not know the run-time value, but it "sniffs" the input value when compiling the query.
For a local variable, SQL Server has no idea at all of the run-time value, and applies standard assumptions. (Which the assumptions are
depends on the operator and what can be deduced from the presence of
unique indexes.)
And there is a corollary of this: if you take out a query from a
stored procedure and replace variables and parameters with constants,
you now have quite a different query.
As shown in the article quick answer to:
How can I get the parameterized version to perform similarly as the
version with literals?
is to use OPTION(RECOMPILE) / OPTIMIZE FOR.
RECOMPILE
Instructs the SQL Server Database Engine to discard the plan generated
for the query after it executes, forcing the query optimizer to
recompile a query plan the next time the same query is executed.
Without specifying RECOMPILE, the Database Engine caches query plans
and reuses them. When compiling query plans, the RECOMPILE query hint
uses the current values of any local variables in the query and, if
the query is inside a stored procedure, the current values passed to
any parameters.
So, option RECOMPILE behaves as if query had literal values instead of parameters.
OPTIMIZE FOR
Instructs the query optimizer to use a particular value for a local
variable when the query is compiled and optimized. The value is used
only during query optimization, and not during query execution.

Related

Impact of variables in a paramatized SQL query

I have a parametrized query which looks like (With ? being the applications parameter):
SELECT * FROM tbl WHERE tbl_id = ?
What are the performance implications of adding a variable like so:
DECLARE #id INT = ?;
SELECT * FROM tbl WHERE tbl_id = #id
I have attempted to investigate myself but have had no luck other than query plans taking slightly longer to compile when the query is first run.
If tbl_id is unique there is no difference at all. I'm trying to explain why.
SQL Server usually can solve a query with many different execution plans. SQL Server has to choose one. It tries to find the the most efficient one without too much effort. Once SQL Server chooses one plan it usually caches it for later reuse. Cardinality plays a key role in the efficiency of an execution plan, i.e How many rows there are on tbl with a given value of tbl_id?. SQL Server stores column values frequency statistics to estimate cardinality.
Firstly, lets assume tbl_id is not unique and has a non uniform distribution.
In the first case we have tbl_id = ?. Lets figure out its cardinality. The first thing we need to do to figure it out is knowing the value of the parameter ?. Is it unknown? Not really. We have a value the first time the query is executed. SQL Server takes this value, it goes to stored statistics and estimates cadinality for this specific value, it estimates the cost of a bunch of possible execution plans taking into account the estimated cardinality, chooses the most efficient one and cache it for later reuse. This approach works most of the time. However if you execute the query later with other parameter value that has a very different cardinality, the cached execution plan might be very inefficient.
In the second case we have tbl_id = #id being #id a variable declared in the query, it isn't a query parameter. Which is the value of #id?. SQL Server treats it as an unknown value. SQL Server peaks the mean frequency from stored statistics as the estimated cardinality for unknown values. Then SQL Server do the same as before: it estimates the cost of a bunch of possible execution plans taking into account the estimated cardinality, chooses the most efficient one and cache it for later reuse. Again, this approach works most of the time. However if you execute the query with one parameter value that has a very different cardinality than the mean, the execution plan might be very inefficient.
When all values have the same cardinality they have the mean cardinality, so there is no difference between parameter and variable. This is the case of unique values, therefore there are no difference when values are unique.
one advantage of the 2nd approach is that is reduces the number of plans SQL will store.
in the first version it will create a different plan for every datatype (tinyint, smallint,int & bigint)
thats assuming its an adhoc statement.
If its in a stored proc - you might run into p-sniffing as mentioned above.
You could try adding
OPTION ( OPTIMIZE FOR (#id = [some good value]))
to the select to see if that helps - but is is usually not considered good practice to couple your queries to values.
I'm not sure if this helps, but I have to account for parameter sniffing for a lot of the stored procedures I write. I do this by creating local variables, setting those to the parameter values, and then using the local variables in the stored procedure. If you look at the stored execution plan, you can see that this prevents the parameter values from being used in the plan.
This is what I do:
CREATE PROCEDURE dbo.Test ( #var int )
AS
DECLARE
#_var int
SELECT
#_var = #var
SELECT *
FROM dbo.SomeTable
WHERE
Id = #_var
I do this mostly for SSRS. I've had a query/stored procedure return <1sec, but the report takes several minutes, for example. Doing the trick above fixed that.
There are also options for optimizing specific values (e.g. OPTION (OPTIMIZE #var FOR UNKNOWN)), but I've found this usually does not help me and will not have the same effects as the trick above. I haven't been able to investigate the specifics into why they are different, but I have experienced the OPTIMIZE FOR UNKNOWN did not help, where as using local variables in place of variables did.

SQL - any performance difference using constant values vs parameters?

Is there any difference, with regards to performance, when there are many queries running with (different) constant values inside a where clause, as opposed to having a query with declared parameters on top, where instead the parameter value is changing?
Sample query with with constant value in where clause:
select
*
from [table]
where [guid_field] = '00000000-0000-0000-000000000000' --value changes
Proposed (improved?) query with declared parameters:
declare #var uniqueidentifier = '00000000-0000-0000-000000000000' --value changes
select
*
from [table]
where [guid_field] = #var
Is there any difference? I'm looking at the execution plans of something similar to the two above queries and I don't see any difference. However, I seem to recall that if you use constant values in SQL statements that SQL server won't reuse the same query execution plans, or something to that effect that causes worse performance -- but is that actually true?
It is important to distinguish between parameters and variables here. Parameters are passed to procedures and functions, variables are declared.
Addressing variables, which is what the SQL in the question has, when compiling an ad-hoc batch, SQL Server compiles each statement within it's own right.
So when compiling the query with a variable it does not go back to check any assignment, so it will compile an execution plan optimised for an unknown variable.
On first run, this execution plan will be added to the plan cache, then future executions can, and will reuse this cache for all variable values.
When you pass a constant the query is compiled based on that specific value, so can create a more optimum plan, but with the added cost of recompilation.
So to specifically answer your question:
However, I seem to recall that if you use constant values in SQL statements that SQL server won't reuse the same query execution plans, or something to that effect that causes worse performance -- but is that actually true?
Yes it is true that the same plan cannot be reused for different constant values, but that does not necessarily cause worse performance. It is possible that a more appropriate plan can be used for that particular constant (e.g. choosing bookmark lookup over index scan for sparse data), and this query plan change may outweigh the cost of recompilation. So as is almost always the case regarding SQL performance questions. The answer is it depends.
For parameters, the default behaviour is that the execution plan is compiled based on when the parameter(s) used when the procedure or function is first executed.
I have answered similar questions before in much more detail with examples, that cover a lot of the above, so rather than repeat various aspects of it I will just link the questions:
Does assigning stored procedure input parameters to local variables help optimize the query?
Ensure cold cache when running query
Why is SQL Server using index scan instead of index seek when WHERE clause contains parameterized values
There are so many things involved in your question and all has to do with statistics..
SQL compiles execution plan for even Adhoc queries and stores them in plan cache for Reuse,if they are deemed safe.
select * into test from sys.objects
select schema_id,count(*) from test
group by schema_id
--schema_id 1 has 15
--4 has 44 rows
First ask:
we are trying a different literal every time,so sql saves the plan if it deems as safe..You can see second query estimates are same as literla 4,since SQL saved the plan for 4
--lets clear cache first--not for prod
dbcc freeproccache
select * from test
where schema_id=4
output:
select * from test where
schema_id=1
output:
second ask :
Passing local variable as param,lets use same value of 4
--lets pass 4 which we know has 44 rows,estimates are 44 whem we used literals
declare #id int
set #id=4
select * from test
As you can see below screenshot,using local variables estimated less some rough 29.5 rows which has to do with statistics ..
output:
So in summary ,statistics are crucial in choosing query plan(nested loops or doing a scan or seek)
,from the examples,you can see how estimates are different for each method.further from a plan cache bloat perspective
You might also wonder ,what happens if i pass many adhoc queries,since SQL generates a new plan for same query even if there is change in space,below are the links which will help you
Further readings:
http://www.sqlskills.com/blogs/kimberly/plan-cache-adhoc-workloads-and-clearing-the-single-use-plan-cache-bloat/
http://sqlperformance.com/2012/11/t-sql-queries/ten-common-threats-to-execution-plan-quality
First, note that a local variable is not the same as a parameter.
Assuming the column is indexed or has statistics, SQL Server uses the statistics histogram to glean an estimate the qualifying row count based on the constant value supplied. The query will also be auto-parameterized and cached if it is trivial (yield the same plan regardless of values) so that subsequent executions avoid query compilation costs.
A parameterized query also generates a plan using the stats histogram with the initially supplied parameter value. The plan is cached and reused for subsequent executions regardless of whether or not it is trivial.
With a local variable, SQL Server uses the overall statistics cardinality to generate the plan because the actual value is unknown at compile time. This plan may be good for some values but suboptimal for others when the query is not trivial.

Does Oracle chose a default execution plan when parsing a prepared statement?

According to this Oracle documentation, I can assume that the Optimizer postpones the hard parse and it doesn't generate an execution plan until the first time a prepared statement is executed:
"The answer is a phenomenon called bind peeking. Earlier, when you ran that query with the bind variable value set to 'NY', the optimizer had to do a hard parse for the first time and while doing so it peeked at the bind variable to see what value had been assigned to it."
But when executing an EXPLAIN PLAN for a prepared statement with bind parameters, we get an executed plan. On his site, Markus Winand says that:
"When using bind parameters, the optimizer has no concrete values available to determine their frequency. It then just assumes an equal distribution and always gets the same row count estimates and cost values. In the end, it will always select the same execution plan."
Which one is true? Does an execution plan get generated when the statement is prepared using an evenly distribution value model, or is the hard parsing postponed until the first execution time.
This discussion misses a very important point about bind variables, parsing and bind peeking; and this is Histograms! Bind variables only becomes an issue when the column in question have histograms. Without histograms there is no need to peek at the value. Oracle have no information then about the distribution of the data, and will only use pure math (distinct values, number of null values, number of rows etc) to find the selectivity of the filter in question.
Binds and histograms are logical opposites. You use bind variables to get one execution plan for all your queries. You use histograms to get different execution plans for different search values. Bind peeking tried to overcome this issue. But it does not do a very good job at it. Many people have actually characterized the bind peeking feature as "a bug". Adaptive Cursor Sharing that comes around in Oracle 11g does a better job of solving this.
Actually I see to many histograms around. I usually disable histograms (method opt=>'for all columns size 1', and only create them when I truly need them.
And then to the original question: "Does Oracle choose a default execution plan when parsing a prepared statement?"
Parsing is not one activity. Parsing involves syntax checking, semantic analysis (does the tables and columns exist, do you have access to the tables), query rewrite (Oracle might rewrite the query in a better way - for instance - if we use the filters a=b and b=c, then Oracle can add the filter a=c), and of course finding an execution plan. We actually differ between different types of parsing - soft parse and hard parse. Hard parsing is where Oracle also have to create the execution plan for the query. This is a very costly activity.
Back to the question. The parsing doesn't really care if you are using bind variables or not. The difference is that if you use bind, you probably only have to do a soft parse. Using bind variables your query will look the same every time you run it (therefor getting the same hash_value). When you run a query Oracle will check (in the library cache) to see if there all ready is an execution plan for your query. This is not a default plan, but a plan that allready exist because someone else has executed the same query (and made Oracle do a hard parse generating an execution plan for this query) and the execution plan hasn't aged out of the cache yet. This is not a default plan. It's just the plan the optimizer at parse time considered the best choice for your query.
When you come to Oracle 12c it actually gets even more complicated. In 12 Oracle have Adaptive Execution plans - this means that the execution plan has an alternative. It can start out with a nested loop, and if it realize that it got the cardinality estimates wrong it can switch to a hash join in the middle of the execution of the query. It also have something called adaptive statistics and sql plan directives. All to make the optimizer and Oracle to make better choises when running your SQLs :-)
The first bind peek actually happens at the first execution. The plan optimization is deferred it doesn't happen at the prepare phase. And later on another bind peek might happen. Typically for VARCHAR2 when you bind two radically different values (i. e. in length of first value 1 byte and later 10 bytes) the optimizer peeks again and it might produce a new plan. In Oracle 12 it's extended even more, it has adaptive join methods. So optimizer suggest NESTED LOOPs but when it's actually being executed after many more rows than estimated comes it switches to HASH join immediately. It's not like adaptive cursor sharing where you need to make a mistake first to produce new execution plan.
Also one very important thing to prepared statements. Since these just re-executes the same cursor as is created with the first execution. They will always execute the same plan, there cannot be any adaptation. For adaptation and alternative execution plans at least SOFT parse must occur. So if the plan is aged out from shared pool or invalidated for any reason.
Explain plan is not cursor it will never respect bind variables. It's only display cursor where you can see bind variable information.
You can find actual information about captured bind values in V$SQL_BIND_CAPTURE.
According to Tom Kyte bind peeking takes place at the hard-parse stage, which chimes with the first quote in your post. In 11g the optimizer is even able to come up with different plans for different bind ranges, which directly contradicts the second quote (although to be fair he is talking about bind variables and not peeking specifically).
The query in the application uses bind values that drive it to one plan or the other consistently. It is only when the plan flip-flops between two radically different execution paths, and for some segment of users, that you have a really bad plan. In such cases, Oracle Database 11g might be the right answer for you, because it accommodates multiple plans.
In general, Oracle behavior starting from 11g is best described by adaptive cursor sharing (see http://docs.oracle.com/database/121/TGSQL/tgsql_cursor.htm#BGBJGDJE)
For JDBC (Thin Driver) specifically: When using PreparedStatements, no plan is generated before the execution step.
See the following example:
String metrics[] = new String[OracleConnection.END_TO_END_STATE_INDEX_MAX];
metrics[OracleConnection.END_TO_END_MODULE_INDEX] = "adaptiveCSTest";
((OracleConnection) conn).setEndToEndMetrics(metrics, (short) 0);
String getObjectNames = "select object_name from key.objects where object_type=?";
PreparedStatement objectNamesStmt = conn.prepareStatement(getObjectNames);
// module set, but statement not parsed
objectNamesStmt.setString(1, "CLUSTER");
// same state
ResultSet rset1 = objectNamesStmt.executeQuery();
// statement parsed and executed

Do SQL bind parameters affect performance?

Suppose I have a table called Projects with a column called Budget with a standard B-Tree index. The table has 50,000 projects, and only 1% of them have a Budget of over one million. If I ran the SQL Query:
SELECT * From Projects WHERE Budget > 1000000;
The planner will use an index range scan on Budget to get the rows off the heap table. However, if I use the query:
SELECT * From Projects WHERE Budget > 50;
The planner will most likely do a sequential scan on the table, as it will know this query will end up returning most or all rows anyway and there's no reason to load all the pages of the index into memory.
Now, let's say I run the query:
SELECT * From Projects WHERE Budget > :budget;
Where :budget is a bind parameter passed into my database. From what I've read, the query as above will be cached, and no data on cardinality can be inferred. In fact, most databases will just assume an even distribution and the cached query plan will reflect that. This surprised me, as usually when you read about the benefits of bind parameters it's on the subject of preventing SQL injection attacks.
Obviously, this could improve performance if the resulting query plan would be the same, as a new plan wouldn't have to be compiled, but could also hurt performance if the values of :budget greatly varied.
My Question: Why are bind parameters not resolved before the query plan is generated and cached? Shouldn't modern databases strive to generate the best plan for the query, which should mean looking at the value for each parameter and getting accurate index stats?
Note: This question probably doesn't apply to mySql as mySql doesn't cache SQL plans. However, I'm interested in why this is the case on Postgres, Oracle and MS SQL.
For Oracle specifically, it depends.
For quite some time (at least 9i), Oracle has supported bind variable peeking. That means that the first time a query is executed, the optimizer peeks at the value of the bind variable and bases its cardinality estimates on the value of that first bind variable. That makes sense in cases where most of the executions of a query are going to have bind variable values that return similarly sized results. If 99% of the queries are using small budget values, it is highly likely that the first execution will use a small value and thus the cached query plan will be appropriate for small bind variable values. Of course, that means that when you do specify a large bind variable value (or, worse, if you get lucky and the first execution is with a large value) you'll get less than optimal query plans.
If you are using 11g, Oracle can use adaptive cursor sharing. This allows the optimizer to maintain multiple query plans for a single query and to pick the appropriate plan based on the bind variable values. That can get rather complicated over time, though. If you have a query with N bind variables, the optimizer has to figure out how to partition that N-dimensional space into different query plans for different bind variable values in order to figure out when and whether to re-optimize a query for a new set of bind variable values and when to simply reuse an earlier plan. A lot of that work ends up being done at night during the nightly maintenance window in order to avoid incurring those costs during the productive day. But that also brings up issues about how much freedom the DBA wants to give the database to evolve plans over time vs how much the DBA wants to control plans so that the database doesn't suddenly start picking a poor plan that causes some major system to slow to a crawl on a random day.
This surprised me, as usually when you read about the benefits of bind parameters it's on the subject of preventing SQL injection attacks.
Don't confuse parameterized queries with prepared statements. Both offer parameterization, but prepared statements offer the additional caching of the query plan.
Why are bind parameters not resolved before the query plan is generated and cached?
Because sometimes generating the query plan is an expensive step. Prepared statements allow you to amortize the cost of query planning.
However, if all you're looking for is SQL injection protection, don't use prepared statements. Use parameterized queries.
For example, in PHP, you can use http://php.net/pg_query_params to execute a parameterized query WITHOUT caching the query plan; meanwhile http://php.net/pg_prepare and http://php.net/pg_execute are used to cache a plan for a prepared statement and later execute it.
Edit: 9.2 apparently changes the way prepared statements are planned

Is a dynamic sql stored procedure a bad thing for lots of records?

I have a table with almost 800,000 records and I am currently using dynamic sql to generate the query on the back end. The front end is a search page which takes about 20 parameters and depending on if a parameter was chosen, it adds an " AND ..." to the base query. I'm curious as to if dynamic sql is the right way to go ( doesn't seem like it because it runs slow). I am contemplating on just creating a denormalized table with all my data. Is this a good idea or should I just build the query all together instead of building it piece by piece using the dynamic sql. Last thing, is there a way to speed up dynamic sql?
It is more likely that your indexing (or lack thereof) is causing the slowness than the dynamic SQL.
What does the execution plan look like? Is the same query slow when executed in SSMS? What about when it's in a stored procedure?
If your table is an unindexed heap, it will perform poorly as the number of records grows - this is regardless of the query, and a dynamic query can actually perform better as the table nature changes because a dynamic query is more likely to have its query plan re-evaluated when it's not in the cache. This is not normally an issue (and I would not classify it as a design advantage of dynamic queries) except in the early stages of a system when SPs have not been recompiled, but statistics and query plans are out of date, but the volume of data has just drastically changed.
Not the static one yet. I have with the dynamic query, but it does not give any optimizations. If I ran it with the static query and it gave suggestions, would applying them affect the dynamic query? – Xaisoft (41 mins ago)
Yes, the dynamic query (EXEC (#sql)) is probably not going to be analyzed unless you analyzed a workload file. – Cade Roux (33 mins ago)
When you have a search query across multiple tables that are joined, the columns with indexes need to be the search columns as well as the primary key/foreign key columns - but it depends on the cardinality of the various tables. The tuning analyzer should show this. – Cade Roux (22 mins ago)
I'd just like to point out that if you use this style of optional parameters:
AND (#EarliestDate is Null OR PublishedDate < #EarliestDate)
The query optimizer will have no idea whether the parameter is there or not when it produces the query plan. I have seen cases where the optimizer makes bad choices in these cases. A better solution is to build the sql that uses only the parameters you need. The optimizer will make the most efficient execution plan in these cases. Be sure to use parameterized queries so that they are reusable in the plan cache.
As previous answer, check your indexes and plan.
The question is whether you are using a stored procedure. It's not obvious from the way you worded it. A stored procedure creates a query plan when run, and keeps that plan until recompiled. With varying SQL, you may be stuck with a bad query plan. You could do several things:
1) Add WITH RECOMPILE to the SP definition, which will cause a new plan to be generated with every execution. This includes some overhead, which may be acceptable.
2) Use separate SP's, depending on the parameters provided. This will allow better query plan caching
3) Use client generated SQL. This will create a query plan each time. If you use parameterized queries, this may allow you to use cached query plans.
The only difference between "dynamic" and "static" SQL is the parsing/optimization phase. Once those are done, the query will run identically.
For simple queries, this parsing phase plus the network traffic turns out to be a significant percentage of the total transaction time, so it's good practice to try and reduce these times.
But for large, complicated queries, this processing is overall insignificant compared to the actual path chosen by the optimizer.
I would focus on optimizing the query itself, including perhaps denormalization if you feel that it's appropriate, though I wouldn't do that on a first go around myself.
Sometimes the denormalization can be done at "run time" in the application using cached lookup tables, for example, rather than maintaining this o the database.
Not a fan of dynamic Sql but if you are stuck with it, you should probably read this article:
http://www.sommarskog.se/dynamic_sql.html
He really goes in depth on the best ways to use dynamic SQL and the isues using it can create.
As others have said, indexing is the most likely culprit. In indexing, one thing people often forget to do is put an index on the FK fields. Since a PK creates an index automatically, many assume an FK will as well. Unfortunately creating an FK does nto create an index. So make sure that any fields you join on are indexed.
There may be better ways to create your dynamic SQL but without seeing the code it is hard to say. I would at least look to see if it is using subqueries and replace them with derived table joins instead. Also any dynamic SQl that uses a cursor is bound to be slow.
If the parameters are optional, a trick that's often used is to create a procedure like this:
CREATE PROCEDURE GetArticlesByAuthor (
#AuthorId int,
#EarliestDate datetime = Null )
AS
SELECT * --not in production code!
FROM Articles
WHERE AuthorId = #AuthorId
AND (#EarliestDate is Null OR PublishedDate < #EarliestDate)
There are some good examples of queries with optional search criteria here: How do I create a stored procedure that will optionally search columns?
As noted, if you are doing a massive query, Indexes are the first bottleneck to look at. Make sure that heavily queried columns are indexed. Also, make sure that your query checks all indexed parameters before it checks un-indexed parameters. This makes sure that the results are filtered down using indexes first and then does the slow linear search only if it has to. So if col2 is indexed but col1 is not, it should look as follows:
WHERE col2 = #col2 AND col1 = #col1
You may be tempted to go overboard with indexes as well, but keep in mind that too many indexes can cause slow writes and massive disk usage, so don't go too too crazy.
I avoid dynamic queries if I can for two reasons. One, they do not save the query plan, so the statement gets compiled each time. The other is that they are hard to manipulate, test, and troubleshoot. (They just look ugly).
I like Dave Kemp's answer above.
I've had some success (in a limited number of instances) with the following logic:
CREATE PROCEDURE GetArticlesByAuthor (
#AuthorId int,
#EarliestDate datetime = Null
) AS
SELECT SomeColumn
FROM Articles
WHERE AuthorId = #AuthorId
AND #EarliestDate is Null
UNION
SELECT SomeColumn
FROM Articles
WHERE AuthorId = #AuthorId
AND PublishedDate < #EarliestDate
If you are trying to optimize to below the 1s range, it may be important to gauge approximately how long it takes to parse and compile the dynamic sql relative to the actual query execution time:
SET STATISTICS TIME ON;
and then execute the dynamic SQL string "statically" and check the "Messages" tab. I was surprised by these results for a ~10 line dynamic sql query that returns two rows from a 1M row table:
SQL Server parse and compile time:
CPU time = 199 ms, elapsed time = 199 ms.
(2 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
Index optimization will doubtfully move the 199ms barrier much (except perhaps due to some analyzation/optimization included within the compile time).
However if the dynamic SQL uses parameters or is repeating than the compile results may be cached according to: See Caching Query Plans which would eliminate the compile time. Would be interesting to know how long cache entries live, size, shared between sessions, etc.