Calling a function from within a select statement - SQL - sql

I have the following statement:
SELECT CASE WHEN (1 = 1) THEN 10 ELSE dbo.at_Test_Function(5) END AS Result
I just want to confirm that in this case the function wont be executed?
My reason for asking is that the function is particularly slow and if the critiria is true I want to avoid calling the function...
Cheers
Anthony

Your assumpion is correct - it won't be executed. I understand your concern, but the CASE construct is "smart" in that way - it doesn't evaluate any conditions after the first valid condition. Here's an example to prove it. If both branches of this case statement were to execute, you would get a "divide by zero" error:
SELECT CASE
WHEN 1=1 THEN 1
WHEN 2=2 THEN 1/0
END AS ProofOfConcept
Does this make sense?

Do not make this assumption, it is WRONG. The Query Optimizer is completely free to choose the evaluation order it pleases and SQL as a language does NOT offer operator short-circuit. Even if you may find in testing that the function is never evaluated, in production you may hit every now and then conditions that cause the server to choose a different execution plan and first evaluate the function, then the rest of the expression. A typical example would be when the server notices that the function return is deterministic and not depending on the row data, in which case it would first evaluate the function to get the value, and after that start scanning the table and evaluate the WHERE inclusion criteria using the function value determined beforehand.

Assuming you are doing some kind of testing... If you are trying to avoid the at_Test_Function, why not just comment it out and do
SELECT 10 AS Result

Put a WaitFor Delay '00:00:05' in the function. If the statement returns immediately it didn't execute if it takes 5 seconds to return then it was executed.

Related

Confused on what to use, If Then or Case Else Statement

I am trying to determine if I should use a CASE Statement or an IF THEN statement to get my results.
I want a SQL statement to run when a certain condition exists, but am not certain on how to check for the condition. Here is what I am working on
IF EXISTS(SELECT source FROM map WHERE rev_num =(SELECT MAX(rev_num) from MAP <-- at this point it would return either an A or B -->
What ever the answer is i then need to run a set of SQL's. So for A it would do this set of statements and for B it would do another.
CASE is used within a SQL statement. IF/THEN can be used to choose which query to execute.
Based on your somewhat vague example it seems like you want to execute different queries based on some condition. In that case, an IF/THEN seems more appropriate.
If, however, the majority of each query is identical and you're just changing part of the query then you may be able to use CASE to reduce the amount of duplicate code.

Performance of OR? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
SQL Server - Query Short-Circuiting?
Is the SQL WHERE clause short-circuit evaluated?
I have a question regarding performance of logical OR operators in T-SQL (SQL Server 2005).
I have searched around a little but I couldn't find anything on the subject.
If you have the following query:
SELECT * FROM Table WHERE (randomboolean OR HeavyToEvaluateCondition)
Wouldn't the procedure interpreter go as far as the randomboolean and skip evaluation of the heavy condition in order to save performance given that the first condition is true?
Since one of the values in an OR statement is true it would be unnecessary to evaluate the second condition since we already know that the first condition is met!
I know it works like this in C# but I want to know if I can count on it in T-SQL too.
You can't count on short circuit evaluation in TSQL.
The optimiser is free to evaluate the conditions in which ever order it sees fit and may in some circumstances evaluate both parts of an expression even when the second evaluation cannot change the result of the expression (Example).
That is not to say it never does short circuit evaluation however. You may well get a start up predicate on the expensive condition so it is only executed when required.
Additionally the presence of the OR in your query can convert a sargable search condition into an unsargable one meaning that indexes are not used optimally. Especially in SQL Server 2005 (In 2008 OPTION (RECOMPILE) can help here).
For example compare the plans for the following. The version with OR ends up doing a full index scan rather than an index seek to the specific values.
DECLARE #number INT;
SET number = 0;
SELECT COUNT(*)
FROM master..spt_values
WHERE #number IS NULL OR number = 0
SELECT COUNT(*)
FROM master..spt_values
WHERE number = 0
Its called short-circuiting. And yes SQL Server does do it in certain cases. In what order depends on many factors and forms part of the execution plan optimisation.
However, there are details online that this is limitted to JOIN conditions, CASE statements, etc.
See this SO post... SQL Server - Query Short-Circuiting?
Firstly where condition is executed than OR operator is executed when control goes to first condition.if first condition is true than it is not check the second condition .if you are given 100 condition and in this scenario first condition is false then it check next condition.

How to cope with null results in SQL Tasks that return single rows in SSIS 2005?

In a dataflow task, I can slip a rowcount into the processing flow and place the count into a variable. I can later use that variable to conditionally perform some other work if the rowcount was > 0. This works well for me, but I have no corresponding strategy for sql tasks expected to return a single row. In that event, I'm returning those values into variables. If the lookup produces no rows, the sql task fails when assigning values into those variables. I can branch on that component failing, but there's a side effect of that - if I'm running the job as a SQL server agent job step, the step returns DTSER_FAILURE, causing the step to fail. I can tell the sql agent to disregard the step failure, but then I won't know if I have a legitimate error in that step. This seems harder than it should be.
The only strategy I can think of is to run the same query with a count(*) aggregate and test if that returns a number > 0 and if so running the query again without the count. That's ugly because I have the same query in two places that I need to keep in sync.
Is there a better way?
In that same condition you can have additional logic (&& or ||). I would take one of the variables for your single statement and say something to the effect:
If #User::rowcount>0 || #User:single_record_var!=Default
That should help.
What kind of SQL statement? Can you change it to still return a single row with all NULLs instead of no rows?
What stops it from returning more than one row? The package would fail if it ended up returning more than one row, right?
You could also change it to call a stored procedure, and then call the stored procedure in two places without code duplication. You could also change it to be a view or user-defined function (if parameters are needed), SELECT COUNT(*) FROM udf() to check if there is data, SELECT * FROM udf() to get the row.

Why does NVL always evaluate 2nd parameter

Does anyone know, why Oracle's NVL (and NVL2) function always evaluate the second parameter, even if the first parameter is not NULL?
Simple test:
CREATE FUNCTION nvl_test RETURN NUMBER AS
BEGIN
dbms_output.put_line('Called');
RETURN 1;
END nvl_test;
SELECT NVL( 0, nvl_test ) FROM dual
returns 0, but also prints Called.
nvl_test has been called, even though the result is ignored since first parameter is not NULL.
It's always been that way, so Oracle has to keep it that way to remain backwards compatible.
Use COALESCE instead to get the short-circuit behaviour.
Here is a post where Tom Kyte confirms that decode and case short circuit but not nvl but he doesn't give justification or documentation for why. Just states it to be:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:926029357278#14932880517348
So in your case you should use decode or case instead of nvl if an expensive function will be called in your query.
In general, it would make sense that the second parameter is evaluated before calling the function, because in general that is how functions are called: all arguments to the function are evaluated and the evaluated values are sent to the function.
However, in the case of a very common system function like NVL, I would have thought PL/SQL could optimise, treating the function call as a special case. But perhaps that is more difficult than it sounds (to me), as I'm sure this optimisation would have occurred to the developers of Oracle.
They are obviously not short-circuiting, but I can't find any references in Oracle documentation.
Check out this discussion: http://forums.oracle.com/forums/thread.jspa?messageID=3478040

Is the SQL WHERE clause short-circuit evaluated?

Are boolean expressions in SQL WHERE clauses short-circuit evaluated
?
For example:
SELECT *
FROM Table t
WHERE #key IS NULL OR (#key IS NOT NULL AND #key = t.Key)
If #key IS NULL evaluates to true, is #key IS NOT NULL AND #key = t.Key evaluated?
If no, why not?
If yes, is it guaranteed? Is it part of ANSI SQL or is it database specific?
If database specific, SQLServer? Oracle? MySQL?
ANSI SQL Draft 2003 5WD-01-Framework-2003-09.pdf
6.3.3.3 Rule evaluation order
[...]
Where the precedence is not determined by the Formats or by
parentheses, effective evaluation of expressions is generally
performed from left to right. However, it is
implementation-dependent whether expressions are actually evaluated left to right, particularly when operands or operators might
cause conditions to be raised or if the results of the expressions
can be determined without completely evaluating all parts of the
expression.
From the above, short circuiting is not really available.
If you need it, I suggest a Case statement:
Where Case when Expr1 then Expr2 else Expr3 end = desiredResult
Expr1is always evaluated, but only one of Expr2 and Expr3 will be evaluated per row.
I think this is one of the cases where I'd write it as if it didn't short-circuit, for three reasons.
Because for MSSQL, it's not resolved by looking at BOL in the obvious place, so for me, that makes it canonically ambiguous.
because at least then I know my code will work. And more importantly, so will those who come after me, so I'm not setting them up to worry through the same question over and over again.
I write often enough for several DBMS products, and I don't want to have to remember the differences if I can work around them easily.
I don't believe that short circuiting in SQL Server (2005) is guaranteed. SQL Server runs your query through its optimization algorithm that takes into account a lot of things (indexes, statistics, table size, resources, etc) to come up with an effective execution plan. After this evaluation, you can't say for sure that your short circuit logic is guaranteed.
I ran into the same question myself sometime ago and my research really did not give me a definitive answer. You may write a small query to give you a sense of proof that it works but can you be sure that as the load on your database increases, the tables grow to be bigger, and things get optimized and changed in the database, that conclusion will hold. I could not and therefore erred on the side of caution and used CASE in WHERE clause to ensure short circuit.
You have to keep in mind how databases work. Given a parameterized query the db builds an execution plan based on that query without the values for the parameters. This query is used every time the query is run regardless of what the actual supplied values are. Whether the query short-circuits with certain values will not matter to the execution plan.
I typically use this for optional parameters. Is this the same as short circuiting?
SELECT [blah]
FROM Emp
WHERE ((#EmpID = -1) OR (#EmpID = EmpID))
This gives me the option to pass in -1 or whatever to account for optional checking of an attribute. Sometimes this involves joining on multiple tables, or preferably a view.
Very handy, not entirely sure of the extra work that it gives to the db engine.
Just stumbled over this question, and had already found this blog-entry: http://rusanu.com/2009/09/13/on-sql-server-boolean-operator-short-circuit/
The SQL server is free to optimize a query anywhere she sees fit, so in the example given in the blog post, you cannot rely on short-circuiting.
However, a CASE is apparently documented to evaluate in the written order - check the comments of that blog post.
For SQL Server, I think it depends on the version but my experience with SQL Server 2000 is that it still evaluates #key = t.Key even when #key is null. In other words, it does not do efficient short circuiting when evaluating the WHERE clause.
I've seen people recommending a structure like your example as a way of doing a flexible query where the user can enter or not enter various criteria. My observation is that Key is still involved in the query plan when #key is null and if Key is indexed then it does not use the index efficiently.
This sort of flexible query with varying criteria is probably one case where dynamically created SQL is really the best way to go. If #key is null then you simply don't include it in the query at all.
Main characteristic of short circuit evaluation is that it stops evaluating the expression as soon as the result can be determined. That means that rest of expression can be ignored because result will be same regardless it is evaluated or not.
Binary boolean operators are comutative, meaning:
a AND b == b AND a
a OR b == b OR a
a XOR b == b XOR a
so there is no guarantee on order of evaluation. Order of evaluation will be determined by query optimizer.
In languages with objects there can be situations where you can write boolean expressions that can be evaluated only with short circuit evaluation. Your sample code construction is often used in such languages (C#, Delphi, VB). For example:
if(someString == null | someString.Length == 0 )
printf("no text in someString");
This C# example will cause exception if someString == null because it will be fully evaluated. In short circuit evaluation, it will work every time.
SQL operates only on scalar variables (no objects) that cannot be uninitialized, so there is no way to write boolean expression that cannot be evaluated. If you have some NULL value, any comparison will return false.
That means that in SQL you cannot write expression that is differently evaluated depending on using short circuit or full evaluation.
If SQL implementation uses short circuit evaluation, it can only hopefully speed up query execution.
i don't know about short circuting, but i'd write it as an if-else statement
if (#key is null)
begin
SELECT *
FROM Table t
end
else
begin
SELECT *
FROM Table t
WHERE t.Key=#key
end
also, variables should always be on the right side of the equation. this makes it sargable.
http://en.wikipedia.org/wiki/Sargable
Below a quick and dirty test on SQL Server 2008 R2:
SELECT *
FROM table
WHERE 1=0
AND (function call to complex operation)
This returns immediately with no records. Kind of short circuit behavior was present.
Then tried this:
SELECT *
FROM table
WHERE (a field from table) < 0
AND (function call to complex operation)
knowing no record would satisfy this condition:
(a field from table) < 0
This took several seconds, indicating the short circuit behavior was not there any more and the complex operation was being evaluated for every record.
Hope this helps guys.
Here is a demo to prove that MySQL does perform WHERE clause short-circuiting:
http://rextester.com/GVE4880
This runs the following queries:
SELECT myint FROM mytable WHERE myint >= 3 OR myslowfunction('query #1', myint) = 1;
SELECT myint FROM mytable WHERE myslowfunction('query #2', myint) = 1 OR myint >= 3;
The only difference between these is the order of operands in the OR condition.
myslowfunction deliberately sleeps for a second and has the side effect of adding an entry to a log table each time it is run. Here are the results of what is logged when running the above two queries:
myslowfunction called for query #1 with value 1
myslowfunction called for query #1 with value 2
myslowfunction called for query #2 with value 1
myslowfunction called for query #2 with value 2
myslowfunction called for query #2 with value 3
myslowfunction called for query #2 with value 4
The above shows that a slow function is executed more times when it appears on the left side of an OR condition when the other operand isn't always true (due to short-circuiting).
This takes an extra 4 seconds in query analyzer, so from what I can see IF is not even shorted...
SET #ADate = NULL
IF (#ADate IS NOT NULL)
BEGIN
INSERT INTO #ABla VALUES (1)
(SELECT bla from a huge view)
END
It would be nice to have a guaranteed way!
The quick answer is: The "short-circuit" behavior is undocumented implementation.
Here's an excellent article that explains this very topic.
Understanding T-SQL Expression Short-Circuiting
It is but obvious that MS Sql server supports Short circuit theory, to improve the performance by avoiding unnecessary checking,
Supporting Example:
SELECT 'TEST'
WHERE 1 = 'A'
SELECT 'TEST'
WHERE 1 = 1 OR 1 = 'A'
Here, the first example would result into error 'Conversion failed when converting the varchar value 'A' to data type int.'
While the second runs easily as the condition 1 = 1 evaluated to TRUE and thus the second condition doesn't ran at all.
Further more
SELECT 'TEST'
WHERE 1 = 0 OR 1 = 'A'
here the first condition would evaluate to false and hence the DBMS would go for the second condition and again you will get the error of conversion as in above example.
NOTE: I WROTE THE ERRONEOUS CONDITION JUST TO REALIZE WEATHER THE CONDITION IS EXECUTED OR SHORT-CIRCUITED
IF QUERY RESULTS IN ERROR MEANS THE CONDITION EXECUTED, SHORT-CIRCUITED OTHERWISE.
SIMPLE EXPLANATION
Consider,
WHERE 1 = 1 OR 2 = 2
as the first condition is getting evaluated to TRUE, its meaningless to evaluate the second condition because its evaluation in whatever value
would not affect the result at all, so its good opportunity for Sql Server to save Query Execution time by skipping unnecessary condition checking or evaluation.
in case of "OR" if first condition is evaluated to TRUE the entire chain connected by "OR" would considered as evaluated to true without evaluating others.
condition1 OR condition2 OR ..... OR conditionN
if the condition1 is evaluated to true, rest all of the conditions till conditionN would be skipped.
In generalized words at determination of first TRUE, all other conditions linked by OR would be skipped.
Consider the second condition
WHERE 1 = 0 AND 1 = 1
as the first condition is getting evalutated to FALSE its meaningless to evaluate the second condition because its evaluation in whatever value
would not affect the result at all, so again its good opportunity for Sql Server to save Query Execution time by skipping unnecessary condition checking or evaluation.
in case of "AND" if first condition is evaluated to FALSE the entire chain connected with the "AND" would considered as evaluated to FALSE without evaluating others.
condition1 AND condition2 AND ..... conditionN
if the condition1 is evaluated to FALSE, rest all of the conditions till conditionN would be skipped.
In generalized words at determination of first FALSE, all other conditions linked by AND would be skipped.
THEREFOR, A WISE PROGRAMMER SHOULD ALWAYS PROGRAM THE CHAIN OF CONDITIONS IN SUCH A WAY THAT, LESS EXPENSIVE OR MOST ELIMINATING CONDITION GETS EVALUATED FIRST,
OR ARRANGE THE CONDITION IN SUCH A WAY THAT CAN TAKE MAXIMUM BENEFIT OF SHORT CIRCUIT