SQL Server - Avoid calling function twice [closed] - sql

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
In the SQL toy example below, SUM() is called twice. Is there any trick to do the computation of a function just once? The reason behind this is that calling twice may be less efficient.
CREATE TABLE #Tmp(Id int);
INSERT INTO #Tmp VALUES (1), (2), (-10);
SELECT IIF(SUM(Id) < 0, 0, SUM(Id)) FROM #Tmp;
In Python, we have so-called Assignment Expressions:
if (list_length := len(some_list)) < 0:
return 0
else:
return list_length
In C#, we have coalesce operator: ??
var x = callFunctionThatCouldReturnNull() ?? valueIfNull;
...at least help us in case of checking null.
Do we have something similar in SQL? One can workaround by assigning to a variable, but then more code to write...
-- Updated --
I need something similar in SQL because of:
Syntatic sugar: repeating code is boring.
Potential performance issue: as said in comments, SQL Compiler may be intelligent enough to recognize the function SUM() is called twice. But what if we have a very complex/long query in IIF() (instead of just SUM()), is it always guaranteed that SQL Compiler also detects that those code snippets are the same, so it saves the result from the 1st call and re-use later?

There are some misconceptions that you have about SQL queries.
In general, the expensive part of the query is reading the data, not doing the sum. This is especially true on trivially small amounts of data.
Second, SQL queries describe the result set. SQL is a declarative language, not a procedural language. The optimizer is free to see that there are two SUM()s that can be calculated only once. I'm not so sure that SQL Server does this optimization.
There are some cases where calling functions can be expensive (I don't think SUM() on integers is one of those cases). If this is a concern, you can use a subquery:
SELECT (CASE WHEN sum_id < 0 THEN 0 ELSE sum_id END)
FROM (SELECT SUM(Id) as sum_id
FROM #Tmp t
) t;
Also note that I replaced the IIF() with the standard SQL CASE expression.

Related

Declaration of variables improve or decrease performance [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
So I am trying to improve my SQL, one topic that arose, does declaration help with performance or not so like for instance lets say you want to do a if else statement but you need to know if the count is higher than 0 for example right
SELECT #COUNTER = COUNT(ID) FROM tblSome WHERE NUMBER > TOTAL
IF(#COUNTER > 0)
OR would it be better than something like this
IF((SELECT #COUNTER = COUNT(ID) FROM tblSome WHERE NUMBER > TOTAL)>0)
I am just trying to minimize the time it takes, but also it would be nice to know
For now I cannot really find a difference with the small amounts of data I am using and I am not sure how to test it further or go test it to the next level
Use of variables can help or hinder performance dependent on the exact circumstances. There isn't a single rule.
In this case the use of the separate variable assignment step can be harmful.
It gives the optimiser no choice but to count all the rows as it doesn't look ahead to how you use that variable in future statements.
using an IF (SELECT COUNT(*) ...) > 0 allows the optimiser to see that it is true as long as the count is >=1 and can sometimes be optimised to an IF EXISTS (semi join) and stop reading after a single row.
But you are better off just writing it as EXISTS anyway rather than relying on that.
(The discussion on this internet archived blog post has more about the circumstances where this optimisation might happen)

Why am I getting "Cannot insert the value NULL into column 'X'" errors, on a hard-coded value? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
INSERT INTO DDA_MatchStrings
(Unit, FamilyID, AccountNumber, MatchType, MatchString, SnapshotDate, IgnoreRecord, Merged)
SELECT
Unit, FamilyID, AccountNumber, 'EMAIL', MatchString, #SnapshotDate, 0, 0
FROM DDA_MatchStrings_temp AS tms
WHERE tms.MatchString IN
(SELECT tms.MatchString FROM DDA_MatchStrings_temp AS tms
GROUP BY tms.MatchString
HAVING COUNT(DISTINCT unit) > 1)
ORDER BY tms.MatchString
I added indentation to the field references for visual observation.
This statement is in a stored procedure, but when the SP gets to this point, it is giving the following error:
Cannot insert the value NULL into column 'IgnoreRecord', table
'Db.dbo.TableX'; column does not allow nulls. INSERT
fails.
I know what this error means. I'm not new to T-SQL. However, I'm completely baffled why it's giving me this error? The value isn't null. It's a hardcoded 0. FYI, the field in question is a tinyint. It should accept this value. Also, this statement is also in other stored procedures (with only a different value in the MatchType field.
What might cause this error?
UPDATE
I've chosen to answer my own question, rather than delete it. This is a good case where not paying attention to db schemas has caused problems. It's worth updating as a reference that you should always pay attention to your SQL object definitions and which schemas are provided when executing code. I hate to be the one making a mistake, but hopefully others will learn from my error.
Ok... I feel a bit stupid but I might as well answer this for posterity.
The issue is not with the insert statement, or my script... it relates to schemas. The definition of the SP is:
CREATE PROCEDURE dbo.DDA_Generate_PotentialDuplicate_Emails
Apparently, at some point I defined it as:
CREATE PROCEDURE DDA_Generate_PotentialDuplicate_Emails
This is the first time where I've worked with a corporate db where, by default, the associated schema was the user's login and not dbo.
So... when I changed my execution code from:
EXEC DDA_Generate_PotentialDuplicate_Emails
to
EXEC dbo.DDA_Generate_PotentialDuplicate_Emails
Everything started to work properly because I was running the proper version of the stored procedure. :P

Why does comparing integers with Equal(=) takes 800ms more that using GreaterThan(>) and LessThan(<) [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
Using Linq and EntityFramework we recently noticed, while testing our queries performances, that the use of Equal(=) operator to compare two integers takes around 800ms more than using a combination of GreaterThan(>) and LessThan(<) operators.
So basically, replacing itemID == paramID (Both being integers) by !(itemID > param ID || itemID < paramID) in our linq query makes the query faster by about 800ms consistently.
Anyone with a deep knowledge of SQL could explain this result to me?
If this was always faster SQL Server would do the rewrite for you. It does not so you can conclude that it is not always faster to do this. In fact it is a bad idea to do this rewrite in 99.999% of the cases.
The information given in the question (almost none) does not allow for further analysis.
Often people ask "why did this random change make my query faster". The answer is always: You accidentally triggered a better query plan. There is no system to it. Query plans can be unstable.
Psychic guess: The complex predicate forces a table scan (or makes it appear better) than using an index. That can sometimes be a good thing.
The first step would be to examine the generated sql. My guess is itemID is nullable and EntityFramework's default behaviour with nullable properties isn't the greatest. It will translate your query into something like: prop = value OR prop is null
If that is the case and you are using EF6, you can overide that behaviour by:
context.UseDatabaseNullSemantics = true;
Msdn

Performance of OR? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
SQL Server - Query Short-Circuiting?
Is the SQL WHERE clause short-circuit evaluated?
I have a question regarding performance of logical OR operators in T-SQL (SQL Server 2005).
I have searched around a little but I couldn't find anything on the subject.
If you have the following query:
SELECT * FROM Table WHERE (randomboolean OR HeavyToEvaluateCondition)
Wouldn't the procedure interpreter go as far as the randomboolean and skip evaluation of the heavy condition in order to save performance given that the first condition is true?
Since one of the values in an OR statement is true it would be unnecessary to evaluate the second condition since we already know that the first condition is met!
I know it works like this in C# but I want to know if I can count on it in T-SQL too.
You can't count on short circuit evaluation in TSQL.
The optimiser is free to evaluate the conditions in which ever order it sees fit and may in some circumstances evaluate both parts of an expression even when the second evaluation cannot change the result of the expression (Example).
That is not to say it never does short circuit evaluation however. You may well get a start up predicate on the expensive condition so it is only executed when required.
Additionally the presence of the OR in your query can convert a sargable search condition into an unsargable one meaning that indexes are not used optimally. Especially in SQL Server 2005 (In 2008 OPTION (RECOMPILE) can help here).
For example compare the plans for the following. The version with OR ends up doing a full index scan rather than an index seek to the specific values.
DECLARE #number INT;
SET number = 0;
SELECT COUNT(*)
FROM master..spt_values
WHERE #number IS NULL OR number = 0
SELECT COUNT(*)
FROM master..spt_values
WHERE number = 0
Its called short-circuiting. And yes SQL Server does do it in certain cases. In what order depends on many factors and forms part of the execution plan optimisation.
However, there are details online that this is limitted to JOIN conditions, CASE statements, etc.
See this SO post... SQL Server - Query Short-Circuiting?
Firstly where condition is executed than OR operator is executed when control goes to first condition.if first condition is true than it is not check the second condition .if you are given 100 condition and in this scenario first condition is false then it check next condition.

For what reason doesn’t SQL allow a variable to be assigned a value using syntax #i =100;? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Unlike programming languages, SQL doesn’t allow variable to be assigned a value using the following syntax ( instead we must use SET or SELECT ):
#i=100; //error
Is there particular reason why SQL doesn’t allow this?
thank you
Why does any language use the syntax it uses?
To maintain conventions, prevent ambiguity, and simplify parsing.
Notice that the syntax of SET statements (properly called assignment statements) is similar to the syntax in the SET clause of an UPDATE statement:
UPDATE mytable SET somecolumn = 100;
I believe they did this deliberately just to annoy you.
You may as well say why can't I do this in c#?
SELECT * FROM MyArray[]
Why must I iterate and mess around with Console.WriteLine? It's just a set collection of data that I want to see in one go.
Those variables are part of Transact-SQL, not the SQL standard.
SQL was not designed to write procedural code, so its support for anything other than set operations is largely bolted on in proprietary extensions.
why do your need a "#" at the beginning of a variable name? to help parse the commands
The SET or SELECT helps differentiate assignment from comparisons, it is just the way TSQL was designed.
use:
SET #i=100 --for single assignments
and
SELECT #i=100,#a=200,#c=300 --for multiple assignements, which is
--faster than the equivalent multiple SET commands
The first reason that comes to mind is that SQL uses the '='-sign for both comparing of values as setting of values. This means that it requires a different syntax to distinguish between comparing and setting of the values. In (some) other programming there is a distinction between setting and comparing the values with the use of the '='-sign and the '=='-sign.
For example in SQL:
SET #a = 100 --To set the value of #a.
if (#a = 100) -- To compare the value of #a.
For example in C#:
a = 100; //To set the value of a.
if (a == 100) //To compare the value of a.
Well, SQL isn't a programming language.
I can't say that I know the reason. However, it likely has something to do with the intended use of SQL. Namely, working with data. Who works with a lot of data? Business people. The language, as such, uses more words and fewer symbols. Compare the style of VisualBasic to C++.
After a variable is declared, it is initialized to NULL. Use the SET statement to assign a value that is not NULL to a declared variable.
So says TSQL SET documentation. I don't remember having to use the SET keyword in Oracle's PLSQL.