Make column value less than or equal than current date - sql

I need to make sure the entry of one of my columns lets call it CreationDate is =< current date.
After failing trying to use this
CHECK ( CreationDate =< GETDATE())
For the reasons mentioned in this post:
CHECK constraint on date of birth?
I'm wondering how would I write a Trigger for SQL Server that will check if the date im trying to insert/update is =< currentDate
I have never used triggers before, I came out with this, but I'm not sure if it works, or how will it be called.
I'm trying to make the trigger return an error so that the programmer is forced to check or fix this.
CREATE TRIGGER CheckValidDateTrigger
ON Reports
INSTEAD OF
UPDATE
AS
DECLARE #ReportCloseDate DateTime;
IF (#ReportCloseDate > GETDATE())
BEGIN
RAISERROR ('Error, the date your trying to save cannot be higher or newer that the current date. The date must be in the past or be the current date', -- Message text.
16, -- Severity.
1 -- State.
);
END;
Can you guys help me out a little?

The documentation says any expression that evaluates to TRUE or FALSE:
CHECK constraints enforce domain integrity by limiting the values that
are accepted by one or more columns. You can create a CHECK constraint
with any logical (Boolean) expression that returns TRUE or FALSE based
on the logical operators.
There is not a restriction (as far as I've found) on non-deterministic functions (that is, on functions where the same call may return different values at different times).
The example that you point to is tagged Oracle. It also gives an alternative solution which is to add a column whose default value is getdate() and to check against that.
So, your check constraint as intended should work.

Related

How to cache return value of a function for a single query

I want to use getdate() function 3-4 times in my single query for validation check. But I want that everytime I anticipate to get current datetime in a single query execution I get the same date at all 3-4 places. Not technically computers are that fast that 99.9% times I will get the same datetime at all places in query. But theoretically it may lead to bug. So how can cache that getdate return by calling it once and use that cached values in query.
But to add, I want to write such statement in check constraint, so I cant declare local variables, or any such thing.
SQL Server has the concept of run-time constant functions. The best way to describe these is that the first thing the execution engine does is pull the function references out from the query plan and execute each such function once per query.
Note that the function references appear to be column-based. So different columns can have different values, but different rows should have the same value within a column.
The two most common functions in this category are getdate() and rand(). Ironically, I find that this is a good thing for getdate(), but a bad thing for rand() (what kind of random number generator always returns the same value?).
For some reason, I can't find the actual documentation on run-time constant functions. But here are some respected blog posts that explain the matter:
https://sqlperformance.com/2014/06/t-sql-queries/dirty-secrets-of-the-case-expression
http://sqlblog.com/blogs/andrew_kelly/archive/2008/03/01/when-a-function-is-indeed-a-constant.aspx
https://blogs.msdn.microsoft.com/conor_cunningham_msft/2010/04/23/conor-vs-runtime-constant-functions/

Why would replacing a parameter with a local variable speed up a query

I have a query that has two DATE parameters, as such:
#startDate DATE,
#endDate DATE
While developing the sproc it was great (< 1 second). Moved it into a stored procedure as a child and when I ran it again, it takes minutes to run (2 to be exact).
I ran into this before (which I thought was some anomaly that I didn't pursue back then) so I tried the last "hack" that worked:
DECLARE #sDate DATE = CAST(#startDate AS DATE);
DECLARE #eDate DATE = CAST(#endDate AS DATE);
And sure enough, back to < 1s return times.
I have tried everything to figure this out and nothing seems to work. I can't find differences anywhere that changes anything. The values are exactly the same, not matter how many different ways I try to slice it.
I have also tried:
SET #startDate = CAST(#startDate AS DATE);
SET #startDate = CONVERT(date, #startDate, 101)
And I have tried re-declaring them (using any method) in the parent sproc.
It only works if I re-declare the variables in the child sproc.
So, why would re-declaring a variable of the same type, result in such an extreme difference in performance?
UPDATE - It Is Parameter Sniffing
I didn't originally think so, but all evidence points to the fact that it is, even though I haven't been able to fix it with normal methods that typically either work or help identify it. Except replacing it with a local variable which with the help from all the posters below would indicate it has to be parameter sniffing.
First Update
I don't think this is parameter sniffing - which was my first thought. This is what I have done to test this:
Changed parameters (add/remove)
Added additional criteria to the query
Added OPTION recompile
SET ARITHABORT ON
Drop/Created old and new indexes
The changes above had no impact on the query.
It is a "parameter sniffing" workaround. I recommend to read: Slow in the Application, Fast in SSMS?
Parameters and Variables
Consider the Orders table in the Northwind database, and these three procedures:
CREATE PROCEDURE List_orders_1 AS
SELECT * FROM Orders WHERE OrderDate > '20000101'
go
CREATE PROCEDURE List_orders_2 #fromdate datetime AS
SELECT * FROM Orders WHERE OrderDate > #fromdate
go
CREATE PROCEDURE List_orders_3 #fromdate datetime AS
DECLARE #fromdate_copy datetime
SELECT #fromdate_copy = #fromdate
SELECT * FROM Orders WHERE OrderDate > #fromdate_copy
go
In the first procedure, the date is a constant, which means that the SQL Server only needs to consider exactly this case. It interrogates the statistics for the Orders table, which indicates that there are no rows with an OrderDate in the third millennium. (All orders in the Northwind database are from 1996 to 1998.) Since statistics are statistics, SQL Server cannot be sure that the query will return no rows at all, why it makes an estimate of one single row.
In the case of List_orders_2, the query is against a variable, or more precisely a parameter. When performing the optimisation, SQL Server knows that the procedure was invoked with the value 2000-01-01. Since it does not any perform flow analysis, it can't say for sure whether the parameter will have this value when the query is executed. Nevertheless, it uses the input value to come up with an estimate, which is the same as for List_orders_1: one single row. This strategy of looking at the values of the input parameters when optimising a stored procedure is known as parameter sniffing.
In the last procedure, it's all different. The input value is copied to a local variable, but when SQL Server builds the plan, it has no understanding of this and says to itself I don't know what the value of this variable will be.
...
Key Points
In this section, we have learned three very important things:
-A constant is a constant, and when a query includes a constant, SQL Server can use the value of the constant with full trust, and even take such shortcuts to not access a table at all, if it can infer from constraints that no rows will be returned.
-For a parameter, SQL Server does not know the run-time value, but it "sniffs" the input value when compiling the query.
-For a local variable, SQL Server has no idea at all of the run-time value, and applies standard assumptions. (Which the assumptions are depends on the operator and what can be deduced from the presence of unique indexes.)
And second great article Parameter Sniffing Problem and Possible Workarounds

When using GETDATE() in many places, is it better to use a variable?

By better, I mean does it improve performance by some non-marginal amount?
That is to say, each time I call GETDATE(), what amount of work does the server do to return that value?
If I'm using GETDATE() in many places in a stored procedure, should I instead be creating a variable to store the date of the transaction?
declare #transDate datetime = GETDATE()
Bench-marking data would be fantastic.
EDIT I want to clarify: I'm interested mainly in the actual performance differences between these two possibilities, and whether or not it is significant.
[NOTE: If you are going to downvote this answer, please leave a comment explaining why. It has already been downvoted many times, and finally ypercube (thank you) explained at least one reason why. I can't remove the answer because it is accepted, so you might as well help to improve it.]
According to this exchange on Microsoft, GETDATE() switched from being constant within a query to non-deterministic in SQL Server 2005. In retrospect, I don't think that is accurate. I think it was completely non-deterministic prior to SQL Server 2005 and then hacked into something called "non-deterministic runtime constant" since SQL Server 2005". The later phrase really seems to mean "constant within a query".
(And GETDATE() is defined as unambiguously and proudly non-deterministic, with no qualifiers.)
Alas, in SQL Server, non-deterministic does not mean that a function is evaluated for every row. SQL Server really does make this needlessly complicated and ambiguous with very little documentation on the subject.
In practice the function call is evaluated when the query is running rather than once when the query is compiled and its value changes each time it is called. In practice, GETDATE() is only evaluated once for each expression where it is used -- at execution time rather than compile time. However, Microsoft puts rand() and getdate() into a special category, called non-deterministic runtime constant functions. By contrast, Postgres doesn't jump through such hoops, it just calls functions that have a constant value when executed as "stable".
Despite Martin Smith's comment, SQL Server documentation is simply not explicit on this matter -- GETDATE() is described as both "nondeterministic" and "non-deterministic runtime constant", but that term isn't really explained. The one place I have found the term , for instance, the very next lines in the documentation say not to use nondeterministic functions in subqueries. That would be silly advice for "nondeterministic runtime constant".
I would suggest using a variable with a constant even within a query, so you have a consistent value. This also makes the intention quite clear:
You want a single value inside the query. Within a single query, you can do something like:
select . . .
from (select getdate() as now) params cross join
. . .
Actually, this is a suggestion that should evaluate only once in the query, but there might be exceptions. Confusion arises because getdate() returns the same value on all different rows -- but it can return different values in different columns. Each expression with getdate() is evaluated independently.
This is obvious if you run:
select rand(), rand()
from (values (1), (2), (3)) v(x);
Within a stored procedure, you would want to have a single value in a variable. What happens if the stored procedure is run as midnight passes by, and the date changes? What impact does that have on the results?
As for performance, my guess is that the date/time lookup is minimal and for a query occurs once per expression as the query starts to run. This should not really a performance issue, but more of a code-consistency issue.
My suggestion would be to use a variable mainly because if you have a long-running process, the GetDate() value might be different between calls.
Unless you are only using the Date part of GetDate() then you will be sure you are always using the same value.
One reason to use a variable with getdate() or functions like suser_sname() is a huge performance difference if you are inserting rows, or if you are doing a GROUP BY. You will notice this if you insert large amount of rows.
I suffered this myself migrating 300GB of data to several tables.
I was testing on a couple of stored procedures using the GETDATE() function as a variable within an SP and I was having increase on IO reads and execution time due to the fact that query optimizer does not know what's the value to operate read this Stored Procedure Execution with Parameters, Variables, and Literals , with that said you can use the GETDATE() function in every single part of the SP as #Gordon Linoff mentioned its value does not change during execution or in order to avoid/remove the thinking that the value might change I did create a parameters this way:
CREATE PROC TestGetdate
(
#CurrentDate DATETIME = NULL
)
AS
SET CurrentDate = GETDATE()
.....
and then use the parameters as you see fit, you'll see good results
Any comments or suggestions are welcome.
I used
WHERE ActualDateShipped+30 > dbo.Today()
in combination with function below. Brought my query time from 13 seconds to 2 seconds. No prior answers in this post helped this problem in SQL 2008/R2.
CREATE FUNCTION [dbo].[Today]()
RETURNS date
AS
BEGIN
DECLARE #today date = getdate()
RETURN #today
End

Is it possible to raise an error if a variable assignment in a select returns multiple values?

I just found a bug on one of my softwares where I had forgotten a where clause. The code was something like that :
declare #foo bigint
declare #bar bigint
select #foo = foo, #bar=bar from tbFooBar
where (....a long list of condition goes there)
(... and an extra condition should have went there but I forgot it)
Unfortunately, the where clause I forgot was useful in very specific corner cases and the code went through testing successfully.
Eventually, the query returned two values instead of one, and the resulting bug was a nightmare to track down (as it was very difficult to reproduce, and it wasn't obvious at all that this specific stored procedure was causing the issue we spotted)
Debugging would have been a lot easier if the #foo=foo had raised an exception instead of silently assigning the first value out of multiple rows.
Why is that this way? I can't think of a situation where one would actually want to do that without raising an error (bearing in mind the clauses 'distinct' and 'top' are there for a reason)
And is there a way to make sql server 2008 raise an error if this situation occurs ?
Try this:
declare #d datetime
select #d = arrived from attendance;
if ##ROWCOUNT > 1 begin
RAISERROR('Was more than expected 1 row.', 16, 1)
end
Why is that this way? People can do quite a lot based on the fact the variable is assigned on each row. For instance, some use it to perform string concatenation.
#bernd_k demonstrates one way to cause an error, assuming that you're only assigning to a single variable. At the moment, there's no way to generalise that technique if you need to assign multiple variables - you still need to ensure that your query only returns one row
If you're concerned that a particular query is large/complex/might be edited later, and somebody might accidentally cause it to return additional rows, you can introduce a new variable, and make the start of your select look like this:
declare #dummy bit
select #dummy = CASE WHEN #dummy is null then 1 ELSE 10/0 END
This will then cause an error if multiple rows are returned.
You can formulate your query like this and than you get errors, when there are multiple results:
declare #foo bigint
select #foo = (
Select foo
from tbFoo
where (....a long list of condition goes there)
(... and an extra condition should have went there but I forgot it)
)
The other syntax is designed not to throw errors.
EDIT:
When you need more than 1 column, you can use a table variable, assign the result to it and check its row count and work accordingly.
I would assign the values to a table variable and check to see if the table had multiple records after assignment. In fact I almost never rely on a query to return one record as that is short-sighted. It may in testing but once real data gets there they often do not and maybe even should not. If you think in terms of sets instead of one record, you will have more reliable code.

Will using and updating the same field in one UPDATE statment cause undefined behaviour?

Here is an example
UPDATE duration = datediff(ss, statustime, getdate()), statustime = getdate() where id = 2009
Is the field duration going to be assigned an undefined assigned value since statustime is getting used and assigned in the same statement? (i.e. positive value if datediff processed first or negative if statustime is processed first)
I can definitely update it in two separate statements but I am curious it is possible to update it in one statement.
No. Both values are calculated before either assignment is made.
Update:
I tracked down the ANSI-92 spec, and section 13.10 on the UPDATE statement says this:
The <value expression>s are effectively evaluated for each row
of T before updating any row of T.
The only other applicable rules refer to secion 9.2, but that only deals with one assignment in isolation.
There is some room for ambiguity here: it could calculate and update all statustime rows first and all duration rows afterward and still technically follow the spec, but that would be a very ... odd ... way to implement it.
My gut instinct says 'no', but this will vary depending on the SQL implementation, query parser, and so on. Your best bet in situations like these is to run a quick experiment on your server (wrap it in a transaction to keep it from modifying data), and see how your particular implementation behaves.