Is this date comparison condition SARG-able in SQL? - sql

Is this condition sargable?
AND DATEDIFF(month,p.PlayerStatusLastTransitionDate,#now) BETWEEN 1 AND 7)
My rule of thumb is that a function on the left makes condition non sargable.. but in some places I have read that BETWEEN clause is sargable.
So does any one know for sure?
For reference:
What makes a SQL statement sargable?
http://en.wikipedia.org/wiki/Sargable
NOTE: If any guru ends here, please do update Sargable Wikipedia page. I updated it a little bit but I am sure it can be improved more :)

Using AdventureWorks, if we look at these two equivalent queries:
SELECT OrderDate FROM Sales.SalesOrderHeader
WHERE DATEDIFF(month,OrderDate,GETDATE()) BETWEEN 1 AND 7;
SELECT OrderDate FROM Sales.SalesOrderHeader
WHERE OrderDate >= DATEADD(MONTH, -7, GETDATE())
AND OrderDate <= DATEADD(MONTH, -1, GETDATE());
In both cases we see a clustered index scan:
But notice the recommended/missing index only on the latter query, since it's the only one that could benefit from it:
If we add an index to the OrderDate column, then run the queries again:
CREATE INDEX dt ON Sales.SalesOrderHeader(OrderDate);
GO
SELECT OrderDate FROM Sales.SalesOrderHeader
WHERE DATEDIFF(month,OrderDate,GETDATE()) BETWEEN 1 AND 7;
SELECT OrderDate FROM Sales.SalesOrderHeader
WHERE OrderDate >= DATEADD(MONTH, -7, GETDATE())
AND OrderDate <= DATEADD(MONTH, -1, GETDATE());
We see much difference - the latter uses a seek:
Notice too how the estimates are way off for your version of the query. This can be absolutely disastrous on a large data set.
There are very few cases where a function or other expression applied to the column will be sargable. One case I know of is CONVERT(DATE, datetime_column) - but that particular optimization is undocumented, and I recommend staying away from it anyway. Not only because you'd be implicitly suggesting that using functions/expressions against columns is okay (it's not in every other scenario), but also because it can lead to wasted reads and disastrous estimates.

I would be very surprised if that was sargable. One option might be to rewrite it as:
WHERE p.PlayerStatusLastTransitionDate >= DATEADD(month,1,CAST(#now AS DATE))
AND p.PlayerStatusLastTransitionDate <= DATEADD(month,7,CAST(#now AS DATE))
Which I believe will be sargable (even though it's not quite as pretty).

Related

Is there a way to optimize this SQL query?

I have this query I have to automate with AWS Lambda but first I want to optimize it.
It seems legit to me but I have this feeling I can do something to improve it.
SELECT q_name, count(*)
FROM myTable
WHERE status = 2
AND DATEDIFF(mi, create_stamp, getdate()) > 1
GROUP BY q_name
The only improvement I can see is not to apply a function to your column, because that makes the query unsargable (unable to use indexes). Instead leave the column as it is and calculate the correct cutoff.
SELECT q_name, count(*)
FROM myTable
WHERE [status] = 2
--AND DATEDIFF(mi, create_stamp, getdate()) > 1
-- Adjust the logic to meet your requirements, because this is slightly different to what you had
AND create_stamp < DATEADD(minute, -1, getdate())
GROUP BY q_name;
Note, while dateadd does accept abbreviations for the unit to add, its much clearer to type it in full.

Different SQL query to compare date

I try to grab records from the email table which is less than 14 days. Just want to know if there is a performance difference in the following two queries?
select *
from email e
where DATEDIFF(day, e.recevied_date, cast(GETDATE() as date)) < 14
select *
from email e
where (GETDATE() - e.recevied_date) < 14
A sargable way of writing this predicate - ie, so SQL Server can use an index on the e.received_date column, would be:
where e.received_date > dateadd(day, -14, getdate())
With this construction there is no expression that needs to be evaluated for the data in the received_date column, and right hand side evaluates to a constant expression.
If you put the column into an expression, like datediff(day, e.received_date, getdate()) then SQL server has to evaluate every row in the table before being able to tell whether or not it is less than 14. This precludes the use of an index.
So, there should be virtually no significant difference between the two constructions you currently have, in the sense that both will be much slower than the sargable predicate, especially if there is an index on the received_date column.
The two expressions are not equivalent.
The first counts the number of midnights between the two dates, so complete days are returned.
The second incorporates the time.
If you want complete days since so many days ago, the best method is:
where e.received_date >= convert(date, dateadd(day, -14, getdate()))
The >= is to include midnight.
This is better because the only operations are on getdate() -- and these can actually be handled prior to the main execution phase. This allows the query to take advantage of indexes, partitions, and statistics on e.received_date.

DATEPART function assigning

I have table who order a order today and yesterday (18,17)
I need to find those result .
select A.C_Name
from Customer_Table A
inner join
Order_Table O
On A.C_ID=O.C_ID
where DATEPART(DAY,Order_Date)=GetDATE() and
DATEPART(DAY,Order_Date)=GETDATE()-1
I didnt get result for above query .
If you want orders today and yesterday, then this should be sufficient:
where Order_Date >= dateadd(day, -1, cast(getdate() as date))
(This assumes no future order dates, which seems reasonable).
Your query is a mess for several reasons. datepart() returns an integer and you are comparing it to a date. Looking at just the "day" part of a date will not work on the first of the month. And, getdate() -- despite its name -- has a time component, so direct equality is inappropriate.

How to get records between 24 and 48 hours in SQL Server

I am trying to get records between 24 and 36 hours.
So far I have :
select * from tablename where DATEDIFF(DAY, dateColumn, GETDATE())>0
This returns me all records older than 24 hours. I am looking to get records older than 24 but no older than 36.
Thanks
So far all the answer here do something like WHERE a.function(date_field) > 0; they place a function around your search field.
Unfortunately this means that the RDBMS's optimiser can not use any index on that field.
Instead you may be recommended in moving the calculations "to the right hand side".
SELECT
*
FROM
tablename
WHERE
dateColumn >= DATEADD(HOUR, -36, GETDATE())
AND dateColumn < DATEADD(HOUR, -24, GETDATE())
This format calculates two values, once, and then can do a range seek on an index. Rather than scanning the whole table, repeating the same calculations again and again.
Note: While these are the first solutions to come to mind, they are suboptimal as pointed out in the comments. See #MatBailie's answer for a solution that would be preferable.
While these are natural and might be okay in some limited use, you really should prefer a solution that is Search ARGument ABLE.
Sargable
In relational databases, a condition (or predicate) in a
query is said to be sargable if the DBMS engine can take advantage of
an index to speed up the execution of the query. The term is derived
from a contraction of Search ARGument ABLE.
Original answers:
Just add another condition:
select *
from tablename
where DATEDIFF(DAY, dateColumn, GETDATE())>0
and DATEDIFF(HOUR, dateColumn, GETDATE()) <= 36
or
select *
from tablename
where DATEDIFF(HOUR, dateColumn, GETDATE()) BETWEEN 24 AND 36
Note: In addition to being non-sargable this BETWEEN also includes records 24 hours old when in fact OP askks for wants older than 24. [OP use between a couple times, but clarifies that it isn't an inclusive SQL BETWEEN, but rather a semi-inclusive between that must be implemented with > and <=. ]
You must also specify the 36 hours:
select * from tablename where DATEDIFF(DAY, dateColumn, GETDATE())>0 AND DATEDIFF(hh, dateColumn, GETDATE())<=36
Use between:
select * from tablename where DATEDIFF(hour, dateColumn, GETDATE()) between 24 and 36
SELECT * FROM tablename WHERE DATEDIFF(hh,dateColumn,GETDATE()) between 24 and 36

Query runs slow with date expression, but fast with string literal

I am running a query with below condition in SQL Server 2008.
Where FK.DT = CAST(DATEADD(m, DATEDIFF(m, 0, getdate()), 0) as DATE)
Query takes forever to run with above condition, but if just say
Where FK.DT = '2013-05-01'
it runs great in 2 mins. FK.DT key contains values of only starting data of the month.
Any help, I am just clueless why this is happening.
This could work better:
Where FK.DT = cast(getdate() + 1 - datepart(day, getdate()) as date)
Unless you are running with trace flag 4199 on there is a bug that affects the cardinality estimates. At the time of writing
SELECT DATEADD(m, DATEDIFF(m, getdate(), 0), 0),
DATEADD(m, DATEDIFF(m, 0, getdate()), 0)
Returns
+-------------------------+-------------------------+
| 1786-06-01 00:00:00.000 | 2013-08-01 00:00:00.000 |
+-------------------------+-------------------------+
The bug is that the predicate in the question uses the first date rather than the second when deriving the cardinality estimates. So for the following setup.
CREATE TABLE FK
(
ID INT IDENTITY PRIMARY KEY,
DT DATE,
Filler CHAR(1000) NULL,
UNIQUE (DT,ID)
)
INSERT INTO FK (DT)
SELECT TOP (1000000) DATEADD(m, DATEDIFF(m, getdate(), 0), 0)
FROM master..spt_values o1, master..spt_values o2
UNION ALL
SELECT DATEADD(m, DATEDIFF(m, 0, getdate()), 0)
Query 1
SELECT COUNT(Filler)
FROM FK
WHERE FK.DT = CAST(DATEADD(m, DATEDIFF(m, 0, getdate()), 0) AS DATE)
Estimates that the number of matching rows will be 100,000. This is the number that match the date '1786-06-01'.
But both of the following queries
SELECT COUNT(Filler)
FROM FK
WHERE FK.DT = CAST(GETDATE() + 1 - DATEPART(DAY, GETDATE()) AS DATE)
SELECT COUNT(Filler)
FROM FK
WHERE FK.DT = CAST(DATEADD(m, DATEDIFF(m, 0, getdate()), 0) AS DATE)
OPTION (QUERYTRACEON 4199)
Give this plan
Due to the much more accurate cardinality estimates the plan now just does a single index seek rather than a full scan.
In most cases, the below probably applies. In this specific case, this is an optimizer bug involving DATEDIFF. Details here and here. Sorry for doubting t-clausen.dk, but his answer simply wasn't an intuitive and logical solution without knowing about the existence of the bug.
So assuming DT is actually DATE and not something silly like VARCHAR or - worse still - NVARCHAR - this is probably because you have a plan cached that used a very different date value when first executed, therefore chose a plan catering to a very different typical data distribution. There are ways you can overcome this:
Force a recompile of the plan by adding OPTION (RECOMPILE). You might only have to do this once, but then the plan you get might not be optimal for other parameters. The downside to leaving the option there all the time is that you then pay the compile cost every time the query runs. In a lot of cases this is not substantial, and I'll often choose to pay a known small cost rather than sometimes have a query that runs slightly faster and other times it runs extremely slow.
...
WHERE FK.DT = CAST(... AS DATE) OPTION (RECOMPILE);
Use a variable first (no need for an explicit CONVERT to DATE here, and please use MONTH instead of shorthand like m - that habit can lead to real funny behavior if you haven't memorized what all of the abbreviations do, for example I bet y and w don't produce the results you'd expect):
DECLARE #dt DATE = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0);
...
WHERE FK.DT = #dt;
However in this case the same thing could happen - parameter sniffing could coerce a sub-optimal plan to be used for different parameters representing different data skew.
You could also experiment with OPTION (OPTIMIZE FOR (#dt = '2013-08-01')), which would coerce SQL Server into considering this value instead of the one that was used to compile the cached plan, but this would require a hard-coded string literal, which will only help you for the rest of August, at which point you'd need to update the value. You could also consider OPTION (OPTIMIZE FOR UNKNOWN).