Most optimal way to get all records IN previous month - sql

In the past I have always used:
WHERE DATEDIFF(m, [DATE_COL], GETDATE()) = 1
which gets me ALL the record that occurred in the PREVIOUS month. For example if I ran this query, it will get me all records which occurred in January.
However I am currently working with a significantly bigger table and if I use the above query, it takes almost 30 minutes for it to load. However, if I use something like
WHERE [SettlementDate] >= DateAdd(DAY, -31, GETDATE())
it will usually run in under 10 seconds.
My question is:
How can I get the same result as WHERE DATEDIFF(m, [DATE_COL], GETDATE()) = 1 without the crazy increase in processing time?
Thank you!

Your query is slow because when you do DATEDIFF(m, [DATE_COL], GETDATE()) it can not use any indexes on the [Date_Col].
Anyway you can use the following where clause, this will use indexes on the [SettlementDate] and hopefully it should perform a lot better than the DATEDIFF() function.
WHERE [SettlementDate] >= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0)
AND [SettlementDate] < DATEADD(DAY,1,DATEADD(MONTH, DATEDIFF(MONTH, -1, GETDATE())-1, -1))

The problem is that you have a function call and the query optimizer cannot see inside functions. That means, it cannot decide if use an index or not. In that case it reads the whole table that can take very long time.
I suggest you to use variables and I believe your query will get better result:
declare #From datetime -- choose the same type as your SettlementDate column
set #From = DateAdd(DAY, -31, GETDATE()) -- compute the starting date
select * from yourTable where SettlementDate >= #From
In that case the sql server will know that you want to compare your SettlementDate value with a date and there is nothing other that has to compute. If you have index in that column, it will use that.
Additional information about SARGable queries: https://www.codeproject.com/Articles/827764/Sargable-query-in-SQL-server

Related

Recover data only from the previous day in a table in SQL Server

I need to retrieve data from a table that has date referenced only the previous day, I am trying to do with the query below but I am not getting:
SELECT
Log.ValorEntrada, Log.DataHoraEvento, Log.NumeroEntrada
FROM
Log
WHERE
Log.DataHoraEvento = (GETDATE()-1)
How can I get this result?
In SQL Server, GETDATE() has a time component. I would recommend:
WHERE Log.DataHoraEvento >= CAST(GETDATE()-1 as DATE) AND
Log.DataHoraEvento < CAST(GETDATE() as DATE)
This condition is "sargable", meaning that an index can be used. The following also is:
WHERE CONVERT(DATE, Log.DataHoraEvento) >= CONVERT(DATE, GETDATE())
Almost all functions prevent the use of indexes, but conversion/casting to a date is an exception.
Finally, if you don't care about indexes, you can also write this as:
WHERE DATEDIFF(day, Log.DataHoraEvento, GETDATE()) = 1
DATEDIFF() with day as the first argument counts the number of "day" boundaries between the two date/times. Everything that happened yesterday has exactly one date boundary.
If DataHoraEvento is a DATETIME, its likely that it has the full time, hence GETDATE()-1 isn't getting any matches. You should search for a range like this:
SELECT L.ValorEntrada, L.DataHoraEvento, L.NumeroEntrada
FROM dbo.[Log] L
WHERE L.DataHoraEvento >= CONVERT(DATE,DATEADD(DAY,-1,GETDATE()))
AND L.DataHoraEvento < CONVERT(DATE,GETDATE());
SELECT Log.ValorEntrada, Log.DataHoraEvento, Log.NumeroEntrada
FROM Log
WHERE Log.DataHoraEvento >= DATEADD(dd,DATEDIFF(dd,1,GETDATE()),0)
AND Log.DataHoraEvento < DATEADD(dd,DATEDIFF(dd,0,GETDATE()),0)
You should also use SYSDATETIME() (if you on SQL Server 2008+) instead of GETDATE() as this gives you datetime2(7) precision.
You can try this :
MEMBER BETWEEN DATEADD(day, -2, GETDATE()) AND DATEADD(day, -1, GETDATE())

SQL Returning todays data only

I need to return data (todays) from my table. I'm using this query which does the job but not as fast as I would like.
Current query
WHERE (CallDetail.DNIS='456456') AND CallDetail.ConnectedDateTimeGmt > CAST(FLOOR(CAST(GETDATE() AS FLOAT))AS DATETIME)
Another query that I use returns the past weeks worth of data in a matter of seconds.
WHERE (CallDetail.LocalName='Name') AND (CallDetail.ConnectedDate Between DATEADD(wk,-1,GetDate()) And GetDate())
Is there a more effective query I can use to return only data for today?
Instead of casting two times to return date part from GETDATE() which slows down query
WHERE (CallDetail.DNIS='456456')
AND CallDetail.ConnectedDateTimeGmt > CAST(FLOOR(CAST(GETDATE() AS FLOAT))AS DATETIME)
use a faster way to return only the date part from GETDATE()
WHERE (CallDetail.DNIS='456456')
AND CallDetail.ConnectedDateTimeGmt > DATEADD(dd, 0, DATEDIFF(dd, 0, GETDATE()))
If you want to speed up the query, then think about indexes. For this query, the most effective index would be on CallDetail(DNIS, ConnectedDateTimeGmt).
Your second query is probably running faster because you have an index on either CallDetail(ConnectedDate) or CallDetail(LocalName).
Kudos for only doing the date arithmetic on getdate() rather than on the field name (this can impede the use of indexes). If you are using a more recent version of SQL Server, then the most readable approach is cast(getdate() as date) for the conversion. However, the way to speed up the query is by judicious use of indexes.
You use a cross join to the date to calculate this only once, not sure how this would affect your performance but I've used this in the past and it always seemed to hit the indexes.
SELECT *
FROM [MyData]
CROSS JOIN (DATEADD(dd, 0, DATEDIFF(dd, 0, GETDATE())) AS CURRDATE) XJOIN
WHERE (CallDetail.DNIS='456456')
AND CallDetail.ConnectedDateTimeGmt > XJOIN.CURRDATE

SQL - Proper "where" clause for current month

Which is the proper way of checking events from current month on SQL Server and why?
1) WHERE (DATEDIFF(month, EventTime, GETDATE())=0))
2) WHERE (YEAR(EventTime) = YEAR(GETDATE()) AND MONTH(EventTime) = MONTH(GETDATE()))
Date format in table is i.e. EventTime: 2011-11-30 15:68:25.000
I don't have access to a SQL Server with a profiler, so I can't actually give as detailed an answer as I'd like.
The question is basically about which one allows the least calculation and most effective use of indexes.
Variants that use string manipulation have the highest calculation load, and don't use indexes at all. So I'll just skip those. That leave four common expressions...
SELECT * FROM date_sargable
WHERE YEAR(value) = YEAR (getDate())
AND MONTH(value) = MONTH(getDate())
;
SELECT * FROM date_sargable
WHERE DATEDIFF(MONTH, value, getDate()) = 0
;
SELECT * FROM date_sargable
WHERE DATEDIFF(MONTH, 0, value) = DATEDIFF(MONTH, 0, getDate())
;
SELECT * FROM date_sargable
WHERE value >= DATEADD(MONTH, DATEDIFF(MONTH, 0, getDate()) , 0)
AND value < DATEADD(MONTH, DATEDIFF(MONTH, 0, getDate()) + 1, 0)
;
The first three use INDEX SCANs, but the last one uses an INDEX SEEK. The difference is that the format of the query allows the optimiser to know you want a specific range of the data, that it's all next to each other in one block of the index, and that it's very easy to find that block.
If, when looking at execution plans, you see a SEEK in one version, and a SCAN in another, you're much more likely to benefit from the SEEK.
Write queries for continuous periods as explicit range condition.
http://use-the-index-luke.com/sql/where-clause/obfuscation/dates
Thus, use something like this:
WHERE EventTime between <begin-of-month> and <end-of-month>
Example code is available at the same page, although doing quarterly filtering:
http://use-the-index-luke.com/sql/where-clause/obfuscation/dates?dbtype=sqlserver#sample_quarter_begin_end
Why? To make indexing easy.
Logically, they both are.
However, I think the first one is simpler (one condition instead of two), easier to understand and more likely to be sargable - so I suggest using that one.
If you have an index on that column, then both calculations are bypassing the index because you're not using what the column is indexed on, but only a part of the index
WHERE (DATEDIFF(month, EventTime, GETDATE())=0))
WHERE (YEAR(EventTime) = YEAR(GETDATE()) AND MONTH(EventTime) = MONTH(GETDATE()))
You are much better off using something like this
WHERE EventTime BETWEEN Cast (DATEADD(dd,-(DAY(GetDate())-1),GetDate()) as Date)
AND Cast (DATEADD(dd,-(DAY(DATEADD(mm,1,GetDate()))),DATEADD(mm,1,GetDate())) as Date)
You can also use the same concept with a >= and <= for the dates
One tip very simple using to_char function, look:
At date example:
2011-11-30 15:68:25.000
you can do this:
to_char(your_field , 'MM') = 11
or for current month of system date:
to_char(sysdate , 'MM') = 11
One detail, to_char funcion is suported by sql language and not by one specific database.

Best way to check for current date in where clause of sql query

I'm trying to find out the most efficient (best performance) way to check date field for current date. Currently we are using:
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received BETWEEN DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0)
AND DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1))
WHERE
DateDiff(d, Received, GETDATE()) = 0
Edit: As lined out in the comments to this answer, that's not an ideal solution. Check the other answers in this thread, too.
If you just want to find all the records where the Received Date is today, and there are records with future Received dates, then what you're doing is (very very slightly) wrong... Because the Between operator allows values that are equal to the ending boundary, so you could get records with Received date = to midnight tomorrow...
If there is no need to use an index on Received, then all you need to do is check that the date diff with the current datetime is 0...
Where DateDiff(day, received, getdate()) = 0
This predicate is of course not SARGable so it cannot use an index...
If this is an issue for this query then, assuming you cannot have Received dates in the future, I would use this instead...
Where Received >= DateAdd(day, DateDiff(Day, 0, getDate()), 0)
If Received dates can be in the future, then you are probably as close to the most efficient as you can be... (Except change the Between to a >= AND < )
If you want performance, you want a direct hit on the index, without any CPU etc per row; as such, I would calculate the range first, and then use a simple WHERE query. I don't know what db you are using, but in SQL Server, the following works:
// ... where #When is the date-and-time we have (perhaps from GETDATE())
DECLARE #DayStart datetime, #DayEnd datetime
SET #DayStart = CAST(FLOOR(CAST(#When as float)) as datetime) -- get day only
SET #DayEnd = DATEADD(d, 1, #DayStart)
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received >= #DayStart AND Received < #DayEnd)
that's pretty much the best way to do it.
you could put the DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0) and DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1) into variables and use those instead but i don't think that this will improve performance.
I'm not sure how you're defining "best" but that will work fine.
However, if this query is something you're going to run repeatedly you should get rid of the get_date() function and just stick a literal date value in there via whatever programming language you're running this in. Despite their output changing only once every 24 hours, get_date(), current_date(), etc. are non-deterministic functions, which means that your RDMS will probably invalidate the query as a candidate for storing in its query cache if it has one.
How 'bout
WHERE
DATEDIFF(d, Received, GETDATE()) = 0
I would normally use the solution suggested by Tomalak, but if you are really desperate for performance the best option could be to add an extra indexed field ReceivedDataPartOnly - which would store data without the time part and then use the query
declare #today as datetime
set #today = datediff(d, 0, getdate())
select
count(job) as jobs
from
dbo.job
where
received_DatePartOnly = #today
Compare two dates after converting into same format like below.
where CONVERT(varchar, createddate, 1) = CONVERT(varchar, getdate(), 1);

MS SQL Date Only Without Time

Question
Hello All,
I've had some confusion for quite some time with essentially flooring a DateTime SQL type using T-SQL. Essentially, I want to take a DateTime value of say 2008-12-1 14:30:12 and make it 2008-12-1 00:00:00. Alot of the queries we run for reports use a date value in the WHERE clause, but I either have a start and end date value of a day and use a BETWEEN, or I find some other method.
Currently I'm using the following:
WHERE CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam
However, this seems kinda clunky. I was hoping there would be something more simple like
CAST([tstamp] AS DATE)
Some places online recommend using DATEPART() function, but then I end up with something like this:
WHERE DATEPART(year, [tstamp]) = DATEPART(year, #dateParam)
AND DATEPART(month, [tstamp]) = DATEPART(month, #dateParam)
AND DATEPART(day, [tstamp]) = DATEPART(day, #dateParam)
Maybe I'm being overly concerned with something small and if so please let me know. I just want to make sure the stuff I'm writing is as efficient as possible. I want to eliminate any weak links.
Any suggestions?
Thanks,
C
Solution
Thanks everyone for the great feedback. A lot of useful information. I'm going to change around our functions to eliminate the function on the left hand side of the operator. Although most of our date columns don't use indexes, it is probably still a better practice.
If you're using SQL Server 2008 it has this built in now, see this in books online
CAST(GETDATE() AS date)
that is very bad for performance, take a look at Only In A Database Can You Get 1000% + Improvement By Changing A Few Lines Of Code
functions on the left side of the operator are bad
here is what you need to do
declare #d datetime
select #d = '2008-12-1 14:30:12'
where tstamp >= dateadd(dd, datediff(dd, 0, #d)+0, 0)
and tstamp < dateadd(dd, datediff(dd, 0, #d)+1, 0)
Run this to see what it does
select dateadd(dd, datediff(dd, 0, getdate())+1, 0)
select dateadd(dd, datediff(dd, 0, getdate())+0, 0)
The Date functions posted by others are the most correct way to handle this.
However, it's funny you mention the term "floor", because there's a little hack that will run somewhat faster:
CAST(FLOOR(CAST(#dateParam AS float)) AS DateTime)
CONVERT(date, GETDATE()) and CONVERT(time, GETDATE()) works in SQL Server 2008. I'm uncertain about 2005.
How about this?
SELECT DATEADD(dd, DATEDIFF(dd,0,GETDATE()), 0)
Yes, T-SQL can feel extremely primitive at times, and it is things like these that often times push me to doing a lot of my logic in my language of choice (such as C#).
However, when you absolutely need to do some of these things in SQL for performance reasons, then your best bet is to create functions to house these "algorithms."
Take a look at this article. He offers up quite a few handy SQL functions along these lines that I think will help you.
http://weblogs.sqlteam.com/jeffs/archive/2007/01/02/56079.aspx
Careful here, if you use anything a long the lines of WHERE CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam it will force a scan on the table and no indexes will be used for that portion.
A much cleaner way of doing this is defining a calculated column
create table #t (
d datetime,
d2 as
cast (datepart(year,d) as varchar(4)) + '-' +
right('0' + cast (datepart(month,d) as varchar(2)),2) + '-' +
right('0' + cast (datepart(day,d) as varchar(2)),2)
)
-- notice a lot of care need to be taken to ensure the format is comparable. (zero padding)
insert #t
values (getdate())
create index idx on #t(d2)
select d2, count(d2) from #t
where d2 between '2008-01-01' and '2009-01-22'
group by d2
-- index seek is used
This way you can directly check the d2 column and an index will be used and you dont have to muck around with conversions.
DATEADD(d, 0, DATEDIFF(d, 0, [tstamp]))
Edit: While this will remove the time portion of your datetime, it will also make the condition non SARGable. If that's important for this query, an indexed view or a between clause is more appropriate.
Alternatively you could use
declare #d datetimeselect
#d = '2008-12-1 14:30:12'
where tstamp
BETWEEN dateadd(dd, datediff(dd, 0, #d)+0, 0)
AND dateadd(dd, datediff(dd, 0, #d)+1, 0)
Here's a query that will return all results within a range of days.
DECLARE #startDate DATETIME
DECLARE #endDate DATETIME
SET #startDate = DATEADD(day, -30, GETDATE())
SET #endDate = GETDATE()
SELECT *
FROM table
WHERE dateColumn >= DATEADD(day, DATEDIFF(day, 0, #startDate), 0)
AND dateColumn < DATEADD(day, 1, DATEDIFF(day, 0, #endDate))
FWIW, I've been doing the same thing as you for years
CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam
Seems to me like this is one of the better ways to strip off time in terms of flexibility, speed and readabily. (sorry). Some UDF functions as suggested can be useful, but UDFs can be slow with larger result sets.
WHERE DATEDIFF(day, tstamp, #dateParam) = 0
This should get you there if you don't care about time.
This is to answer the meta question of comparing the dates of two values when you don't care about the time.