SQL - Proper "where" clause for current month - sql

Which is the proper way of checking events from current month on SQL Server and why?
1) WHERE (DATEDIFF(month, EventTime, GETDATE())=0))
2) WHERE (YEAR(EventTime) = YEAR(GETDATE()) AND MONTH(EventTime) = MONTH(GETDATE()))
Date format in table is i.e. EventTime: 2011-11-30 15:68:25.000

I don't have access to a SQL Server with a profiler, so I can't actually give as detailed an answer as I'd like.
The question is basically about which one allows the least calculation and most effective use of indexes.
Variants that use string manipulation have the highest calculation load, and don't use indexes at all. So I'll just skip those. That leave four common expressions...
SELECT * FROM date_sargable
WHERE YEAR(value) = YEAR (getDate())
AND MONTH(value) = MONTH(getDate())
;
SELECT * FROM date_sargable
WHERE DATEDIFF(MONTH, value, getDate()) = 0
;
SELECT * FROM date_sargable
WHERE DATEDIFF(MONTH, 0, value) = DATEDIFF(MONTH, 0, getDate())
;
SELECT * FROM date_sargable
WHERE value >= DATEADD(MONTH, DATEDIFF(MONTH, 0, getDate()) , 0)
AND value < DATEADD(MONTH, DATEDIFF(MONTH, 0, getDate()) + 1, 0)
;
The first three use INDEX SCANs, but the last one uses an INDEX SEEK. The difference is that the format of the query allows the optimiser to know you want a specific range of the data, that it's all next to each other in one block of the index, and that it's very easy to find that block.
If, when looking at execution plans, you see a SEEK in one version, and a SCAN in another, you're much more likely to benefit from the SEEK.

Write queries for continuous periods as explicit range condition.
http://use-the-index-luke.com/sql/where-clause/obfuscation/dates
Thus, use something like this:
WHERE EventTime between <begin-of-month> and <end-of-month>
Example code is available at the same page, although doing quarterly filtering:
http://use-the-index-luke.com/sql/where-clause/obfuscation/dates?dbtype=sqlserver#sample_quarter_begin_end
Why? To make indexing easy.

Logically, they both are.
However, I think the first one is simpler (one condition instead of two), easier to understand and more likely to be sargable - so I suggest using that one.

If you have an index on that column, then both calculations are bypassing the index because you're not using what the column is indexed on, but only a part of the index
WHERE (DATEDIFF(month, EventTime, GETDATE())=0))
WHERE (YEAR(EventTime) = YEAR(GETDATE()) AND MONTH(EventTime) = MONTH(GETDATE()))
You are much better off using something like this
WHERE EventTime BETWEEN Cast (DATEADD(dd,-(DAY(GetDate())-1),GetDate()) as Date)
AND Cast (DATEADD(dd,-(DAY(DATEADD(mm,1,GetDate()))),DATEADD(mm,1,GetDate())) as Date)
You can also use the same concept with a >= and <= for the dates

One tip very simple using to_char function, look:
At date example:
2011-11-30 15:68:25.000
you can do this:
to_char(your_field , 'MM') = 11
or for current month of system date:
to_char(sysdate , 'MM') = 11
One detail, to_char funcion is suported by sql language and not by one specific database.

Related

Query data from previous day when month/year changes

In a SQL Server query, I am currently using the clause
WHERE
DAY(trade_date) = DAY(GETDATE()) - 1
AND MONTH(trade_date) = MONTH(GETDATE())
AND YEAR(trade_date) = YEAR(GETDATE())
to query my data from the previous day.
It is working fine right now but my question is if, for example, on 8/1/2021, SQL Server will try to get data from 8/0/2021 or if it will know to get data from 7/31/2021.
If this query won't work what could I use instead? Thanks!
I would recommend using proper date comparison logic - instead of breaking it down to day, month and year. Also, it is recommended to use proper date arithmetic functions like DATEADD instead of just - 1 on your date values (never sure what that -1 stands for: minus one day? Week? Month? Hour?).
And lastly - I would also recommend using SYSDATETIME() instead of GETDATE() since the latter always returns a DATETIME datatype - which should be on its way out, and you should use DATE (if you don't need to time portion), or DATETIME2(n) (if you do need the time portion) since those are more efficient and have fewer limitations compared to DATETIME.
If your trade_date is a DATE column (as it probably should be), just use:
WHERE
trade_date = DATEADD(DAY, -1, SYSDATETIME())
and if it's not a DATE - just cast it to a date as needed:
WHERE
CAST(trade_date AS DATE) = DATEADD(DAY, -1, CAST(SYSDATETIME() AS DATE))

get the transactions of previous month regardless the day and time

I wondering if there is more efficient way to get the following scenario.
check the datasets
https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=e18a8c1200c8eac1f3bdca184075358e
I need to get the data of previous month using getdate function as a base line regardless the day.
any suggestion please.
The most efficient method is direct date comparisons:
where trandate < datefromparts(year(getdate()), month(getdate()), 1) and
trandate >= dateadd(month, -1, datefromparts(year(getdate()), month(getdate()), 1))
This allows the optimizer to use indexes and partitions that are based on trandate. Also, the statistics are more accurate for trandate directly -- and functions impede the use of statistics.
-- using datediff
select * from aaa
where datediff(MONTH,aaa.trandate,getdate()) = 1

Most optimal way to get all records IN previous month

In the past I have always used:
WHERE DATEDIFF(m, [DATE_COL], GETDATE()) = 1
which gets me ALL the record that occurred in the PREVIOUS month. For example if I ran this query, it will get me all records which occurred in January.
However I am currently working with a significantly bigger table and if I use the above query, it takes almost 30 minutes for it to load. However, if I use something like
WHERE [SettlementDate] >= DateAdd(DAY, -31, GETDATE())
it will usually run in under 10 seconds.
My question is:
How can I get the same result as WHERE DATEDIFF(m, [DATE_COL], GETDATE()) = 1 without the crazy increase in processing time?
Thank you!
Your query is slow because when you do DATEDIFF(m, [DATE_COL], GETDATE()) it can not use any indexes on the [Date_Col].
Anyway you can use the following where clause, this will use indexes on the [SettlementDate] and hopefully it should perform a lot better than the DATEDIFF() function.
WHERE [SettlementDate] >= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0)
AND [SettlementDate] < DATEADD(DAY,1,DATEADD(MONTH, DATEDIFF(MONTH, -1, GETDATE())-1, -1))
The problem is that you have a function call and the query optimizer cannot see inside functions. That means, it cannot decide if use an index or not. In that case it reads the whole table that can take very long time.
I suggest you to use variables and I believe your query will get better result:
declare #From datetime -- choose the same type as your SettlementDate column
set #From = DateAdd(DAY, -31, GETDATE()) -- compute the starting date
select * from yourTable where SettlementDate >= #From
In that case the sql server will know that you want to compare your SettlementDate value with a date and there is nothing other that has to compute. If you have index in that column, it will use that.
Additional information about SARGable queries: https://www.codeproject.com/Articles/827764/Sargable-query-in-SQL-server

SQL Returning todays data only

I need to return data (todays) from my table. I'm using this query which does the job but not as fast as I would like.
Current query
WHERE (CallDetail.DNIS='456456') AND CallDetail.ConnectedDateTimeGmt > CAST(FLOOR(CAST(GETDATE() AS FLOAT))AS DATETIME)
Another query that I use returns the past weeks worth of data in a matter of seconds.
WHERE (CallDetail.LocalName='Name') AND (CallDetail.ConnectedDate Between DATEADD(wk,-1,GetDate()) And GetDate())
Is there a more effective query I can use to return only data for today?
Instead of casting two times to return date part from GETDATE() which slows down query
WHERE (CallDetail.DNIS='456456')
AND CallDetail.ConnectedDateTimeGmt > CAST(FLOOR(CAST(GETDATE() AS FLOAT))AS DATETIME)
use a faster way to return only the date part from GETDATE()
WHERE (CallDetail.DNIS='456456')
AND CallDetail.ConnectedDateTimeGmt > DATEADD(dd, 0, DATEDIFF(dd, 0, GETDATE()))
If you want to speed up the query, then think about indexes. For this query, the most effective index would be on CallDetail(DNIS, ConnectedDateTimeGmt).
Your second query is probably running faster because you have an index on either CallDetail(ConnectedDate) or CallDetail(LocalName).
Kudos for only doing the date arithmetic on getdate() rather than on the field name (this can impede the use of indexes). If you are using a more recent version of SQL Server, then the most readable approach is cast(getdate() as date) for the conversion. However, the way to speed up the query is by judicious use of indexes.
You use a cross join to the date to calculate this only once, not sure how this would affect your performance but I've used this in the past and it always seemed to hit the indexes.
SELECT *
FROM [MyData]
CROSS JOIN (DATEADD(dd, 0, DATEDIFF(dd, 0, GETDATE())) AS CURRDATE) XJOIN
WHERE (CallDetail.DNIS='456456')
AND CallDetail.ConnectedDateTimeGmt > XJOIN.CURRDATE

Best way to check for current date in where clause of sql query

I'm trying to find out the most efficient (best performance) way to check date field for current date. Currently we are using:
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received BETWEEN DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0)
AND DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1))
WHERE
DateDiff(d, Received, GETDATE()) = 0
Edit: As lined out in the comments to this answer, that's not an ideal solution. Check the other answers in this thread, too.
If you just want to find all the records where the Received Date is today, and there are records with future Received dates, then what you're doing is (very very slightly) wrong... Because the Between operator allows values that are equal to the ending boundary, so you could get records with Received date = to midnight tomorrow...
If there is no need to use an index on Received, then all you need to do is check that the date diff with the current datetime is 0...
Where DateDiff(day, received, getdate()) = 0
This predicate is of course not SARGable so it cannot use an index...
If this is an issue for this query then, assuming you cannot have Received dates in the future, I would use this instead...
Where Received >= DateAdd(day, DateDiff(Day, 0, getDate()), 0)
If Received dates can be in the future, then you are probably as close to the most efficient as you can be... (Except change the Between to a >= AND < )
If you want performance, you want a direct hit on the index, without any CPU etc per row; as such, I would calculate the range first, and then use a simple WHERE query. I don't know what db you are using, but in SQL Server, the following works:
// ... where #When is the date-and-time we have (perhaps from GETDATE())
DECLARE #DayStart datetime, #DayEnd datetime
SET #DayStart = CAST(FLOOR(CAST(#When as float)) as datetime) -- get day only
SET #DayEnd = DATEADD(d, 1, #DayStart)
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received >= #DayStart AND Received < #DayEnd)
that's pretty much the best way to do it.
you could put the DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0) and DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1) into variables and use those instead but i don't think that this will improve performance.
I'm not sure how you're defining "best" but that will work fine.
However, if this query is something you're going to run repeatedly you should get rid of the get_date() function and just stick a literal date value in there via whatever programming language you're running this in. Despite their output changing only once every 24 hours, get_date(), current_date(), etc. are non-deterministic functions, which means that your RDMS will probably invalidate the query as a candidate for storing in its query cache if it has one.
How 'bout
WHERE
DATEDIFF(d, Received, GETDATE()) = 0
I would normally use the solution suggested by Tomalak, but if you are really desperate for performance the best option could be to add an extra indexed field ReceivedDataPartOnly - which would store data without the time part and then use the query
declare #today as datetime
set #today = datediff(d, 0, getdate())
select
count(job) as jobs
from
dbo.job
where
received_DatePartOnly = #today
Compare two dates after converting into same format like below.
where CONVERT(varchar, createddate, 1) = CONVERT(varchar, getdate(), 1);