SQL View: Optimizing real-time data - sql

I am having an issue with a query a while to run.
The scenario is this: I have an efficiency metric being populated within a view taking in inputs from another view. This calculation utilizes GETUTCDATE()and I am just adjusting for my time zone. I am calculating efficiency by way of using a "BuildTime" column value versus how much time has passed since 7:00 AM of the current day (e.g. if 120min have passed since 7AM and "BuildTime" equals 120min, the efficiency is 100%. I am also using a CASE function to only calculate the current passing time between operating hours (7AM - 3:30PM)
Attached below is the code:
SELECT
md.Operator,
CASE
WHEN DATEADD(HOUR, -6, GETUTCDATE()) > CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -6, GETUTCDATE()))) + '7:00' AND GETDATE() < CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -6, GETUTCDATE()))) + '15:30' THEN
(SUM(isNull(md.TotalTime, 0)) + SUM(isNull(md.DelTime, 0))) * 1.0 / DATEDIFF(MINUTE, CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -6, GETUTCDATE()))) + '7:00' , DATEADD(HOUR, -6, GETUTCDATE())) * 100.0
ELSE (SUM(isNull(md.TotalTime, 0)) + SUM(isNull(md.DelTime, 0))) / 435 * 100.0
END
AS OpEfficiency
FROM [Booms MES Master Data] as md
WHERE md.[Date] = CONVERT(varchar(50), DATEADD(HOUR, -6, GETUTCDATE()), 101)
GROUP BY md.Operator
As of now, this code takes several seconds to run. I'm wondering where the problem lays within the code? Am I converting too many statements or is it an issue with a nested if function?

If you have a very large dataset, as a rule of thumb, start with your fixed value first and limit your conversion to the minimal amount of characters. In this case varchar(10) since a 101 format date only has 10 character ( '12/12/1234' ).
WHERE CONVERT(varchar(10), DATEADD(HOUR, -6, GETUTCDATE()), 101) = md.[Date]
You can use Microsoft's Analyze an Actual Execution Plan to detect the further issues.
https://learn.microsoft.com/en-us/sql/relational-databases/performance/analyze-an-actual-execution-plan
Also consider converting to dates instead of varchar, if you code is running on sql server 2012+. Dates are stored as two integer in sql server. It is far easier for sql to find a specific number once sorted than searching for a string sorted as mm/dd/yy if you have multiple years of data.
WHERE CONVERT(date, DATEADD(HOUR, -6, GETUTCDATE())) = CONVERT(date,md.[Date])

Related

Recover data only from the previous day in a table in SQL Server

I need to retrieve data from a table that has date referenced only the previous day, I am trying to do with the query below but I am not getting:
SELECT
Log.ValorEntrada, Log.DataHoraEvento, Log.NumeroEntrada
FROM
Log
WHERE
Log.DataHoraEvento = (GETDATE()-1)
How can I get this result?
In SQL Server, GETDATE() has a time component. I would recommend:
WHERE Log.DataHoraEvento >= CAST(GETDATE()-1 as DATE) AND
Log.DataHoraEvento < CAST(GETDATE() as DATE)
This condition is "sargable", meaning that an index can be used. The following also is:
WHERE CONVERT(DATE, Log.DataHoraEvento) >= CONVERT(DATE, GETDATE())
Almost all functions prevent the use of indexes, but conversion/casting to a date is an exception.
Finally, if you don't care about indexes, you can also write this as:
WHERE DATEDIFF(day, Log.DataHoraEvento, GETDATE()) = 1
DATEDIFF() with day as the first argument counts the number of "day" boundaries between the two date/times. Everything that happened yesterday has exactly one date boundary.
If DataHoraEvento is a DATETIME, its likely that it has the full time, hence GETDATE()-1 isn't getting any matches. You should search for a range like this:
SELECT L.ValorEntrada, L.DataHoraEvento, L.NumeroEntrada
FROM dbo.[Log] L
WHERE L.DataHoraEvento >= CONVERT(DATE,DATEADD(DAY,-1,GETDATE()))
AND L.DataHoraEvento < CONVERT(DATE,GETDATE());
SELECT Log.ValorEntrada, Log.DataHoraEvento, Log.NumeroEntrada
FROM Log
WHERE Log.DataHoraEvento >= DATEADD(dd,DATEDIFF(dd,1,GETDATE()),0)
AND Log.DataHoraEvento < DATEADD(dd,DATEDIFF(dd,0,GETDATE()),0)
You should also use SYSDATETIME() (if you on SQL Server 2008+) instead of GETDATE() as this gives you datetime2(7) precision.
You can try this :
MEMBER BETWEEN DATEADD(day, -2, GETDATE()) AND DATEADD(day, -1, GETDATE())

Most optimal way to get all records IN previous month

In the past I have always used:
WHERE DATEDIFF(m, [DATE_COL], GETDATE()) = 1
which gets me ALL the record that occurred in the PREVIOUS month. For example if I ran this query, it will get me all records which occurred in January.
However I am currently working with a significantly bigger table and if I use the above query, it takes almost 30 minutes for it to load. However, if I use something like
WHERE [SettlementDate] >= DateAdd(DAY, -31, GETDATE())
it will usually run in under 10 seconds.
My question is:
How can I get the same result as WHERE DATEDIFF(m, [DATE_COL], GETDATE()) = 1 without the crazy increase in processing time?
Thank you!
Your query is slow because when you do DATEDIFF(m, [DATE_COL], GETDATE()) it can not use any indexes on the [Date_Col].
Anyway you can use the following where clause, this will use indexes on the [SettlementDate] and hopefully it should perform a lot better than the DATEDIFF() function.
WHERE [SettlementDate] >= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0)
AND [SettlementDate] < DATEADD(DAY,1,DATEADD(MONTH, DATEDIFF(MONTH, -1, GETDATE())-1, -1))
The problem is that you have a function call and the query optimizer cannot see inside functions. That means, it cannot decide if use an index or not. In that case it reads the whole table that can take very long time.
I suggest you to use variables and I believe your query will get better result:
declare #From datetime -- choose the same type as your SettlementDate column
set #From = DateAdd(DAY, -31, GETDATE()) -- compute the starting date
select * from yourTable where SettlementDate >= #From
In that case the sql server will know that you want to compare your SettlementDate value with a date and there is nothing other that has to compute. If you have index in that column, it will use that.
Additional information about SARGable queries: https://www.codeproject.com/Articles/827764/Sargable-query-in-SQL-server

Extract time from datetime efficiently (as decimal or datetime)

I have been able to find a lot of information for getting a string representation of just the time from a datetime column like this one.
I need to get the time part out of a datetime in a way that I can do some math on it like adding it to another datetime. So a string representation of the time wont help me.
However I've only found one example that will extract the time as a numeric type value. I.e:
SELECT CAST(GETDATE() AS FLOAT) - FLOOR(CAST(GETDATE() AS FLOAT))
This method requires two casts though and I have to run this on over 10,000 rows. Is there anything similar to the dateadd method for extracting the date part from a datetime column i.e.:
select DATEADD(dd, DATEDIFF(dd, 0, getdate()), 0)
that I can use to get just the time out of a datetime column and return it as a decimal or datetime? Perhaps a solution that uses less casting?
I am using SQL Server 2000.
To get a datetime:
SELECT GetDate() - DateDiff(day, 0, GetDate());
-- returns the time with zero as the datetime part (1900-01-01).
And to get a number representing the time:
SELECT DateDiff(millisecond, DateDiff(day, 0, GetDate()), GetDate());
-- time since midnight in milliseconds, use as you wish
If you really want a string, then:
SELECT Convert(varchar(8), GetDate(), 108); -- 'hh:mm:ss'
SELECT Convert(varchar(12), GetDate(), 114); -- 'hh:mm:ss.nnn' where nnn is milliseconds
One way You can get the time in seconds is with:
select cast(datediff(second, DATEADD(dd, DATEDIFF(dd, 0, getdate()), 0), getdate())/(60*60*24.0) as datetime)
This calculates the time in seconds and then converts back to a datetime.
To get it as a decimal:
select datediff(second, DATEADD(dd, DATEDIFF(dd, 0, getdate()), 0), getdate())/(60*60*24.0)
Or use "ms" if you prefer millisecond precision.
Or, you can use the more readable:
select datepart(hh, getdate())/24.0+datepart(mm, getdate())/(24*60.0)+
datepart(ss, getdate())/(24*60*60.0)

Best way to check for current date in where clause of sql query

I'm trying to find out the most efficient (best performance) way to check date field for current date. Currently we are using:
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received BETWEEN DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0)
AND DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1))
WHERE
DateDiff(d, Received, GETDATE()) = 0
Edit: As lined out in the comments to this answer, that's not an ideal solution. Check the other answers in this thread, too.
If you just want to find all the records where the Received Date is today, and there are records with future Received dates, then what you're doing is (very very slightly) wrong... Because the Between operator allows values that are equal to the ending boundary, so you could get records with Received date = to midnight tomorrow...
If there is no need to use an index on Received, then all you need to do is check that the date diff with the current datetime is 0...
Where DateDiff(day, received, getdate()) = 0
This predicate is of course not SARGable so it cannot use an index...
If this is an issue for this query then, assuming you cannot have Received dates in the future, I would use this instead...
Where Received >= DateAdd(day, DateDiff(Day, 0, getDate()), 0)
If Received dates can be in the future, then you are probably as close to the most efficient as you can be... (Except change the Between to a >= AND < )
If you want performance, you want a direct hit on the index, without any CPU etc per row; as such, I would calculate the range first, and then use a simple WHERE query. I don't know what db you are using, but in SQL Server, the following works:
// ... where #When is the date-and-time we have (perhaps from GETDATE())
DECLARE #DayStart datetime, #DayEnd datetime
SET #DayStart = CAST(FLOOR(CAST(#When as float)) as datetime) -- get day only
SET #DayEnd = DATEADD(d, 1, #DayStart)
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received >= #DayStart AND Received < #DayEnd)
that's pretty much the best way to do it.
you could put the DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0) and DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1) into variables and use those instead but i don't think that this will improve performance.
I'm not sure how you're defining "best" but that will work fine.
However, if this query is something you're going to run repeatedly you should get rid of the get_date() function and just stick a literal date value in there via whatever programming language you're running this in. Despite their output changing only once every 24 hours, get_date(), current_date(), etc. are non-deterministic functions, which means that your RDMS will probably invalidate the query as a candidate for storing in its query cache if it has one.
How 'bout
WHERE
DATEDIFF(d, Received, GETDATE()) = 0
I would normally use the solution suggested by Tomalak, but if you are really desperate for performance the best option could be to add an extra indexed field ReceivedDataPartOnly - which would store data without the time part and then use the query
declare #today as datetime
set #today = datediff(d, 0, getdate())
select
count(job) as jobs
from
dbo.job
where
received_DatePartOnly = #today
Compare two dates after converting into same format like below.
where CONVERT(varchar, createddate, 1) = CONVERT(varchar, getdate(), 1);

MS SQL Date Only Without Time

Question
Hello All,
I've had some confusion for quite some time with essentially flooring a DateTime SQL type using T-SQL. Essentially, I want to take a DateTime value of say 2008-12-1 14:30:12 and make it 2008-12-1 00:00:00. Alot of the queries we run for reports use a date value in the WHERE clause, but I either have a start and end date value of a day and use a BETWEEN, or I find some other method.
Currently I'm using the following:
WHERE CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam
However, this seems kinda clunky. I was hoping there would be something more simple like
CAST([tstamp] AS DATE)
Some places online recommend using DATEPART() function, but then I end up with something like this:
WHERE DATEPART(year, [tstamp]) = DATEPART(year, #dateParam)
AND DATEPART(month, [tstamp]) = DATEPART(month, #dateParam)
AND DATEPART(day, [tstamp]) = DATEPART(day, #dateParam)
Maybe I'm being overly concerned with something small and if so please let me know. I just want to make sure the stuff I'm writing is as efficient as possible. I want to eliminate any weak links.
Any suggestions?
Thanks,
C
Solution
Thanks everyone for the great feedback. A lot of useful information. I'm going to change around our functions to eliminate the function on the left hand side of the operator. Although most of our date columns don't use indexes, it is probably still a better practice.
If you're using SQL Server 2008 it has this built in now, see this in books online
CAST(GETDATE() AS date)
that is very bad for performance, take a look at Only In A Database Can You Get 1000% + Improvement By Changing A Few Lines Of Code
functions on the left side of the operator are bad
here is what you need to do
declare #d datetime
select #d = '2008-12-1 14:30:12'
where tstamp >= dateadd(dd, datediff(dd, 0, #d)+0, 0)
and tstamp < dateadd(dd, datediff(dd, 0, #d)+1, 0)
Run this to see what it does
select dateadd(dd, datediff(dd, 0, getdate())+1, 0)
select dateadd(dd, datediff(dd, 0, getdate())+0, 0)
The Date functions posted by others are the most correct way to handle this.
However, it's funny you mention the term "floor", because there's a little hack that will run somewhat faster:
CAST(FLOOR(CAST(#dateParam AS float)) AS DateTime)
CONVERT(date, GETDATE()) and CONVERT(time, GETDATE()) works in SQL Server 2008. I'm uncertain about 2005.
How about this?
SELECT DATEADD(dd, DATEDIFF(dd,0,GETDATE()), 0)
Yes, T-SQL can feel extremely primitive at times, and it is things like these that often times push me to doing a lot of my logic in my language of choice (such as C#).
However, when you absolutely need to do some of these things in SQL for performance reasons, then your best bet is to create functions to house these "algorithms."
Take a look at this article. He offers up quite a few handy SQL functions along these lines that I think will help you.
http://weblogs.sqlteam.com/jeffs/archive/2007/01/02/56079.aspx
Careful here, if you use anything a long the lines of WHERE CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam it will force a scan on the table and no indexes will be used for that portion.
A much cleaner way of doing this is defining a calculated column
create table #t (
d datetime,
d2 as
cast (datepart(year,d) as varchar(4)) + '-' +
right('0' + cast (datepart(month,d) as varchar(2)),2) + '-' +
right('0' + cast (datepart(day,d) as varchar(2)),2)
)
-- notice a lot of care need to be taken to ensure the format is comparable. (zero padding)
insert #t
values (getdate())
create index idx on #t(d2)
select d2, count(d2) from #t
where d2 between '2008-01-01' and '2009-01-22'
group by d2
-- index seek is used
This way you can directly check the d2 column and an index will be used and you dont have to muck around with conversions.
DATEADD(d, 0, DATEDIFF(d, 0, [tstamp]))
Edit: While this will remove the time portion of your datetime, it will also make the condition non SARGable. If that's important for this query, an indexed view or a between clause is more appropriate.
Alternatively you could use
declare #d datetimeselect
#d = '2008-12-1 14:30:12'
where tstamp
BETWEEN dateadd(dd, datediff(dd, 0, #d)+0, 0)
AND dateadd(dd, datediff(dd, 0, #d)+1, 0)
Here's a query that will return all results within a range of days.
DECLARE #startDate DATETIME
DECLARE #endDate DATETIME
SET #startDate = DATEADD(day, -30, GETDATE())
SET #endDate = GETDATE()
SELECT *
FROM table
WHERE dateColumn >= DATEADD(day, DATEDIFF(day, 0, #startDate), 0)
AND dateColumn < DATEADD(day, 1, DATEDIFF(day, 0, #endDate))
FWIW, I've been doing the same thing as you for years
CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam
Seems to me like this is one of the better ways to strip off time in terms of flexibility, speed and readabily. (sorry). Some UDF functions as suggested can be useful, but UDFs can be slow with larger result sets.
WHERE DATEDIFF(day, tstamp, #dateParam) = 0
This should get you there if you don't care about time.
This is to answer the meta question of comparing the dates of two values when you don't care about the time.