Analyising Implict CAST - sql

I have an academic scenario, which I would like to know how to analyse.
DECLARE #date DATETIME
SET #date = getDate()
SET #date = DATEADD(DAY, DATEDIFF(DAY, 0, #date-3), 3)
This will round the date down to a Thursday.
What I have been challenged on is to evidence where there are implicit CASTs.
The are three places where I presume that this must be occuring...
DATEADD(
DAY,
DATEDIFF(
DAY,
0, -- Implicitly CAST to a DATETIME?
#date-3 -- I presume the `3` is being implicitly cast to a DATETIME?
),
3 -- Another implicit CAST to a DATETIME?
)
Perhaps, however, as the 0 and 3's are are constants, this is done during compilation to an execution plan?
But if the 3's were INT variables, would that be different?
Is there a way to analyse an execution plan, or some other method, to be able to determine this imperically?
To make matters more complicated, I'm currently off site. I'm trying to remotely assist a colleague with this. Which means I do not have direct access to SSMS, etc.

For the queries
DECLARE #date DATETIME = getDate()
DECLARE #N INT = 3
SELECT DATEADD(DAY, DATEDIFF(DAY, 0, #date-3), 3)
FROM master..spt_values
SELECT DATEADD(DAY, DATEDIFF(DAY, 0, #date-#N), #N)
FROM master..spt_values
And looking at the execution plans the compute scalars show the following.
Query 1
[Expr1003] = Scalar Operator(dateadd(day,datediff(day,'1900-01-01 00:00:00.000',[#date]-'1900-01-04 00:00:00.000'),'1900-01-04 00:00:00.000'))
Query 2
[Expr1003] = Scalar Operator(dateadd(day,datediff(day,'1900-01-01 00:00:00.000',[#date]-CONVERT_IMPLICIT(datetime,[#N],0)),CONVERT_IMPLICIT(datetime,[#N],0)))
showing that your suspicion is correct that it happens at compile time for the literal values but needs a CONVERT_IMPLICIT at run time for the int variables

Related

Direct access slower than using functions?

I was conducting some performance testing and have discovered something quite strange. I have set up a short script to time how long it takes to perform certain actions.
declare #date date
declare #someint int
declare #start datetime
declare #ended datetime
set #date = GETDATE()
DECLARE #count INT
SET #count = 0
set #start = GETDATE()
WHILE (#count < 1000)
BEGIN
--Insert test script here
END
set #ended = GETDATE()
select DATEDIFF( MILLISECOND, #start, #ended)
The table I was running tests againsts contains 3 columns, MDay, and CalDate. Every calendar date has a corresponding M(Manufacturing)Day. The table may look something like this:
MDay | CalDate
1 | 1970-01-01
2 | 1970-01-02
I wanted to test how efficient one of our functions was. This function simply takes in a date and returns the int MDay value. I used direct access, basically the same thing without the function, and tests resulted in this method take twice as long! Code I inserted into the loop is provided below. I used a random date in an attempt to eliminate caching (if exist).
Function
select #someint = Reference.GetMDay(DATEADD( D, convert(int, RAND() * 1000) , #date))
Definition for above
create Function [Reference].[GetMDay]
(#pCaLDate smalldatetime
)
Returns int
as
Begin
Declare #Mday int
Select #Mday = Mday
from Reference.MDay
where Caldate = #pCaLDate
Direct
select #someint = MDay from Reference.MDay where CalDate = DATEADD( D, convert(int, RAND() * 1000) , #date)
I even tried using a static #date for my direct code and the difference in times are negligible, so I know the convert call isn't holding it back.
What the heck is going on here?
Take a look at http://msdn.microsoft.com/en-us/library/ms178071%28v=sql.105%29.aspx is the execution plan the same on your sql server for both methods?

SQL HELP... CONVERT(int, CONVERT(datetime, FLOOR(CONVERT(float, getdate())))

I am having a problem adjusting this part of my SQL statement:
HAVING dbo.BOOKINGS.BOOKED = CONVERT(int, CONVERT(datetime,
FLOOR(CONVERT(float, GETDATE()))) + 2)
Normally, the page that uses this statement just lists the amount of sales for today, I want to switch the GETDATE() to a date that I declare. I tried all different formats and none have worked
Use the DATEADD/DATEDIFF method of setting the time portion to midnight of the current date - it's the fastest means, and casting to FLOAT can be unreliable:
HAVING BOOKINGS.dbo.BOOKED = CONVERT(INT, DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0))+2
Then, you can set your own date easily if you use a variable (#var in this example, within a stored procedure or function):
DECLARE #var DATETIME
SELECT ...
HAVING BOOKINGS.dbo.BOOKED = CONVERT(INT, DATEADD(dd, DATEDIFF(dd, 0, #var), 0))+2
This assumes #var is a DATETIME data type. Otherwise, you'll need to use a date format SQL Server will implicitly convert to a DATETIME -- or use CAST/CONVERT to explicitly convert the value.
if you want you to give your own date you could do this instead of getdate() which gives current system timestamp.
Cast('2010-11-04 13:28:00.000' as datetime)
How about
declare #myDate as datetime
set #myDate = '11/2/2010'
. . .
HAVING dbo.BOOKINGS.BOOKED = CONVERT(int, CONVERT(datetime,
FLOOR(CONVERT(float, #myDate ))) + 2)
That should do it, and it should automatically do the type conversion on your date string used in the set statement, or you could just pass in a datetime parameter if this is in a stored procedure.

How can I compare time in SQL Server?

I'm trying to compare time in a datetime field in a SQL query, but I don't know if it's right. I don't want to compare the date part, just the time part.
I'm doing this:
SELECT timeEvent
FROM tbEvents
WHERE convert(datetime, startHour, 8) >= convert(datetime, #startHour, 8)
Is it correct?
I'm asking this because I need to know if 08:00:00 is less or greater than 07:30:00 and I don't want to compare the date, just the time part.
Thanks!
Your compare will work, but it will be slow because the dates are converted to a string for each row. To efficiently compare two time parts, try:
declare #first datetime
set #first = '2009-04-30 19:47:16.123'
declare #second datetime
set #second = '2009-04-10 19:47:16.123'
select (cast(#first as float) - floor(cast(#first as float))) -
(cast(#second as float) - floor(cast(#second as float)))
as Difference
Long explanation: a date in SQL server is stored as a floating point number. The digits before the decimal point represent the date. The digits after the decimal point represent the time.
So here's an example date:
declare #mydate datetime
set #mydate = '2009-04-30 19:47:16.123'
Let's convert it to a float:
declare #myfloat float
set #myfloat = cast(#mydate as float)
select #myfloat
-- Shows 39931,8244921682
Now take the part after the comma character, i.e. the time:
set #myfloat = #myfloat - floor(#myfloat)
select #myfloat
-- Shows 0,824492168212601
Convert it back to a datetime:
declare #mytime datetime
set #mytime = convert(datetime,#myfloat)
select #mytime
-- Shows 1900-01-01 19:47:16.123
The 1900-01-01 is just the "zero" date; you can display the time part with convert, specifying for example format 108, which is just the time:
select convert(varchar(32),#mytime,108)
-- Shows 19:47:16
Conversions between datetime and float are pretty fast, because they're basically stored in the same way.
convert(varchar(5), thedate, 108) between #leftTime and #rightTime
Explanation:
if you have varchar(5) you will obtain HH:mm
if you have varchar(8) you obtain HH:mm ss
108 obtains only the time from the SQL date
#leftTime and #rightTime are two variables to compare
If you're using SQL Server 2008, you can do this:
WHERE CONVERT(time(0), startHour) >= CONVERT(time(0), #startTime)
Here's a full test:
DECLARE #tbEvents TABLE (
timeEvent int IDENTITY,
startHour datetime
)
INSERT INTO #tbEvents (startHour) SELECT DATEADD(hh, 0, GETDATE())
INSERT INTO #tbEvents (startHour) SELECT DATEADD(hh, 1, GETDATE())
INSERT INTO #tbEvents (startHour) SELECT DATEADD(hh, 2, GETDATE())
INSERT INTO #tbEvents (startHour) SELECT DATEADD(hh, 3, GETDATE())
INSERT INTO #tbEvents (startHour) SELECT DATEADD(hh, 4, GETDATE())
INSERT INTO #tbEvents (startHour) SELECT DATEADD(hh, 5, GETDATE())
--SELECT * FROM #tbEvents
DECLARE #startTime datetime
SET #startTime = DATEADD(mi, 65, GETDATE())
SELECT
timeEvent,
CONVERT(time(0), startHour) AS 'startHour',
CONVERT(time(0), #startTime) AS '#startTime'
FROM #tbEvents
WHERE CONVERT(time(0), startHour) >= CONVERT(time(0), #startTime)
Just change convert datetime to time that should do the trick:
SELECT timeEvent
FROM tbEvents
WHERE convert(time, startHour) >= convert(time, #startHour)
if (cast('2012-06-20 23:49:14.363' as time) between
cast('2012-06-20 23:49:14.363' as time) and
cast('2012-06-20 23:49:14.363' as time))
One (possibly small) issue I have noted with the solutions so far is that they all seem to require a function call to process the comparison. This means that the query engine will need to do a full table scan to seek the rows you are after - and be unable to use an index. If the table is not going to get particularly large, this probably won't have any adverse affects (and you can happily ignore this answer).
If, on the other hand, the table could get quite large, the performance of the query could suffer.
I know you stated that you do not wish to compare the date part - but is there an actual date being stored in the datetime column, or are you using it to store only the time? If the latter, you can use a simple comparison operator, and this will reduce both CPU usage, and allow the query engine to use statistics and indexes (if present) to optimise the query.
If, however, the datetime column is being used to store both the date and time of the event, this obviously won't work. In this case if you can modify the app and the table structure, separate the date and time into two separate datetime columns, or create a indexed view that selects all the (relevant) columns of the source table, and a further column that contains the time element you wish to search for (use any of the previous answers to compute this) - and alter the app to query the view instead.
Using float does not work.
DECLARE #t1 datetime, #t2 datetime
SELECT #t1 = '19000101 23:55:00', #t2 = '20001102 23:55:00'
SELECT CAST(#t1 as float) - floor(CAST(#t1 as float)), CAST(#t2 as float) - floor(CAST(#t2 as float))
You'll see that the values are not the same (SQL Server 2005). I wanted to use this method to check for times around midnight (the full method has more detail) in which I was comparing the current time for being between 23:55:00 and 00:05:00.
Adding to the other answers:
you can create a function for trimming the date from a datetime
CREATE FUNCTION dbo.f_trimdate (#dat datetime) RETURNS DATETIME AS BEGIN
RETURN CONVERT(DATETIME, CONVERT(FLOAT, #dat) - CONVERT(INT, #dat))
END
So this:
DECLARE #dat DATETIME
SELECT #dat = '20080201 02:25:46.000'
SELECT dbo.f_trimdate(#dat)
Will return
1900-01-01 02:25:46.000
Use Datepart function: DATEPART(datepart, date)
E.g#
SELECT DatePart(#YourVar, hh)*60) +
DatePart(#YourVar, mi)*60)
This will give you total time of day in minutes allowing you to compare more easily.
You can use DateDiff if your dates are going to be the same, otherwise you'll need to strip out the date as above
You can create a two variables of datetime, and set only hour of date that your need to compare.
declare #date1 datetime;
declare #date2 datetime;
select #date1 = CONVERT(varchar(20),CONVERT(datetime, '2011-02-11 08:00:00'), 114)
select #date2 = CONVERT(varchar(20),GETDATE(), 114)
The date will be "1900-01-01" you can compare it
if #date1 <= #date2
print '#date1 less then #date2'
else
print '#date1 more then #date2'
SELECT timeEvent
FROM tbEvents
WHERE CONVERT(VARCHAR,startHour,108) >= '01:01:01'
This tells SQL Server to convert the current date/time into a varchar using style 108, which is "hh:mm:ss". You can also replace '01:01:01' which another convert if necessary.
I believe you want to use DATEPART('hour', datetime).
Reference is here:
http://msdn.microsoft.com/en-us/library/ms174420.aspx
I don't love relying on storage internals (that datetime is a float with whole number = day and fractional = time), but I do the same thing as the answer Jhonny D. Cano. This is the way all of the db devs I know do it. Definitely do not convert to string. If you must avoid processing as float/int, then the best option is to pull out hour/minute/second/milliseconds with DatePart()
I am assuming your startHour column and #startHour variable are both DATETIME; In that case, you should be converting to a string:
SELECT timeEvent
FROM tbEvents
WHERE convert(VARCHAR(8), startHour, 8) >= convert(VARCHAR(8), #startHour, 8)
below query gives you time of the date
select DateAdd(day,-DateDiff(day,0,YourDateTime),YourDateTime) As NewTime from Table
#ronmurp raises a valid concern - the cast/floor approach returns different values for the same time. Along the lines of the answer by #littlechris and for a more general solution that solves for times that have a minute, seconds, milliseconds component, you could use this function to count the number of milliseconds from the start of the day.
Create Function [dbo].[MsFromStartOfDay] ( #DateTime datetime )
Returns int
As
Begin
Return (
( Datepart( ms , #DateTime ) ) +
( Datepart( ss , #DateTime ) * 1000 ) +
( Datepart( mi , #DateTime ) * 1000 * 60 ) +
( Datepart( hh , #DateTime ) * 1000 * 60 * 60 )
)
End
I've verified that it returns the same int for two different dates with the same time
declare #first datetime
set #first = '1900-01-01 23:59:39.090'
declare #second datetime
set #second = '2000-11-02 23:56:39.090'
Select dbo.MsFromStartOfDay( #first )
Select dbo.MsFromStartOfDay( #second )
This solution doesn't always return the int you would expect. For example, try the below in SQL 2005, it returns an int ending in '557' instead of '556'.
set #first = '1900-01-01 23:59:39.556'
set #second = '2000-11-02 23:56:39.556'
I think this has to do with the nature of DateTime stored as float. You can still compare the two number, though. And when I used this approach on a "real" dataset of DateTime captured in .NET using DateTime.Now() and stored in SQL, I found that the calculations were accurate.
TL;DR
Separate the time value from the date value if you want to use indexes in your search (you probably should, for performance). You can: (1) use function-based indexes or (2) create a new column for time only, index this column and use it in you SELECT clause.
Keep in mind you will lose any index performance boost if you use functions in a SQL's WHERE clause, the engine has to do a scan search. Just run your query with EXPLAIN SELECT... to confirm this. This happens because the engine has to process EVERY value in the field for EACH comparison, and the converted value is not indexed.
Most answers say to use float(), convert(), cast(), addtime(), etc.. Again, your database won't use indexes if you do this. For small tables that may be OK.
It is OK to use functions in WHERE params though (where field = func(value)), because you won't be changing EACH field's value in the table.
In case you want to keep use of indexes, you can create a function-based index for the time value. The proper way to do this (and support for it) may depend on your database engine. Another option is adding a column to store only the time value and index this column, but try the former approach first.
Edit 06-02
Do some performance tests before updating your database to have a new time column or whatever to make use of indexes. In my tests, I found out the performance boost was minimal (when I could see some improvement) and wouldn't be worth the trouble and overhead of adding a new index.

Best way to check for current date in where clause of sql query

I'm trying to find out the most efficient (best performance) way to check date field for current date. Currently we are using:
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received BETWEEN DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0)
AND DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1))
WHERE
DateDiff(d, Received, GETDATE()) = 0
Edit: As lined out in the comments to this answer, that's not an ideal solution. Check the other answers in this thread, too.
If you just want to find all the records where the Received Date is today, and there are records with future Received dates, then what you're doing is (very very slightly) wrong... Because the Between operator allows values that are equal to the ending boundary, so you could get records with Received date = to midnight tomorrow...
If there is no need to use an index on Received, then all you need to do is check that the date diff with the current datetime is 0...
Where DateDiff(day, received, getdate()) = 0
This predicate is of course not SARGable so it cannot use an index...
If this is an issue for this query then, assuming you cannot have Received dates in the future, I would use this instead...
Where Received >= DateAdd(day, DateDiff(Day, 0, getDate()), 0)
If Received dates can be in the future, then you are probably as close to the most efficient as you can be... (Except change the Between to a >= AND < )
If you want performance, you want a direct hit on the index, without any CPU etc per row; as such, I would calculate the range first, and then use a simple WHERE query. I don't know what db you are using, but in SQL Server, the following works:
// ... where #When is the date-and-time we have (perhaps from GETDATE())
DECLARE #DayStart datetime, #DayEnd datetime
SET #DayStart = CAST(FLOOR(CAST(#When as float)) as datetime) -- get day only
SET #DayEnd = DATEADD(d, 1, #DayStart)
SELECT COUNT(Job) AS Jobs
FROM dbo.Job
WHERE (Received >= #DayStart AND Received < #DayEnd)
that's pretty much the best way to do it.
you could put the DATEADD(d, DATEDIFF(d, 0, GETDATE()), 0) and DATEADD(d, DATEDIFF(d, 0, GETDATE()), 1) into variables and use those instead but i don't think that this will improve performance.
I'm not sure how you're defining "best" but that will work fine.
However, if this query is something you're going to run repeatedly you should get rid of the get_date() function and just stick a literal date value in there via whatever programming language you're running this in. Despite their output changing only once every 24 hours, get_date(), current_date(), etc. are non-deterministic functions, which means that your RDMS will probably invalidate the query as a candidate for storing in its query cache if it has one.
How 'bout
WHERE
DATEDIFF(d, Received, GETDATE()) = 0
I would normally use the solution suggested by Tomalak, but if you are really desperate for performance the best option could be to add an extra indexed field ReceivedDataPartOnly - which would store data without the time part and then use the query
declare #today as datetime
set #today = datediff(d, 0, getdate())
select
count(job) as jobs
from
dbo.job
where
received_DatePartOnly = #today
Compare two dates after converting into same format like below.
where CONVERT(varchar, createddate, 1) = CONVERT(varchar, getdate(), 1);

MS SQL Date Only Without Time

Question
Hello All,
I've had some confusion for quite some time with essentially flooring a DateTime SQL type using T-SQL. Essentially, I want to take a DateTime value of say 2008-12-1 14:30:12 and make it 2008-12-1 00:00:00. Alot of the queries we run for reports use a date value in the WHERE clause, but I either have a start and end date value of a day and use a BETWEEN, or I find some other method.
Currently I'm using the following:
WHERE CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam
However, this seems kinda clunky. I was hoping there would be something more simple like
CAST([tstamp] AS DATE)
Some places online recommend using DATEPART() function, but then I end up with something like this:
WHERE DATEPART(year, [tstamp]) = DATEPART(year, #dateParam)
AND DATEPART(month, [tstamp]) = DATEPART(month, #dateParam)
AND DATEPART(day, [tstamp]) = DATEPART(day, #dateParam)
Maybe I'm being overly concerned with something small and if so please let me know. I just want to make sure the stuff I'm writing is as efficient as possible. I want to eliminate any weak links.
Any suggestions?
Thanks,
C
Solution
Thanks everyone for the great feedback. A lot of useful information. I'm going to change around our functions to eliminate the function on the left hand side of the operator. Although most of our date columns don't use indexes, it is probably still a better practice.
If you're using SQL Server 2008 it has this built in now, see this in books online
CAST(GETDATE() AS date)
that is very bad for performance, take a look at Only In A Database Can You Get 1000% + Improvement By Changing A Few Lines Of Code
functions on the left side of the operator are bad
here is what you need to do
declare #d datetime
select #d = '2008-12-1 14:30:12'
where tstamp >= dateadd(dd, datediff(dd, 0, #d)+0, 0)
and tstamp < dateadd(dd, datediff(dd, 0, #d)+1, 0)
Run this to see what it does
select dateadd(dd, datediff(dd, 0, getdate())+1, 0)
select dateadd(dd, datediff(dd, 0, getdate())+0, 0)
The Date functions posted by others are the most correct way to handle this.
However, it's funny you mention the term "floor", because there's a little hack that will run somewhat faster:
CAST(FLOOR(CAST(#dateParam AS float)) AS DateTime)
CONVERT(date, GETDATE()) and CONVERT(time, GETDATE()) works in SQL Server 2008. I'm uncertain about 2005.
How about this?
SELECT DATEADD(dd, DATEDIFF(dd,0,GETDATE()), 0)
Yes, T-SQL can feel extremely primitive at times, and it is things like these that often times push me to doing a lot of my logic in my language of choice (such as C#).
However, when you absolutely need to do some of these things in SQL for performance reasons, then your best bet is to create functions to house these "algorithms."
Take a look at this article. He offers up quite a few handy SQL functions along these lines that I think will help you.
http://weblogs.sqlteam.com/jeffs/archive/2007/01/02/56079.aspx
Careful here, if you use anything a long the lines of WHERE CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam it will force a scan on the table and no indexes will be used for that portion.
A much cleaner way of doing this is defining a calculated column
create table #t (
d datetime,
d2 as
cast (datepart(year,d) as varchar(4)) + '-' +
right('0' + cast (datepart(month,d) as varchar(2)),2) + '-' +
right('0' + cast (datepart(day,d) as varchar(2)),2)
)
-- notice a lot of care need to be taken to ensure the format is comparable. (zero padding)
insert #t
values (getdate())
create index idx on #t(d2)
select d2, count(d2) from #t
where d2 between '2008-01-01' and '2009-01-22'
group by d2
-- index seek is used
This way you can directly check the d2 column and an index will be used and you dont have to muck around with conversions.
DATEADD(d, 0, DATEDIFF(d, 0, [tstamp]))
Edit: While this will remove the time portion of your datetime, it will also make the condition non SARGable. If that's important for this query, an indexed view or a between clause is more appropriate.
Alternatively you could use
declare #d datetimeselect
#d = '2008-12-1 14:30:12'
where tstamp
BETWEEN dateadd(dd, datediff(dd, 0, #d)+0, 0)
AND dateadd(dd, datediff(dd, 0, #d)+1, 0)
Here's a query that will return all results within a range of days.
DECLARE #startDate DATETIME
DECLARE #endDate DATETIME
SET #startDate = DATEADD(day, -30, GETDATE())
SET #endDate = GETDATE()
SELECT *
FROM table
WHERE dateColumn >= DATEADD(day, DATEDIFF(day, 0, #startDate), 0)
AND dateColumn < DATEADD(day, 1, DATEDIFF(day, 0, #endDate))
FWIW, I've been doing the same thing as you for years
CAST(CONVERT(VARCHAR, [tstamp], 102) AS DATETIME) = #dateParam
Seems to me like this is one of the better ways to strip off time in terms of flexibility, speed and readabily. (sorry). Some UDF functions as suggested can be useful, but UDFs can be slow with larger result sets.
WHERE DATEDIFF(day, tstamp, #dateParam) = 0
This should get you there if you don't care about time.
This is to answer the meta question of comparing the dates of two values when you don't care about the time.