Trying to convert a datetime format column (example value: 12-11-2020 18:15:06) which is actually a nvarchar into this date format: yyyymmdd
This is what I tried so far but I'm getting the following error:
What am I doing wrong?
There are many problems here.
Dates should not be stored as strings.
You lose the ability to perform any kind of date math or extract date parts.
You lose built-in validation (both invalid dates like February 31st and any garbage that doesn't even look like a date).
For example, we have no idea if 12-11-2020 is December 11th or November 12th, or if the data was entered consistently. Imagine a person from the US did some of the data entry and a colleague from Germany did the rest.
FORMAT() is the most expensive way to format a date (see this and this - these are articles about removing time altogether, but the overhead with FORMAT() is the same).
An index on MYDATE can't be used to satisfy this query, so you will do a full table scan every time, which will get worse and worse as the table grows.
You can perform your query in your given scenario without changing anything, but I highly recommend you fix the column and make sure data entry can't be arbitrary (use a date picker or calendar control so you dictate a consistent format).
If 12-11-2020 is December 11th:
WHERE TRY_CONVERT(date, MYDATE, 110) >= #DateVariable;
If 12-11-2020 is November 12th:
WHERE TRY_CONVERT(date, MYDATE, 105) >= #DateVariable;
Note that this still might not get the correct and logical results if some people thought they entered December 11th and others thought they entered November 12th.
You can see all the valid style numbers for CONVERT/TRY_CONVERT here.
Related
I have an nvarchar(100) column which has a value ' 8/11/2022'.
I receive and error when trying to convert it to date...
select convert(date,[date],103)
from [Source].[TableName] s_p
--Msg 241, Level 16, State 1, Line 96
--Conversion failed when converting date and/or time from character string.
I have tried a number of different ways to approach but I can't find one to give me '08/11/2022'
select Date = REPLACE(LEFT([Date],10),' ','0')
from [Source].[TableName] s_p
--Outcome 8/11/2022
select REPLACE([DATE],' 8/','08/')
from [Source].[TableName] s_p
--Outcome 8/11/2022
select convert(nvarchar,[date],103)
from [Source].[TableName] s_p
--Outcome 8/11/2022
The strange thing is when I copy and paste from the results grid then do a replace it works fine...
select REPLACE(' 8/11/2022',' 8/','08/')
--Outcome 08/11/2022
Please help me to get to '08/11/2022' or any single digit to having a leading 0.
Thanks, Will
Different languages and cultures have their own formatting preferences around date values. Some places like M/dd/yyyy. Some places like dd/MM/yyyy. Or perhaps d-M-YYYY (different separators and conventions around leading zeros). The point is it's not okay to go into a place and impose our own preferences and norms on that culture.
The SQL language is no different. It really is it's own language, and as such has it's own expectations around date handling. If you violate these expectations, you should not be surprised when there are misunderstandings as a result.
The first expectation is for date and datetime values to be stored in datetime columns. It's hard to understate how much of a difference this can make for performance and correctness.
But let's assume that's not an option, and you have no choice but to use a string column like varchar or nvarchar. In that situation, there is still an expectation around how date values should be formatted.
Any database will do better if you use a format which stores the date parts in order by descending length. For example, ISO-8601 yyyy-MM-ddTHH:mm:sss[.fff] This is important to allow greater than/less than comparisons to work, it can greatly help with indexes and performance, and it makes cast/convert operations to datetime values MUCH more likely to succeed and be accurate.
For SQL Server specifically, there are three acceptable formats:
yyyy-MM-ddTHH:mm:sss[.fff],
yyyyMMdd HH:mm:ss[.fff], and
yyyyMMdd.
Anything else WILL have date values that don't parse as expected. Any string manipulation done to call the CONVERT() method should focus on reaching one of these formats.
With that in mind, and assuming 8/11/2022 means November 8 and not August 11 (given the 103 convert format), you need something like this:
convert(datetime,
right([date], charindex('/', reverse([date]))-1) -- year
+ right('0' + replace(substring([date], charindex('/', [date])+1, 2), '/', ''), 2) -- month
+ right('0' + left([date], charindex('/',[date])-1),2) -- day
)
And you can see it work here:
https://dbfiddle.uk/lM8sVySh
Yes, that's a lot of code. It's also gonna be more than a little slow. And again, the reason why it's so slow and complicated is you jumped in with your own cultural expectations and failed to respect the language of the platform you're using.
Finally, I need to question the premise. As the fiddle above shows, SQL Server is perfectly happy to convert this specific value without error. This tells me you probably have more rows, and any error is in fact coming from a different row.
With that in mind, one thing to remember is a WHERE clause condition will not necessarily run or filter a table before a CONVERT() operation in the SELECT clause. That is, if you have many different kinds of value in this table, you cannot guarantee your CONVERT() expression will only run on the date values, no matter what kind of WHERE clause you have. Databases do not guarantee order of operations in this way.
The problem could also be some invisible unicode whitespace.
Another possibility is date formats. Most cultures that prefer a leading day, instead of month or year, tend to also strongly prefer to see the leading 0 in the first place. That the zero is missing here makes me wonder if you might have a number of dates in the column that were formatted by, say, Americans. So then you try to parse a column with values both like 02/13/2022 and 13/02/2022. Obviously those dates can't both use the same format, since there is no 13th month.
In that case, best of luck to you, because you no longer have any way to know for certain whether 2/3/2022 means March 2nd or February 3rd... and trying to guess (by say, assuming your own common format) is just exacerbating the same mistake that got you into this mess in the first place.
It's worth noting all three of these possibilities would be avoided had you used DateTime columns from the beginning.
You'll want to use LPAD to add 0 to string, then CAST() string as date if you want to change to date data type
I want to create a column of data type having only 'mm-dd' values.
Is it possible and if yes how should I do it?
Note: Instead of "2022-06-07", I want "07-06"
There is no date type that can store that format - in fact none of the date types store a date and/or time in any of the formats you typically recognize.
For your specific requirement, that looks like a char(5) for the data type, but how you constrain it so that it will only accept valid date values, I have no idea. You'd think this would work:
CHECK (TRY_CONVERT(date, string_column + '-2022', 105) IS NOT NULL)
But what about leap years? February 29th is sometimes valid, but you've thrown away the only information that can make you sure. What a bunch of mess to store your favorite string and trust that people aren't putting garbage in there.
Honestly I would store the date as a date, then you can just have a computed column (or a column in a view, or just do this at query time:
d_slash_m_column AS CONVERT(char(5), date_column, 105)
Why not just in your query (or only in a view) say:
[output] = CONVERT(char(5), data_in_the_right_type, 105)
?
I'd personally stay away from FORMAT(), for reasons I've described here:
FORMAT() is nice and all, but…
FORMAT is a convenient but expensive function - Part 1
FORMAT is a convenient but expensive function - Part 2
You can use the SQL Server FORMAT function:
FORMAT(col1, 'dd/MM')
Check the demo here.
In such cases using char or varchar is not the best option as in those cases the underlying DB constraints that validate the integrity of the data do not kick in.
Best option is to use an arbitrary year and then put in a proper date, so for example for storing 01-Jan, the db column should store proper date with year as any arbitrary value, e.g. 2000. So your db should say 2000-01-01.
With such a solution you are still able to rely on the DB to raise an error if you tried month 13. Similarly sorting will work naturally as well.
I know that ACCESS's time format depends on your Windows time settings. I use ISO-8601 format (YYYYMMDD) so that I can get away with SQL WHERE statements like this one:
WHERE dates > #2020/02/15#
AND dates < #2021/01/30#
If I run the code from above in another computer, whose Windows time settings are for example DDMMYYYY, will the SQL statement no longer work? I could simply do something like this to solve that problem (will it though?):
WHERE dates BETWEEN Format(date1, "\#YYYY\/MM\/DD\#") AND Format(date2, "#YYYY\/MM\/DD\#")
EDIT: Time format has beign changed as pointed out by #Gustav. The question remains; will the first WHERE Statement no longer work on different Windows time settings? Will the second correct the problem?
In Access SQL, use octothorpes:
WHERE dates > #2020/02/15#
AND dates < #2021/01/30#
WHERE dates BETWEEN Format(date1, "\#YYYY\/MM\/DD\#") AND Format(date2, "#YYYY\/MM\/DD\#")
Nope, Windows time settings will mess with a lot of things, but not with ordering or comparisons with dates.
As long as the field is defined as a date (so with octothorpes, like Gustav said), the 2nd of February 2021 will be less the 11th of February 2021, even though that wouldn't be the case if you cast them to a string first.
Always try to keep columns as they are when filtering, so if dates is actually a date column (and not a formatted string), just use WHERE dates BETWEEN #2020/02/15# AND #2021/01/30#, no formats, no funky stuff. And note that especially when trying to keep your application working in all locales, it's important to avoid casting dates to strings, which can happen if you compare a date with a formatted string.
I need verify that all cells in column contain data in only date format. How it possible to verify?
*I think it isn't LIKE function.
DATE doesn't have any format. What you see is for display purpose so that it could be easily interpreted.
DATE datatype is stored in a proprietary format internally in 7 bytes. It is a bad idea and makes no sense to verify the format while date is stored in an internal format. As I said, format is only for display.
If the date column is not a DATE data type, then it is a design flaw. And, any application based on such a flawed database design is on the verge to break anytime.
Storing DATE values other than date data type is just like not understanding the basics.
You should first fix the design to get a permanent solution. Any solution to your question is just another workaround.
Let me show a small example how it creates even more confusion.
The following date :
01/02/2015
Is it:
1st Feb 2015 or,
2nd Jan 2015
There is no way to tell that. It could be either DD or MM. This being just one among so many other problems due to the incorrect data type.
Store date values as DATE data type only, period.
Based on your last question, I think you are looking for something like this:
SELECT COUNT(*) FROM ...
WHERE NOT REGEXP_LIKE (A, '^XXX/MOSCOW/XXXMSX/[0-9]{4}-[0-9]{2}-[0-9]{2}$')
If count is greater than zero, something doesn't match. If you want more detail on what doesn't match, change your SELECT clause appropriately.
If you are looking for multiple date formats, you can change your regular expression appropriately. The | operator in most flavors of regular expression, including Oracle's, lets you define multiple patterns in the same space. You might use something like
SELECT COUNT(*) FROM ...
WHERE NOT
REGEXP_LIKE (A,
'^XXX/MOSCOW/XXXMSX/[0-9]{4}-[0-9]{2}-[0-9]{2}$|^[0-9]{4}-[0-9]{2}-[0-9]{2}$')
adding as many different matching patterns as you need.
Try
SELECT *
FROM POL
WHERE NOT REGEXP_LIKE(TR_KRY, '^(0[1-9]|([1-2][0-9])|30|31)-(([0][1-9])|10|11|12)-[0-9]{4}$')
This will return you all rows where TR_KRY is not formatted as 'DD-MM-YYYY', where DD is '01'-'31', MM is '01'-'12', and YYYY is any four numeric digits.
As others have said, storing dates as character strings is not a good idea. In the field you're looking at, it might be that the date is stored as DD-MM-YYYY (day-month-year - the usual case in Europe and perhaps elsewhere), or it might be that the date is stored as MM-DD-YYYY (month-day-year - a common practice in the US). If possible, I suggest you should convert this field to the DATE data type so that the TO_CHAR function can be used to produce a text version of the date in whatever format is desired.
Given the example data you've shown in comments (and that's also not good practice - you should go back and edit the question when you want to include additional information) it appears the dates are formatted as DD-MM-YYYY and I've set up the regular expression above to deal with this as best as possible.
I have a form containing two text boxes for user input. Both text boxes have the Property format set to "Short Date". One is the "start date", and the other is the "end date". I also have several tables, each with a DateTime field ("studystartdatetime"). I would like to be able to query these tables, but restrict the results to rows whose DateTime fields are between the entered dates (inclusive). Currently, the condition is:
WHERE s.studystartdatetime BETWEEN forms!frmMain!txtstartdate AND forms!frmmain!txtenddate
This, however, does not return rows which occurred on the enddate specified.
I have tried every combination of CDate, Format, and DateValue that I could think of in which to wrap one or all of these fields, but I always receive the same cryptic error:
The expression is typed incorrectly, or it is too complex to be evaluated. For example, a numeric expression may contain too many complicated elements. Try simplifying the expression by assigning parts of the expression to variables.
Some examples of conditions I have tried:
WHERE CDate(Format(s.studystartdatetime, "yyyy/mm/dd")) BETWEEN forms!frmMain!txtstartdate AND forms!frmmain!txtenddate
WHERE DateValue(Format(s.studystartdatetime, "yyyy/mm/dd")) BETWEEN forms!frmMain!txtstartdate AND forms!frmmain!txtenddate
WHERE CDate(Format(s.studystartdatetime, "yyyy/mm/dd")) BETWEEN CDate(Format(forms!frmMain!txtstartdate, "yyyy/mm/dd")) AND CDate(Format(forms!frmmain!txtenddate, "yyyy/mm/dd"))
WHERE DateValue(Format(s.studystartdatetime, "yyyy/mm/dd")) BETWEEN CDate(Format(forms!frmMain!txtstartdate, "yyyy/mm/dd")) AND CDate(Format(forms!frmmain!txtenddate, "yyyy/mm/dd"))
WHERE DateValue(Format(s.studystartdatetime, "Short Date")) BETWEEN forms!frmMain!txtstartdate AND forms!frmmain!txtenddate
Etc.
Any input into this would be greatly appreciated :)
What's happening is that your short date inputs are producing datetime values at midnight on the start of the day the user entered. So, the range 2009-1-1 to 2009-1-10 (or whatever short date format is used on your system) is searching for events from the very start of January 1st to the very start of January 10th and excluding the events that happened later on January 10th.
To correct, add 1 to the end date the user puts into the search. This will search from the very start of January 1st to the very start of January 11th, including all events on the 10th of January.
Finally, events that occurred at exactly midnight of January 11th can slip in to your results this way, so instead of using BETWEEN you should use
studystartdatetime >= forms!frmMain!txtStartDate AND studystartdatetime < forms!frmMain!txtEndDate + 1
Larry's answer was the correct answer for you, but let me draw out some of the issues raised here.
you need to distinguish between date format and date storage. In the Jet/ACE database engine (Access's default database engine), dates are stored as an integer for the day and a decimal portion for the time. This is why you can add a digit (or a decimal) to a date and get a correct result, because the whole number part of the underlying representation of the date represents the days since Dec. 30, 1899 (the reason why it's not Dec. 31st is complicated -- somebody screwed up in calculating leap years, and so a whole bunch of programs were written with the wrong assumptions about when Dec. 31st, 1899 actually was).
"short date" is a date format, the standard m/d/yy (or m/d/yyyy, depending on your local settings in both Windows and Access). It has nothing to do with the actual underlying date values stored in your table, but it can have a huge effect if you work with the results of formatting. For instance, Format(Date(), "m/d/yyyy") returns a string, not a date value. It's a string that can be implicitly coerced to a date value, and one very often relies on that happening transparently. But you still have to understand that the Format() function returns a string, and that string won't always be treated as a date.
Jet/ACE SQL expects formatted dates to be passed in American order, the counter-intuitive m/d/yyyy, instead of the more logical d/m/yyyy or, better still, the ISO standard yyyy/m/d. Because of this, any time you are running your app with a non-US locale set for Windows, you need to be explicit about your dates. This means casting your dates to a non-ambiguous format (d/mmm/yyyy works because it specifies the day in digits and the month in letters), or process all your dates with the DateSerial() function. This applies to date criteria in your WHERE clause, or anywhere in your SELECT statement that you are doing date calculations -- pass the date into the functions in a non-ambiguous format or with DateSerial() and you will avoid this problem.