In the queries I stumble upon each date is converted with to_date function before any comparison. Sometimes it caused "literal does not match format string" error, which had rather nothing to do with format and the cause was explained here:
ORA-01861: literal does not match format string
My question is: is it really necessary to use date conversion? Why is it converted in the first place before applying any logical comparison?
Oracle does not store dates as, well, dates. The problem is that there might be a time on the dates that would cause them to be unequal. (You can see the documentation here for information about the date data type.)
In general, we think that "2013-01-01" is equal to "2013-01-01". However, the first date might be "2013-01-01 01:00:00" and the second "2013-01-01 02:02:02". And they would not be equal. To make matters worse, they may look the same when they are printed out.
You don't actually have to convert the dates to strings in order to do such comparisons. You can also use the trunc() function. Such a transformation of the data is insurance against "invisible" time components of the data interfering with comparisons.
You should really be storing dates as actual dates (or timestamps). If you have strings representing dates, you will often need to convert them using to_date (with a specified format, not relying on default formats). It really depends on what comparisons/date functionality you want. You're getting errors because you hit a value that does not conform to your specified format. This is also a good reason to specify a column as DATE to store dates. For example,
select to_date('123', 'MM-DD-YYYY') from dual;
will throw an ORA-01861. So you may have 99.9% of the rows as MM-DD-YYYY, but the 0.1% will cause you headaches.
Anyway, if you cleanup those strings, you can do much more using to_date and date functions. For example:
select
(last_day(to_date('02-05-2009', 'MM-DD-YYYY')) - to_date('01-15-1998', 'MM-DD-YYYY')) as days_between_dates
from dual;
Not fun to do that with strings. Or maybe just find the most recent date:
select greatest( to_date('02-05-2009', 'MM-DD-YYYY'), to_date('12-01-1988', 'MM-DD-YYYY')) from dual;
using string comparison would give wrong answer:
select greatest('02-05-2009', '12-01-1988') from dual;
Just a few examples, but much better to treat dates as dates, not strings.
If you have a string that represents a date, use TO_DATE.
If you already have a date, use it directly.
Related
What is the difference between referring to a hive date as date '2020-08-25' vs. just '2020-08-25' without the word date? Is it two different data types? or are these the exactly the same. This something I might put into a where statement like below:
where somedate<=date'2020-08-25'
vs.
where somedate<='2020-08-25'
date'2020-08-25' is a standard date literal: it generates a legitimate value of the date datatype.
On the other hand, '2020-08-25' is a literal string (that represents a date, but it could be anything else).
Now what is the best pick for the predicate in your where clause? It depends on the datatype of the column you want to compare.
If you have a date column, then I would recommend using a literal date. Otherwise, you rely on the ability of the database to understand what you mean, and implicitly convert the string to a date. Each database has its own set of rules to handle such case: in the worst case scenario, it might take the wrong decision and decide do string comparison, which would require converting all stored values to string (this depends on the database specification) - which would kill the performance of the query.
If you happen to be storing dates as strings (which usually indicates a bad design), then string comparison is fine.
date '2020-08-25' is a literal of type DATE.
'2020-08-25' - is a string literal.
String and dates can be implicitly converted, so both your where clauses are functionally identical. Depending on column datatype, implicit conversion may happen. Better use the same type to avoid implicit conversion.
Also DATE can be packed in 4 bytes integer in binary formats. See HIVE-3910
If I have a query like:
select *
from CAT_ACCT_AUDIT_TRAIL cataccount0_
where cataccount0_.CAAT_EXECUTED_DATE >=TO_DATE(’26-AUG-2016′, ‘DD-MM-YYYY’) AND
to_Date(TO_CHAR(cataccount0_.CAAT_EXECUTED_DATE , ‘dd -mon-yyyy’), ‘DD-MM-YYYY’)<=TO_DATE('31-AUG-2016', 'DD-MM-YYYY')
Here why do we require the to_char or to_date functions? What is the right context to use them?
If I do either of these:
select TO_DATE('26-AUG-2016', 'DD-MM-YYYY') from dual;
select TO_DATE('01-12-2016', 'DD-MM-YYYY') from dual;
I get the output in NLS variable format as I set in the session, irrespective of date input in format conversion; I get the same result for both. Wy is this so?
What is the correct way to solve this query? I mean when i need to fetch the values in date range.
You use to_date() to convert a string like '01-12-2016' to the date datatype. You use to_char() to convert a date to a string. In both cases you specify the format of the string - if you don't then your session NLS settings are used, which is not good practice for anything except ad hoc queries as someone else running your code later may get a different output or an error.
A general rule - which your code is following - is to compare data of one type with values/constants of the same type. As your column is a date, you're supplying the filter values as dates - by converting strings to the date datatype. If you didn't do that then implicit conversion would happen, but you should not rely on that either as it can also lead to NLS issues, and depending on the type it can prevent indexes being used. Read more about data conversion in the documentation.
Oracle tries to be flexible when interpreting the string when you do to_date(). When you do TO_DATE('26-AUG-2016', 'DD-MM-YYYY') you are supplying the month as a string (in a specific language, which is another topic), but telling the function to expect a number. Oracle 'helpfully' interprets that anyway, so it usually works. But whatever format you use for to_date(), you aren't specifying the display format, so your client is deciding how to display the converted date as a string for you - usually using your NLS settings, again.
Doing this:
to_Date(TO_CHAR(cataccount0_.CAAT_EXECUTED_DATE , ‘dd -mon-yyyy’), ‘DD-MM-YYYY’)
is usually pointless, but even so should be using consistent format models. One reason this is sometimes done is if the source date (caat_executed_date here) has its time set to something other than midnight, and you want to discard the time. But there are better ways to do that - specifically the trunc() function, which by default sets the time to midnight.
When you have constant values, like TO_DATE('31-AUG-2016', 'DD-MM-YYYY'), you can also use ANSI date literals, in the form of DATE '2016-08-31'.
It is unclear what you want to do, but you don't actually need those functions on constants. Just use the date keyword for date literals. For instance:
where cataccount0_.CAAT_EXECUTED_DATE >= date '2016-08-26'
If you want to remove the time component from a date, then use trunc():
where trunc(cataccount0_.CAAT_EXECUTED_DATE, 'dd') -- the `'dd'` is optional for this purpose
This can be used in any context where a date constant is accepted.
I work in T-SQL but have been given some Oracle PL-SQL for review on a Project.
Within the code there are Multiple WHERE clauses that do comparison of a Field of DataType = DATE against Strings which hold a "date".
ex:
WHERE to_date(mytable.mydatefield) > '23-OCT-2015'
OR
WHERE mytable.mydatefield > '23-OCT-2015'
Q1: Since "mydatefield" is already defined as a DATA type, isn't doing a "to_date" unnecessary?
Q2: Will Oracle do an implicit conversion on the '23-OCT-2015' and convert it to a date for comparison? I seem to remember encountering this before and comparing DATES to STRINGS caused issues?
Am I incorrect about that? If not can someone give me an example that I can use as evidence that it would not work?
A1: In general yes, but take the way Oracle handles implicit type conversions into account. The To_Date function around the mydatefield column expects a string input, so Oracle implicitly converts mydatefield to a string with a format matching the NLS_DATE_FORMAT session setting (which defaults to DD-MON-RR). Once converted to a string the To_Date function then converts it back to a date again using the current NLS_DATE_FORMAT setting. The newly reconstituted date is then compared to the string '23-OCT-2015', but since dates and strings aren't directly comparable the string value gets implicitly converted to a date using the current NLS_DATE_FORMAT setting. Depending on the value of the NLS_DATE_FORMAT setting, the first implicit conversion is likely to lose information specifically any time portion AND the original century, since the default NLS_DATE_FORMAT uses only a two digit year RR and no time component.
A2: Possibly, but it's best not to rely on it.
Both relations are poor programming for a couple of reasons. First they both are affected by implicit type conversions from dates to strings (or vice versa). Second they are both attempting to compare dates with strings in a non canonical form. As such 10-DEC-15 is less than 23-OCT-2015 because 1 is less than 2. Also note the difference in the number of digits representing the year since the default NLS_DATE_FORMAT uses a two digit year.
The correct method would be to compare the date column (possibly truncated) to a date string explicitly converted to a date
WHERE mytable.mydatefield > TO_DATE('23-OCT-2015', 'DD-MON-YYYY')
OR with truncation:
WHERE trunc(mytable.mydatefield) > TO_DATE('23-OCT-2015', 'DD-MON-YYYY')
which removes the time component of the date field.
Q1: According to Oracle, the first parameter of to_date() is a char value. Using it like to_date(date_value) you will force an implicit cast of date_value to char and then wrapping it again in a date value.
Q2: The server will do an implicit conversion from the string '23-OCT-2015' to a date value but based on database parameters which can differ from various servers(DEV vs PROD for example) so you should not rely on them. An example of correct usage would be WHERE mytable.mydatefield > to_date('23-OCT-2015','dd-MON-yyyy')
You should always use to_date/to_char to make sure you are using the correct format. Please see this answer for a more detailed explanation: Comparing Dates in Oracle SQL
I need verify that all cells in column contain data in only date format. How it possible to verify?
*I think it isn't LIKE function.
DATE doesn't have any format. What you see is for display purpose so that it could be easily interpreted.
DATE datatype is stored in a proprietary format internally in 7 bytes. It is a bad idea and makes no sense to verify the format while date is stored in an internal format. As I said, format is only for display.
If the date column is not a DATE data type, then it is a design flaw. And, any application based on such a flawed database design is on the verge to break anytime.
Storing DATE values other than date data type is just like not understanding the basics.
You should first fix the design to get a permanent solution. Any solution to your question is just another workaround.
Let me show a small example how it creates even more confusion.
The following date :
01/02/2015
Is it:
1st Feb 2015 or,
2nd Jan 2015
There is no way to tell that. It could be either DD or MM. This being just one among so many other problems due to the incorrect data type.
Store date values as DATE data type only, period.
Based on your last question, I think you are looking for something like this:
SELECT COUNT(*) FROM ...
WHERE NOT REGEXP_LIKE (A, '^XXX/MOSCOW/XXXMSX/[0-9]{4}-[0-9]{2}-[0-9]{2}$')
If count is greater than zero, something doesn't match. If you want more detail on what doesn't match, change your SELECT clause appropriately.
If you are looking for multiple date formats, you can change your regular expression appropriately. The | operator in most flavors of regular expression, including Oracle's, lets you define multiple patterns in the same space. You might use something like
SELECT COUNT(*) FROM ...
WHERE NOT
REGEXP_LIKE (A,
'^XXX/MOSCOW/XXXMSX/[0-9]{4}-[0-9]{2}-[0-9]{2}$|^[0-9]{4}-[0-9]{2}-[0-9]{2}$')
adding as many different matching patterns as you need.
Try
SELECT *
FROM POL
WHERE NOT REGEXP_LIKE(TR_KRY, '^(0[1-9]|([1-2][0-9])|30|31)-(([0][1-9])|10|11|12)-[0-9]{4}$')
This will return you all rows where TR_KRY is not formatted as 'DD-MM-YYYY', where DD is '01'-'31', MM is '01'-'12', and YYYY is any four numeric digits.
As others have said, storing dates as character strings is not a good idea. In the field you're looking at, it might be that the date is stored as DD-MM-YYYY (day-month-year - the usual case in Europe and perhaps elsewhere), or it might be that the date is stored as MM-DD-YYYY (month-day-year - a common practice in the US). If possible, I suggest you should convert this field to the DATE data type so that the TO_CHAR function can be used to produce a text version of the date in whatever format is desired.
Given the example data you've shown in comments (and that's also not good practice - you should go back and edit the question when you want to include additional information) it appears the dates are formatted as DD-MM-YYYY and I've set up the regular expression above to deal with this as best as possible.
I am writing to you because I can't use the operator to_date on an AS400 database.
With Oracle database, I use:
datefield >= to_date('01/01/2014','DD/MM/YYYY')
But with AS400, I get an error:
Incompatible operator
Is there another function I may use to replace to_date?
assuming datefield is a actual date data type
Then all you need to do is use an ISO formatted date string
datefield >= '2014-01-01'
DB2 for IBM i will always recognize '2014-01-01' as a date.
But if you really want to explicitly convert it yourself, then there's two functions
DATE('2014-01-01')
CAST('2014-01-01' as DATE)
CAST is preferred for portability.
I recommend sticking with ISO format, though the system will recognize USA 'mm/dd/yyyy' and EUR 'dd.mm.yyyy'.
Reference here:
http://www-01.ibm.com/support/knowledgecenter/ssw_ibm_i_71/db2/rbafzdtstrng.htm
I realize this topic is old, but the current answer seemed mostly to ignore the original issue with TO_DATE, and instead offer a circumvention; of course the circumvention is IMO, a better approach. By addition of the message identifier and further explanation of the original issue and possible resolutions, hopefully those are beneficial to others in both locating this discussion as a match to their own issue and beneficial for the additional commentary provided.
The issue described in the OP is a reflection of the error condition SQL0401 [sqlcode -401] diagnosing that the data-type of the TO_DATE scalar is a TIMESTAMP whereas the DateField column data-type is a DATE [or so implied, although if the OP had included the DDL for the TABLE, the reviewers could be assured that "datefield" is indeed a column of the DATE data type].
In v5r3 the "Cause" is described by the text "Date, time, and timestamp operands are compatible with character operands or with another operand of the same type."; FWiW the USEnglish [first-level] text likely would have been "Comparison operator >= operands not compatible.", rather than just "Incompatible operator" as was noted in the OP. Even by v7r1, the documentation suggests no change for the SQL0401:http://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rzala/rzalaml.htm
"...
Date, time, and timestamp operands are compatible with character and graphic operands or with another operand of the same type.
..."
Despite the name of the scalar function, for what might seem the logical effect given that moniker, the scalar result is not a DATE data type; the effect is instead reflective of the scalar function name TIMESTAMP_FORMAT, thus yielding a TIMESTAMP scalar result. The moniker TO_DATE is merely a synonym\syntax-alternative:
http://www.ibm.com/support/knowledgecenter/api/content/ssw_ibm_i_71/db2/rbafzscatsformat.htm
The originally described scenario datefield >= to_date('01/01/2014','DD/MM/YYYY') for which a non-standard date format is coded, the error could be prevented by explicitly casting the result of that TO_DATE scalar to a DATE type. For example by wrapping the TO_DATE result in another [casting] scalar, such as for example, either of the DATE casting scalar datefield >= DATE(to_date('01/01/2014','DD/MM/YYYY')) or the CAST scalardatefield >= CAST(to_date('01/01/2014','DD/MM/YYYY') as DATE)
Of course the other alternatives of using a character-string formatted as one of the standards date formats [e.g. *ISO as Charles suggested] is probably just as simple; even if that usage is not as explicitly revealing [to a reviewer of the statement] as would be the format-string specified as the second argument on the TO_DATE(). But per the specification originally shown as 'DD/MM/YYYY', the preference may be to use the *EUR standard formatting for which the format is 'DD.MM.YYYY'; i.e. coded as datefield >= '01.01.2014'
Note that in addition to the Date strings documentation reference http://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/db2/rbafzdatestrings.htm, there is another alternative not mentioned on that page which is a somewhat redundant form of the *ISO formatted character-string, DATE '2014-01-01' and almost the same [both in specification and redundancy; the alternative merely saves typing the parentheses] as the DATE [casting] scalar specification DATE('2014-01-01') already mentioned elsewhere in this topic. Thus each of datefield >= DATE'2014-01-01' or datefield >= '2014-01-01' or datefield >= DATE('2014-01-01') are all equivalent, and each is depending on *ISO formatting of the character-string as the date representation.