How does Calcite deal with data conversion? - sql

I am trying to convert a date that's stored as a string to a date, e.g.
YYYYMMDD (string) to YYYY-MM-DD (date)
As far as I know there is no conversion function that checks input format and output format, I tried manual logic, e.g.
CASE
WHEN CHAR_LENGTH(TRIM(some_string_date)) = 8
THEN
CAST(
SUBSTRING(TRIM(some_string_date) FROM 1 FOR 4)
|| '-'
|| SUBSTRING(TRIM(some_string_date) FROM 5 FOR 2)
||'-'
|| SUBSTRING(TRIM(some_string_date) FROM 7 FOR 2)
as DATE)
ELSE
NULL
END
However this is not accepted by Apache SQL Validator, does anyone see problem here?

Not directly answering the question, but maybe related, date literals are declared with DATE keyword, e.g. you can see examples in the tests in Beam tests: one, two and in Calcite docs.
Update:
What seems to happen is Calcite adds some indirection when doing CASE. Casting the strings to dates works as expected in general. For example, if input rows have schema (INT f_int, VARCHAR f_string) and dates are in 'YYYYMMDD' (e.g. (1, '2018'), then this works:
SELECT f_int,
CAST(
SUBSTRING(TRIM(f_string) FROM 1 FOR 4)
||'-'
||SUBSTRING(TRIM(f_string) FROM 5 FOR 2)
||'-'
||SUBSTRING(TRIM(f_string) FROM 7 FOR 2) as DATE)
FROM PCOLLECTION
Even directly casting the 'YYYYMMDD' works:
SELECT f_int,
CAST(f_string AS DATE)
FROM PCOLLECTION
You can see all supported date formats here.
But as soon as you wrap it in 'CASE ... ELSE NULL', then Beam/Calcite seem to infer that the expression type is now a 'String'. This means that 'THEN CAST(... AS DATE)' succeeds and returns a 'Date', but then it's converted to 'String' when wrapped in 'CASE'. Then, when returning the result in my test it seems to try to cast it back to 'Date', but the string format now is not 'YYYYMMDD' but some other default format. Unfortunately that format is not in the list of supported, so it fails.
Workaround:
As soon as you change 'ELSE NULL' to something that's known to be a 'Date', e.g. 'ELSE DATE "2001-01-01"' then it works again as Beam/Calcite don't seem to go though 'String'->'Date'->'String'->'Date' path and this works:
SELECT f_int,
CASE WHEN CHAR_LENGTH(TRIM(f_string)) = 8
THEN CAST (
SUBSTRING(TRIM(f_string) FROM 1 FOR 4)
||'-'
||SUBSTRING(TRIM(f_string) FROM 5 FOR 2)
||'-'
||SUBSTRING(TRIM(f_string) FROM 7 FOR 2) AS DATE)
ELSE DATE '2001-01-01'
END
FROM PCOLLECTION
I filed BEAM-5789 to track a better solution.
Update 2:
So, while Calcite generates the plan telling Beam what to do, it's Beam that actually casts/parses the dates in this case. There's an effort to use Calcite's built-in implementations of basic operations instead of re-implementing everything in Beam:
https://github.com/apache/beam/pull/6417 . After this pull request is merged, this CASE ... ELSE NULL path should work automatically, if I'm reading it right (I assume this class will be used for handling date/time values). It will still go through strings, probably unnecessarily, but it should just work.

If is MYSQL.
You may try
SELECT DATE_FORMAT(STR_TO_DATE('20080908', '%Y%m%d'), "%Y-%m-%d");
For validate, you may check whether the string could be converted to date successfully. sometimes. NULL means failed.

Related

Convert varchar/timestamp col to Date field

I have a varchar2 datatype field (lmp_date) that can return either null or what looks like a timestamp value. Changing the database data_type to DATE isn't a possibility, so now I'm needing to convert this to a date, but with the values this column returns, I'm having some problems.
Returned values for lmp_date =
null or 2021-06-11-00.00.00
Date format needed: MM/DD/YYYY
I've tried cast, convert, substr+instr to no avail
ETA - A couple example attempts (because there have been 10+:
select order_no, to_date(lmp_date) lmp_date from table_a - with error message of 'ORA-01861: literal does not match format string'
select order_no, to_date(substr(lmp_date, 1, instr(lmp_date, '00' -15))) lmp_date from table_a - since lmp_date has null value possibilities, this doesn't work successfully
select order_no, cast(lmp_date as date) lmp_date from table_a - with same error message of 'ORA-01861: literal does not match format string'
select order_no, to_date(lmp_date, 'YYYY-MM-DD') lmp_date from table_a - ORA-01830: date format picture ends before converting entire input string
There have been more attempts, this is all I can remember
To convert a string to a date, use the to_date() function with a suitable format mask:
to_date(lmp_date, 'YYYY-MM-DD-HH24:MI:SS')
The format model elements are in the documentation.
The result of that is a date data type, which is an internal 7-byte representation. Your client or application will format that for display, which may be based on your NLS_DATE_FORMAT setting, so you can modify that to change hot all dates are displayed; or use to_char() to convert the date back to a string, e.g.:
to_char(to_date(lmp_date, 'YYYY-MM-DD-HH24:MI:SS'), 'MM/DD/YYYY')
although if you want it as that string you can just use string manipulation with substr() and concatenation:
case when lmp_date is not null then
substr(lmp_date, 6, 2) || '/' || substr(lmp_date, 9, 2) || '/' || substr(lmp_date, 1, 4)
end
db<>fiddle
When you do either of these:
to_date(lmp_date)
cast(lmp_date as date)
this also relies on your session NLS_DATE_FORMAT; and the "literal does not match format string" error indicates that it doesn't match the string, e.g. if you have the still-default 'DD-MON-RR' setting. It would actually work - for you in your current session - if you changed that setting. I've shown that here just for info. But to work for anyone regardless of their session settings, you should use to_date() with an explicit format mask, and don't rely on or assume anything session-specific.
You were nearly there with:
to_date(lmp_date, 'YYYY-MM-DD')
and again the "date format picture ends before converting entire input string" message tells you what is wrong - your string carries on past the YYYY-MM-DD elements. Expanding the format mask to match all of the string, as I did above, means it knows what each part means.
If you were really only interested in the date part then you could cut the end off the string:
to_date(substr(lmp_date, 1, 10), 'YYYY-MM-DD')
but that's only really useful if you have a mix of string values where some have times and some do not. (The resulting date will always have a time; it will just be midnight.) And if you have dates with different formats then it gets a bit complicated - partly why you shouldn't store dates as strings.

I have an oracle table which has date in dd-mm-yyyy and dd/mm/yyyy format in same field. Now i have to convert into one common format

I have an oracle table which has date in dd-mm-yyyy and dd/mm/yyyy format in same field. Now i have to convert into one common format.
Please suggest how to approach this?
I did tried but it is failing as it is failing due to invalid month.
Is there a way i can first identify what format the date is and then based on case statement i might convert.
or something easy way? Please
I trust you've learnt your lesson and you're now going to store these dates in the date data type.
Your two different date formats actually aren't important, Oracle already is a little over accepting when it comes to separating characters.
e.g
to_date('01/01/1900','dd-mm-yyyy')
Does not error
I did tried but it is failing as it is failing due to invalid month.
Your error is coming because you've allowed a value that doesn't match either of those formats into your string column.
If you are on version 12.2 at least (which you should be in 2020) then you can use the validate_conversion function to identify rows that don't convert to a date with your format (https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/VALIDATE_CONVERSION.html#GUID-DC485EEB-CB6D-42EF-97AA-4487884CB2CD)
select string_column
from my_table
where validate_conversion(string_column AS DATE,'dd/mm/yyyy') = 0
The other additional helper we got in 12.2 was the on conversion error clause of to_date. So you can do.
alter table my_table add my_date date;
update my_table set my_date = to_date(my_string default null on conversion error,'dd/mm/yyyy');
If you are confident that there is no other format than those two, a simple approach is replace():
update mytable set mystring = replace(mystring, '/', '-');
This turns all dates to format dd-mm-yyyy.
I would suggest taking a step forward and convert these strings to a date column.
alter table mytable add mydate date;
update mytable set mydate = to_date(replace(mystring, '/', '-'), 'dd-mm-yyyy');
This will fail if invalid date strings are met. I tend to consider that a good thing, since it clearly signals that this a problem with the data. If you want to avoid that, you can use on conversion error, available starting Oracle 12:
to_date(
replace(mystring, '/', '-') default null on conversion error,
'dd-mm-yyyy'
)
Then you can remove the string column, which is no longer needed.

Is there an equivalent function to isdate () in oracle

CASE WHEN ISDATE(LTRIM(RTRIM(rard.thevalue))) = 1
THEN CONVERT(smalldatetime, LTRIM(RTRIM(rard.thevalue)))
WHEN ISDATE(LTRIM(RTRIM(rard2.thevalue))) = 1
THEN CONVERT(smalldatetime, LTRIM(RTRIM(rard2.thevalue)))
ELSE CONVERT(smalldatetime, LTRIM(RTRIM(r.receiptdate)))
I have this syntax in SQL which has to get converted into oracle. The column "thevalue" has different formats in it ex: HH:MM , MM/DD/YYYY, HH:MM:SS etc. So isdate() function is checking whether its matching the date format and then pulling the data. I would need similar kind of function to check whether the columns value is matching date time format and then display as date.
The Oracle equivalent would be validate_conversion().
However, unlike SQL Server, Oracle won't recognize varying formats. You need to explicitly specify the format that you want (unless your dates already are in the format configured by nls_date_format). Basically, you could test each possible format one after the other, and stop whenever one is recognized.
Since your purpose is to actually convert the string to a date, it would be simpler to use directly to_date(), with the on conversion error clause.
Consider something like:
coalesce(
to_date(thevalue default null on conversion error, 'MM/DD/YYYY'),
to_date(thevalue default null on conversion error, 'YYYY-MM-DD HH24:MI:SS'),
...
)
Notes:
the function happily ignores leading and trailing spaces, so there is no need to trim() beforehand
this requires Oracle 12.2 or higher
isdate() is not really safe in SQL Server; better use try_convert(), which basically behaves like Oracle's to_date() with default null on conversion error

DB2 Convert Number to Date

For some reason (I have no control over this) dates are stored as Integers in an iSeries AS400 DB2 system that I need to query. E.g. today will be stored as:
20,171,221
Being in the UK I need it to be like the below in Date format:
21/12/2017
This is from my query: (OAORDT = date field)
Select
Date(SUBSTR( CHAR( OAORDT ),7,2) ||'/' || SUBSTR(CHAR ( OAORDT ),5,2) || '/' || SUBSTR(CHAR (OAORDT ),1,4)) AS "Order Date"
from some.table
However, all I get is Nulls. If I remove the Date function, then it does work but its now a string, which I don't want:
Select
SUBSTR( CHAR( OAORDT ),7,2) ||'/' || SUBSTR(CHAR ( OAORDT ),5,2) || '/' || SUBSTR(CHAR (OAORDT ),1,4) AS "Order Date"
from some.table
How do I convert the OAORDT field to Date?
Just to update - I will be querying this from MS SQL Server using an OpenQuery
Thanks.
1) How do I convert the OAORDT field to Date?
Simplest is to use TIMESTAMP_FORMAT :
SELECT DATE(TIMESTAMP_FORMAT(CHAR(OAORDT),'YYYYMMDD'))
2) Being in the UK I need it to be [...] in Date format 21/12/2017 :
SELECT VARCHAR_FORMAT(DATE(TIMESTAMP_FORMAT(CHAR(OAORDT),'YYYYMMDD')),'DD/MM/YYYY')
Note, you didn't specify where you are doing this, but since you tagged as ibm-midrange, I am answering for embedded SQL. If you want JDBC, or ODBC, or interactive SQL, the concept is similar, just the means of achieving it is different.
Make sure SQL is using dates in the correct format, it defaults to *ISO. For you it should be *EUR. In RPG, you can do it this way:
exec sql set option *datfmt = *EUR;
Make sure that set option is the first SQL statement in your program, I generally put it immediately between D and C specs.
Note that this is not an optimal solution for a program. Best practice is to set the RPG and SQL date formats both to *ISO. I like to do that explicitly. RPG date format is set by
ctl-opt DatFmt(*ISO);
SQL date format is set by
exec sql set option *datfmt = *ISO;
Now all internal dates are processed in *ISO format, and have no year range limitation (year can be 0001 - 9999). And you can display or print in any format you please. Likewise, you can receive input in any format you please.
Edit Dates are a unique beast. Not every language, nor OS knows how to handle them. If you are looking for a Date value, the only format you need to specify is the format of the string you are converting to a Date. You don't need to (can't) specify the internal format of the Date field, and the external format of a Date field can be mostly anything you want, and different each time you use it. So when you use TIMESTAMP_FORMAT() as #Stavr00 mentioned:
DATE(TIMESTAMP_FORMAT(CHAR(OAORDT),'YYYYMMDD'))
The format provided is not the format of the Date field, but the format of the data being converted to a Timestamp. Then the Date() function converts the Timestamp value into a Date value. At this point format doesn't matter because regardless of which external format you have specified by *DATFMT, the timestamp is in the internal timestamp format, and the date value is in the internal date format. The next time the format matters is when you present the Date value to a user as a string or number. At that point the format can be set to *ISO, *EUR, *USA, *JIS, *YMD, *MDY, *DMY, or *JUL, and in some cases *LONGJUL and the *Cxxx formats are available.
Since none of variants suited my needs I've came out with my own.
It is as simple as:
select * from yourschema.yourtable where yourdate = int(CURRENT DATE - 1 days) - 19000000;
This days thing is leap year-aware and suits most needs fine.
Same way days can be turned to months or years.
No need for heavy artillery like VARCHAR_FORMAT/TIMESTAMP_FORMAT.
Below worked for me:
select date(substring(trim(DateCharCol), 1, 2)||'/'||substring(trim(DateCharCol), 3, 2)||'/'||'20'||substring(trim(DateCharCol), 5, 2)) from yourTable where TableCol =?;

CONVERT various date-like strings (varchar) to one DATE field

Consider a varchar field (ShipDate) that gets date-like strings written to it. These strings come from multiple third-party systems in various formats (over which I, apparently, have no control =/).
I decided to create a view that converts this varchar field to DATE so that I can query it easily (and filter out some other records / fields that I don't care about).
So far I see two formats coming in: YYYYMMDD (which is fine, I can just a a straight CONVERT) and MM/DD/YYYY, which causes an error:
Conversion failed when converting date and/or time from character string.
This changes my conversion from a simple CONVERT(DATE, ShipDate, 1) to:
CONVERT (DATE,
(CASE
WHEN ShipDate LIKE '_/__/____' THEN SUBSTRING(ShipDate, 6, 4) + '0' + SUBSTRING(ShipDate, 1, 1) + SUBSTRING(ShipDate, 3, 2)--M/DD/YYYY
WHEN ShipDate LIKE '__/_/____' THEN SUBSTRING(ShipDate, 6, 4) + SUBSTRING(ShipDate, 1, 2) + '0' + SUBSTRING(ShipDate, 4, 1)--MM/D/YYYY
WHEN ShipDate LIKE '_/_/____' THEN SUBSTRING(ShipDate, 5, 4) + '0' + SUBSTRING(ShipDate, 1, 1) + '0' + SUBSTRING(ShipDate, 3, 1)--M/D/YYYY
ELSE ShipDate --For the YYYYMMDD dates
END), 1) --End of CONVERT
Is there a better way to do the above SQL statement? I could potentially get even more date-like string formats as time goes on, so the above example could get pretty awful (I tagged this question with regex in case that could reduce the size of the case statement).
Or, is there a way to handle this problem as the records come in, avoiding the view altogether? I'm not too familiar with Triggers / SP's, but if that's a good option I'm willing to go that route =)
Or, some other method that is commonly used to solve this problem? Just curious at this point. I'm a .NET programmer, but end up helping out with SQL work because I have some experience, so I'm pretty new to anything even kind of advanced in SQL.
Don't use the date_style parameter for CONVERT. That's really for converting in the other direction. You should be able to just use: CAST(some_string AS DATE).
You might have some problems if you start getting dates in the DD/MM/YYYY format though. Of course, if they're being all mixed together then there's no way to solve that issue anyway, since even you can't know whether 4/1/2011 is April 1st or January 4th.
If the known formats are always M then D, and the separators are always /, why not just parse for the slashes? Also, why are you using ,1) in your CONVERT? All of the above formats seemed to convert fine for me without it:
WITH x(ShipDate) AS
(
SELECT '5/12/2011'
UNION ALL SELECT '05/5/2012'
UNION ALL SELECT '05/05/2012'
)
SELECT CONVERT (DATE, ShipDate) FROM x;
You say you can work with YYYYMMDD?
But MM/DD/YYYY is giving you problems. Then perhaps you can do this:
CONVERT(varchar(8),CAST('MM/DD/YYYY' as datetime),112) = YYYYMMDD
my reaction would be to add a proper date column, then implement a trigger that does the conversion into that date column.
you could then manually fix up any that failed to convert, and those records would still have values, unlike the view solution.