Cannot check if varchar is BETWEEN date and date - sql

I created a partition projection in Athena named 'dt', which is a STRING and contains date information in the format 2020/12/11/20.
I'm running the following query in Athena
SELECT
DATE_FORMAT(dt, '%Y-%m') as dt,
count(*) as "total_visualization",
count(*)/cast(date_format(DATE '{END_DATE}', '%d') as integer) as "average_dia"
FROM
user.dashborad
WHERE
event = 'complete'
AND dt BETWEEN DATE '{START_DATE}' and DATE '{END_DATE}'
GROUP BY 1;
The resulting raw query received by Athena is:
DATE_FORMAT(dt, '%Y-%m') as dt,
count(*) as "total_visualization",
count(*)/cast(date_format(DATE '2022-08-08', '%d') as integer) as "average_day"
FROM user.dashborad
WHERE event = 'complete' AND dt BETWEEN DATE '2022-08-01' and DATE '2022-08-08'
GROUP BY 1;
However, I get the following error:
Error querying the database: SYNTAX_ERROR: line 2:62: Cannot check if varchar is BETWEEN date and date.
I've tried to find a workaround in an attempt to convert it into a date format using date_parse but it didn't work. And with str_to_date I get this error:
SYNTAX_ERROR: line 2:2: Function str_to_date not registered
Is there any other way I can modify the query to convert 'dt' from a varchar into a format Athena understands?

It is always a bad idea to store a date in a string instead of using the appropriate data type. You even call the column dt which suggests a datetime. This makes it harder to spot inappropriate handling.
Here
AND dt BETWEEN DATE '{START_DATE}' and DATE '{END_DATE}'
you compare a string with dates. Thus you rely on the DBMS guessing the string's date format correctly. Don't do this. Convert the string explicitely to a date, because you know the format. Or, as 'YYYY-MM-DD' is comparable, work with the strings right away:
AND dt BETWEEN '{START_DATE}' and '{END_DATE}'
Here
DATE_FORMAT(dt, '%Y-%m')
you invoke a date function on a string. This means the DBMS must again guess your format, convert your string into a date accordingly and then invoke the function to convert the date into a string. Instead, just use the appropriate string function on the string:
SUBSTR(dt, 1, 7)
The complete query:
SELECT
SUBSTR(dt, 1, 7) AS year_month,
COUNT(*) AS total_visualization,
COUNT(*) / CAST(SUBSTR('{END_DATE}', 9, 2)) AS INTEGER) AS average_dia
FROM
user.dashborad
WHERE
event = 'complete'
AND dt BETWEEN '{START_DATE}' and '{END_DATE}'
GROUP BY SUBSTR(dt, 1, 7)
ORDER BY SUBSTR(dt, 1, 7);

Related

How do I convert a YYYYMMDD decimal to date in SQL?

I am trying to convert an integer to a date or a date to an integer so I can compare two columns. I'm using teradata and have been struggling with Invalid Date [Error 2665] when trying to cast either to the other format. The formats are:
20220830 Type D
2022-08-05 Type DA
Methods I've tried:
SELECT cast((20220830 - 19000000) as date)
SELECT CAST(TRIM(20201231) AS DATE FORMAT 'YYYYMMDD')
select cast(2022-08-05 as Integer Format '99999999')
Select Convert(DATETIME, LEFT(20130101, 8))
SELECT CAST(CAST(20220830 AS CHAR(8)) AS DATE FORMAT 'YYYYMMDD')
select cast(test_date as date format'YYYYMMDD')
from
(SELECT cast (integer_date as char(8)) as test_date
from example)t1
Any insights into why these methods aren't working would be a great help
Select cast((column_name)-19000000 as date). Mistake was in trying to format such as Select cast((column_name)-19000000 as date format 'YYYY-MM-DD)

Invalid datetime string when CAST As Date

I have Time column in BigQuery, the values of which look like this: 2020-09-01-07:53:19 it is a STRING format. I need to extract just the date. Desired output: 2020-09-01.
My query:
SELECT
CAST(a.Time AS date) as Date
from `table_a`
The error message is: Invalid datetime string "2020-09-02-02:17:49"
You could also use the parse_datetime(), then convert to a date.
with temp as (select '2020-09-02-02:17:49' as Time)
select
date(parse_datetime('%Y-%m-%d-%T',Time)) as new_date
from temp
How about just taking the left-most 10 characters?
select substr(a.time, 1, 10)
If you want this as a date, then:
select parse_date('%Y-%m-%d', substr(a.time, 1, 10))
select STR_TO_DATE('2020-09-08 00:58:09','%Y-%m-%d') from DUAL;
or to be more specific as your column do as:
select STR_TO_DATE(a.Time,'%Y-%m-%d') from `table_a`;
Note: this format is applicable where mysql is supported

Redshift can't convert a string to a date, tried multiple functions

I have a table with a field called ADATE, it is a VARCHAR(16) and the values are like so: 2019-10-22-09:00.
I am trying to convert this do a DATE type but cannot get this to work.
I have tried:
1
TO_DATE(ADATE, 'YYYY-MM-DD')
Can't cast database type date to string
2
TO_DATE(LEFT(ADATE, 10), 'YYYY-MM-DD')
Can't cast database type date to string
3
TO_DATE(TRUNC(ADATE), 'YYYY-MM-DD')
XX000: Invalid digit, Value '-', Pos 4, Type: Decimal
4
CAST(ADATE AS DATE)
Error converting text to date
5
CAST(LEFT(ADATE, 10) AS DATE)
Error converting text to date
6
CAST(TRUNC(ADATE) AS DATE)
Error converting numeric to date
The issue was the data containing blanks (not Nulls) so the error was around them.
I resolved this by using the following code:
TO_DATE(LEFT(CASE WHEN adate = '' THEN NULL ELSE adate END, 10), 'YYYY-MM-DD') adate
Clearly, you have bad date string values -- which is why the value should be stored as a date to begin with.
I don't think Redshift has a way of validating the date before attempting the comparison, or of avoiding an error. But you can use case and regular expressions to see if the value is reasonable. This might help:
(case when left(adate, 10) ~ '^(19|20)[0-9][0-9]-[0-1][0-9]-[0-3][0-9]$'
then to_date(left(adate, 10), 'YYYY-MM-DD')
end)
This is not precise . . . you can make it more complex so month 19 is not permitted (for instance), but it is likely to catch the errors.

Invalid data error in Redshift

I have a query I am running in redshift that produces an error when I try to compare two dates. I have determined this is due to a data problem where the dates are VARCHAR and some are empty strings. The best solution is clearly to fix this at the source, but while trying to build a work around, I stumbled upon some very odd behavior.
To get around, I preselect the dates that are not empty strings, and cast as dates, then convert to integer date format (YYYYMMDD) and convert to INT. This runs fine. However, if I try to compare this with an integer in a WHERE clause, the query crashes with a data type error.
Here is a toy version of the working query
SELECT
date_id,
COUNT(*)
FROM
(
SELECT
CONVERT(int, date_id) AS date_id
FROM
(
SELECT
DATE_PART('year', start_dttm)*10000+DATE_PART('month', start_dttm)*10+DATE_PART('day', start_dttm) AS date_id
FROM
(
SELECT
CAST(start_dttm AS DATETIME) AS start_dttm
FROM
sfe.calendar_detail
WHERE
start_dttm <> ''
) cda
) cdb
) cd
GROUP BY
date_id
;
And here is the failed query
SELECT
date_id,
COUNT(*)
FROM
(
SELECT
CONVERT(int, date_id) AS date_id
FROM
(
SELECT
DATE_PART('year', start_dttm)*10000+DATE_PART('month', start_dttm)*10+DATE_PART('day', start_dttm) AS date_id
FROM
(
SELECT
CAST(start_dttm AS DATETIME) AS start_dttm
FROM
sfe.calendar_detail
WHERE
start_dttm <> ''
) cda
) cdb
) cd
WHERE
date_id >= 20170920
GROUP BY
date_id
;
As I mentioned above, the correct solution is to fix the data type and count empty dates as Nulls not empty strings, but I am very curious as to why the second query crashes on an invalid data type error.
Many Thanks!
Edit:
Here is the error
ERROR: Invalid digit, Value '1', Pos 0, Type: Integer
DETAIL:
-----------------------------------------------
error: Invalid digit, Value '1', Pos 0, Type: Integer
code: 1207
context:
query: 2006739
location: :0
process: query0_39 [pid=0]
-----------------------------------------------
Rather than converting dates to the human-readable YYYYMMDD format, it is always better to keep them as DATE or TIMESTAMP format. This way, date operations can be easily performed (eg adding 5 days to a date). You can still do easy comparison operators by using 'YYYYMMDD'::DATE.
Given that you are converting from a String, and casting to a Date seems to work, and that you have some empty strings, use this to convert it to a date:
SELECT
NULLIF(start_dttm, '')::DATE AS dt
FROM sfe.calendar_detail
WHERE dt > '20170920'::DATE
This will return a NULL if the string is empty, and a Date if it contains a date that could be converted.

Date formating with Character field

I have a character field that represents the date as '01-JAN-13'. When I reformat with TO_DATE(SUBSTR(DSIS_CC_MASTER.created_ts, 1, 9),'YYYY-MM-DD'), I get the result as 13-JAN-01.
How to get the data in format YY-MM-DD. Do I need to write in a CASE to change the month to numbers?
SELECT dsis_cc_master.created_ts
, to_date(substr(dsis_cc_master.created_ts, 1, 9), 'YYYY-MM-DD') AS created_month
FROM traffic_eng.dsis_cc_master
WHERE dsis_cc_master.created_ts >= to_date('2013-01-01', 'yyyy-mm-dd')
My result is
01-JAN-13 13-JAN-01
I am trying to get 01-01-13 or 13-01-01 in the second column.
You are getting back a date value, which is displayed however the client is configured to display dates. If you want to explicitly set the format use to_char to return a character string instead:
select
DSIS_CC_MASTER.created_ts,
TO_CHAR(TO_DATE(SUBSTR(DSIS_CC_MASTER.created_ts, 1, 9),'YYYY-MM-DD'),'DD-MM-YY') as created_month
FROM TRAFFIC_ENG.DSIS_CC_MASTER
WHERE DSIS_CC_MASTER.created_ts >= to_date('2013-01-01', 'yyyy-mm-dd')
However you are changing the return type which may cause issues if the consumer of this query is expecting a date type to come back (e.g. to do date math).
You have to use the date format model and TO_CHAR function to get the result in the format that you require:
select DSIS_CC_MASTER.created_ts, TO_CHAR(TO_DATE(SUBSTR(DSIS_CC_MASTER.created_ts, 1, 9),'YYYY-MM-DD'), 'DD-MM-YY') as created_month FROM TRAFFIC_ENG.DSIS_CC_MASTER
WHERE DSIS_CC_MASTER.created_ts >= to_date('2013-01-01', 'yyyy-mm-dd')
Since the data in your column is already stored in the YYYY-MM-DD format, you could actually just substr(DSIS_CC_MASTER.created_ts, 3, 7) and you'll get what you need.