Presto Value cannot be cast as TIMESTAMP - sql

I'm getting the error, value cannot be cast as timestamp: 2021-03-14 02:21:16. This seems like a perfectly eligible candidate for a timestamp cast. Is there any reason why this error should be triggered?
I'm tempted to just use TRY_CAST and filter out the NULL values in a WHERE clause. But I'm not sure how prevalent this issue is and would like to better understand what's causing it. The type of the value in the db table is VARCHAR.

This is because Presto has a bug where timestamps are not properly treated according to standard SQL behavior. What you're probably observing is a timestamp that falls in the daylight savings transition "gap" for the timezone of your session.
This issue is fixed in Trino (formerly known as Presto SQL):
trino> select cast('2021-03-14 02:21:16' as timestamp);
_col0
-------------------------
2021-03-14 02:21:16.000
(1 row)

Related

Timestampdiff issue while attempting to execute, errors don't make sense

I'm attempting to use the time difference between two timestamps with where. I realize that there are various posts on how to do this and I've looked at them.
Code:
SELECT *
FROM table1
WHERE TIMESTAMP_DIFF('SECOND', started_at, ended_at) <= 60
AND started_at IS NOT NULL
AND ended_at IS NOT NULL;
However, BigQuery keeps throwing an error.
A valid date part name is required but found ended_at at [3:107]
So I look at the schema.
started_at TIMESTAMP NULLABLE
ended_at TIMESTAMP NULLABLE
While it's certainly possible, I'm doing something wrong. The error would lead me to believe that this is an issue with the column itself. I've also tried it with clauses for where ended_at is not null and started_at is not null. While the query for everything returns if I search that way, as soon as I put timestamp into it, it doesn't work.
NOTE: I realize that the timestampdiff() function doesn't have an underscore typically, but BigQuery uses that syntax according to the note if you type it the other way.
Function not found: TIMESTAMPDIFF; Did you mean timestamp_diff? at [3:71]
TIMESTAMP_DIFF(ended_at, started_at, SECOND) should do the trick
https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#timestamp_diff
Example:

casting string to date postgresql

Using postgres 14
I have a timestamp something like this 2011-04-26T05:04:11Z. Its in UTC time
I tried converting it to a postgres timestamp using this function and i get a wrong result
2022-04-26 00:04:11-07. The time part seems messed up.
This is the query i have
select to_TIMESTAMP('2011-04-26T05:04:11Z','YYYY-MM-DDTHH:MI:SS')
If you just want to convert the string to a Postgres timestamp then:
select '2011-04-26T05:04:11Z'::timestamptz;
04/25/2011 22:04:11 PDT
The output will depend on the DateStyle setting.
To get your example to work then:
select to_TIMESTAMP('2011-04-26T5:04:11Z','YYYY-MM-DD"T"HH24:MI:SS');
to_timestamp
-------------------------
04/26/2011 05:04:11 PDT
Note the "T" this causes it to be ignored as that seems to be what is causing the issues. Not certain, but probably related to Postgres ISO format using a space instead of T. Quoting characters to be ignored comes from Formatting function:
Tip
Prior to PostgreSQL 12, it was possible to skip arbitrary text in the input string using non-letter or non-digit characters. For example, to_timestamp('2000y6m1d', 'yyyy-MM-DD') used to work. Now you can only use letter characters for this purpose. For example, to_timestamp('2000y6m1d', 'yyyytMMtDDt') and to_timestamp('2000y6m1d', 'yyyy"y"MM"m"DD"d"') skip y, m, and d.
There is no provision for a time zone abbreviation in to_timestamp so the Z will be ignored and the timestamp will be in local time with the same time value. That is why I made my first suggestion using the timestamptz cast.
Two ways to deal with time zone:
One:
select to_TIMESTAMP('2011-04-26T5:04:11Z','YYYY-MM-DD"T"HH24:MI:SS')::timestamp AT time zone 'UTC';
timezone
-------------------------
04/25/2011 22:04:11 PDT
Two:
select to_TIMESTAMP('2011-04-26T5:04:11+00','YYYY-MM-DD"T"HH24:MI:SS+TZH');
to_timestamp
-------------------------
04/25/2011 22:04:11 PDT

correct type for SQL snowflake date

I am using an SQL Script to parse a json into a table using dbt. One of the cols had this date value: '2022-02-09T20:28:59+0000'. What would be the correct way to define iso date's data type in Snowflake?
Currently, I just used the date type like this in my dbt sql script:
JSON_DATA:"situation_date"::date AS MY_DATE
but clearly, dateisn't the correct one because later when I test it using select * , I get this error:
SQL Error [100040] [22007]: Date '2022-02-09T20:28:59+0000' is not recognized
so I need to know which Snowflake date data type or datetime type suits the best with this one
Correct pulling the "date from JSON" so not so clear cut:
SELECT
'{"date":"2022-02-09T20:28:59+0000"}' as json_str
,parse_json(json_str) as json
,json:date as data_from_json
,TRY_TO_TIMESTAMP_NTZ(data_from_json, 'YYYY-MM-DDTHH:MI:SS+0000') as date_1
,TRY_TO_TIMESTAMP_NTZ(substr(data_from_json,1,19), 'YYYY-MM-DDTHH:MI:SS') as date_2
;
gives the error:
Function TRY_CAST cannot be used with arguments of types VARIANT and TIMESTAMP_NTZ(9)
Because the type of data_from_json as VARIANT and the TO_DATE/TO_TIMESTAMP function expect TEXT so we need to cast to that
SELECT
'{"date":"2022-02-09T20:28:59+0000"}' as json_str
,parse_json(json_str) as json
,json:date as data_from_json
,TRY_TO_TIMESTAMP_NTZ(data_from_json::text, 'YYYY-MM-DDTHH:MI:SS+0000') as date_1
,TRY_TO_TIMESTAMP_NTZ(substr(data_from_json::text,1,19), 'YYYY-MM-DDTHH:MI:SS') as date_2
;
If all your timezones are always +0000 you can just put that in the parse format (like example date_1), OR you can truncate that part off (like example date_2)
gives:
JSON_STR
JSON
DATA_FROM_JSON
DATE_1
DATE_2
{"date":"2022-02-09T20:28:59+0000"}
{ "date": "2022-02-09T20:28:59+0000" }
"2022-02-09T20:28:59+0000"
2022-02-09 20:28:59.000
2022-02-09 20:28:59.000
Using TRY_TO_TIMESTAMP:
SELECT TRY_TO_TIMESTAMP(JSON_DATA:"situation_date", 'format_here')
FROM tab;
so I need to know which Snowflake date data type or datetime type suits the best with this one
TIMESTAMP_INPUT_FORMAT
The specific input could be set up on ACCOUNT/USER/SESSION level.
AUTO Detection of Integer-stored Date, Time, and Timestamp Values
Avoid using AUTO format if there is any chance for ambiguous results. Instead, specify an explicit format string by:
Setting TIMESTAMP_INPUT_FORMAT and other session parameters for dates, timestamps, and times. See Session Parameters for Dates, Times, and Timestamps (in this topic).
I think ::TIMESTAMP should work for this. So JSON_DATA:"situation_date"::TIMESTAMP if you need to go just to date after, you could then to ::Date or to_Date()
After some testing, it seems to me you have 2 options.
Either you can get rid of the +0000 at the end:
left(column_date, len(column_date)-5)::timestamp
or use the function try_to_timestamp with format:
try_to_timestamp('2022-02-09T20:28:59+0000','YYYY-MM-DD"T"HH24:MI:SS+TZHTZM')
TZH and TZM both are TimeZone Offset Hours and Minutes

Converting timestamp on whole table in bigquery

I have this table which stores millions of rows of data. This data has a date that indicates when was the data entered. I store the data in NUMERIC schemas with EPOCH UNIX as the format. However, I wanted to convert them to human date (yyyy-mm-dd hh:mm:ss) and later sort them by date not queried date.
However, it took me so long to find a suitable way. Here's my attempt.
I used SELECT CAST(DATE(timestamp) AS DATE) AS CURR_DT FROM dataset.table but it gave me this error:
No matching signature for function DATE for argument types: NUMERIC. Supported signatures: DATE(TIMESTAMP, [STRING]); DATE(DATETIME); DATE(INT64, INT64, INT64) at [1:13]
I used this method BigQuery: convert epoch to TIMESTAMP but still didn't fully understand
I'm a novice in coding so I hope you guys understand the situation. Thanks!
If I am understanding your question correctly you would like to take a numeric EPOCH time that is stored as an integer and convert it to a timestamp?
If so you can use the following in BigQuery Standard SQL:
select TIMESTAMP_SECONDS(1606048220)
It gives the output of:
2020-11-22 12:30:20 UTC
Documentation
If you only want the date component, then you would convert to a date after converting to a timestamp. Presumably you have seconds, so you would use TIMESTAMP_SECONDS() -- but there are similar functions for milliseconds and microseconds.
For just the date:
select date(timestamp_seconds(col))
Note that this removes the time component.

Earliest Timestamp supported in PostgreSQL

I work with different databases in a number of different time zones (and periods of time) and one thing that normally originates problems, is the date/time definition.
For this reason, and since a date is a reference to a starting value, to keep track of how it was calculated, I try to store the base date; i.e.: the minimum date supported in that particular computer/database;
If I am seeing it well, this depends on the RDBMS and on the particular storage of the type.
In SQL Server, I found a couple of ways of calculating this "base date";
SELECT CONVERT(DATETIME, 0)
or
SELECT DATEADD(MONTH, 0, 0 )
or even a cast like this:
DECLARE #300 BINARY(8)
SET #300 = 0x00000000 + CAST(300 AS BINARY(4))
set #dt=(SELECT CAST(#300 AS DATETIME) AS BASEDATE)
print CAST(#dt AS NVARCHAR(100))
(where #dt is a datetime variable)
My question is, is there a similar way of calculating the base date in PostgreSQL, i.e.: the value that is the minimum date supported and is on the base of all calculations?
From the description of the date type, I can see that the minimum date supported is 4713 BC, but is there a way of getting this value programmatically (for instance as a formatted date string), as I do in SQL Server?
The manual states the values as:
Low value: 4713 BC
High value: 294276 AD
with the caveat, as Chris noted, that -infinity is also supported.
See the note later in the same page in the manual; the above is only true if you are using integer timestamps, which are the default in all vaguely recent versions of PostgreSQL. If in doubt:
SHOW integer_datetimes;
will tell you. If you're using floating point datetimes instead, you get greater range and less (non-linear) precision. Any attempt to work out the minimum programatically must cope with that restriction.
PostgreSQL does not just let you cast zero to a timestamp to get the minimum possible timestamp, nor would this make much sense if you were using floating point datetimes. You can use the julian date conversion function, but this gives you the epoch not the minimum time:
postgres=> select to_timestamp(0);
to_timestamp
------------------------
1970-01-01 08:00:00+08
(1 row)
because it accepts negative values. You'd think that giving it negative maxint would work, but the results are surprising to the point where I wonder if we've got a wrap-around bug lurking here:
postgres=> select to_timestamp(-922337203685477);
to_timestamp
---------------------------------
294247-01-10 12:00:54.775808+08
(1 row)
postgres=> select to_timestamp(-92233720368547);
to_timestamp
---------------------------------
294247-01-10 12:00:54.775808+08
(1 row)
postgres=> select to_timestamp(-9223372036854);
to_timestamp
------------------------------
294247-01-10 12:00:55.552+08
(1 row)
postgres=> select to_timestamp(-922337203685);
ERROR: timestamp out of range
postgres=> select to_timestamp(-92233720368);
to_timestamp
---------------------------------
0954-03-26 09:50:36+07:43:24 BC
(1 row)
postgres=> select to_timestamp(-9223372036);
to_timestamp
------------------------------
1677-09-21 07:56:08+07:43:24
(1 row)
(Perhaps related to the fact that to_timestamp takes a double, even though timestamps are stored as integers these days?).
I think it's possibly wisest to just let the timestamp range be any timestamp you don't get an error on. After all, the range of valid timestamps is not continuous:
postgres=> SELECT TIMESTAMP '2000-02-29';
timestamp
---------------------
2000-02-29 00:00:00
(1 row)
postgres=> SELECT TIMESTAMP '2001-02-29';
ERROR: date/time field value out of range: "2001-02-29"
LINE 1: SELECT TIMESTAMP '2001-02-29';
so you can't assume that just because a value is between two valid timestamps, it is its self valid.
The earliest timestamp is '-infinity'. This is a special value. The other side is 'infinity' which is later than any specific timestamp.
I don't know of a way of getting this programaticly. I would just use the value hard-coded the way you might use NULL. That means you have to handle infinities on the client side though.