Odd error with casting to timestamp in standard SQL/Tableau - sql

The latest version of Tableau has started using standard SQL when it connects to Google's BigQuery.
I recently tried to update a large table but found that there appeared to be errors when trying to parse datetimes. The table originates as a CSV which is loaded into BigQuery where further manipulations happen. The datetime column in the original CSV contain strings in ISO standard date time format (basically yyyy-mm-dd hh:mm). This saves a lot of annoying manipulation later.
But on trying to convert the datetime strings in Tableau into dates or datetimes I got a bunch of errors. On investigation they seemed to come from BigQuery and looked like this:
Error: Invalid timestamp: '2015-06-28 02:01'
I thought at first this might be a Tableau issue so I loaded a chunk of the original CSV into Tableau directly where the conversion of the string to a data worked perfectly well.
I then tried simpler versions of the conversion (to a year rather than a full datetime) and they still failed. The generated SQL for the simplest conversion looks like this:
SELECT
EXTRACT(YEAR
FROM
CAST(`Arrival_Date` AS TIMESTAMP)) AS `yr_Arrival_Date_ok`
FROM
`some_dataset`.`some_table` `some_table`
GROUP BY
1
The invalid timestamp in the error message always looks to me like a perfectly valid timestamp. And further analysis suggests it doesn't happen for all the rows in the source table, just occasional ones.
This error did not appear in older versions of Tableau/BigQuery where legacy SQL was the default for Tableau. So i'm presuming it is a consequence of standard SQL.
So is there an intermittent problem with casting to timestamps in BigQuery? Or is this a Tableau problem which causes the SQL to be incorrectly formatted? And what can I do about it?

The seconds part in the canonical timestamp representation required if the hour and minute are also present. Try this instead with PARSE_TIMESTAMP and see if it works:
SELECT
EXTRACT(YEAR
FROM
PARSE_TIMESTAMP('%F %R', `Arrival_Date`)) AS `yr_Arrival_Date_ok`
FROM
`some_dataset`.`some_table`.`some_table`
GROUP BY
1

Related

Can't handle unfamiliar date format in BigQuery

I'm trying to query a BigQuery table that has column "date" (set to type DATE in the schema) formatted as yyyy-mm-dd-??. In other words, there's an extra set of information about the date and I'm not really sure what it is. When I try to query the "date" column I run into the error:
SQL Error [100032] [HY000]: [Simba]BigQueryJDBCDriver Error executing query job. Message: Invalid date: '2022-09-03-01'
I've tried cast(date as string), cast(left(date, 10) as string), all types of workarounds, but the error persists. It seems that no matter how much I try and nail it home in the query that I want this weird date column to be read as a string, so that I can work with it, BigQuery still wants to take it as a date, I guess because that's how it's setup in the schema. I don't care if this is parsed into a date properly or if it's read as a string and then I can parse it from there, I just want to be able to query the date column without getting an error.

BigQuery timestamp field in Data Studio error

I have data in a BigQuery instance with a some date fields in epoch/timestamp format. I'm trying to convert to a YYYYMMDD format or similar in order to create a report in Data Studio. I have tried the following solutions so far:
Change the format in the Edit Connection menu when creating the Data Source in Data Studio to Date format. Not working. I get Configuration errors when I add the field to the Data Studio report.
Create a new field using the TODATE() function. I always get an invalid formula error (even when I follow the documentation for this function). I have tried to change the field type prior to use the TODATE() function. Not working in any case.
Am I doing something wrong? Why do I always get errors?
Thanks!
The function for TODATE() is actually CURRENT_DATE(). Change timestamp to DATE using EXTRACT(DATE from variableName)
make sure not use Legacy SQL !
The issue stayed, but changing the name of the variable from actual_delivery_date to ADelDate made it work. So I presume there's a bug and short(er) names may help to avoid it
As commented by Elliott Brossard, the solution would be instead of using Data Studio for the conversion,use PARSE_DATE or PARSE_TIMESTAMP in BigQuery and convert it there instead.

Amazon redshift - extracting time from timestamp SQL error

I am trying to extract the time from a datetime column in my Amazon Redshift database (Postgresql 8.0). I have already referred to previous questions such as this. But I am getting an unusual error.
When I try:
SELECT collected_timestamp::time
or
SELECT cast(collected_timestamp as time)
I get the following error:
ERROR: Specified types or functions (one per INFO message) not supported on Redshift tables
The goal is to pull the time portion from the timestamp such that 2017-11-06 13:03:28 returns 13:03:28.
This seems like an easy problem to solve but for some reason I am missing something. Researching that error does not lead to anything meaningful. Any help is appreciated.
Note that Redshift <> PostgreSQL - it was forked from PostgreSQL but is very different under the hood.
You're trying to cast a timestamp value to a data type of "time" which does not exist in Redshift. To return a value that is only the time component of a timestamp you will need to cast it to a character data type, e.g.:
SELECT to_char(collected_timestamp, 'HH24:MI:SS');
There are a few ways, here is one i use:
SELECT ('2018-03-07 21:55:12'::timestamp - trunc('2018-03-07 21:55:12'::timestamp))::time;
I hope that helps.
EDIT: I have made incorrect use of ::time please see comments on other answer.

Azure SQL Data Warehouse - Strange DateTime conversion error/ behaviour

I am reading data from data lake (csv) and when running the below query, I am getting a 'Conversion failed when converting date and/or time from character string' error message.
select convert(datetime, NullIf(ltrim(rtrim([Date started])), ''), 111)
FROM dl.temp
Looked through the data and checked the source file as well, couldn't spot anything unusual.
As soon as I include the * and change the query to the below everything runs fine and the conversion seem to be doing its job.
select convert(datetime, NullIf(ltrim(rtrim([Date started])), ''), 111),*
from dl.temp
Out of curiosity also wanted to check the max and minimum date, so running max gives me the following:
However when I search for that particular value like below, I don't get any rows returned. It seems like it setting it to the column name. Does anyone know what is going on?
select *
from dl.temp
where [Date started] = 'Date started'
I am running this against an Azure Data Warehouse.
I think you'll find the issue is in your external file format.
In the CREATE EXTERNAL FILE FORMAT you probably need to add FIRST_ROW=2 in your FORMAT OPTIONS.
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-file-format-transact-sql

Are there SQL datatypes that don't work with R?

I am trying run an sqlQuery in Rstudio which seems to crash the program. I want to use the RODBC package to import a name called package name and elapsed time from a Oracle database. When I try to do an sqlQuery such as the following
dataframe <- sqlQuery(channel,
"select package_name, elapsed_time from fooSchema.barTable")
When I run this with just the package_name or other fields in the table, it works fine. If I try to run this with the elapsed_time, RStudio crashes. The datatype of elapsed_time is INTERVAL DAY (3) TO SECOND (6) so one record for example looks like this, "+000 00:00:00.22723"
Are there certain data types, such as Interval Day to Second, from Oracle that don't work in RStudio or R in general?
The problem isn't R, Rstudio, or even RODBC. The problem is that Oracle doesn't support interval data types for ODBC connections.
It is under section E.1
https://docs.oracle.com/cd/B28359_01/server.111/b32009/app_odbc.htm#CIHBFHCG
To get back to your question in a more general sense. Base R supports Date, POSIXct, and POSIXlt objects.
Dates and POSIXct objects are stored as the number of days/seconds respectively since 1/1/1970 whereas POSIXlt is a list of elements.
Whatever SQL connector you're using will need to coerce the SQL version of a date and time into one of the above. Sometimes it'll just convert to a character string. For instance with RPostgreSQL it'll take columns stored as Postgre's Date type as a character but Postgres timestamp columns will be coerced into POSIXct directly.