I am using Power Query for a data table coming from Databricks and used a function date function Date.From([Date1]) - [Date2] where Date 1 is a Random date and Date 2 is a column in a table.
The M code I used:
= Table.AddColumn(#"Renamed Columns", "Age", each Date.From(#date(2024,12,31)) - [#"Date"], type duration)
And here is the error I got
org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Undefined function: 'timestampdiff'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.
This function works fine on tables from folder or sharepoint, just not for table from databricks. Is there an alternative way to calculate data/time difference in Power Query for sources from databricks?
I am using an sql script to parse a json into a snowflake table using dbt.
One of the cols contain this datetime value: '2022-02-09T20:28:59+0000'.
What's the correct way to define ISO datetime's data type in Snowflake?
I tried date, timestamp and TIMESTAMP_NTZ like this in my dbt sql script:
JSON_DATA:",my_date"::TIMESTAMP_NTZ AS MY_DATE
but clearly, these aren't the correct one because later on when I test it in snowflake with select * , I get this error:
SQL Error [100040] [22007]: Date '2022-02-09T20:28:59+0000' is not recognized
or
SQL Error [100035] [22007]: Timestamp '2022-02-13T03:32:55+0100' is not recognized
so I need to know which Snowflake time/date data type suits the best for this one
EDIT:
This is what I am trying now.
SELECT
JSON_DATA:"date_transmission" AS DATE_TRANSMISSION
, TO_TIMESTAMP(DATE_TRANSMISSION:text, 'YYYY-MM-DDTHH24:MI:SS.FFTZH:TZM') AS DATE_TRANSMISSION_TS_UTC
, JSON_DATA:"authorizerClientId"::text AS AUTHORIZER_CLIENT_ID
, JSON_DATA:"apiPath"::text API_PATH
, MASTERCLIENT_ID
, META_FILENAME
, META_LOAD_TS_UTC
, META_FILE_TS_UTC
FROM {{ source('INGEST_DATA', 'TABLENAME') }}
I get this error:
000939 (22023): SQL compilation error: error line 6 at position 4
10:21:46 too many arguments for function [TO_TIMESTAMP(GET(DATE_TRANSMISSION, 'text'), 'YYYY-MM-DDTHH24:MI:SS.FFTZH:TZM')] expected 1, g
However, if I comment out the the first 2 lines(related to timpstamp types), the other two work perfectly fine. What's the correct syntax of parsing json with TO_TIMESTAMP?
Not that JSON_DATA:"apiPath"::text API_PATH gives the correct value for it in my snowflake tables.
Did some testing and it seems you have 2 options.
You can either get rid of the +0000 at the end: left(column_date, len(column_date)-5)
or try_to_timestamp with format
try_to_timestamp('2022-02-09T20:28:59+0000','YYYY-MM-DD"T"HH24:MI:SS+TZHTZM')
TZH and TZM are TimeZone Offset Hours and Minutes
So there are 2 main points here.
when getting data from JSON to pass to any of the timestamp functions that want a ::TEXT object, but the values to get from JSON are still ::VARIANT so they need to be cast. This is the cause of the error you quote
(22023): SQL compilation error: error line 6 at position 4
10:21:46 too many arguments for function [TO_TIMESTAMP(GET(DATE_TRANSMISSION, 'text'), 'YYYY-MM-DDTHH24:MI:SS.FFTZH:TZM')] expected 1, g
also your SQL is wrong there it should have been
TO_TIMESTAMP(DATE_TRANSMISSION::text,
How you handle the timezone format.As other have noted you (as I did in your last question) do you want to ignore the timezone values or read them. I forgot about the TZHTZM formatting. Given you have timezone data, you should use the TO_TIMESTAMP_TZ`TRY_TO_TIMESTAMP_TZto make sure the time zone data is keep, given you second example shows+0100`
putting those together (assuming you didn't want an extra date_transmission as a variant in you data) :
SELECT
TO_TIMESTAMP_TZ(JSON_DATA:"date_transmission"::text, 'YYYY-MM-DDTHH24:MI:SS+TZHTZM') AS DATE_TRANSMISSION_TS_UTC
, JSON_DATA:"authorizerClientId"::text AS AUTHORIZER_CLIENT_ID
, JSON_DATA:"apiPath"::text AS API_PATH
, MASTERCLIENT_ID
, META_FILENAME
, META_LOAD_TS_UTC
, META_FILE_TS_UTC
FROM {{ source('INGEST_DATA', 'TABLENAME') }}
You should use timestamp (not date which does not store the time information), but probably the format you are using is not autodetected. You can specify the input format as YYYY-MM-DD"T"HH24:MI:SSTZHTZM as shown here. The autodetected one has a : between the TZHTZM.
I need to find out the schema of a given JSON file, I see sql has schema_of_json function
and something like this works flawlessly
> SELECT schema_of_json('[{"col":0}]');
ARRAY<STRUCT<`col`: BIGINT>>
But if I query for my table name, it gives me the following error
>SELECT schema_of_json(Transaction) as json_data from table_name;
Error in SQL statement: AnalysisException: cannot resolve 'schemaofjson(`Transaction`)' due to data type mismatch: The input json should be a string literal and not null; however, got `Transaction`.; line 1 pos 7;
The Transaction is one of the columns in my table and after checking it manually I can attest that it is of String type(json).
The SQL statement has it to give me the schema of the JSON, how to do it?
after looking further into the documentation that it is clear that the word foldable means that of the static one, and a column from a table JSON won't work
for minimal reroducible example here you go:
SELECT schema_of_json(CAST('{ "a": "b" }' AS STRING))
As soon as the cast is introduced in the above statement, the schema_of_json will fail......... It needs a static JSON as it's input
Need Help in DTS.
After creating a table "allorders" with autodetect schema, I created a data transfer service. But when I ran the DTS I'm getting an error. see Job below. quantity field type is for sure set to integer and all the data in the said field are whole numbers.
Job bqts_602c3b1a-0000-24db-ba34-30fd38139ad0 (table allorders) failed
with error INVALID_ARGUMENT: Error while reading data, error message:
Could not parse 'quantity' as INT64 for field quantity (position 14)
starting at location 0 with message 'Unable to parse'; JobID:
956421367065:bqts_602c3b1a-0000-24db-ba34-30fd38139ad0
When I recreated a table and set all fields to type string. It worked fine. see Job below
Job bqts_607cef13-0000-2791-8888-001a114b79a8 (table allorders)
completed successfully. Number of records: 56017, with errors: 0.
Try to find unparseable values in the table with all string fileds:
SELECT *
FROM dataset.table
WHERE SAFE_CAST(value AS INT64) IS NULL;
We ran this query on Bigquery:
SELECT DateTime, Source, MachineName, LogLevel, Identifier, Message, Exception
FROM TABLE_DATE_RANGE(XXXX.EventLog_, TIMESTAMP(Current_Date()), TIMESTAMP(Current_Date()))
Where source like 'Sync' and (MachineName like 'WEBNEW' or Identifier like 'WEBNEW')
Order by DateTime desc
LIMIT 100;
It gave us :
Error: Cannot read tablet : Incompatible types. 'DateTime' : TYPE_MESSAGE 'DateTime' : TYPE_INT64
Job ID: red-road-574:job_t5gM9MysBFi20PFZ88kgTO8ygvQ
When we only got rid of " Order by DateTime desc", the query ran well.
We wonder why, and how to fix it.
Transient issue in BigQuery - everything should be working normal now.