Undefined function: 'timestampdiff'.This function is neither a registered temporary function nor a permanent function registered in the database - apache-spark-sql

I am using Power Query for a data table coming from Databricks and used a function date function Date.From([Date1]) - [Date2] where Date 1 is a Random date and Date 2 is a column in a table.
The M code I used:
= Table.AddColumn(#"Renamed Columns", "Age", each Date.From(#date(2024,12,31)) - [#"Date"], type duration)
And here is the error I got
org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Undefined function: 'timestampdiff'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.
This function works fine on tables from folder or sharepoint, just not for table from databricks. Is there an alternative way to calculate data/time difference in Power Query for sources from databricks?

Related

How to resolve this sql error of schema_of_json

I need to find out the schema of a given JSON file, I see sql has schema_of_json function
and something like this works flawlessly
> SELECT schema_of_json('[{"col":0}]');
ARRAY<STRUCT<`col`: BIGINT>>
But if I query for my table name, it gives me the following error
>SELECT schema_of_json(Transaction) as json_data from table_name;
Error in SQL statement: AnalysisException: cannot resolve 'schemaofjson(`Transaction`)' due to data type mismatch: The input json should be a string literal and not null; however, got `Transaction`.; line 1 pos 7;
The Transaction is one of the columns in my table and after checking it manually I can attest that it is of String type(json).
The SQL statement has it to give me the schema of the JSON, how to do it?
after looking further into the documentation that it is clear that the word foldable means that of the static one, and a column from a table JSON won't work
for minimal reroducible example here you go:
SELECT schema_of_json(CAST('{ "a": "b" }' AS STRING))
As soon as the cast is introduced in the above statement, the schema_of_json will fail......... It needs a static JSON as it's input

Error message when try to run a function in function

I create two functions that worked well as a stand alone functions.
But when I try to run one of them inside the second one I got an error:
'SQL compilation error: Unsupported subquery type cannot be evaluated'
Can you advise?
Adding code bellow:
CASE
WHEN (regexp_substr(ARGUMENTS_JSON,'drives_removed":\\[]') = 'drives_removed":[]') THEN **priceperitem(1998, returnitem_add_item(ARGUMENTS_JSON))**
WHEN (regexp_substr(ARGUMENTS_JSON,'drives_added":\\[\\]') = 'drives_added":[]') THEN 1
WHEN (regexp_substr(ARGUMENTS_JSON,'from_flavor') = 'from_flavor') THEN 1
WHEN (regexp_substr(ARGUMENTS_JSON,'{}') = '{}') THEN 1
ELSE 'Other'
END as Price_List,
The problem happened when with function 'priceperitem'
If I replace returnitem_add_item with a string then it will work fine.
Function 1 : priceperitem get customer number and a item return pricing per the item from customer pricing list
Function 2 : returnitem_add_item parsing string and return a string
As an alternative, you may try processing udf within a udf in the following way :
create function x()
returns integer
as
$$
select 1+2
$$
;
set q=x();
create function pi_udf(q integer)
returns integer
as
$$
select q*3
$$
;
select pi_udf($q);
If this does not work out as well, the issue might be data specific. Try executing the function with one record, and then add more records to see the difference in the behavior. There are certain limitations on using SQL UDF's in Snowflake and hence the 'Unsupported Subquery' Error.

ERROR: column mm.geom does not exist in PostgreSQL execution using R

I am trying to run the model in R software which calls functions from GRASS GIS (version 7.0.2) and PostgreSQL (version 9.5) to complete the task. I have created a database in PostgreSQL and created an extension Postgis, then imported required vector layers into the database using Postgis shapefile importer. Every time I try to run using R (run as an administrator), it returns an error like:
Error in fetch(dbSendQuery(con, q, n = -1)) :
error in evaluating the argument 'res' in selecting a method for function 'fetch': Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: column mm.geom does not exist
LINE 5: (st_dump(st_intersection(r.geom, mm.geom))).geom as geom,
^
HINT: Perhaps you meant to reference the column "r.geom".
QUERY:
insert into m_rays
with os as (
select r.ray, st_endpoint(r.geom) as s,
(st_dump(st_intersection(r.geom, mm.geom))).geom as geom,
mm.legend, mm.hgt as hgt, r.totlen
from rays as r,bh_gd_ne_clip as mm
where st_intersects(r.geom, mm.geom)
)
select os.ray, os.geom, os.hgt, l.absorb, l.barrier, os.totlen,
st_length(os.geom) as shape_length, st_distance(os.s, st_endpoint(os.geom)) as near_dist
from os left join lut as l
on os.legend = l.legend
CONTEXT: PL/pgSQL function do_crtn(text,text,text) line 30 at EXECUTE
I have checked over and over again, column geometry does exist in Schema>Public>Views of PostgreSQL. Any advise on how to resolve this error?
add quotes and then use r."geom" instead r.geom

SparkSQL errors when using SQL DATE function

In Spark I am trying to execute SQL queries on a temporary table derived from a data frame that I manually built by reading a csv file and converting the columns into the right data type.
Specifically, the table I'm talking about is the LINEITEM table from [TPC-H specification][1]. Unlike stated in the specification I am using TIMESTAMP rather than DATE because I've read that Spark does not support the DATE type.
In my single scala source file, after creating the data frame and registering a temporary table called "lineitem", I am trying to execute the following query:
val res = sqlContext.sql("SELECT * FROM lineitem l WHERE date(l.shipdate) <= date('1998-12-01 00:00:00');")
When I submit the packaged jar using spark-submit, I get the following error:
Exception in thread "main" java.lang.RuntimeException: [1.75] failure: ``union'' expected but but `;' found
When I omit the semicolon and do the same thing, I get the following error:
Exception in thread "main" java.util.NoSuchElementException: key not found: date
Spark version is 1.4.0.
Does anyone have an idea what's the problem with these queries?
[1] http://www.tpc.org/TPC_Documents_Current_Versions/pdf/tpch2.17.1.pdf
SQL queries passed to SQLContext.sql shouldn't be delimited using semicolon - this the source of your first problem
DATE UDF expects date in the YYYY-­MM-­DD form and DATE('1998-12-01 00:00:00') evaluates to null. As long as timestamp can be casted to DATE correct query string looks like this:
"SELECT * FROM lineitem l WHERE date(l.shipdate) <= date('1998-12-01')"
DATE is a Hive UDF. It means you have to use HiveContext not a standard SQLContext - this is the source of your second problem.
import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc) // where sc is a SparkContext
In Spark >= 1.5 it is also possible to use to_date function:
import org.apache.spark.sql.functions.{lit, to_date}
df.where(to_date($"shipdate") <= to_date(lit("1998-12-01")))
Please try hive function CAST (expression AS toDatatype)
It changes an expression from one datatype to other
e.g. CAST ('2016-06-17 00.00.000' AS DATE) will convert String to Date
In your case
val res = sqlContext.sql("SELECT * FROM lineitem l WHERE CAST(l.shipdate as DATE) <= CAST('1998-12-01 00:00:00' AS DATE);")
Supported datatype conversions are as listed in Hive Casting Dates

BigQuery: "Invalid function name"

In BigQuery, I ran this query
SELECT SEC_TO_TIMESTAMP(start_time), resource
FROM [logs.requestlogs_20140305]
and received the error
Error: Invalid function name: SEC_TO_TIMESTAMP
SEC_TO_TIMESTAMP is listed in the date functions reference, so why does this error come up?
Turns out my start_time column was of type float, not an integer which SEC_TO_TIMESTAMP expects. Changing the query to
SELECT SEC_TO_TIMESTAMP(INTEGER(start_time)), resource
FROM [logs.requestlogs_20140305]
fixed the problem.