I'm trying to use jQuery with UDFs written in javascript, to be used with BigQuery. I uploaded the jQuery library to my cloud-storage, but when I try to upload it to my UDF I'm getting an error
TypeError: Cannot read property 'createElement' of undefined at gs://mybucket/jquery.min.js line 2, columns 7311-7312
Any help, please?
Thank you.
CREATE TEMP FUNCTION test()
RETURNS STRING
LANGUAGE js
OPTIONS (
library=["gs://mybucket/jquery.min.js"]
)
AS """
return "test";
""";
As you might know, some limitations apply to temporary and persistent user-defined functions
One of them - The DOM objects Window, Document, and Node, and functions that require them, are not supported.
This might be a reason!
Related
I have a UDF defined like so:
def my_function(input: Array[Byte])
and I want to call it in spark SQL, so i'm trying
SELECT my_function(binary(CONCAT(*))) FROM table;
but I don't think this is working. To my understanding, select * will return Array[Row], and then calling the native function binary will serialize that. Will that convert Array[Row] to Array[Byte]? Not sure how to call this udf via sql
We have to register the function and then we can use the UDF
ie
spark.udf.register(funname and definition )
you can explore more on link
As illustrated here, dumping the wasm byte code and copy past into the javascript seems difficult.
I guess you mean a better way than copying into JS - I haven't investigated that (yet), but this will make UDFs easier for others to use:
Move the .js out of the query into a file.
Create a persistent function.
Then people will be able to call it like this:
SELECT fhoffa.x.sample_wasm_udf([2,3,4])
To create this function I did:
CREATE OR REPLACE FUNCTION fhoffa.x.sample_wasm_udf(x ARRAY<INT64>)
RETURNS ARRAY<INT64>
LANGUAGE js AS '''
return main(x)
'''
OPTIONS (library="gs://fh-bigquery/js/wasm.udf.js");
For more on persistent functions, see:
https://medium.com/#hoffa/new-in-bigquery-persistent-udfs-c9ea4100fd83
I have been looking into how to write a UDF in BigQuery and found this syntax:
CREATE { TEMPORARY | TEMP } FUNCTION
function_name ([named_parameter[, ...]])
[RETURNS data_type]
{ [LANGUAGE language AS """body"""] | [AS (function_definition)] };
In the document I found, there is no clear mention of what languages are supported. In the examples given in the page, it only talks about "js" and I can't find any other language examples so I presume it only supports JavaScript but I am wondering whether anyone knows for sure.
From that same page:
Supported external UDF languages
External UDFs support code written in JavaScript, which you specify using js as the LANGUAGE.
You can't use languages other than JavaScript.
Here I have a scala UDF that checks if a url is one of my domains. To check if 'to_site' is one of my domains, I'm using indexOf in javascript.
CREATE TEMPORARY FUNCTION our_domain(to_site STRING)
RETURNS BOOLEAN
LANGUAGE js AS """
domains = ['abc.com', 'xyz.com'];
if (to_site == null || to_site == undefined) return false;
for (var i = 0; i < domains.length; i++){
var q= DOMAIN('XYZ');
if (String.prototype.toLowerCase.call(to_site).indexOf(domains[i]) !== -1)
return true;
}
return false;
""";
SELECT our_domain('www.foobar.com'), our_domain('www.xyz.com');
This returns false, then true.
It would be much nicer if I could use the DOMAIN(url) function from javascript. indexOf is not very good because it will match www.example.com?from=www.abc.com, when actually example.com is not one of my domains. Javascript also has a (new URL('www.example.com/q/z')).hostname to parse the domain component, but it includes the subdomain like 'www.' which complicates the comparison. Bigquery's DOMAIN(url) function only gives the domain and knowing google it's fast C++.
I know I can do this
our_domain(DOMAIN('www.xyz.com'))
But in general it would be nice to use some of the bigquery API functions in javascript. Is this possible?
I also tried this
CREATE TEMPORARY FUNCTION our_domain1(to_site String)
AS (our_domain(DOMAIN(to_site));
but it fails saying DOMAIN does not exist.
DOMAIN() function is supported in BigQuery Legacy SQL whereas Scalar UDFs are part of BigQuery Standard SQL.
So, unfortunatelly, no, you cannot use DOMAIN() function with code that uses Scalar UDF as of now at least.
And, no, you cannot use SQL Functions within JS [Scalar] UDFs, but you can use them in SQL UDFs
Finally, as I suggested in my answer on your previous question - in somple scenario like this your particular one - you better use SQL Scalar SQL vs. JS Scalar UDFs - they do not have LIMITs that JS UDFs have
The DOMAIN function in legacy SQL is more or less just a regular expression. Have you seen this previous question about DOMAIN? As Mikhail points out, you should be able to define a SQL UDF that uses a regex to extract the domain and then checks if it's in your list.
I cannot figure out how to return JSONP in RestXQ. After adding
let $x := util:declare-option("exist:serialize", fn:concat("method=json jsonp=",request:get-parameter("callback", "callback")))
to the function, I get the error message:
err:XPTY0004:It is a type error if, during the static analysis phase, an expression is found to have a static type that is not appropriate for the context in which the expression occurs, or during the dynamic evaluation phase, the dynamic type of a value does not match a required type as specified by the matching rules in 2.5.4 SequenceType Matching.
The beginning of the GET function is:
declare
%rest:GET
%rest:path("/demo/contacts/submit")
%rest:query-param("email", "{$email}", '')
%rest:query-param("nomail", "{$nomail}", 0)
%rest:produces("application/javascript")
%output:media-type("application/javascript")
%output:method("json")
function contacts:submit($email as xs:string*, $nomail as xs:integer*)
{
try
{
let $x := util:declare-option("exist:serialize", fn:concat("method=json jsonp=",request:get-parameter("callback", "callback")))
As discussed on the eXist-open mailing list (I'd suggest joining!), the request module's get-parameter() function is not available inside a RestXQ function. Instead, you can get your callback parameter via the %rest:query-param annotation. Add %rest:query-param("callback", "{$callback}", '') to your contacts:submit() function, and I think you'll be a step closer.
#joewiz is correct. Your initial problem is related to the use of eXist request module from RESTXQ, which is unsupported.
Also, RESTXQ does not currently support JSONP serialization. If you want to use JSONP serialization, your best bet at the moment is to manage the serialization to JSON yourself, perhaps using the xqjson library or similar and then wrapping the result in a JSON function using concat or similar.