How can i use the new UDF functionality to create "Dynamic SQL statement"? - google-bigquery

How can i use the new UDF functionality to create "Dynamic SQL statement"?
Is there a way to use UDF in order to construct SQL statement based on template and input variables, and later run this query?

The documentation https://cloud.google.com/bigquery/user-defined-functions?hl=en says:
A UDF is similar to the "Map" function in a MapReduce: it takes a
single row as input and produces zero or more rows as output. The
output can potentially have a different schema than the input.
So your UDF receives just a single row.
Therefore - no, UDF is not for the purpose you described in your question.
You might take a look at views - maybe that will suit you better:
https://cloud.google.com/bigquery/querying-data#views

Related

how to use SQL user defined function in snowflake?

I am just studying how to use SQL in snowflake. Here is a snapshot:
And this is the code used in here:
use schema SNOWFLAKE_SAMPLE_DATA.TPCH_SF1;
--use schema SNOWFLAKE_SAMPLE_DATA.TPCH_SF10;
select *
from LINEITEM
limit 200
You can see the table includes two feilds: L_LINENUMBER, L_QUANTITY. Now I want to try a user defined function, which can do:
use L_LINENUMBER, L_QUANTITY as two parameters transferred into the function,
calculate L_LINENUMBER1=L_LINENUMBER+1, and L_QUANTITY1=mean(L_QUANTITY).
join the two new fields (L_LINENUMBER1, L_QUANTITY1) to the original table (LINEITEM)
how to use create function to do this. I have read a lot of examples regarding create function. But I just cannot get the point. Maybe because I am not good at SQL. So, could anyone give me a comprehensive example with all the details?
I understand that you question is about UDFs, but using UDFs for your purpose here is overkill.
You can increment an attribute in a table using the following statement.
SELECT
L_LINENUMBER+1 as L_LINENUMBER1
FROM LINEITEM;
To calculate the mean of an attribute in a table, you should understand that this is an aggregate function which only makes sense when used in conjunction with a group by statement. An example with your data is shown below.
SELECT
AVG(L_QUANTITY) AS L_QUANTITY1
FROM LINEITEM
GROUP BY L_ORDERKEY;
Since your question was originally on UDFs and you seem to be following with Snowflake's sample data, the example that they provide is the following UDF which accepts a temperature in Kelvin and converts it to Fahrenheit (from the definition you can see that it can be applied to any attribute of the number type).
CREATE OR REPLACE FUNCTION
UTIL_DB.PUBLIC.convert_fahrenheit( t NUMBER)
RETURNS NUMBER
COMMENT='Convert from Kelvin from Fahrenheit'
AS '(t - 273.15) * 1.8000 + 32.00';

How to define functions for any column (scalar UDF) on Google BigQuery

Let's say I need to define a function with a behavior like UPPER(string), we can call it FIRSTCHAR(string) that gets the first character of a string.
So I would like to make SQL like:
SELECT FIRSTCHAR(middle_name) AS middle_name_first_char,
FIRSTCHAR(last_name) AS last_name_first_char FROM clients
Reading BigQuery UDF documentation is not clear how to make such functions that works over string, across any table or column. It looks like to define a function with bigquery.defineFunction() it needs an Input column names argument.
Per what I know, scalar type UDF are not available yet in BigQuery. Current UDF are only table wise. So you supply table to UDF and UDF is processing it row-by-row outputting 0, 1 or many rows (depends on your implemented function) for each input row.
I remember one of Google Team member mentioned - they work on making scalar UDF available at some point
I assume your simplified example in question is just example to demonstrate point of your question, so I am not providing actual solution for this example (which is super simple use of string function(s))
2016-08-11 UPDATE
Scalar UDF are supported now for BigQuery Standard SQL
See examples below
JS UDF
CREATE TEMPORARY FUNCTION FIRSTCHAR(word STRING)
RETURNS STRING
LANGUAGE js
AS "return word.substring(0, 1);";
SELECT
FIRSTCHAR(middle_name) AS middle_name_first_char,
FIRSTCHAR(last_name) AS last_name_first_char
FROM clients
SQL UDF
CREATE TEMPORARY FUNCTION FIRSTCHAR(word STRING)
RETURNS STRING
AS (SUBSTR(word, 0, 1));
SELECT
FIRSTCHAR(middle_name) AS middle_name_first_char,
FIRSTCHAR(last_name) AS last_name_first_char
FROM clients

What is the use case for Merge function SQL Clr?

I am writing a CLR userdefinedAggregate function to implement median. While I understand all the other function which I have to implement. I can not understand, what is the use of the merge function.
I am getting a vague idea that if aggregated function is partially evaluated ( i.e. evaluated for some rows with one group and the remaining in other ) then the values needs to be aggregated. If its the case is there a way to test this ?
Please let me know if any of the above is not clear or if you need any further information.
Your vague idea is correct.
From Requirements for CLR User-Defined Aggregates
This method can be used to merge another instance of this aggregate
class with the current instance. The query processor uses this method
to merge multiple partial computations of an aggregation.
The parameter to merge is another instance of your aggregate and you should merge the aggregated data in that instance to your current instance.
You can have a look at the sample string concatenate aggregate. The merge method add the concatenated strings from the parameter to the current instance of the aggregate class.

Wrap SQL CONTAINS as an expression?

I have a question. I working on one site on Asp.Net, which uses some ORM. I need to use a couple of FullTextSearch functions, such as Contains. But when I try to generate it with that ORM, it generates such SQL code
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name]
FROM [dbo].[SomeTable] AS [Extent1]
WHERE (Contains([Extent1].[Name], N'qq')) = 1
SQL can't parse it, because Contains doesn't return bit value. And unfortunately I can't modify SQL query generation process, but I can modify statements in it.
My question is - is it possible to wrap call of CONTAINS function to something else? I tried to create another function, that will SELECT with contains, but it requires specific table\column objects, and I don't want to do one function for each table..
EDIT
I can modify result type for that function in ORM. In previous sample result type is Bit. I can change it to int,nvarchar,etc. But as I understood there is no Boolean type in SQL, and I can't specify it.
Can't you put this in a stored procedure, and tell your ORM to call the stored procedure? Then you don't have to worry about the fact that your ORM only understands a subset of valid T-SQL.
I don't know that I believe the argument that requiring new stored procedures is a blocker. If you have to write a new CONTAINS expression in your ORM code, how much different is it to wrap that expression in a CREATE PROCEDURE statement in a different window? If you want to do this purely in ORM, then you're going to have to put pressure on the vendor to pick up the pace and start getting more complete coverage of the language they should fully support.

How to pass an entire row (in SQL, not PL/SQL) to a stored function?

I am having the following (pretty simple) problem. I would like to write an (Oracle) SQL query, roughly like the following:
SELECT count(*), MyFunc(MyTable.*)
FROM MyTable
GROUP BY MyFunc(MyTable.*)
Within PL/SQL, one can use a RECORD type (and/or %ROWTYPE), but to my knowledge, these tools are not available within SQL. The function expects the complete row, however. What can I do to pass the entire row to the stored function?
Thanks!
Don't think you can.
Either create the function with all the arguments you need, or pass the id of the row and do a SELECT within the function.