Declare variables in scheduled query BigQuery; - sql

I am developing a scheduled query where I am using the WITH statement to join and filtrate several tables from BigQuery. To filtrate the dates, I would like to declare the following variables:
DECLARE initial, final DATE;
SET initial = DATE_TRUNC(DATE_TRUNC(CURRENT_DATE(), MONTH)+7,ISOWEEK);
SET final = LAST_DAY(DATE_TRUNC(CURRENT_DATE(), MONTH)+7, ISOWEEK);
However, when executing this query, I am getting two results; one for the variables declared (which I am not interested in having them as output), and the WITH statement that is selected at the end (which as the results that I am interested in).
The principal problem is that, whenever I try t connect this scheduled query to a table in Google Data Studio I get the following error:
Invalid value: configuration.query.destinationTable cannot be set for scripts;
How can I declare a variable without getting it as a result at the end?
Here you have a sample of the code I am trying work in:
DECLARE initial, final DATE;
SET initial = DATE_TRUNC(DATE_TRUNC(CURRENT_DATE(), MONTH)+7,ISOWEEK);
SET final = LAST_DAY(DATE_TRUNC(CURRENT_DATE(), MONTH)+7, ISOWEEK);
WITH HelloWorld AS (
SELECT shop_date, revenue
FROM fulltable
WHERE shop_date >= initial
AND shop_date <= final
)
SELECT * from HelloWorld;

with initial1 as ( select DATE_TRUNC(DATE_TRUNC(CURRENT_DATE(), MONTH)+7,ISOWEEK) as initial2),
final1 as ( select LAST_DAY(DATE_TRUNC(CURRENT_DATE(), MONTH)+7, ISOWEEK) as final2),
HelloWorld AS (
SELECT shop_date, revenue
FROM fulltable
WHERE shop_date >= (select initial2 from initial1) AND shop_date <= (select final2 from final1)
)
SELECT * from HelloWorld;

With config table having just 1 row and cross-joining it with your table, your query can be written like below.
WITH config AS (
SELECT DATE_TRUNC(DATE_TRUNC(CURRENT_DATE(), MONTH)+7,ISOWEEK) AS initial,
LAST_DAY(DATE_TRUNC(CURRENT_DATE(), MONTH)+7, ISOWEEK) AS final
),
HelloWorld AS (
SELECT * FROM UNNEST([DATE '2022-06-06']) shop_date, config
WHERE shop_date >= config.initial AND shop_date <= config.final
)
SELECT * FROM HelloWorld;

A few patterns I've used:
If you have many that have the same return type (STRING)
CREATE TEMP FUNCTION config(key STRING)
RETURNS STRING AS (
CASE key
WHEN "timezone" THEN "America/Edmonton"
WHEN "something" THEN "Value"
END
);
Then use config(key) to retrieve the value.
Or,
Create a function for each constant
CREATE TEMP FUNCTION timezone()
RETURNS STRING AS ("America/Edmonton");
Then use timezone() to get the value.
It would execute the function each time, so don't do something expensive in there (like SELECT from another table).

Related

DB2 SQL function returning multiple values when I am expecting only one

I am trying to get the location of the last time an item was moved via sql function with the code below. Pretty basic, I'm just trying to grab the max date and time. If I run the sql as a regular select and hard code an item number in ATPRIM I get only one location. But if I create this function and then try to run it and then pass the function an item number I get every occurrence in the history file instead of just the MAX which would be the most recent. Also I have tried a Select Distinct and that did not do anything for me.
ATOGST = Item Location
ATPRIM = Item
ATDATE = Date
ATTIME = Time
CREATE FUNCTION ERPLXU/F#QAT1(AATPRIM VARCHAR(10))
RETURNS CHAR(50)
LANGUAGE SQL
NOT DETERMINISTIC
BEGIN DECLARE F#QAT1 CHAR(50) ;
SET F#QAT1 = ' ' ;
SELECT ATOGST
INTO F#QAT1 FROM ERPLXF/QAT as t1
WHERE ATPRIM = AATPRIM
AND ATDATE = (SELECT MAX(ATDATE) FROM ERPLXF/QAT AS T2
WHERE T2.ATPRIM = AATPRIM)
AND ATTIME = (SELECT MAX(ATTIME) FROM ERPLXF/QAT AS T3
WHERE T3.ATPRIM = AATPRIM
AND T3.ATDATE = T1.ATDATE) ;
RETURN F#QAT1 ;
END
EDIT:
So what I am trying to do is get that location and I got it to work on my iSeries in strsql but the problem is we use a web application called Web Object Wizard (WoW) which lets us use sql to make reports that are more user friendly. Below is what I was trying to get to work but the subquery in the select does not work in WoW so that is where I was trying create a function which we know works in other applications.
SELECT distinct t1.atprim, atdesc, dbtabl, dbdtin, dblife, dblpdp,
dbcost, dbbas, dbresv, dbyrdp, dbcurr,
(select atogst
from erplxf.qat as t2
where t1.atprim = t2.atprim and atdate = (select max(atdate) from
erplxf.qat as t3 where t2.atprim = t3.atprim) and attime = (select
max(attime) from erplxf.qat as t4 where t1.atprim = t4.atprim and
t1.atdate = t4.atdate)
) as #113_ToLoc
FROM erplxf.qat as t1 join erplxf.qdb on atassn = dbassn
where dbrcid = 'DB'
and dbcurr != 0
So instead of that subquery at the end of the select it would just be
, erplxu.f#qat1(atprim) as #113_ToLoc
Try this:
CREATE FUNCTION ERPLXU/F#QAT1(AATPRIM VARCHAR(10))
RETURNS CHAR(50)
LANGUAGE SQL
RETURN
SELECT ATOGST
FROM ERPLXF/QAT
WHERE ATPRIM = AATPRIM
ORDER BY ATDATE DESC, ATTIME DESC
FETCH FIRST 1 ROW ONLY;

Is it possible to invoke BigQuery procedures in python client?

Scripting/procedures for BigQuery just came out in beta - is it possible to invoke procedures using the BigQuery python client?
I tried:
query = """CALL `myproject.dataset.procedure`()...."""
job = client.query(query, location="US",)
print(job.results())
print(job.ddl_operation_performed)
print(job._properties) but that didn't give me the result set from the procedure. Is it possible to get the results?
Thank you!
Edited - stored procedure I am calling
CREATE OR REPLACE PROCEDURE `Project.Dataset.Table`(IN country STRING, IN accessDate DATE, IN accessId, OUT saleExists INT64)
BEGIN
IF EXISTS (SELECT 1 FROM dataset.table where purchaseCountry = country and purchaseDate=accessDate and customerId = accessId)
THEN
SET saleExists = (SELECT 1);
ELSE
INSERT Dataset.MissingSalesTable (purchaseCountry, purchaseDate, customerId) VALUES (country, accessDate, accessId);
SET saleExists = (SELECT 0);
END IF;
END;
If you follow the CALL command with a SELECT statement, you can get the return value of the function as a result set. For example, I created the following stored procedure:
BEGIN
-- Build an array of the top 100 names from the year 2017.
DECLARE
top_names ARRAY<STRING>;
SET
top_names = (
SELECT
ARRAY_AGG(name
ORDER BY
number DESC
LIMIT
100)
FROM
`bigquery-public-data.usa_names.usa_1910_current`
WHERE
year = 2017 );
-- Which names appear as words in Shakespeare's plays?
SET
top_shakespeare_names = (
SELECT
ARRAY_AGG(name)
FROM
UNNEST(top_names) AS name
WHERE
name IN (
SELECT
word
FROM
`bigquery-public-data.samples.shakespeare` ));
END
Running the following query will return the procedure's return as the top-level results set.
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `my-project.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
In Python:
from google.cloud import bigquery
client = bigquery.Client()
query_string = """
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `swast-scratch.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
"""
query_job = client.query(query_string)
rows = list(query_job.result())
print(rows)
Related: If you have SELECT statements within a stored procedure, you can walk the job to fetch the results, even if the SELECT statement isn't the last statement in the procedure.
# TODO(developer): Import the client library.
# from google.cloud import bigquery
# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()
# Run a SQL script.
sql_script = """
-- Declare a variable to hold names as an array.
DECLARE top_names ARRAY<STRING>;
-- Build an array of the top 100 names from the year 2017.
SET top_names = (
SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE year = 2000
);
-- Which names appear as words in Shakespeare's plays?
SELECT
name AS shakespeare_name
FROM UNNEST(top_names) AS name
WHERE name IN (
SELECT word
FROM `bigquery-public-data.samples.shakespeare`
);
"""
parent_job = client.query(sql_script)
# Wait for the whole script to finish.
rows_iterable = parent_job.result()
print("Script created {} child jobs.".format(parent_job.num_child_jobs))
# Fetch result rows for the final sub-job in the script.
rows = list(rows_iterable)
print("{} of the top 100 names from year 2000 also appear in Shakespeare's works.".format(len(rows)))
# Fetch jobs created by the SQL script.
child_jobs_iterable = client.list_jobs(parent_job=parent_job)
for child_job in child_jobs_iterable:
child_rows = list(child_job.result())
print("Child job with ID {} produced {} rows.".format(child_job.job_id, len(child_rows)))
It works if you have SELECT inside your procedure, given the procedure being:
create or replace procedure dataset.proc_output() BEGIN
SELECT t FROM UNNEST(['1','2','3']) t;
END;
Code:
from google.cloud import bigquery
client = bigquery.Client()
query = """CALL dataset.proc_output()"""
job = client.query(query, location="US")
for result in job.result():
print result
will output:
Row((u'1',), {u't': 0})
Row((u'2',), {u't': 0})
Row((u'3',), {u't': 0})
However, if there are multiple SELECT inside a procedure, only the last result set can be fetched this way.
Update
See below example:
CREATE OR REPLACE PROCEDURE zyun.exists(IN country STRING, IN accessDate DATE, OUT saleExists INT64)
BEGIN
SET saleExists = (WITH data AS (SELECT "US" purchaseCountry, DATE "2019-1-1" purchaseDate)
SELECT Count(*) FROM data where purchaseCountry = country and purchaseDate=accessDate);
IF saleExists = 0 THEN
INSERT Dataset.MissingSalesTable (purchaseCountry, purchaseDate, customerId) VALUES (country, accessDate, accessId);
END IF;
END;
BEGIN
DECLARE saleExists INT64;
CALL zyun.exists("US", DATE "2019-2-1", saleExists);
SELECT saleExists;
END
BTW, your example is much better served with a single MERGE statement instead of a script.

Creating Dynamic Dates as Variable (Column Names) in SQL

First, I have read about similar posts and have read the comments that this isn't an ideal solution and I get it but the boss (ie client) wants it this way. The parameters are as follows (for various reasons too bizarre to go into but trust me):
1. SQL Server Mgmt Studio 2016
2. NO parameters or pass throughs or temp tables. All has to be within contained code.
So here we go:
I need to create column headings that reflect dates:
1. Current date
2. Most recent quarter end prior to current date
3. Most recent quarter end prior to #2
4. Most recent quarter end prior to #3
5. Most recent quarter end prior to #4
6. Most recent quarter end prior to #5
So if using today's date, my column names would be as follows
12/18/2016 9/30/2016 6/30/2016 3/31/2016 12/31/2016 9/30/2015
I can easily do it in SAS but can't in SQL given the requirements stated above.
Help please with same code.
Thank you
Paula
Seems like a long way to go for something which really belongs in the presentation layer. That said, consider the following:
Let's assume you maintain a naming convention for your calculated fields, for example [CurrentDay], [QtrMinus1], [QtrMinus2], [QtrMinus3], [QtrMinus4],[QtrMinus5]. Then we can wrap your complicated query in some dynamic SQL.
Just as an illustration, let's assume your current query results looks like this
After the "wrap", the results will then look like so:
The code - Since you did NOT exclude Dynamic SQL.
Declare #S varchar(max)='
Select [CustName]
,['+convert(varchar(10),GetDate(),101)+'] = [CurrentDay]
,['+Convert(varchar(10),EOMonth(DateFromParts(Year(DateAdd(QQ,-1,GetDate())),DatePart(QQ,DateAdd(QQ,-1,GetDate()))*3,1)),101)+'] = [QtrMinus1]
,['+Convert(varchar(10),EOMonth(DateFromParts(Year(DateAdd(QQ,-2,GetDate())),DatePart(QQ,DateAdd(QQ,-2,GetDate()))*3,1)),101)+'] = [QtrMinus2]
,['+Convert(varchar(10),EOMonth(DateFromParts(Year(DateAdd(QQ,-3,GetDate())),DatePart(QQ,DateAdd(QQ,-3,GetDate()))*3,1)),101)+'] = [QtrMinus3]
,['+Convert(varchar(10),EOMonth(DateFromParts(Year(DateAdd(QQ,-4,GetDate())),DatePart(QQ,DateAdd(QQ,-4,GetDate()))*3,1)),101)+'] = [QtrMinus4]
,['+Convert(varchar(10),EOMonth(DateFromParts(Year(DateAdd(QQ,-5,GetDate())),DatePart(QQ,DateAdd(QQ,-5,GetDate()))*3,1)),101)+'] = [QtrMinus5]
From (
-- Your Complicated Query --
Select * from YourTable
) A
'
Exec(#S)
If it helps the visualization, the generated SQL is as follows:
Select [CustName]
,[12/18/2016] = [CurrentDay]
,[09/30/2016] = [QtrMinus1]
,[06/30/2016] = [QtrMinus2]
,[03/31/2016] = [QtrMinus3]
,[12/31/2015] = [QtrMinus4]
,[09/30/2015] = [QtrMinus5]
From (
-- Your Complicated Query --
Select * from YourTable
) A
Here is one way using dynamic query
DECLARE #prior_quarters INT = 4,
#int INT =1,
#col_list VARCHAR(max)=Quotename(CONVERT(VARCHAR(20), Getdate(), 101))
WHILE #int <= #prior_quarters
BEGIN
SELECT #col_list += Concat(',', Quotename(CONVERT(VARCHAR(20), Eomonth(Getdate(), ( ( ( ( Month(Getdate()) - 1 ) % 3 ) + 1 ) * -1 ) * #int), 101)))
SET #int+=1
END
--SELECT #col_list -- for debugging
EXEC ('select '+#col_list+' from yourtable')

Error creating function in DB2 with params

I have a problem with a function in db2
The function finds a record, and returns a number according to whether the first and second recorded by a user
The query within the function is this
SELECT
CASE
WHEN NUM IN (1,2) THEN 5
ELSE 2.58
END AS VAL
FROM (
select ROW_NUMBER() OVER() AS NUM ,s.POLLIFE
from LQD943DTA.CAQRTRML8 c
INNER JOIN LSMODXTA.SCSRET s ON c.MCCNTR = s.POLLIFE
WHERE s.NOEMP = ( SELECT NOEMP FROM LSMODDTA.LOLLM04 WHERE POLLIFE = '0010111003')
) AS T WHERE POLLIFE = '0010111003'
And works perfect
I create the function with this code
CREATE FUNCTION LIBWEB.BNOWPAPOL(POL CHAR)
RETURNS DECIMAL(7,2)
LANGUAGE SQL
NOT DETERMINISTIC
READS SQL DATA
RETURN (
SELECT
CASE
WHEN NUM IN (1,2) THEN 5
ELSE 2.58
END AS VAL
FROM (
select ROW_NUMBER() OVER() AS NUM ,s.POLLIFE
from LQD943DTA.CAQRTRML8 c
INNER JOIN LSMODXTA.SCSRET s ON c.MCCNTR = s.POLLIFE
WHERE s.NOEMP = ( SELECT NOEMP FROM LSMODDTA.LOLLM04 WHERE POLLIFE = POL)
) AS T WHERE POLLIFE = POL
)
The command runs executed properly
WARNING: 17:55:40 [CREATE - 0 row(s), 0.439 secs] Command processed.
No rows were affected
When I want execute the query a get a error
SELECT LIBWEB.BNOWPAPOL('0010111003') FROM DATAS.DUMMY -- dummy has only one row
I get
[Error Code: -204, SQL State: 42704] [SQL0204] BNOWPAPOL in LIBWEB
type *N not found.
I detect, when I remove the parameter the function works fine!
With this code
CREATE FUNCTION LIBWEB.BNOWPAPOL()
RETURNS DECIMAL(7,2)
LANGUAGE SQL
NOT DETERMINISTIC
READS SQL DATA
RETURN (
SELECT
CASE
WHEN NUM IN (1,2) THEN 5
ELSE 2.58
END AS VAL
FROM (
select ROW_NUMBER() OVER() AS NUM ,s.POLLIFE
from LQD943DTA.CAQRTRML8 c
INNER JOIN LSMODXTA.SCSRET s ON c.MCCNTR = s.POLLIFE
WHERE s.NOEMP = ( SELECT NOEMP FROM LSMODDTA.LOLLM04 WHERE POLLIFE = '0010111003')
) AS T WHERE POLLIFE = '0010111003'
)
Why??
This statement:
SELECT LIBWEB.BNOWPAPOL('0010111003') FROM DATAS.DUMMY
causes this error:
[Error Code: -204, SQL State: 42704] [SQL0204] BNOWPAPOL in LIBWEB
type *N not found.
The parm value passed into the BNOWPAPOL() function is supplied as a quoted string with no definition (no CAST). The SELECT statement assumes that it's a VARCHAR value since different length strings might be given at any time and passes it to the server as a VARCHAR.
The original function definition says:
CREATE FUNCTION LIBWEB.BNOWPAPOL(POL CHAR)
The function signature is generated for a single-byte CHAR. (Function definitions can be overloaded to handle different inputs, and signatures are used to differentiate between function versions.)
Since a VARCHAR was passed from the client and only a CHAR function version was found by the server, the returned error fits. Changing the function definition or CASTing to a matching type can solve this kind of problem. (Note that a CHAR(1) parm could only correctly handle a single-character input if a value is CAST.)

Effective way how to handle application settings in PL/pgSQL functions

Consider following situation: I have PL/pgSQL function which checks, If given auditor has some prerequisites for QS Auditor function. Thresholds of this prerequisites are defined in separate table quasar_settings. Every time, If is the function called, is executed SELECT which retrieves these prerequisites. This is quite inefficient, because this SELECT is called for every row. This quasar_settings table contains only one row. Is there any other more effective solution (global variable, caching, etc)?
Table quasar_settings has only one row.
Using PostgreSQL 9.3
PL/pgSQL function
CREATE OR REPLACE FUNCTION qs_auditor_training_auditing(auditor quasar_auditor) RETURNS boolean AS $$
DECLARE
settings quasar_settings%ROWTYPE;
BEGIN
SELECT s INTO settings
FROM quasar_settings s LIMIT 1;
RETURN auditor.nb1023_procedures_hours >= settings.qs_auditor_nb1023_procedures AND
-- MD Training
auditor.mdd_hours + auditor.ivd_hours >= settings.qs_auditor_md_training AND
-- ISO 9001 Trainig
(
auditor.is_aproved_for_iso13485 OR
(auditor.is_aproved_for_iso9001 AND auditor.iso13485_hours >= settings.qs_auditor_iso13485_training) OR
(auditor.iso13485_hours + auditor.iso9001_hours >= settings.qs_auditor_class_room_training)
);
END;
$$ LANGUAGE plpgsql;
Example of usage:
SELECT auditor.id, qs_auditor_training_auditing(auditor) FROM quasar_auditor auditor;
Do a cross join to the settings table in instead of calling the function at every row
select
a.id,
a.nb1023_procedures_hours >= s.qs_auditor_nb1023_procedures and
-- md training
a.mdd_hours + a.ivd_hours >= s.qs_auditor_md_training and
-- iso 9001 trainig
(
a.is_aproved_for_iso13485 or
(
a.is_aproved_for_iso9001 and
a.iso13485_hours >= s.qs_auditor_iso13485_training
) or
(a.iso13485_hours + a.iso9001_hours >= s.qs_auditor_class_room_training)
)
from
quasar_auditor a
cross join
quasar_settings s