pass an ARRAY<STRUCT> to a js UDF in bigquery sql

pass an ARRAY<STRUCT> to a js UDF in bigquery sql - google-bigquery

I'm trying to get data from some tables, and pass them to a JavaScript UDF in bigquery.
I wrote the following code but I can't seem to understand the correct syntax to to store the result of my select in the wanted structure, then how to pass it to my udf function.
DECLARE arg1 ARRAY<STRING>;
DECLARE arg2 ARRAY<STRUCT <col1 STRING, col2 STRING> >;
DECLARE res1 ARRAY<STRING>;
SET arg1 = ARRAY<STRING>["Adams", "Joseph", "Davis", "Mary", "Jesus"] ;
CREATE TEMP FUNCTION myfunction(arg1 ARRAY<STRING> , arg2 ARRAY<STRUCT<col1 STRING, col2 STRING> > )
RETURNS ARRAY<STRING>
LANGUAGE js AS '''
return ["a", "b", "c"]
''';
SET ARG2 = (SELECT AS STRUCT(
WITH TBL1 AS
(SELECT 'Adams' as LastName, 50 as SchoolID UNION ALL
SELECT 'Buchanan', 52 UNION ALL
SELECT 'Coolidge', 52 UNION ALL
SELECT 'Davis', 51 UNION ALL
SELECT 'Eisenhower', 77)
SELECT LastName as col1, SchoolID as col2 FROM TBL1));
SET res1 = select res FROM UNNEST(myfunction( arg1, arg2 )) res;

There are a few syntax errors with the code you shared but based on the procedural logic, I've made a few adjustments to a more declarative style and corrected these.
I have kept arg1 even though we may also include this in the TBL1 CTE if desired
arg2 is now created by selected the values from TBL1 as a STRUCT and the columns are renamed using as col1 and as col2
I've casted SchoolId as a string since your UDF accepts col2 as string
I've added another field id for demo purposes as I will group or aggregate the data by this id column to get the array of struct arg2 parameter required by your UDF
I finally select id and call your UDF with the parameters in the correct format. array_agg is used to create the array of structs here. The result is in a column named res2
DECLARE arg1 ARRAY<STRING>;
SET arg1 = ARRAY<STRING>["Adams", "Joseph", "Davis", "Mary", "Jesus"] ;
CREATE TEMP FUNCTION myfunction(arg1 ARRAY<STRING> , arg2 ARRAY<STRUCT<col1 STRING, col2 STRING> > )
RETURNS ARRAY<STRING>
LANGUAGE js AS '''
return ["a", "b", "c"]
''';
WITH TBL1 AS (
SELECT 'Adams' as LastName, 50 as SchoolID UNION ALL
SELECT 'Buchanan', 52 UNION ALL
SELECT 'Coolidge', 52 UNION ALL
SELECT 'Davis', 51 UNION ALL
SELECT 'Eisenhower', 77
),
tbl2 as (
SELECT
1 as id,
STRUCT(
LastName as col1,
CAST(SchoolID as STRING) as col2
) as arg2
FROM TBL1
)
select id, myfunction(arg1, array_agg( arg2)) as res1 from tbl2
group by id
Let me know if this works for you.

Related

Escape characters in BQ external tables

I have an external table which is populated from a csv file.
In the csv file there is a field which has an escape character in it followed by a coma.
eg "a\,b" which should read just "a,b". when i load the csv file it it separates it into 2 columns "a", "b" but should read "a,b" in one column. I've tried using the option = '' without any luck.
CREATE OR REPLACE EXTERNAL TABLE TEST
(A STRING,
B STRING)
OPTIONS (
format = 'CSV',
quote = '\'
)
Could someone help ?

You may try below workaround.
CREATE OR REPLACE EXTERNAL TABLE `your-project.your-dataset.so_test` (
raw STRING
) OPTIONS (
uris=['gs://your-bucket/so/test2.csv'],
format = 'CSV',
field_delimiter = CHR(1)
);
CREATE TEMP TABLE sample_table AS
SELECT csv[SAFE_OFFSET(0)] col1,
REPLACE(csv[SAFE_OFFSET(1)], '|', ',') col2,
csv[SAFE_OFFSET(2)] col3,
FROM `bigdata-cubig-data.bdc_cubig_temp.so_test`,
UNNEST ([STRUCT(SPLIT(REPLACE(raw, '\\,', '|')) AS csv)]);
SELECT * FROM sample_table;
Sample csv file
gs://your-bucket/so/test2.csv
blah,a\,b,blah
Query results
Or, using PIVOT query
CREATE TEMP TABLE sample_table (
col1 STRING, col2 STRING, col3 STRING,
) AS
SELECT * REPLACE(REPLACE(col_1, '|', ',') AS col_1) FROM (
SELECT col, offset
FROM `your-project.your-dataset.so_test`,
UNNEST (SPLIT(REPLACE(csv, '\\,', '|'))) col WITH offset
) PIVOT (ANY_VALUE(col) col FOR offset IN (0, 1, 2));
SELECT * FROM sample_table;

Get column names from a table matching a specific value

I have a table named TableA which has many rows. Sample structure given below
CID int, Col1 int, Col2 int, Col3 int, Col4 int
when I run a query (say when CID=5) i will get only one row with Col1,Col2 etc having different values.
I want to get the digit of column name where the row value is -1.
For more clarity
CID, Col1, Col2 , Col3 , Col4
5 0 -1 0 -1
in this example i should get result as
MyRes
2
4
Is there a any way to achieve it

You may use unpivot then filter your record on basis of result value which is -1 in your case.
; with cte as (
select CID, result, col from
(
select * from table
) as t
unpivot
(
result for col in ( Col1, col2, col3, col4 )
) as p
)
select CId, col from cte where result = -1
Just a bit of homework for you to get the number part from the column name.
In case you find any problem in that part please comment I'll do that also but lets give a try first.

I got the answer
declare #ID int
set #ID = 1
select substring( T1.ColName, 2, LEN(T1.ColName)) as MyRes from (
select
Col.value('local-name(.)', 'varchar(10)') as ColName
from (select *
from TableA
for xml path(''), type) as T(XMLCol)
cross apply
T.XMLCol.nodes('*') as n(Col)
where Col.value('.', 'varchar(10)') = '-1'
)T1

Split string into words in columns

I am looking to split a string into words in columns in SQL Server 2014. I have found a few solutions but all of them are giving the results in rows. How can I break the below string into columns?
"First Second Third Fourth Fifth"

You can use XML and grab the elements by their position:
DECLARE #YourString VARCHAR(100)='First Second Third Fourth Fifth';
WITH StringAsXML AS
(
SELECT CAST('<x>' + REPLACE((SELECT #YourString AS [*] FOR XML PATH('')),' ','</x><x>') + '</x>' AS XML) TheXml
)
SELECT TheXml.value('x[1]/text()[1]','nvarchar(max)') AS FirstElement
,TheXml.value('x[2]/text()[1]','nvarchar(max)') AS SecondElement
,TheXml.value('x[3]/text()[1]','nvarchar(max)') AS ThirdElement
,TheXml.value('x[4]/text()[1]','nvarchar(max)') AS FourthElement
,TheXml.value('x[5]/text()[1]','nvarchar(max)') AS FifthElement
FROM StringAsXML;
Remark
You can use PIVOT, conditional aggregation, FROM(VALUES()) or the above. but any of these approaches will need a known set of columns (a known count of elements or at least a maximum count of elements).
If you cannot rely on such a knowledge, you can use dynamically created SQL. This would mean to create one of the working statements on string base and use EXEC for a dynamic execution.
UPDATE: A dynamic approach
This approach will deal with a variable number of elements
DECLARE #YourString VARCHAR(100)='First Second Third Fourth Fifth';
DECLARE #Delimiter CHAR(1)=' ';
DECLARE #countElements INT = LEN(#YourString)-LEN(REPLACE(#YourString,#Delimiter,''));
DECLARE #Statement VARCHAR(MAX)=
'WITH StringAsXML AS
(
SELECT CAST(''<x>'' + REPLACE((SELECT ''ReplaceYourString'' AS [*] FOR XML PATH('''')),'' '',''</x><x>'') + ''</x>'' AS XML) TheXml
)
SELECT ReplaceColumnList
FROM StringAsXML;';
DECLARE #columnList VARCHAR(MAX);
WITH cte AS
(
SELECT 1 AS ElementCounter
,CAST('TheXml.value(''x[1]/text()[1]'',''nvarchar(max)'') AS Element_01' AS VARCHAR(MAX)) AS ColStatement
UNION ALL
SELECT cte.ElementCounter+1
,cte.ColStatement + CAST(',TheXml.value(''x[' + CAST(cte.ElementCounter+1 AS VARCHAR(10)) + ']/text()[1]'',''nvarchar(max)'') AS Element_' + REPLACE(STR(cte.ElementCounter + 1,2),' ','0') AS VARCHAR(MAX))
FROM cte
WHERE cte.ElementCounter <= #countElements
)
SELECT #columnList=(SELECT TOP 1 cte.ColStatement FROM cte ORDER BY cte.ElementCounter DESC)
--replace the string you want to split
SET #Statement = REPLACE(#Statement,'ReplaceYourString',#YourString);
--replace the columnList
SET #Statement = REPLACE(#Statement,'ReplaceColumnList',#columnList);
EXEC(#Statement);
UPDATE 2: The smallest fully inlined and position-safe splitter I know of
Try this out:
DECLARE #inp VARCHAR(200) = 'First Second Third Fourth Fifth';
DECLARE #dlmt VARCHAR(100)=' ';
;WITH
a AS (SELECT n=0, i=-1, j=0 UNION ALL SELECT n+1, j, CHARINDEX(#dlmt, #inp, j+1) FROM a WHERE j > i),
b AS (SELECT n, SUBSTRING(#inp, i+1, IIF(j>0, j, LEN(#inp)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b;
And just to get it complete: The above tiny splitter combined with PIVOT:
;WITH
a AS (SELECT n=0, i=-1, j=0 UNION ALL SELECT n+1, j, CHARINDEX(#dlmt, #inp, j+1) FROM a WHERE j > i),
b AS (SELECT n, SUBSTRING(#inp, i+1, IIF(j>0, j, LEN(#inp)+1)-i-1) s FROM a WHERE i >= 0)
SELECT p.*
FROM b
PIVOT(MAX(s) FOR n IN([1],[2],[3],[4],[5])) p;

You can use a SQL split string function to seperate the string into words and using the order of the word in the original string, you can use CASE statements like a PIVOT query and display as columns
Here is a sample
declare #string varchar(max) = 'First Second Third Fourth Fifth'
;with cte as (
select
case when id = 1 then val end as Col1,
case when id = 2 then val end as Col2,
case when id = 3 then val end as Col3,
case when id = 4 then val end as Col4,
case when id = 5 then val end as Col5
from dbo.split( #string,' ')
)
select
max(Col1) as Col1,
max(Col2) as Col2,
max(Col3) as Col3,
max(Col4) as Col4,
max(Col5) as Col5
from cte
If you cannot create a UDF, you can use the logic in your SQL code as follows
Please note that if you have your data in a database table column, you can simply replace column content in the first SQL CTE expression
declare #string varchar(max) = 'First Second Third Fourth Fifth'
;with cte1 as (
select convert(xml, N'<root><r>' + replace(#string,' ','</r><r>') + '</r></root>') as rawdata
), cte2 as (
select
ROW_NUMBER() over (order by getdate()) as id,
r.value('.','varchar(max)') as val
from cte1
cross apply rawdata.nodes('//root/r') as records(r)
)
select
max(Col1) as Col1,
max(Col2) as Col2,
max(Col3) as Col3,
max(Col4) as Col4,
max(Col5) as Col5
from (
select
case when id = 1 then val end as Col1,
case when id = 2 then val end as Col2,
case when id = 3 then val end as Col3,
case when id = 4 then val end as Col4,
case when id = 5 then val end as Col5
from cte2
) t

You may use parsename function as :
create table tab ( str varchar(100));
insert into tab values('First Second Third Fourth Fifth');
with t as
(
select replace(str,' ','.') as str
from tab
)
Select substring(str,1,charindex('.',str)-1) as col_first,
parsename(substring(str,charindex('.',str)+1,len(str)),4) as col_second,
parsename(substring(str,charindex('.',str)+1,len(str)),3) as col_third,
parsename(substring(str,charindex('.',str)+1,len(str)),2) as col_fourth,
parsename(substring(str,charindex('.',str)+1,len(str)),1) as col_fifth
from t;
col_first col_second col_third col_fourth col_fifth
--------- ---------- --------- ---------- ---------
First Second Third Fourth Fifth
P.S. firstly, need to split the main string into the parts with at most 3 three dot(.) character(otherwise the function doesn't work). It's a restriction for parsename.
Rextester Demo

Select rows using in with comma-separated string parameter

I'm converting a stored procedure from MySql to SQL Server. The procedure has one input parameter nvarchar/varchar which is a comma-separated string, e.g.
'1,2,5,456,454,343,3464'
I need to write a query that will retrieve the relevant rows, in MySql I'm using FIND_IN_SET and I wonder what the equivalent is in SQL Server.
I also need to order the ids as in the string.
The original query is:
SELECT *
FROM table_name t
WHERE FIND_IN_SET(id,p_ids)
ORDER BY FIND_IN_SET(id,p_ids);

The equivalent is like for the where and then charindex() for the order by:
select *
from table_name t
where ','+p_ids+',' like '%,'+cast(id as varchar(255))+',%'
order by charindex(',' + cast(id as varchar(255)) + ',', ',' + p_ids + ',');
Well, you could use charindex() for both, but the like will work in most databases.
Note that I've added delimiters to the beginning and end of the string, so 464 will not accidentally match 3464.

You would need to write a FIND_IN_SET function as it does not exist. The closet mechanism I can think of to convert a delimited string into a joinable object would be a to create a table-valued function and use the result in a standard in statement. It would need to be similar to:
DECLARE #MyParam NVARCHAR(3000)
SET #MyParam='1,2,5,456,454,343,3464'
SELECT
*
FROM
MyTable
WHERE
MyTableID IN (SELECT ID FROM dbo.MySplitDelimitedString(#MyParam,','))
And you would need to create a MySplitDelimitedString type table-valued function that would split a string and return a TABLE (ID INT) object.

A set based solution that splits the id's into ints and join with the base table which will make use of index on the base table id. I assumed the id would be an int, otherwise just remove the cast.
declare #ids nvarchar(100) = N'1,2,5,456,454,343,3464';
with nums as ( -- Generate numbers
select top (len(#ids)) row_number() over (order by (select 0)) n
from sys.messages
)
, pos1 as ( -- Get comma positions
select c.ci
from nums n
cross apply (select charindex(',', #ids, n.n) as ci) c
group by c.ci
)
, pos2 as ( -- Distinct posistions plus start and end
select ci
from pos1
union select 0
union select len(#ids) + 1
)
, pos3 as ( -- add row number for join
select ci, row_number() over (order by ci) as r
from pos2
)
, ids as ( -- id's and row id for ordering
select cast(substring(#ids, p1.ci + 1, p2.ci - p1.ci - 1) as int) id, row_number() over (order by p1.ci) r
from pos3 p1
inner join pos3 p2 on p2.r = p1.r + 1
)
select *
from ids i
inner join table_name t on t.id = i.id
order by i.r;

You can also try this by using regex to get the input values from comma separated string :
select * from table_name where id in (
select regexp_substr(p_ids,'[^,]+', 1, level) from dual
connect by regexp_substr(p_ids, '[^,]+', 1, level) is not null );

SQL How do I find values from a list that are not in a table

I have a table with values in a field called 'code'.
ABC
DFG
CDF
How would I select all codes that are not in the table from a list I have?
Eg:
SELECT * from [my list] where table1.code not in [my list]
the list is not in a table.
The list would be something like "ABC","BBB","TTT" (As strings)

Try this:
SELECT code
FROM Table1
WHERE code NOT IN ('ABC','CCC','DEF') --values from your list
It will result:
DFG
CDF
If the list is in another table, try this:
SELECT code
FROM Table1
WHERE code NOT IN (SELECT code FROM Table2)
As per your requirement, try this:
SELECT list
FROM Table2
WHERE list NOT IN (SELECT code from table1)
It will select the list values that are not in code.
See an example in SQL Fiddle

The question key point need to set "ABC","BBB","TTT" source data trun to a table.
that table will look like
|---+
|val|
|---+
|ABC|
|BBB|
|TTT|
Sqlite didn't support sqlite function. so that will be a little hard to sqlite your list to be a table.
You can use a CTE Recursive to make like sqlite function
You need to use replace function to remove " double quotes from your
source data.
There are two column in the CTE
val column carry your List data
rest column to remember current splite string
You will get a table from CTE like this.
|---+
|val|
|---+
|ABC|
|BBB|
|TTT|
Then you can compare the data with table1.
Not IN
WITH RECURSIVE split(val, rest) AS (
SELECT '', replace('"ABC","BBB","TTT"','"','') || ','
UNION ALL
SELECT
substr(rest, 0, instr(rest, ',')),
substr(rest, instr(rest, ',')+1)
FROM split
WHERE rest <> '')
SELECT * from (
SELECT val
FROM split
WHERE val <> ''
) t where t.val not IN (
select t1.code
from table1 t1
)
sqlfiddle:https://sqliteonline.com/#fiddle-5adeba5dfcc2fks5jgd7ernq
Outut Result:
+---+
|val|
+---+
|BBB|
|TTT|
If you want to show it in a line,use GROUP_CONCAT function.
WITH RECURSIVE split(val, rest) AS (
SELECT '', replace('"ABC","BBB","TTT"','"','') || ','
UNION ALL
SELECT
substr(rest, 0, instr(rest, ',')),
substr(rest, instr(rest, ',')+1)
FROM split
WHERE rest <> '')
SELECT GROUP_CONCAT(val,',') val from (
SELECT val
FROM split
WHERE val <> ''
) t where t.val not IN (
select t1.code
from table1 t1
)
Outut Result:
BBB, TTT
sqlfiddle:https://sqliteonline.com/#fiddle-5adecb92fcc36ks5jgda15yq
Note:That is unreasonable on SELECT * from [my list] where table1.code not in [my list],because This query has no place to find table1 so you couldn't get table1.code column
You can use not exists or JOIN to make your expect.
sqlfiddle:https://sqliteonline.com/#fiddle-5adeba5dfcc2fks5jgd7ernq

Can you use common table expressions?
WITH temp(code) AS (VALUES('ABC'),('BBB'),('TTT'),(ETC...))
SELECT temp.code FROM temp WHERE temp.code NOT IN
(SELECT DISTINCT table1.code FROM table1);
This would allow you to create a temporary table defined with your list of strings within the VALUES statement. Then use standard SQL to select values NOT IN your table1.code column.

Is this solution good, or am I missing something?
create table table10 (code varchar(20));
insert into table10 (code) values ('ABC');
insert into table10 (code) values ('DFG');
insert into table10 (code) values ('CDF');
select * from (
select 'ABC' as x
union all select 'BBB'
union all select 'TTT'
) t where t.x not in (select code from table10);
-- returns: BBB
-- TTT
See SQL Fiddle.

This can also be achieved using a stored procedure:
DELIMITER //
drop function if exists testcsv
//
create function testcsv(csv varchar(255)) returns varchar(255)
deterministic
begin
declare pos, found int default 0;
declare this, notin varchar(255);
declare continue handler for not found set found = 0;
set notin = '';
repeat
set pos = instr(csv, ',');
if (pos = 0) then
set this = trim('"' from csv);
set csv = '';
else
set this = trim('"' from trim(substring(csv, 1, pos-1)));
set csv = substring(csv, pos+1);
end if;
select 1 into found from table1 where code = this;
if (not found) then
if (notin = '') then
set notin = this;
else
set notin = concat(notin, ',', this);
end if;
end if;
until csv = ''
end repeat;
return (notin);
end
//
select testcsv('"ABC","BBB","TTT","DFG"')
Output:
BBB, TTT

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

pass an ARRAY<STRUCT> to a js UDF in bigquery sql - google-bigquery

Related

Escape characters in BQ external tables

Get column names from a table matching a specific value

Split string into words in columns

Select rows using in with comma-separated string parameter

SQL How do I find values from a list that are not in a table

Categories

Resources