Split string in bigquery - google-bigquery

I've the following string that I would like to split and given in rows.
Example values in my column are:
['10000', '10001', '10002', '10003', '10004']
Using the SPLIT function, I get the following result:
I've two questions:
How do I split it so that I get '10000', instead of ['10000'?
How do I remove the Apostrof ' ?
Response:

Consider below example
with t as (
select ['10000', '10001', '10002', '10003', '10004'] col
)
select cast(item as int64) num
from t, unnest(col) item
Above is assumption that col is array. In case if it is a string - use below
with t as (
select "['10000', '10001', '10002', '10003', '10004']" col
)
select cast(trim(item, " '[]") as int64) num
from t, unnest(split(col)) item
Both with output

Related

BigQuery - Count how many words in array are equal

I want to count how many similar words I have in a path (which will be split at delimiter /) and return a matching array of integers.
Input data will be something like:
I want to add another column, match_count, with an array of integers. For example:
To replicate this case, this is the query I'm working with:
CREATE TEMP FUNCTION HOW_MANY_MATCHES_IN_PATH(src_path ARRAY<STRING>, test_path ARRAY<STRING>) RETURNS ARRAY<INTEGER> AS (
-- WHAT DO I PUT HERE?
);
SELECT
*,
HOW_MANY_MATCHES_IN_PATH(src_path, test_path) as dir_path_match_count
FROM (
SELECT
ARRAY_AGG(x) AS src_path,
ARRAY_AGG(y) as test_path
FROM
UNNEST([
'lib/client/core.js',
'lib/server/core.js'
]) AS x, UNNEST([
'test/server/core.js'
]) as y
)
I've tried working with ARRAY and UNNEST in the HOW_MANY_MATCHES_IN_PATH function, but I either end up with an error or an array of 4 items (in this example)
Consider below approach
create temp function how_many_matches_in_path(src_path string, test_path string) returns integer as (
(select count(distinct src)
from unnest(split(src_path, '/')) src,
unnest(split(test_path, '/')) test
where src = test)
);
select *,
array( select how_many_matches_in_path(src, test)
from t.src_path src with offset
join t.test_path test with offset
using(offset)
) dir_path_match_count
from your_table t
if to apply to sample of Input data in your question
with your_table as (
select
['lib/client/core.js', 'lib/server/core.js'] src_path,
['test/server/core.js', 'test/server/core.js'] test_path
)
output is

REGEX get all matched patterns by SQL DB2

all.
I need to extract from the string by REGEX all that matching the pattern "TTT\d{3}"
For the string in example i would like to get:
TTT108,TTT109,TTT111,TTT110
The DB2 function i would like to use is REGEXP_REPLACE(str,'REGEX pattern', ',').
The number of matching can be 0,1,2,3... in each string.
Thank you.
The example:
TTT108(optional);TTT109(optional);TTT111(optional);TTT110optional);ENTITYLIST_2=(optional);ENTITYLIST_3=(optional);Containment_Status=(optional)
If you want to extract the valid instead of replacing the invalid characters, please check if this helps:
with data (s) as (values
('TTT108(optional);TTT109(optional);TTT111(optional);TTT110optional);ENTITYLIST_2=(optional);ENTITYLIST_3=(optional);Containment_Status=(optional)')
)
select listagg(sst,', ') within group (order by n)
from (
select n,
regexp_substr(s,'(TTT[0-9][0-9][0-9])', 1, n)
from data
cross join (values (1),(2),(3),(4),(5)) x (n) -- any numbers table
where n <= regexp_count(s,'(TTT[0-9][0-9][0-9])')
) x (n,sst)
For any number of tokens & Db2 versions before 11.1:
select id, listagg(tok, ',') str
from
(
values
(1, 'TTT108(optional);TTT109(optional);TTT111(optional);TTT110optional);ENTITYLIST_2=(optional);ENTITYLIST_3=(optional);Containment_Status=(optional)')
) mytable (id, str)
, xmltable
(
'for $id in tokenize($s, ";") let $new := replace($id, "(TTT\d{3}).*", "$1") where matches($id, "(TTT\d{3}).*") return <i>{string($new)}</i>'
passing mytable.str as "s"
columns tok varchar(6) path '.'
) t
group by id;

Bigquery array of STRINGs to array of INTs

I'm trying to pull an array of INT64 s in BigQuery standard SQL from a column which is a long string of numbers separated by commas (for example, 2013,1625,1297,7634). I can pull an array of strings easily with:
SELECT
SPLIT(string_col,",")
FROM
table
However, I want to return an array of INT64 s, not an array of strings. How can I do that? I've tried
CAST(SPLIT(string_col,",") AS ARRAY<INT64>)
but that doesn't work.
Below is for BigQuery Standard SQL
#standardSQL
WITH yourTable AS (
SELECT 1 AS id, '2013,1625,1297,7634' AS string_col UNION ALL
SELECT 2, '1,2,3,4,5'
)
SELECT id,
(SELECT ARRAY_AGG(CAST(num AS INT64))
FROM UNNEST(SPLIT(string_col)) AS num
) AS num,
ARRAY(SELECT CAST(num AS INT64)
FROM UNNEST(SPLIT(string_col)) AS num
) AS num_2
FROM yourTable
Mikhail beat me to it and his answer is more extensive but adding this as a more minimal repro:
SELECT CAST(num as INT64) from unnest(SPLIT("2013,1625,1297,7634",",")) as num;

T-SQL function to split string with two delimiters as column separators into table

I'm looking for a t-sql function to get a string like:
a:b,c:d,e:f
and convert it to a table like
ID Value
a b
c d
e f
Anything I found in Internet incorporated single column parsing (e.g. XMLSplit function variations) but none of them letting me describe my string with two delimiters, one for column separation & the other for row separation.
Can you please guiding me regarding the issue? I have a very limited t-sql knowledge and cannot fork those read-made functions to get two column solution?
You can find a split() function on the web. Then, you can do string logic:
select left(val, charindex(':', val)) as col1,
substring(val, charindex(':', val) + 1, len(val)) as col2
from dbo.split(#str, ';') s(val);
You can use a custom SQL Split function in order to separate data-value columns
Here is a sql split function that you can use on a development system
It returns an ID value that can be helpful to keep id and value together
You need to split twice, first using "," then a second split using ";" character
declare #str nvarchar(100) = 'a:b,c:d,e:f'
select
id = max(id),
value = max(value)
from (
select
rowid,
id = case when id = 1 then val else null end,
value = case when id = 2 then val else null end
from (
select
s.id rowid, t.id, t.val
from (
select * from dbo.Split(#str, ',')
) s
cross apply dbo.Split(s.val, ':') t
) k
) m group by rowid

Get SQL Substring After a Certain Character but before a Different Character

I have some key values that I want to parse out of my SQL Server table. Here are some examples of these key values:
R50470B50469
B17699C88C68AM
R22818B17565C32G16SU
B1444
What I am wanting to get out of the string, is all the numbers that occur after the character 'B' but before any other letter character if it exists such as 'C'. How can I do this in SQL?
WITH VALS(Val) AS
(
SELECT 'R50470B50469' UNION ALL
SELECT 'R22818B17565C32G16SU' UNION ALL
SELECT 'R22818B17565C32G16SU' UNION ALL
SELECT 'B1444'
)
SELECT SUBSTRING(Tail,0,PATINDEX('%[AC-Z]%', Tail))
FROM VALS
CROSS APPLY
(SELECT RIGHT(Val, LEN(Val) - CHARINDEX('B', Val)) + 'X') T(Tail)
WHERE Val LIKE '%B%'