TextJoin like function based on a condition in on SQL - sql

Trying to figure out if it is possible to do a textjoin like function in SQL based on a condition. Right now the only way I can think of doing it is by running a pivot to make the rows of the column and aggregating them that way. I think this is the only way to transpose the data in SQL?
Input This would be a aql table (tbl_fruit) that exists as the image depicts
SELECT *
FROM tbl_fruit
Output

Below is for BigQuery Standard SQL (without specifically listing each column, thus in a way that it scales ...)
#standardSQL
select `Group`, string_agg(split(kv, ':')[offset(0)], ', ') output
from `project.dataset.table` t,
unnest(split(translate(to_json_string((select as struct t.* except(`Group`))), '{}"', ''))) kv
where split(kv, ':')[offset(1)] != '0'
group by `Group`
If to apply to sample data from your question - output is

In Big Query, you could do this with arrays:
select grp,
array_to_string(
[
case when apples = 1 then 'apples' end,
case when oranges = 1 then 'oranges' end,
case when bananas = 1 then 'bananas' end,
case when grapes = 1 then 'grapes' end
],
','
) as output
from mytable
This puts all the columns in an array, transcoding each 1 to the corresponding literal string and 0s to null values. Then array_to_string() builds the output CSV string - this functions ignores null values by default.

Related

BigQuery Standard SQL, get max value from json array

I have a BigQuery column which contains STRING values like
col1
[{"a":1,"b":2},{"a":2,"b":3}]
[{"a":3,"b":4},{"a":5,"b":6}]
Now when doing a SELECT for each I want to get just the max. value of "a" in each json array for example here I would want the output of the SELECT on the table to be
2
5
Any ideas please? Thanks!
Use JSON_EXTRACT_ARRAY() to retrieve each array element. Then JSON_EXTRACT_VALUE():
with t as (
select '[{"a":1,"b":2},{"a":2,"b":3}]' as col union all
select '[{"a":3,"b":4},{"a":5,"b":6}]'
)
select t.*,
(select max(json_value(el, '$.a'))
from unnest(JSON_QUERY_ARRAY(col, '$')) el
)
from t;

BigQuery - concatenate ignoring NULL

I'm very new to SQL. I understand in MySQL there's the CONCAT_WS function, but BigQuery doesn't recognise this.
I have a bunch of twenty fields I need to CONCAT into one comma-separated string, but some are NULL, and if one is NULL then the whole result will be NULL. Here's what I have so far:
CONCAT(m.track1, ", ", m.track2))) As Tracks,
I tried this but it returns NULL too:
CONCAT(m.track1, IFNULL(m.track2,CONCAT(", ", m.track2))) As Tracks,
Super grateful for any advice, thank you in advance.
Unfortunately, BigQuery doesn't support concat_ws(). So, one method is string_agg():
select t.*,
(select string_agg(track, ',')
from (select t.track1 as track union all select t.track2) x
) x
from t;
Actually a simpler method uses arrays:
select t.*,
array_to_string([track1, track2], ',')
Arrays with NULL values are not supported in result sets, but they can be used for intermediate results.
I have a bunch of twenty fields I need to CONCAT into one comma-separated string
Assuming that these are the only fields in the table - you can use below approach - generic enough to handle any number of columns and their names w/o explicit enumeration
select
(select string_agg(col, ', ' order by offset)
from unnest(split(trim(format('%t', (select as struct t.*)), '()'), ', ')) col with offset
where not upper(col) = 'NULL'
) as Tracks
from `project.dataset.table` t
Below is oversimplified dummy example to try, test the approach
#standardSQL
with `project.dataset.table` as (
select 1 track1, 2 track2, 3 track3, 4 track4 union all
select 5, null, 7, 8
)
select
(select string_agg(col, ', ' order by offset)
from unnest(split(trim(format('%t', (select as struct t.*)), '()'), ', ')) col with offset
where not upper(col) = 'NULL'
) as Tracks
from `project.dataset.table` t
with output

Count unique within combination of json keys in BigQuery

In BigQuery I do have a json stored in 1 column like this:
{"key1": "value1", "key3":"value3"}
{"key2": "value2"}
{"key3": "value3"}
What I'd want to know is how to calculate number of unique combinations, paying attention that there can be up to 100+ different keys so avoiding listing them would be beneficial.
In example above end result of unique number will be 2, because first and third matched by "key3", while second didn't matched with anything.
I understand how to build this with writing an app that will calculate it, but would like to see if there is any solution possible with 1 query
If your JSON values are formatted with no spaces after the :, then you can treat this as string manipulations:
with t as (
select '{"key1":"value1", "key3":"value3"}' as kv union all
select '{"key2":"value2"}' union all
select '{"key3":"value3"}'
)
select x, count(*)
from t cross join
unnest(regexp_extract_all(t.kv, '"[^,]+"')) x
group by x
having count(*) = 1;
With the spaces, you can use replace() to get rid of them:
with t as (
select '{"key1": "value1", "key3":"value3"}' as kv union all
select '{"key2": "value2"}' union all
select '{"key3": "value3"}'
)
select replace(x, '": "', '":"'), count(*)
from t cross join
unnest(regexp_extract_all(t.kv, '"[^,]+"')) x
group by 1
having count(*) = 1;

Convert comma separated string into rows

I have a comma separated string.
Now I'd like to separate this string value into each row.
Input:
1,2,3,4,5
Required output:
value
----------
1
2
3
4
5
How can I achieve this in sql?
Thanks in advance.
Use the STRING_SPLIT function if you are on SQL Server
SELECT value
FROM STRING_SPLIT('1,2,3,4,5', ',')
Else you can loop on the SUBSTRING_INDEX() function and insert every string in a temporary table.
If you are using Postgres, you can use string_to_array and unnest:
select *
from unnest(string_to_array('1,2,3,4,5',',') as t(value);
In Postgres, you can also use the 'regexp_split_to_table()' function.
If you're using MariaDB or MySQL you can use a recursive CTE such as:
with recursive itemtable as (
select
trim(substring_index(data, ',', 1)) as value,
right(data, length(data) - locate(',', data, 1)) as data
from (select '1,2,3,4,5' as data) as input
union
select
trim(substring_index(data, ',', 1)) as value,
right(data, length(data) - locate(',', data, 1)) as data
from itemtable
)

T-SQL function to split string with two delimiters as column separators into table

I'm looking for a t-sql function to get a string like:
a:b,c:d,e:f
and convert it to a table like
ID Value
a b
c d
e f
Anything I found in Internet incorporated single column parsing (e.g. XMLSplit function variations) but none of them letting me describe my string with two delimiters, one for column separation & the other for row separation.
Can you please guiding me regarding the issue? I have a very limited t-sql knowledge and cannot fork those read-made functions to get two column solution?
You can find a split() function on the web. Then, you can do string logic:
select left(val, charindex(':', val)) as col1,
substring(val, charindex(':', val) + 1, len(val)) as col2
from dbo.split(#str, ';') s(val);
You can use a custom SQL Split function in order to separate data-value columns
Here is a sql split function that you can use on a development system
It returns an ID value that can be helpful to keep id and value together
You need to split twice, first using "," then a second split using ";" character
declare #str nvarchar(100) = 'a:b,c:d,e:f'
select
id = max(id),
value = max(value)
from (
select
rowid,
id = case when id = 1 then val else null end,
value = case when id = 2 then val else null end
from (
select
s.id rowid, t.id, t.val
from (
select * from dbo.Split(#str, ',')
) s
cross apply dbo.Split(s.val, ':') t
) k
) m group by rowid