NULL arrays get converted to empty arrays when written to a table

NULL arrays get converted to empty arrays when written to a table - google-bigquery

When I run this query below I get 7 as expected.
WITH Items AS (
SELECT [] AS num1, CAST(NULL AS ARRAY<INTEGER>) as num2
)
SELECT COALESCE(num2, [7]) FROM Items;
When I save Items to a table and then reference the table by name I get [] instead, which I confirmed is not a NULL with the WHERE clause.
CREATE TEMP TABLE Items
AS
(
SELECT [] AS num1, CAST(NULL AS ARRAY<INTEGER>) as num2
);
WITH t as
(
SELECT COALESCE(num2, [7]) as test FROM Items
)
SELECT *
FROM t
WHERE array_length(test)=0
Is this expected behavior?
This link suggests empty arrays and NULL arrays are distinct https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#array_nulls

I talked to someone and he pointed out this important statement (emphasis mine):
BigQuery translates a NULL ARRAY into an empty ARRAY in the query result,
although inside the query, NULL and empty ARRAYs are two distinct values.
It was also news to him but basically it seems when you write to a table you are writing the query result so the NULL ARRAY is now an empty array.
The work around is to use CREATE VIEW so that the NULLs are part of the query.

Related

Unnest string array and transpose in Big query

I'm using Bigquery, I've a table A with string array and I need to cast to int64/string ( if possible ) so I can join with table B which of Int64/string
The main ask here is:
I've a table A, where I've string array mapped with Ref ID as below:
I'm trying to get unnest and my desired output should be as below.
I did tried below script:
SELECT a0_string_arrat,
ref_id
FROM TableA AS t,
t.String_array AS a0_String_array
But the challenge with above script is, I've close to 1000 Ref IDs, but my output is resulting only 100
If I try the below, I'm able to get all 1000 rows.
SELECT string_array,
ref_id
FROM TableA
The end goal is to I need to unnest and cast to Int64/string. The above script is not working for my need. can someone help on this.

You can use CROSS JOIN + UNNEST() in order to get the values from the array attributed to each ref_id:
select
ref_id,
unnested_numbers
from tablea
cross join unnest(string_array) as unnested_numbers
order by 2, 1
This should give you the desired output that you specified.

Selecting Columns which are not null in Athena(Metabase)

I have a table of 1000+ columns in Athena(Metabase), and I want to know how can I extract only those columns which are not null for a certain group of ID.

Typically, this would need an UNPIVOTING of your columns to rows and then check where not null and then back to PIVOT.
From the documentation, Athena may do it simpler.
As documented here
SELECT filter(ARRAY [-1, NULL, 10, NULL], q -> q IS NOT NULL)
Which returns:
[-1,10]
Unfortunately, since there is no ability to be dynamic until we get to an array, this looks like:
WITH dataset AS (
SELECT
ID,
ARRAY[field1, field2, field3, .....] AS fields
FROM
Your1000ColumnTable
)
SELECT ID, SELECT filter(fields, q -> q IS NOT NULL)
FROM dataset
If you need to access the column names from the array, use a mapping to field names when creating the array as seen here

SELECT on JSON operations of Postgres array column?

I have a column of type jsonb[] (a Postgres array of jsonb objects) and I'd like to perform a SELECT on rows where a criteria is met on at least one of the objects. Something like:
-- Schema would be something like
mytable (
id UUID PRIMARY KEY,
col2 jsonb[] NOT NULL
);
-- Query I'd like to run
SELECT
id,
x->>'field1' AS field1
FROM
mytable
WHERE
x->>'field2' = 'user' -- for any x in the array stored in col2
I've looked around at ANY and UNNEST but it's not totally clear how to achieve this, since you can't run unnest in a WHERE clause. I also don't know how I'd specify that I want the field1 from the matching object.
Do I need a WITH table with the values expanded to join against? And how would I achieve that and keep the id from the other column?
Thanks!

You need to unnest the array and then you can access each json value
SELECT t.id,
c.x ->> 'field1' AS field1
FROM mytable t
cross join unnest(col2) as c(x)
WHERE c.x ->> 'field2' = 'user'
This will return one row for each json value in the array.

JSONB array contains like OR and AND operators

Consider a table temp (jsondata jsonb)
Postgres provides a way to query jsonb array object for contains check using
SELECT jsondata
FROM temp
WHERE (jsondata->'properties'->'home') ? 'football'
But, we can't use LIKE operator for array contains. One way to get LIKE in the array contains is using -
SELECT jsondata
FROM temp,jsonb_array_elements_text(temp.jsondata->'properties'->'home')
WHERE value like '%foot%'
OR operation with LIKE can be achieved by using -
SELECT DISTINCT jsondata
FROM temp,jsonb_array_elements_text(temp.jsondata->'properties'->'home')
WHERE value like '%foot%' OR value like 'stad%'
But, I am unable to perform AND operation with LIKE operator in JSONB array contains.

After unnesting the array with jsonb_array_elements() you can check values meeting one of the conditions and sum them in groups by original rows, example:
drop table if exists temp;
create table temp(id serial primary key, jsondata jsonb);
insert into temp (jsondata) values
('{"properties":{"home":["football","stadium","16"]}}'),
('{"properties":{"home":["football","player","16"]}}'),
('{"properties":{"home":["soccer","stadium","16"]}}');
select jsondata
from temp
cross join jsonb_array_elements_text(temp.jsondata->'properties'->'home')
group by jsondata
-- or better:
-- group by id
having sum((value like '%foot%' or value like 'stad%')::int) = 2
jsondata
---------------------------------------------------------
{"properties": {"home": ["football", "stadium", "16"]}}
(1 row)
Update. The above query may be expensive with a large dataset. There is a simplified but faster solution. You can cast the array to text and apply like to it, e.g.:
select jsondata
from temp
where jsondata->'properties'->>'home' like all('{%foot%, %stad%}');
jsondata
---------------------------------------------------------
{"properties": {"home": ["football", "stadium", "16"]}}
(1 row)

I have the following, but it was a bit fiddly. There's probably a better way but this is working I think.
The idea is to find the matching JSON array entries, then collect the results. In the join condition we check the "matches" array has the expected number of entries.
CREATE TABLE temp (jsondata jsonb);
INSERT INTO temp VALUES ('{"properties":{"home":["football","stadium",16]}}');
SELECT jsondata FROM temp t
INNER JOIN LATERAL (
SELECT array_agg(value) AS matches
FROM jsonb_array_elements_text(t.jsondata->'properties'->'home')
WHERE value LIKE '%foo%' OR value LIKE '%sta%'
LIMIT 1
) l ON array_length(matches, 1) = 2;
jsondata
-------------------------------------------------------
{"properties": {"home": ["football", "stadium", 16]}}
(1 row)

demo: db<>fiddle
I would cast the array into text. Then you are able to search for keywords with every string operator.
Disadvantage: because it was an array the text contains characters like braces and commas. So it's not that simple to search for keyword with a certain beginning (ABC%): You always have to search like %ABC%
SELECT jsondata
FROM (
SELECT
jsondata,
jsondata->'properties'->>'home' as a
FROM
temp
)s
WHERE
a LIKE '%stad%' AND a LIKE '%foot%'

Merge two arrays of object from two different columns to create a new array of unique and not null objects in postgresql

I have a result set which returns 3 columns out of which one is varchar and two are array, now i need to merge the array column to create a new array with not null unique elements. I have tried different options non of them are working, Any suggestions?

You can concatenate the arrays and unnest into rows. Then you can use distinct to get the unique rows, and array_agg to combine them back into an array:
select id
, array_agg(nr)
from (
select distinct id
, unnest(array[col1] || col2 || col3) nr
from t1
) sub
group by
id
Example at SQL Fiddle.

Thanks for the suggestions guys. I found the solution array_cat function worked for me.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

NULL arrays get converted to empty arrays when written to a table - google-bigquery

Related

Unnest string array and transpose in Big query

Selecting Columns which are not null in Athena(Metabase)

SELECT on JSON operations of Postgres array column?

JSONB array contains like OR and AND operators

Merge two arrays of object from two different columns to create a new array of unique and not null objects in postgresql

Categories

Resources