postgres extract int array from text - sql

I have the following postgresql statement:
SELECT 1 = ANY( jsonb_array_elements_text('[2, 1, 3]') );
Basically I have a string which contains an array of integers seperated by comma, like: [1, 2, 3] and sometimes this array could be empty too, like: []. Now, I want to write a query (as part of a bigger query) where I would be able to find out if an element is matching any integers in the text. For example:
SELECT 1 = ANY( jsonb_array_elements_text('[2, 1, 3]') ); -- Should return true
SELECT 1 = ANY( jsonb_array_elements_text('[]') ); -- should return false
However, the above query fails with an error message:
ERROR: op ANY/ALL (array) requires array on right side
LINE 1: SELECT 1 = ANY( jsonb_array_elements_text('[2, 1, 3]') );
Any help how I can extract an integer array out of a text so that I can use it in a join condition ?
I am using postgres 9.4 if it matters.

I have found it. The answer is:
SELECT 1 IN (SELECT json_array_elements('[2, 1, 3]')::text::int);
SELECT 1 IN (SELECT json_array_elements('[]')::text::int);
SELECT 1 IN (SELECT json_array_elements('[12, 10, 3]')::text::int);

Related

Is there a way of replace null values in a JSON bigquery field?

I have a JSON value like the one below in a certain column of my table:
{"values":[1, 2, null, 4, null]}
What I want is to convert the value in a bigquery ARRAY: ARRAY<INT64>
I tried JSON_VALUE_ARRAY but it throws an error because the final output cannot be anarray with NULLs.
Said that, what should be the correct approach for that?
You can unnest an array with null elements. For building a new array you can provided the flag ignore nulls to remove null values.
with tbl as (select JSON '{"values":[1, 2, null, 4, null]}' as data union all select JSON ' {"values":[ ] }')
select *,
((Select array_agg(x ignore nulls) from unnest(JSON_VALUE_ARRAY (data.values ) ) x))
from tbl

SQL Array with Null

I'm trying to group BigQuery columns using an array like so:
with test as (
select 1 as A, 2 as B
union all
select 3, null
)
select *,
[A,B] as grouped_columns
from test
However, this won't work, since there is a null value in column B row 2.
In fact this won't work either:
select [1, null] as test_array
When reading the documentation on BigQuery though, it says Nulls should be allowed.
In BigQuery, an array is an ordered list consisting of zero or more
values of the same data type. You can construct arrays of simple data
types, such as INT64, and complex data types, such as STRUCTs. The
current exception to this is the ARRAY data type: arrays of arrays are
not supported. Arrays can include NULL values.
There doesn't seem to be any attributes or safe prefix to be used with ARRAY() to handle nulls.
So what is the best approach for this?
Per documentation - for Array type
Currently, BigQuery has two following limitations with respect to NULLs and ARRAYs:
BigQuery raises an error if query result has ARRAYs which contain NULL elements, although such ARRAYs can be used inside the query.
BigQuery translates NULL ARRAY into empty ARRAY in the query result, although inside the query NULL and empty ARRAYs are two distinct values.
So, as of your example - you can use below "trick"
with test as (
select 1 as A, 2 as B union all
select 3, null
)
select *,
array(select cast(el as int64) el
from unnest(split(translate(format('%t', t), '()', ''), ', ')) el
where el != 'NULL'
) as grouped_columns
from test t
above gives below output
Note: above approach does not require explicit referencing to all involved columns!
My current solution---and I'm not a fan of it---is to use a combo of IFNULL(), UNNEST() and ARRAY() like so:
select
*,
array(
select *
from unnest(
[
ifnull(A, ''),
ifnull(B, '')
]
) as grouping
where grouping <> ''
) as grouped_columns
from test
An alternative way, you can replace NULL value to some NON-NULL figures using function IFNULL(null, 0) as given below:-
with test as (
select 1 as A, 2 as B
union all
select 3, IFNULL(null, 0)
)
select *,
[A,B] as grouped_columns
from test

How to count setof / number of keys of JSON in postgresql?

I have a column in jsonb storing a map, like {'a':1,'b':2,'c':3} where the number of keys is different in each row.
I want to count it -- jsonb_object_keys can retrieve the keys but it is in setof
Are there something like this?
(select count(jsonb_object_keys(obj) from XXX )
(this won't work as ERROR: set-valued function called in context that cannot accept a set)
Postgres JSON Functions and Operators Document
json_object_keys(json)
jsonb_object_keys(jsonb)
setof text Returns set of keys in the outermost JSON object.
json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}')
json_object_keys
------------------
f1
f2
Crosstab isn't feasible as the number of key could be large.
Shortest:
SELECT count(*) FROM jsonb_object_keys('{"a": 1, "b": 2, "c": 3}'::jsonb);
Returns 3
If you want all json number of keys from a table, it gives:
SELECT (SELECT COUNT(*) FROM jsonb_object_keys(myJsonField)) nbr_keys FROM myTable;
Edit: there was a typo in the second example.
You could convert keys to array and use array_length to get this:
select array_length(array_agg(A.key), 1) from (
select json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}') as key
) A;
If you need to get this for the whole table, you can just group by primary key.
While a sub select must be used to convert the JSON keys set to rows, the following tweaked query might run faster by skipping building the temporary array:
SELECT count(*) FROM
(SELECT jsonb_object_keys('{"a": 1, "b": 2, "c": 3}'::jsonb)) v;
and it's a bit shorter ;)
To make it a function:
CREATE OR REPLACE FUNCTION public.count_jsonb_keys(j jsonb)
RETURNS bigint
LANGUAGE sql
AS $function$
SELECT count(*) from (SELECT jsonb_object_keys(j)) v;
$function$
Alternately, you could simply return the upper bounds of the keys when listed as an array:
SELECT
ARRAY_UPPER( -- Grab the upper bounds of the array
ARRAY( -- Convert rows into an array.
SELECT JSONB_OBJECT_KEYS(obj)
),
1 -- The array's dimension we're interested in retrieving the count for
) AS count
FROM
xxx
Using '{"a": 1, "b": 2, "c": 3}'::jsonb as obj, count would result in a value of three (3).
Pasteable example:
SELECT
ARRAY_UPPER( -- Grab the upper bounds of the array
ARRAY( -- Convert rows into an array.
SELECT JSONB_OBJECT_KEYS('{"a": 1, "b": 2, "c": 3}'::jsonb)
),
1 -- The array's dimension we're interested in retrieving the count for
) AS count

How to aggragate integers in postgresql?

I have a query that gives list of IDs:
ID
2
3
4
5
6
25
ID is integer.
I want to get that result like that in ARRAY of integers type:
ID
2,3,4,5,6,25
I wrote this query:
select string_agg(ID::text,',')
from A
where .....
I have to convert it to text otherwise it won't work. string_agg expect to get (text,text)
this works fine the thing is that this result should later be used in many places that expect ARRAY of integers.
I tried :
select ('{' || string_agg(ID::text,',') || '}')::integer[]
from A
WHERE ...
which gives: {2,3,4,5,6,25} in type int4 integer[]
but this isn't the correct type... I need the same type as ARRAY.
for example SELECT ARRAY[4,5] gives array integer[]
in simple words I want the result of my query to work with (for example):
select *
from b
where b.ID = ANY (FIRST QUERY RESULT) // aka: = ANY (ARRAY[2,3,4,5,6,25])
this is failing as ANY expect array and it doesn't work with regular integer[], i get an error:
ERROR: operator does not exist: integer = integer[]
note: the result of the query is part of a function and will be saved in a variable for later work. Please don't take it to places where you bypass the problem and offer a solution which won't give the ARRAY of Integers.
EDIT: why does
select *
from b
where b.ID = ANY (array [4,5])
is working. but
select *
from b
where b.ID = ANY(select array_agg(ID) from A where ..... )
doesn't work
select *
from b
where b.ID = ANY(select array_agg(4))
doesn't work either
the error is still:
ERROR: operator does not exist: integer = integer[]
Expression select array_agg(4) returns set of rows (actually set of rows with 1 row). Hence the query
select *
from b
where b.id = any (select array_agg(4)) -- ERROR
tries to compare an integer (b.id) to a value of a row (which has 1 column of type integer[]). It raises an error.
To fix it you should use a subquery which returns integers (not arrays of integers):
select *
from b
where b.id = any (select unnest(array_agg(4)))
Alternatively, you can place the column name of the result of select array_agg(4) as an argument of any, e.g.:
select *
from b
cross join (select array_agg(4)) agg(arr)
where b.id = any (arr)
or
with agg as (
select array_agg(4) as arr)
select *
from b
cross join agg
where b.id = any (arr)
More formally, the first two queries use ANY of the form:
expression operator ANY (subquery)
and the other two use
expression operator ANY (array expression)
like it is described in the documentation: 9.22.4. ANY/SOME
and 9.23.3. ANY/SOME (array).
How about this query? Does this give you the expected result?
SELECT *
FROM b b_out
WHERE EXISTS (SELECT 1
FROM b b_in
WHERE b_out.id = b_in.id
AND b_in.id IN (SELECT <<first query that returns 2,3,4,...>>))
What I've tried to do is to break down the logic of ANY into two separate logical checks in order to achieve the same result.
Hence, ANY would be equivalent with a combination of EXISTS at least one of the values IN your list of values returned by the first SELECT.

sql FInding strings with duplicate characters

I have a list of strings:
HEAWAMFWSP
TLHHHAFWSP
AWAMFWHHAW
AUAWAMHHHA
Each of these strings represent 5 pairs of 2 character combinations (i.e. HE AW AM FW SP)
What I am looking to do in SQL is to display all strings that have duplication in the pairs.
Take string number 3 from above; AW AM FW HH AW. I need to display this record because it has a duplicate pair (AW).
Is this possible?
Thanks!
Given current requirements, yes this is dooable. Here's a version which uses a recursive CTE (text may need to be adjusted for vendor idiosyncracies), written and tested on DB2. Please note that this will return multiple rows if there is more than 2 instances of a pair in a string, or more than 1 set of duplicates.
WITH RECURSIVE Pair (rowid, start, pair, text) as (
SELECT id, 1, SUBSTR(text, 1, 2), text
FROM SourceTable
UNION ALL
SELECT rowid, start + 2, SUBSTR(text, start + 2, 2), text
FROM Pair
WHERE start < LENGTH(text) - 1)
SELECT Pair.rowid, Pair.pair, Pair.start, Duplicate.start, Pair.text
FROM Pair
JOIN Pair as Duplicate
ON Duplicate.rowid = Pair.rowid
AND Duplicate.pair = Pair.pair
AND Duplicate.start > Pair.start
Here's a not very elegant solution, but it works and only returns the row once no matter how many duplicate matches. The substring function is for SQLServer, not sure what it is for Oracle.
select ID, Value
from MyTable
where (substring(Value,1,2) = substring(Value,3,4)
or substring(Value,1,2) = substring(Value,5,6)
or substring(Value,1,2) = substring(Value,7,8)
or substring(Value,1,2) = substring(Value,9,10)
or substring(Value,3,4) = substring(Value,5,6)
or substring(Value,3,4) = substring(Value,7,8)
or substring(Value,3,4) = substring(Value,9,10)
or substring(Value,5,6) = substring(Value,7,8)
or substring(Value,5,6) = substring(Value,9,10)
or substring(Value,7,8) = substring(Value,9,10))