Check if any of array elements starts with specific characters postgres - sql

I am trying to check if an array has any of elements which starts with or ends with the string specified.
For Comparing strings we could do something like below,
select * from table where 'gggggg' ilike '%g';
Can someone help to find out if an array contains values as like pattern.
Eg array : ['str1', 'str2', 'str3']
Also I want to find if any of elements ends with 1 or starts with 'str'.

For now, the only thing you can do is unnest the array and test each element:
create table test (a text[]);
insert into test values (array['abc', 'def', 'ghi']);
select distinct a from test
JOIN lateral (select * from unnest(a) as u) as sub on TRUE
WHERE u like '%g';
a
---
(0 rows)
select distinct a from test
JOIN lateral (select * from unnest(a) as u) as sub on TRUE
WHERE u like 'g%';
a
---------------
{abc,def,ghi}
(1 row)
In postgres 12, you will be able to use jsonb_path_exists. Of course, this would work better if you stored your data in jsonb, but it will still work, just not as efficiently:
-- Starts with g
select a from test
where jsonb_path_exists(to_jsonb(a), '$[*] ? (# like_regex "^g")');
a
---------------
{abc,def,ghi}
(1 row)
-- Ends with g
select a from test
where jsonb_path_exists(to_jsonb(a), '$[*] ? (# like_regex "g$")');
a
---
(0 rows)

Related

postgres && - Array Overlap operator with wildcard

In postgres
select array['some', 'word'] && array ['some','xxx'] -- return true
select array['some', 'word'] && array ['','word'] -- return true
I'd like to know how I can use % wildcar in combination with && operator.
select array['some%', 'word'] && array ['','some'] -- I was thinking this will return true but it doesn't.
I want to check if a text array contains at least one element of another text array. The first text array can contains wildcard. What's the best way to do that ?
You could try unnest to parse every element of both arrays and compare them using LIKE or ILIKE:
SELECT EXISTS(
SELECT
FROM unnest(array['some%', 'word']) i (txt),
unnest(array ['','some']) j (txt)
WHERE j.txt LIKE i.txt) AS overlaps;
overlaps
----------
t
(1 row)
If you want to apply the % to all array elements, just place it directly in the WHERE clause in the LIKE or ILIKE operator:
SELECT EXISTS(
SELECT
FROM unnest(array['some', 'word']) i (txt),
unnest(array ['','XXsomeXX']) j (txt)
WHERE j.txt LIKE '%'||i.txt||'%') AS overlaps;
overlaps
----------
t
(1 row)
Demo: db<>fiddle

SQL Array with Null

I'm trying to group BigQuery columns using an array like so:
with test as (
select 1 as A, 2 as B
union all
select 3, null
)
select *,
[A,B] as grouped_columns
from test
However, this won't work, since there is a null value in column B row 2.
In fact this won't work either:
select [1, null] as test_array
When reading the documentation on BigQuery though, it says Nulls should be allowed.
In BigQuery, an array is an ordered list consisting of zero or more
values of the same data type. You can construct arrays of simple data
types, such as INT64, and complex data types, such as STRUCTs. The
current exception to this is the ARRAY data type: arrays of arrays are
not supported. Arrays can include NULL values.
There doesn't seem to be any attributes or safe prefix to be used with ARRAY() to handle nulls.
So what is the best approach for this?
Per documentation - for Array type
Currently, BigQuery has two following limitations with respect to NULLs and ARRAYs:
BigQuery raises an error if query result has ARRAYs which contain NULL elements, although such ARRAYs can be used inside the query.
BigQuery translates NULL ARRAY into empty ARRAY in the query result, although inside the query NULL and empty ARRAYs are two distinct values.
So, as of your example - you can use below "trick"
with test as (
select 1 as A, 2 as B union all
select 3, null
)
select *,
array(select cast(el as int64) el
from unnest(split(translate(format('%t', t), '()', ''), ', ')) el
where el != 'NULL'
) as grouped_columns
from test t
above gives below output
Note: above approach does not require explicit referencing to all involved columns!
My current solution---and I'm not a fan of it---is to use a combo of IFNULL(), UNNEST() and ARRAY() like so:
select
*,
array(
select *
from unnest(
[
ifnull(A, ''),
ifnull(B, '')
]
) as grouping
where grouping <> ''
) as grouped_columns
from test
An alternative way, you can replace NULL value to some NON-NULL figures using function IFNULL(null, 0) as given below:-
with test as (
select 1 as A, 2 as B
union all
select 3, IFNULL(null, 0)
)
select *,
[A,B] as grouped_columns
from test

NOT IN is not working as expected with Listagg function

Below is the DDL of the table
create or replace table tempdw.blk_table;
(
db_name varchar,
tbl_expr varchar
);
insert into tempdw.blk_table values ('edw','ABC%');
insert into tempdw.blk_table values ('edw','EFG%');
select * from tempdw.blk_table;
Below code is not working, expected output should not return any
select * from tempdw.blk_table where tbl_expr not in (
select regexp_replace(regexp_replace(replace(listagg(tbl_expr,','),',','\',\''),'^','\''),'$','\'') from tempdw.blk_table);
When I run below code it works fine , Trying to understand why it's not working for above code
select * from tempdw.blk_table where tbl_expr NOT IN('ABC%','EFG%');
Au contraire The code is working just fine. You don't understand the difference between a string that has commas and a list of strings.
Unfortunately, it is rather hard to figure out what you do want to do, because your question does not explain that.
I can speculate that you want something like:
select bt.*
from blk_table bt
where db_name like tbl_expr;
This is just a guess, however.
with data as (
select * from values ('edw','ABC%'),('edw','ABC%') v(db_name,tbl_expr )
)
select * from data
where tbl_expr not in (
select regexp_replace(regexp_replace(replace(listagg(tbl_expr,','),',','\',\''),'^','\''),'$','\'') from data);
does indeed give the results you don't want. aka:
DB_NAME TBL_EXPR
edw ABC%
edw ABC%
because your sub-query only has one row of results, because you have aggregated the two input into one row.
REGEXP_REPLACE( REGEXP_REPLACE( REPLACE( LISTAGG( TBL_EXPR,','),',','\',\''),'^','\''),'$','\'')
'ABC%','ABC%'
and NOT IN is a exact match .. thus if we change from strings to numbers:
SELECT num, num in (2,3,4) FROM values (1),(3),(5) v(num);
gives:
NUM NUM IN (2,3,4)
1 0
3 1
5 0
so your NOT IN would only return strings that are not in the list of one you have... and given your list is the aggregate of the same input, that are by definition not that same.
back to strings..
SELECT str
,str in ('str_a', 'str_b')
,str not in ('str_a', 'str_b')
from values ('a'),('str_b') v(str);
gives:
STR STR IN ('STR_A', 'STR_B') STR NOT IN ('STR_A', 'STR_B')
a 0 1
str_b 1 0
Thus the results you are getting..
now I suspect you are want LIKE type behavior OR a REGEX match, but given you are building the list you know what you are doing there..
also note:
listagg(tbl_expr,',') AS a
,replace(a,',','\',\'') AS b
,regexp_replace(b,'^','\'') AS c
,regexp_replace(c,'$','\'') AS d
is the effect of what you are doing can be replaced with
listagg('\'' || tbl_expr || '\'',',')
unless you want strings with embedded comma to become independent "list" items..

sql, strategies to find out string contains certain texts

I want to select any data that contains 800, 805, 888... (there are 8 pattern texts) in the column.
Do I have to use like statement 8 times for each one or is there a faster way?
Example:
SELECT * FROM caller,
WHERE id LIKE '%805%' OR id LIKE'%800' OR ... ;
(PS. I am not allowed to create another table, just using sql queries.)
LIKE is for strings, not for numbers. Assuming id is actually a number, you first need to cast it to a string in order to be able to apply a LIKE condition on it.
But once you do that, you can use an array for that:
SELECT *
FROM caller
WHERE id::text LIKE ANY (array['%805%', '%800', '81%']);
Use any() with an array of searched items:
with test(id, col) as (
values
(1, 'x800'),
(2, 'x855'),
(3, 'x900'),
(4, 'x920')
)
select *
from test
where col like any(array['%800', '%855'])
id | col
----+------
1 | x800
2 | x855
(2 rows)
This is shorter to write but not faster to execute I think.

How to aggragate integers in postgresql?

I have a query that gives list of IDs:
ID
2
3
4
5
6
25
ID is integer.
I want to get that result like that in ARRAY of integers type:
ID
2,3,4,5,6,25
I wrote this query:
select string_agg(ID::text,',')
from A
where .....
I have to convert it to text otherwise it won't work. string_agg expect to get (text,text)
this works fine the thing is that this result should later be used in many places that expect ARRAY of integers.
I tried :
select ('{' || string_agg(ID::text,',') || '}')::integer[]
from A
WHERE ...
which gives: {2,3,4,5,6,25} in type int4 integer[]
but this isn't the correct type... I need the same type as ARRAY.
for example SELECT ARRAY[4,5] gives array integer[]
in simple words I want the result of my query to work with (for example):
select *
from b
where b.ID = ANY (FIRST QUERY RESULT) // aka: = ANY (ARRAY[2,3,4,5,6,25])
this is failing as ANY expect array and it doesn't work with regular integer[], i get an error:
ERROR: operator does not exist: integer = integer[]
note: the result of the query is part of a function and will be saved in a variable for later work. Please don't take it to places where you bypass the problem and offer a solution which won't give the ARRAY of Integers.
EDIT: why does
select *
from b
where b.ID = ANY (array [4,5])
is working. but
select *
from b
where b.ID = ANY(select array_agg(ID) from A where ..... )
doesn't work
select *
from b
where b.ID = ANY(select array_agg(4))
doesn't work either
the error is still:
ERROR: operator does not exist: integer = integer[]
Expression select array_agg(4) returns set of rows (actually set of rows with 1 row). Hence the query
select *
from b
where b.id = any (select array_agg(4)) -- ERROR
tries to compare an integer (b.id) to a value of a row (which has 1 column of type integer[]). It raises an error.
To fix it you should use a subquery which returns integers (not arrays of integers):
select *
from b
where b.id = any (select unnest(array_agg(4)))
Alternatively, you can place the column name of the result of select array_agg(4) as an argument of any, e.g.:
select *
from b
cross join (select array_agg(4)) agg(arr)
where b.id = any (arr)
or
with agg as (
select array_agg(4) as arr)
select *
from b
cross join agg
where b.id = any (arr)
More formally, the first two queries use ANY of the form:
expression operator ANY (subquery)
and the other two use
expression operator ANY (array expression)
like it is described in the documentation: 9.22.4. ANY/SOME
and 9.23.3. ANY/SOME (array).
How about this query? Does this give you the expected result?
SELECT *
FROM b b_out
WHERE EXISTS (SELECT 1
FROM b b_in
WHERE b_out.id = b_in.id
AND b_in.id IN (SELECT <<first query that returns 2,3,4,...>>))
What I've tried to do is to break down the logic of ANY into two separate logical checks in order to achieve the same result.
Hence, ANY would be equivalent with a combination of EXISTS at least one of the values IN your list of values returned by the first SELECT.