How to aggragate integers in postgresql? - sql

I have a query that gives list of IDs:
ID
2
3
4
5
6
25
ID is integer.
I want to get that result like that in ARRAY of integers type:
ID
2,3,4,5,6,25
I wrote this query:
select string_agg(ID::text,',')
from A
where .....
I have to convert it to text otherwise it won't work. string_agg expect to get (text,text)
this works fine the thing is that this result should later be used in many places that expect ARRAY of integers.
I tried :
select ('{' || string_agg(ID::text,',') || '}')::integer[]
from A
WHERE ...
which gives: {2,3,4,5,6,25} in type int4 integer[]
but this isn't the correct type... I need the same type as ARRAY.
for example SELECT ARRAY[4,5] gives array integer[]
in simple words I want the result of my query to work with (for example):
select *
from b
where b.ID = ANY (FIRST QUERY RESULT) // aka: = ANY (ARRAY[2,3,4,5,6,25])
this is failing as ANY expect array and it doesn't work with regular integer[], i get an error:
ERROR: operator does not exist: integer = integer[]
note: the result of the query is part of a function and will be saved in a variable for later work. Please don't take it to places where you bypass the problem and offer a solution which won't give the ARRAY of Integers.
EDIT: why does
select *
from b
where b.ID = ANY (array [4,5])
is working. but
select *
from b
where b.ID = ANY(select array_agg(ID) from A where ..... )
doesn't work
select *
from b
where b.ID = ANY(select array_agg(4))
doesn't work either
the error is still:
ERROR: operator does not exist: integer = integer[]

Expression select array_agg(4) returns set of rows (actually set of rows with 1 row). Hence the query
select *
from b
where b.id = any (select array_agg(4)) -- ERROR
tries to compare an integer (b.id) to a value of a row (which has 1 column of type integer[]). It raises an error.
To fix it you should use a subquery which returns integers (not arrays of integers):
select *
from b
where b.id = any (select unnest(array_agg(4)))
Alternatively, you can place the column name of the result of select array_agg(4) as an argument of any, e.g.:
select *
from b
cross join (select array_agg(4)) agg(arr)
where b.id = any (arr)
or
with agg as (
select array_agg(4) as arr)
select *
from b
cross join agg
where b.id = any (arr)
More formally, the first two queries use ANY of the form:
expression operator ANY (subquery)
and the other two use
expression operator ANY (array expression)
like it is described in the documentation: 9.22.4. ANY/SOME
and 9.23.3. ANY/SOME (array).

How about this query? Does this give you the expected result?
SELECT *
FROM b b_out
WHERE EXISTS (SELECT 1
FROM b b_in
WHERE b_out.id = b_in.id
AND b_in.id IN (SELECT <<first query that returns 2,3,4,...>>))
What I've tried to do is to break down the logic of ANY into two separate logical checks in order to achieve the same result.
Hence, ANY would be equivalent with a combination of EXISTS at least one of the values IN your list of values returned by the first SELECT.

Related

SQL Array with Null

I'm trying to group BigQuery columns using an array like so:
with test as (
select 1 as A, 2 as B
union all
select 3, null
)
select *,
[A,B] as grouped_columns
from test
However, this won't work, since there is a null value in column B row 2.
In fact this won't work either:
select [1, null] as test_array
When reading the documentation on BigQuery though, it says Nulls should be allowed.
In BigQuery, an array is an ordered list consisting of zero or more
values of the same data type. You can construct arrays of simple data
types, such as INT64, and complex data types, such as STRUCTs. The
current exception to this is the ARRAY data type: arrays of arrays are
not supported. Arrays can include NULL values.
There doesn't seem to be any attributes or safe prefix to be used with ARRAY() to handle nulls.
So what is the best approach for this?
Per documentation - for Array type
Currently, BigQuery has two following limitations with respect to NULLs and ARRAYs:
BigQuery raises an error if query result has ARRAYs which contain NULL elements, although such ARRAYs can be used inside the query.
BigQuery translates NULL ARRAY into empty ARRAY in the query result, although inside the query NULL and empty ARRAYs are two distinct values.
So, as of your example - you can use below "trick"
with test as (
select 1 as A, 2 as B union all
select 3, null
)
select *,
array(select cast(el as int64) el
from unnest(split(translate(format('%t', t), '()', ''), ', ')) el
where el != 'NULL'
) as grouped_columns
from test t
above gives below output
Note: above approach does not require explicit referencing to all involved columns!
My current solution---and I'm not a fan of it---is to use a combo of IFNULL(), UNNEST() and ARRAY() like so:
select
*,
array(
select *
from unnest(
[
ifnull(A, ''),
ifnull(B, '')
]
) as grouping
where grouping <> ''
) as grouped_columns
from test
An alternative way, you can replace NULL value to some NON-NULL figures using function IFNULL(null, 0) as given below:-
with test as (
select 1 as A, 2 as B
union all
select 3, IFNULL(null, 0)
)
select *,
[A,B] as grouped_columns
from test

Check if any of array elements starts with specific characters postgres

I am trying to check if an array has any of elements which starts with or ends with the string specified.
For Comparing strings we could do something like below,
select * from table where 'gggggg' ilike '%g';
Can someone help to find out if an array contains values as like pattern.
Eg array : ['str1', 'str2', 'str3']
Also I want to find if any of elements ends with 1 or starts with 'str'.
For now, the only thing you can do is unnest the array and test each element:
create table test (a text[]);
insert into test values (array['abc', 'def', 'ghi']);
select distinct a from test
JOIN lateral (select * from unnest(a) as u) as sub on TRUE
WHERE u like '%g';
a
---
(0 rows)
select distinct a from test
JOIN lateral (select * from unnest(a) as u) as sub on TRUE
WHERE u like 'g%';
a
---------------
{abc,def,ghi}
(1 row)
In postgres 12, you will be able to use jsonb_path_exists. Of course, this would work better if you stored your data in jsonb, but it will still work, just not as efficiently:
-- Starts with g
select a from test
where jsonb_path_exists(to_jsonb(a), '$[*] ? (# like_regex "^g")');
a
---------------
{abc,def,ghi}
(1 row)
-- Ends with g
select a from test
where jsonb_path_exists(to_jsonb(a), '$[*] ? (# like_regex "g$")');
a
---
(0 rows)

Hive - joining two tables to find string that like string in reference table

I stumbled in a case where requires to mask data using keyword from other reference table, illustrated below:
1:
Table A contains thousands of keyword and Table B contains 15 millions ++ row for each day processing..
How can I replace data in table B using keyword in table A in new column?
I tried to use join but join can only match when the string exactly the same
Here is my code
select
sourcetype, hourx,minutex,t1.adn,hostname,t1.appsid,t1.taskid,
product_id,
location,
smsIncoming,
case
when smsIncoming regexp keyword = true then keyword
else 'undef' end smsIncoming_replaced
from(
select ... from ...
)t1
left join
(select adn,keyword,type,mapping_param,mapping_param_json,appsid,taskid,is_api,charlentgh,wordcount,max(datex)
from ( select adn,keyword,type,mapping_param,mapping_param_json,appsid,taskid,is_api,charlentgh,wordcount,datex ,last_update,
max(last_update) over (partition by keyword) as last_modified
from sqm_stg.reflex_service_map ) as sub
where last_update = last_modified
group by adn,keyword,type,mapping_param,mapping_param_json,appsid,taskid,is_api,charlentgh,wordcount)t2
on t1.adn=t2.adn and t1.appsid=t2.appsid and t1.taskid=t2.taskid
Need advice :)
Thanks
Use instr(string str, string substr) function: Returns the position of the first occurrence of substr in str. Returns null if either of the arguments are null and returns 0 if substr could not be found in str. Be aware that this is not zero based. The first character in str has index 1.
case
when instr(smsIncoming,keyword) >0 then keyword
else 'undef'
end smsIncoming_replaced

db2 query returning 0 records

I am trying to execute below query:
select * from trgdms.src_stm where trim(src_stm_code) in (select concat (concat('''', trim(replace(processed_src_sys_cd,',',''','''))),'''' ) from trgdms.batch_run_log where src_stm_cd_or_hub_cd='CTR' and file_process_dt='2015-06-19' and batch_id=42);
the src_stm_code column of table trgdms.src_stm have values like T19,T68,T73 etc. When I run the inner query alone i do get the correct result:
select concat (concat('''', trim(replace(processed_src_sys_cd,',',''','''))),'''' ) from trgdms.batch_run_log where src_stm_cd_or_hub_cd='CTR' and file_process_dt='2015-06-19' and batch_id=42
Result: 'T19','T68','T73'
Wondered if anyone has used something similar in db2?
Remove the concat part from the inner query. When you concatenate values, the list of values generated will be treated as a string and the query doesn't return any results.
select * from trgdms.src_stm where trim(src_stm_code)
in (select trim(replace(processed_src_sys_cd,',',''',''')) from trgdms.batch_run_log
where src_stm_cd_or_hub_cd='CTR' and file_process_dt='2015-06-19' and batch_id=42);

Operator does not exist: integer = integer[] in a query with ANY

I frequently used integer = ANY(integer[]) syntax, but now ANY operator doesn't work. This is the first time I use it to compare a scalar with an integer returned from CTE, but I thought this shouldn't cause problems.
My query:
WITH bar AS (
SELECT array_agg(b) AS bs
FROM foo
WHERE c < 3
)
SELECT a FROM foo WHERE b = ANY ( SELECT bs FROM bar);
When I run it, it throws following error:
ERROR: operator does not exist: integer = integer[]: WITH bar AS (
SELECT array_agg(b) AS bs FROM foo WHERE c < 3 ) SELECT a FROM foo
WHERE b = ANY ( SELECT bs FROM bar)
Details in this SQL Fiddle.
So what am I doing wrong?
Based on the error message portion operator does not exist: integer = integer[], it appears that the bs column needs to be unnested, in order to get the right hand side back to an integer so the comparison operator can be found:
WITH bar AS (
SELECT array_agg(b) AS bs
FROM foo
WHERE c < 3
)
SELECT a
FROM foo
WHERE b = ANY ( SELECT unnest(bs) FROM bar);
This results in the output:
A
2
3
Given the doc for the ANY function:
The right-hand side is a parenthesized subquery, which must return
exactly one column. The left-hand expression is evaluated and compared
to each row of the subquery result using the given operator, which
must yield a Boolean result. The result of ANY is "true" if any true
result is obtained. The result is "false" if no true result is found
(including the case where the subquery returns no rows).
... the error makes sense, as the left-hand expression is an integer -- column b -- while the right-hand expression is an array of integers, or integer[], and so the comparison ends up being of the form integer = integer[], which doesn't have an operator, and therefore results in the error.
unnesting the integer[] value makes the left- and right-hand expressions integers, and so the comparison can continue.
Modified SQL Fiddle.
Note: that the same behavior is seen when using IN instead of = ANY.
without unnest
WITH bar AS (
SELECT array_agg(b) AS bs
FROM foo
WHERE c < 3
)
SELECT a FROM foo WHERE ( SELECT b = ANY (bs) FROM bar);
FYI, For me,
SELECT ... WHERE "id" IN (SELECT unnest(ids) FROM tablewithids)
was incomparably faster than
SELECT ... WHERE "id" = ANY((SELECT ids FROM tablewithids)::INT[])
Didn't do any research into why that was though.
column needs to be unnest
WITH bar AS (
SELECT array_agg(b) AS bs
FROM foo
WHERE c < 3
)
SELECT a
FROM foo
WHERE b = ANY ( SELECT unnest(bs) FROM bar);