PostgreSQL case insensitive SELECT on array - sql

I'm having problems finding the answer here, on google or in the docs ...
I need to do a case insensitive select against an array type.
So if:
value = {"Foo","bar","bAz"}
I need
SELECT value FROM table WHERE 'foo' = ANY(value)
to match.
I've tried lots of combinations of lower() with no success.
ILIKE instead of = seems to work but I've always been nervous about LIKE - is that the best way?

One alternative not mentioned is to install the citext extension that comes with PostgreSQL 8.4+ and use an array of citext:
regress=# CREATE EXTENSION citext;
regress=# SELECT 'foo' = ANY( '{"Foo","bar","bAz"}'::citext[] );
?column?
----------
t
(1 row)
If you want to be strictly correct about this and avoid extensions you have to do some pretty ugly subqueries because Pg doesn't have many rich array operations, in particular no functional mapping operations. Something like:
SELECT array_agg(lower(($1)[n])) FROM generate_subscripts($1,1) n;
... where $1 is the array parameter. In your case I think you can cheat a bit because you don't care about preserving the array's order, so you can do something like:
SELECT 'foo' IN (SELECT lower(x) FROM unnest('{"Foo","bar","bAz"}'::text[]) x);

This seems hackish to me but I think it should work
SELECT value FROM table WHERE 'foo' = ANY(lower(value::text)::text[])
ilike could have issues if your arrays can have _ or %
Note that what you are doing is converting the text array to a single text string, converting it to lower case, and then back to an array. This should be safe. If this is not sufficient you could use various combinations of string_to_array and array_to_string, but I think the standard textual representations should be safer.
Update building on subquery solution below, one option would be a simple function:
CREATE OR REPLACE FUNCTION lower(text[]) RETURNS text[] LANGUAGE SQL IMMUTABLE AS
$$
SELECT array_agg(lower(value)) FROM unnest($1) value;
$$;
Then you could do:
SELECT value FROM table WHERE 'foo' = ANY(lower(value));
This might actually be the best approach. You could also create GIN indexes on the output of the function if you want.

Another alternative would be with unnest()
WITH tbl AS (SELECT 1 AS id, '{"Foo","bar","bAz"}'::text[] AS value)
SELECT value
FROM (SELECT id, value, unnest(value) AS val FROM tbl) x
WHERE lower(val) = 'foo'
GROUP BY id, value;
I added an id column to get exactly identical results - i.e. duplicate value if there are duplicates in the base table. Depending on your circumstances, you can probably omit the id from the query to collapse duplicates in the results or if there are no dupes to begin with. Also demonstrating a syntax alternative:
SELECT value
FROM (SELECT value, lower(unnest(value)) AS val FROM tbl) x
WHERE val = 'foo'
GROUP BY value;
If array elements are unique within arrays in lower case, you don't even need the GROUP BY, since every value can only match once.
SELECT value
FROM (SELECT value, lower(unnest(value)) AS val FROM tbl) x
WHERE val = 'foo';
'foo' must be lower case, obviously.
Should be fast.
If you want that fast wit a big table, I would create a functional GIN index, though.

my solution to exclude values using a sub select...
and groupname not ilike all (
select unnest(array[exceptionname||'%'])
from public.group_exceptions
where ...
and ...
)

Regular expression may do the job for most cases
SELECT array_to_string('{"a","b","c"}'::text[],'|') ~* ANY('{"A","B","C"}');

I find creating a custom PostgreSQL function works best for me
CREATE OR REPLACE FUNCTION lower(text_array text[]) RETURNS text[] AS
$BODY$
SELECT (lower(text_array::text))::text[]
$BODY$
LANGUAGE SQL IMMUTABLE;

Related

How put value in jsonb_set postgreSQL [duplicate]

How do I declare a variable for use in a PostgreSQL 8.3 query?
In MS SQL Server I can do this:
DECLARE #myvar INT
SET #myvar = 5
SELECT *
FROM somewhere
WHERE something = #myvar
How do I do the same in PostgreSQL? According to the documentation variables are declared simply as "name type;", but this gives me a syntax error:
myvar INTEGER;
Could someone give me an example of the correct syntax?
I accomplished the same goal by using a WITH clause, it's nowhere near as elegant but can do the same thing. Though for this example it's really overkill. I also don't particularly recommend this.
WITH myconstants (var1, var2) as (
values (5, 'foo')
)
SELECT *
FROM somewhere, myconstants
WHERE something = var1
OR something_else = var2;
There is no such feature in PostgreSQL. You can do it only in pl/PgSQL (or other pl/*), but not in plain SQL.
An exception is WITH () query which can work as a variable, or even tuple of variables. It allows you to return a table of temporary values.
WITH master_user AS (
SELECT
login,
registration_date
FROM users
WHERE ...
)
SELECT *
FROM users
WHERE master_login = (SELECT login
FROM master_user)
AND (SELECT registration_date
FROM master_user) > ...;
You could also try this in PLPGSQL:
DO $$
DECLARE myvar integer;
BEGIN
SELECT 5 INTO myvar;
DROP TABLE IF EXISTS tmp_table;
CREATE TABLE tmp_table AS
SELECT * FROM yourtable WHERE id = myvar;
END $$;
SELECT * FROM tmp_table;
The above requires Postgres 9.0 or later.
Dynamic Config Settings
you can "abuse" dynamic config settings for this:
-- choose some prefix that is unlikely to be used by postgres
set session my.vars.id = '1';
select *
from person
where id = current_setting('my.vars.id')::int;
Config settings are always varchar values, so you need to cast them to the correct data type when using them. This works with any SQL client whereas \set only works in psql
The above requires Postgres 9.2 or later.
For previous versions, the variable had to be declared in postgresql.conf prior to being used, so it limited its usability somewhat. Actually not the variable completely, but the config "class" which is essentially the prefix. But once the prefix was defined, any variable could be used without changing postgresql.conf
It depends on your client.
However, if you're using the psql client, then you can use the following:
my_db=> \set myvar 5
my_db=> SELECT :myvar + 1 AS my_var_plus_1;
my_var_plus_1
---------------
6
If you are using text variables you need to quote.
\set myvar 'sometextvalue'
select * from sometable where name = :'myvar';
This solution is based on the one proposed by fei0x but it has the advantages that there is no need to join the value list of constants in the query and constants can be easily listed at the start of the query. It also works in recursive queries.
Basically, every constant is a single-value table declared in a WITH clause which can then be called anywhere in the remaining part of the query.
Basic example with two constants:
WITH
constant_1_str AS (VALUES ('Hello World')),
constant_2_int AS (VALUES (100))
SELECT *
FROM some_table
WHERE table_column = (table constant_1_str)
LIMIT (table constant_2_int)
Alternatively you can use SELECT * FROM constant_name instead of TABLE constant_name which might not be valid for other query languages different to postgresql.
Using a Temp Table outside of pl/PgSQL
Outside of using pl/pgsql or other pl/* language as suggested, this is the only other possibility I could think of.
begin;
select 5::int as var into temp table myvar;
select *
from somewhere s, myvar v
where s.something = v.var;
commit;
I want to propose an improvement to #DarioBarrionuevo's answer, to make it simpler leveraging temporary tables.
DO $$
DECLARE myvar integer = 5;
BEGIN
CREATE TEMP TABLE tmp_table ON COMMIT DROP AS
-- put here your query with variables:
SELECT *
FROM yourtable
WHERE id = myvar;
END $$;
SELECT * FROM tmp_table;
True, there is no vivid and unambiguous way to declare a single-value variable, what you can do is
with myVar as (select "any value really")
then, to get access to the value stored in this construction, you do
(select * from myVar)
for example
with var as (select 123)
... where id = (select * from var)
You may resort to tool special features. Like for DBeaver own proprietary syntax:
#set name = 'me'
SELECT :name;
SELECT ${name};
DELETE FROM book b
WHERE b.author_id IN (SELECT a.id FROM author AS a WHERE a.name = :name);
As you will have gathered from the other answers, PostgreSQL doesn’t have this mechanism in straight SQL, though you can now use an anonymous block. However, you can do something similar with a Common Table Expression (CTE):
WITH vars AS (
SELECT 5 AS myvar
)
SELECT *
FROM somewhere,vars
WHERE something = vars.myvar;
You can, of course, have as many variables as you like, and they can also be derived. For example:
WITH vars AS (
SELECT
'1980-01-01'::date AS start,
'1999-12-31'::date AS end,
(SELECT avg(height) FROM customers) AS avg_height
)
SELECT *
FROM customers,vars
WHERE (dob BETWEEN vars.start AND vars.end) AND height<vars.avg_height;
The process is:
Generate a one-row cte using SELECT without a table (in Oracle you will need to include FROM DUAL).
CROSS JOIN the cte with the other table. Although there is a CROSS JOIN syntax, the older comma syntax is slightly more readable.
Note that I have cast the dates to avoid possible issues in the SELECT clause. I used PostgreSQL’s shorter syntax, but you could have used the more formal CAST('1980-01-01' AS date) for cross-dialect compatibility.
Normally, you want to avoid cross joins, but since you’re only cross joining a single row, this has the effect of simply widening the table with the variable data.
In many cases, you don’t need to include the vars. prefix if the names don’t clash with the names in the other table. I include it here to make the point clear.
Also, you can go on to add more CTEs.
This also works in all current versions of MSSQL and MySQL, which do support variables, as well as SQLite which doesn’t, and Oracle which sort of does and sort of doesn’t.
Here is an example using PREPARE statements. You still can't use ?, but you can use $n notation:
PREPARE foo(integer) AS
SELECT *
FROM somewhere
WHERE something = $1;
EXECUTE foo(5);
DEALLOCATE foo;
In DBeaver you can use parameters in queries just like you can from code, so this will work:
SELECT *
FROM somewhere
WHERE something = :myvar
When you run the query DBeaver will ask you for the value for :myvar and run the query.
Here is a code segment using plain variable in postges terminal. I have used it a few times. But need to figure a better way. Here I am working with string variable. Working with integer variable, you don't need the triple quote. Triple quote becomes single quote at query time; otherwise you got syntax error. There might be a way to eliminate the need of triple quote when working with string variables. Please update if you find a way to improve.
\set strainname '''B.1.1.7'''
select *
from covid19strain
where name = :strainname ;
In psql, you can use these 'variables' as macros. Note that they get "evaluated" every time they are used, rather than at the time that they are "set".
Simple example:
\set my_random '(SELECT random())'
select :my_random; -- gives 0.23330629315990592
select :my_random; -- gives 0.67458399344433542
this gives two different answers each time.
However, you can still use these as a valuable shorthand to avoid repeating lots of subselects.
\set the_id '(SELECT id FROM table_1 WHERE name = ''xxx'' LIMIT 1)'
and then use it in your queries later as
:the_id
e.g.
INSERT INTO table2 (table1_id,x,y,z) VALUES (:the_id, 1,2,3)
Note you have to double-quote the strings in the variables, because the whole thing is then string-interpolated (i.e. macro-expanded) into your query.

Why doesn't this to_date work, when the results have been filtered to match my date format (Oracle SQL)

I have a table 'A' with one column (VARCHAR2). The table contains a row containing the text '01/01/2021' and another row with the text 'A'.
When I try to filter out 'A' and then to_date the remaining value, I get 'ORA-01858: a non-numeric character was found where a numeric was expected'. I've tried this in 2 ways.
select *
from tbl
where col <> 'A'
and to_Date(col,'DD/MM/YYYY') = to_date('01/01/2020','DD/MM/YYYY');
select *
from ( select *
from tbl
where col <> 'A')
where to_Date(col,'DD/MM/YYYY') = to_date('01/01/2020','DD/MM/YYYY');
I can understand why the first might not work, but in the second example, the to_date should ONLY ever see filtered data (i.e. '01/01/2020').
When I delete the value of 'A', the statement runs and I get my result back so it seems conclusive that the reason it isn't running is because it's trying to to_date the value of 'A', even though that should have been filtered out by then.
I have been able to replicate this using actual Oracle tables but unfortunately when I try and reproduce the tables using WITH AS, the query works and no error is encountered - another mystery!
Why doesn't this query work? The order of operation seems to be satisfied (and it works if I use WITH AS).
Oracle (and other databases) are under no obligation to evaluate the predicate applied to an inline view before evaluating the outer predicate. Frequently, in fact, from a performance optimization standpoint, you want the optimizer to push a selective predicate from an outer query into a view, inline view, or subquery. In this case, whether the query throws an error will depend on the query plan the optimizer chooses and which predicate it actually evaluates first.
As a quick hack, you can change the inline view to prevent predicates from being pushed. In this case, the presence of a rownum stops the optimizer from pushing the predicate. You could also use hints like no_push_pred to try to force the optimizer to use the plan you want
select *
from ( select t.*, rownum rn
from tbl t
where col <> 'A')
where to_Date(col,'DD/MM/YYYY') = to_date('01/01/2020','DD/MM/YYYY');
The issue with either of these quick hacks, though, is that some future version of the optimizer might have more options than you are aware of today so you may have problems in the future.
A better option is to rewrite the query such that you don't care what order the predicates are evaluated. In this case (depending on Oracle version), that's pretty easy since to_date allows you to specify a value when there is a conversion error
select *
from tbl
where col <> 'A'
and to_Date(col default null on conversion error,'DD/MM/YYYY') =
to_date('01/01/2020','DD/MM/YYYY');
If you're on an earlier version of Oracle or to_date is just an example of the actual problem, you can create a custom function that does the same thing.
create function safe_to_date( p_str in varchar2, p_fmt in varchar2 )
return date
is
begin
return to_date( p_str, p_fmt );
exception
when value_error
then
return null;
end safe_to_date;
select *
from tbl
where col <> 'A'
and safe_to_date(col,'DD/MM/YYYY') = to_date('01/01/2020','DD/MM/YYYY');

PostgreSQL get last value in a comma separated list of values

In a PostgreSQL table I have a column which has values like
AX,B,C
A,BD
X,Y
J,K,L,M,N
In short , it will have a few comma separated strings in the column for each record. I wanted to get the last one in each record. I ended up with this.
select id, reverse(substr(reverse(mycolumn),1,position(',' in reverse(mycolumn)))) from mytable order by id ;
Is there an easier way?
I would do it this way:
select reverse(split_part(reverse(myColumn), ',', 1))
With regexp_replace:
select id, regexp_replace(mycolumn, '.*,', '')
from mytable
order by id;
Is there an easier way?
With your current data, Gordon's answer works best imo. Other options would be a regex (messy), or converting the column to a text[] array e.g. ('{' || col || '}')::text[] or variations thereof.
If you were using a text[] array instead of plain text for your column, you'd want to use array functions directly:
select col[array_length(col, 1)]
http://www.postgresql.org/docs/current/static/functions-array.html
Example with dummy data:
with bar as (
select '{a,b,c}'::text[] as foo
)
select foo[array_length(foo, 1)] from bar;
You could, of course, also create a parse_csv() function or get_last_csv_value() function to avoid writing the above.

Loop over an PostgreSQL array within a SELECT query, instead of within a PLPGSQL function

I have a text[] ARRAY column within a PostgreSQL table and I need to run char_length() on each element inside the array within a SELECT query (normal SQL, not plpgsql) so that if any of the elements is more than 25 characters long, the SELECT returns 't' and 'f' otherwise. I know I can loop over the text[] within a custom plpgsql function but due to other reasons I need to find a way to do this in SQL directly.
Is it possible?
As of PostgreSQL 8.4, you can use the UNNEST function:
SELECT MAX((char_length(string) > 25)::INT)::BOOLEAN
FROM (
SELECT my_array,UNNEST(my_array) AS string
FROM my_table
) AS x
GROUP BY my_array;
You could use unnest to break open the array and then some simple length and exists stuff:
select exists(
select 1
from (
select unnest(ar) as x
from table_name
) as t
where length(x) > 25
)
The exists and select 1 business is just a convenient way to collapse result set to a single boolean (I'm sure there are other ways).

PostgreSQL: Why can't subqueries as expressions return more than one row, but functions can?

If I try to create a column whose value is a select returning more than one row, I get an error.
=> select (select 1 union select 2);
ERROR: more than one row returned by a subquery used as an expression
But if I create a function that does the same thing, I get the behavior I want.
=> create or replace function onetwo() returns setof integer as $$
$> select 1 union select 2
$> $$ language 'sql' strict immutable;
CREATE FUNCTION
=> select onetwo();
onetwo
--------
1
2
Why the difference?
It's not an apples to apples comparison.
select *
FROM (select 1
union ALL
select 2)
...is equivalent to your function.
A column in the SELECT clause can only return a single value per record. Which is impossible with your UNION example. So I converted it into a derived table/inline view, which is what is happening with the function example.
While OMG Ponies answer is entirely correct I'd rather put it like this: You're confusing SELECT f() with SELECT literal.
SELECT f() executes a function and returns its result. And, a table returning function can also be written as SELECT * FROM f() -- which is even more elegant. Because Pg doesn't yet have stored procedures -- less scheduling they can be done through functions -- we use SELECT as Microsoft SQL uses EXECUTE
SELECT LITERAL is a method of returning a literal (something that can fit in a row/column).