Find position(s) in array matching a given sub-array - sql

Given this table:
CREATE TABLE datasets.travel(path integer[], path_timediff double precision[]);
INSERT INTO datasets.travel
VALUES (array[50,49,49,49,49,50], array[NULL,438,12,496,17,435]);
I am looking for some kind of function or query in the PostgreSQL that for a given input array[49,50] will find the matching consecutive index values in path which is [5,6] and the corresponding element in path_timediff which is 435 in the example (array index 6).
My ultimate purpose is to find all such occurrences of [49,50] in path and all the corresponding elements in path_timediff. How can I do that?

Assuming you have a primary key in your table you did not show:
CREATE TABLE datasets.travel (
travel_id serial PRIMARY KEY
, path integer[]
, path_timediff float8[]
);
Here is one way with generate_subscripts() in a LATERAL join:
SELECT t.travel_id, i+1 AS position, path_timediff[i+1] AS timediff
FROM (SELECT * FROM datasets.travel WHERE path #> ARRAY[49,50]) t
, generate_subscripts(t.path, 1) i
WHERE path[i:i+1] = ARRAY[49,50];
This finds all matches, not just the first.
i+1 works for a sub-array of length 2. Generalize with i + array_length(sub_array, 1) - 1.
The subquery is not strictly necessary, but can use a GIN index on (path) for a fast pre-selection:
(SELECT * FROM datasets.travel WHERE path #> ARRAY[49,50])
Related:
How to access array internal index with postgreSQL?
Parallel unnest() and sort order in PostgreSQL
PostgreSQL unnest() with element number

Related

How to filter a JSON array containing any of multiple elements

This is the sample table:
create table leadtime.test (
id serial primary key,
name jsonb
)
Data test:
insert into leadtime.test (name)
values ('["abc", "def", "ghi"]');
I want to check if name contains any value in this array '["abc", "132", "456"]'
I have to do this code:
select * from leadtime.test
where (name ? 'abc') or (name ? '132') or (name ? '456');
I was told that multiple OR'ed filters or not optimal for performance.
Is there a better way?
Pass your array of search terms as actual Postgres array and use the |? operator:
SELECT *
FROM test
WHERE name ?| '{abc, def, ghi}';
The manual:
jsonb ?| text[] → boolean
Do any of the strings in the text array exist as top-level keys or
array elements?
Can be supported with a plain GIN index on (name), too.

Filter an SQL Array text[] for matching value containing a parameter

I have a table with a TEXT[] column. I want to return all rows that have at least one of the array value that contains my parameter.
Right now I'm doing WHERE array_to_string(arr, ',') ilike '%myString%'
But I feel their must be a better optimized way of doing that kind of search.
Plus I would also like to search for values begining or ending by my parameter.
CREATE TABLE IF NOT EXISTS my_table
(
id BIGSERIAL,
col_array TEXT[],
CONSTRAINT my_table_pkey PRIMARY KEY (id)
)
insert into my_table(col_array)
VALUES ('{ABC,DEF}'),
('{FGH,IJK}'),
('{LMN}'),
('{OPQ}');
select * from my_table where ARRAY_TO_STRING(col_array, ',') ilike '%F%';
this works as it returns only first 2 rows.
You can find a sqlfiddle here: http://sqlfiddle.com/#!17/09632/7
I would use a sub-query:
select t.*
from my_table t
where exists (select *
from unnest(t.col_array) as x(e)
where x.e ilike '%F%')
You might want to re-consider your decision to de-normalize your model.
Quote from the manual
Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.

Remove one, non-unique value from a 2d array

To expand on my answered question here:
Remove one, non-unique value from an array
Given this table in PostgreSQL 9.6:
CREATE TABLE test_table (
id int PRIMARY KEY
, test_array text[][]
);
With a row like:
INSERT INTO test_table (id, test_array)
VALUES (1 , '{ {A,AA},{A,AB},{B,AA},{B,AB} }');
How would I remove an index from test_array:
a) matching the [0] value,
b) matching both the [0] and [1] values.
I am getting an exception when using array_position:
searching for elements in multidimensional arrays is not supported
Also, how would an update query be constructed based on this matching?
I'm not sure that I can build a query as done in a 1d array.
Any help is appreciated.
Decided to normalize instead (in this instance, breaking the arrays into two tables with reference keys), per a_horse_with_no_name's recommendation.

postgreSQL hstore if contains value

Is there a way to check if a value already exists in the hstore in the query itself.
I have to store various values per row ( each row is an "item").
I need to be able to check if the id already exists in database in one of the hstore rows without selecting everything first and doing loops etc in php.
hstore seems to be the only data type that offers something like that and also allows you to select the column for that row into an array.
Hstore may not be the best data type to store data like that but there isn't anything else better available.
The whole project uses 9.2 and i cannot change that - json is in 9.3.
The exist() function tests for the existence of a key. To determine whether the key '42' exists anywhere in the hstore . . .
select *
from (select test_id, exist(test_hs, '42') key_exists
from test) x
where key_exists = true;
test_id key_exists
--
2 t
The svals() function returns values as a set. You can query the result to determine whether a particular value exists.
select *
from (select test_id, svals(test_hs) vals
from test) x
where vals = 'Wibble';
hstore Operators and Functions
create table test (
test_id serial primary key,
test_hs hstore not null
);
insert into test (test_hs) values (hstore('a', 'b'));
insert into test (test_hs) values (hstore('42', 'Wibble'));

Unique constraint for permutations across multiple columns

Given the following three columns in a Postgres database: first, second, third; how can I create a constraint such that permutations are unique?
E.g. If ('foo', 'bar', 'shiz') exist in the db, ('bar', 'shiz', 'foo') would be excluded as non-unique.
You could use hstore to create the unique index:
CREATE UNIQUE INDEX hidx ON test USING BTREE (hstore(ARRAY[a,b,c], ARRAY[a,b,c]));
Fiddle
UPDATE
Actually
CREATE UNIQUE INDEX hidx ON test USING BTREE (hstore(ARRAY[a,b,c], ARRAY[null,null,null]));
might be a better idea since it will work the same but should take less space (fiddle).
For only three columns this unique index using only basic expressions should perform very well. No additional modules like hstore or custom function needed:
CREATE UNIQUE INDEX t_abc_uni_idx ON t (
LEAST(a,b,c)
, GREATEST(LEAST(a,b), LEAST(b,c), LEAST(a,c))
, GREATEST(a,b,c)
);
SQL fiddle
Also needs the least disk space:
SELECT pg_column_size(row(hstore(t))) AS hst_row
,pg_column_size(row(hstore(ARRAY[a,b,c], ARRAY[a,b,c]))) AS hst1
,pg_column_size(row(hstore(ARRAY[a,b,c], ARRAY[null,null,null]))) AS hst2
,pg_column_size(row(ARRAY[a,b,c])) AS arr
,pg_column_size(row(LEAST(a,b,c)
, GREATEST(LEAST(a,b), LEAST(b,c), LEAST(a,c))
, GREATEST(a,b,c))) AS columns
FROM t;
hst_row | hst1 | hst2 | arr | columns
---------+------+------+-----+---------
59 | 59 | 56 | 69 | 30
Numbers are bytes for index row in the example in the fiddle, measured with pg_column_size(). My example uses only single characters, the difference in size is constant.
You can do this by creating a unique index on a function which returns a sorted array of the values in the columns:
CREATE OR REPLACE FUNCTION sorted_array(anyarray)
RETURNS anyarray
AS $BODY$
SELECT array_agg(x) FROM (SELECT unnest($1) AS x FROM test ORDER BY x) AS y;
$BODY$
LANGUAGE sql IMMUTABLE;
CREATE UNIQUE index ON test (sorted_array(array[first,second,third]));
Suggestion from co-worker, variation of #julien's idea:
Sort the terms alphabetically and place a delimiter on either side of each term. Concatenate them and place them in a separate field that becomes the primary key.
Why the delimiter? So that, "a", "aa", "aaa" and "aa", "aa", "aa" can both be inserted.