I have a table with 2 column R(id int,dat jsonb). The b column jsonb consist of a 2D array [][]. For example :
id| dat
1 | {"name":"a","numbers":[[1,2],[3,4],[5,6],[1,3]]}
2 | {"age":5,"numbers":[[1,1]]}
3 | {"numbers":[[5,6],[6,7]]}
I'm trying to find all the ids that contain a specific number in one of those sub arrays. I used 2 solutions and I want to understand why the first one isn't working :
1)
select * from R
where exists (
select from jsonb_array_elements(R.dat->'numbers')->>0 first,jsonb_array_elements(range.data->'numbers')->>1 second where first::decimal= 1 and second::decimal= 1
);
ERROR: syntax error at or near "->>"
LINE 3: ...t from jsonb_array_elements(R.dat->'numbers')->>0 first,j...
SELECT *
FROM R
WHERE EXISTS (
SELECT FROM jsonb_array_elements(R.dat-> 'numbers') subarray
WHERE (subarray->>0)::decimal = 1 and (subarray->>1)::decimal = 1
);
In addition, I saw that gin index doesn't handle this operator so basically does any index will help here ?
Your first query raises an error because you can use only table experssions (not value expressions) in the FROM clause.
You can make the second query a bit simpler:
select *
from r
where exists (
select from jsonb_array_elements(dat->'numbers') subarray
where subarray = '[1,1]'
);
or using the function in a lateral join:
select r.*
from r
cross join jsonb_array_elements(dat->'numbers')
where value = '[1,1]';
There is no index that could support these queries because of the use of jsonb_array_elements().
You may be tempted to use the containment operator #> in the way like this:
select *
from r
where dat->'numbers' #> '[[1,1]]'::jsonb
id | dat
----+------------------------------------------------------------
1 | {"name": "a", "numbers": [[1, 2], [3, 4], [5, 6], [1, 3]]}
2 | {"age": 5, "numbers": [[1, 1]]}
(2 rows)
Unfortunately, as you can see, it does not work as you could expect. The use of the operator on arrays is a bit tricky, as it works in the way: array1 #> array2 is true if for each element j of array2, there is an i in array1 such that i #> j. Hence, per the documentation:
the order of array elements is not significant when doing a containment match, and duplicate array elements are effectively considered only once.
Related
I have a table with one column duration containing integer values, and I'm trying to build an other column, using a sql query, that would contain an a list of integer between 1 and the value in the duration column.
For example:
duration | range
3 | [1, 2, 3]
3 | [1, 2, 3]
2 | [1, 2]
1 | [1]
...
I found a potential solution in JS.
create or replace function list_range(DURATION double)
returns VARCHAR
language javascript
strict
as 'return [...Array(DURATION).keys()];';
SELECT
t.*,
list_range(t.duration) as range
FROM table t
What do you think of this solution? Can it be optimized?
with temp as (
select distinct duration, level as l
from duration
connect by level <= duration)
select duration,
'['||listagg(l, ', ') within group(order by l)||']' as range
from temp
group by duration
order by 1 desc;
I'm trying to implement EAV pattern using Attribute->Value tables but unlike standard way values stored in jsonb filed like {"attrId":[values]}. It's help make easy search request like:
SELECT * FROM products p WHERE p.attributes #> "{1:[2]} AND p.attributes #> "{1:[4]}"
Now I'm wondering is it will be a good approach, and what is a effective way to calculate count of available variations, for example:
-p1- {"width":[1]}
-p2- {"width":[2],"height":[3]}
-p3- {"width":[1]}
Output will
width: 1 (count 2); 2 (count 1)
height: 3 (count 1)
when select width 2
width: 1 (count 0); 2 (count 1)
height: 3 (count 1)
"Flat is better than nested" -- the zen of python
I think you would be better served to use simple key/value pairs and in the rare event you have a complex value, then make it a list. But I don't see that use case.
Here is an example which answers your question. It could be modified to use your structure, but let's keep it simple:
First create a table and insert some JSON:
# create table foo (a jsonb);
# insert into foo values ('{"a":"1", "b":"2"}');
# insert into foo values ('{"c":"3", "d":"4"}');
# insert into foo values ('{"e":"5", "a":"6"}');
Here are the records:
# select * from foo;
a
----------------------
{"a": "1", "b": "2"}
{"c": "3", "d": "4"}
{"a": "6", "e": "5"}
(3 rows)
Here is the output of the json_each_text() function from https://www.postgresql.org/docs/9.6/static/functions-json.html
# select jsonb_each_text(a) from foo;
jsonb_each_text
-----------------
(a,1)
(b,2)
(c,3)
(d,4)
(a,6)
(e,5)
(6 rows)
Now we need to put it in a table expression to be able to get access to the individual fields:
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, (rec).value from t1;
key | value
-----+-------
a | 1
b | 2
c | 3
d | 4
a | 6
e | 5
(6 rows)
And lastly here is a grouping with the SUM function. Notice the a key which was in the database 2x, has been properly summed.
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, sum((rec).value::int) from t1 group by (rec).key;
key | sum
-----+-----
c | 3
b | 2
a | 7
e | 5
d | 4
(5 rows)
As a final note, (rec) has parentheses around it because otherwise it is incorrectly looked at as a table and will result in this error:
ERROR: missing FROM-clause entry for table "rec"
I have a table called pins like this:
id (int) | pin_codes (jsonb)
--------------------------------
1 | [4000, 5000, 6000]
2 | [8500, 8400, 8600]
3 | [2700, 2300, 2980]
Now, I want the row with pin_code 8600 and with its array index. The output must be like this:
pin_codes | index
------------------------------
[8500, 8500, 8600] | 2
If I want the row with pin_code 2700, the output :
pin_codes | index
------------------------------
[2700, 2300, 2980] | 0
What I've tried so far:
SELECT pin_codes FROM pins WHERE pin_codes #> '[8600]'
It only returns the row with wanted value. I don't know how to get the index on the value in the pin_codes array!
Any help would be great appreciated.
P.S:
I'm using PostgreSQL 10
If you were storing the array as a real array not as a json, you could use array_position() to find the (first) index of a given element:
select array_position(array['one', 'two', 'three'], 'two')
returns 2
With some text mangling you can cast the JSON array into a text array:
select array_position(translate(pin_codes::text,'[]','{}')::text[], '8600')
from the_table;
The also allows you to use the "operator"
select *
from pins
where '8600' = any(translate(pin_codes::text,'[]','{}')::text[])
The contains #> operator expects arrays on both sides of the operator. You could use it to search for two pin codes at a time:
select *
from pins
where translate(pin_codes::text,'[]','{}')::text[] #> array['8600','8400']
Or use the overlaps operator && to find rows with any of multiple elements:
select *
from pins
where translate(pin_codes::text,'[]','{}')::text[] && array['8600','2700']
would return
id | pin_codes
---+-------------------
2 | [8500, 8400, 8600]
3 | [2700, 2300, 2980]
If you do that a lot, it would be more efficient to store the pin_codes as text[] rather then JSON - then you can also index that column to do searches more efficiently.
Use the function jsonb_array_elements_text() using with ordinality.
with my_table(id, pin_codes) as (
values
(1, '[4000, 5000, 6000]'::jsonb),
(2, '[8500, 8400, 8600]'),
(3, '[2700, 2300, 2980]')
)
select id, pin_codes, ordinality- 1 as index
from my_table, jsonb_array_elements_text(pin_codes) with ordinality
where value::int = 8600;
id | pin_codes | index
----+--------------------+-------
2 | [8500, 8400, 8600] | 2
(1 row)
As has been pointed out previously the array_position function is only available in Postgres 9.5 and greater.
Here is custom function that achieves the same, derived from nathansgreen at github.
-- The array_position function was added in Postgres 9.5.
-- For older versions, you can get the same behavior with this function.
create function array_position(arr ANYARRAY, elem ANYELEMENT, pos INTEGER default 1) returns INTEGER
language sql
as $BODY$
select row_number::INTEGER
from (
select unnest, row_number() over ()
from ( select unnest(arr) ) t0
) t1
where row_number >= greatest(1, pos)
and (case when elem is null then unnest is null else unnest = elem end)
limit 1;
$BODY$;
So in this specific case, after creating the function the following worked for me.
SELECT
pin_codes,
array_position(pin_codes, 8600) AS index
FROM pins
WHERE array_position(pin_codes, 8600) IS NOT NULL;
Worth bearing in mind that it will only return the index of the first occurrence of 8600, you can use the pos argument to index which ever occurrence that you like.
In short, normalize your data structure, or don't do this in SQL. If you want this index of the sub-data element given your current data structure, then do this in your application code (take result, cast to list/array, get index).
Try to unnest the string and assign numbers as follows:
with dat as
(
select 1 id, '8700, 5600, 2300' pins
union all
select 2 id, '2300, 1700, 1000' pins
)
select dat.*, t.rn as index
from
(
select id, t.pins, row_number() over (partition by id) rn
from
(
select id, trim(unnest(string_to_array(pins, ','))) pins from dat
) t
) t
join dat on dat.id = t.id and t.pins = '2300'
If you insist on storing Arrays, I'd defer to klins answer.
As the alternative answer and extension to my comment...don't store SQL data in arrays. 'Normalize' your data in advance and SQL will handle it significantly better. Klin's answer is good, but may suffer for performance as it's outside of what SQL does best.
I'd break the Array prior to storing it. If the number of pincodes is known, then simply having the table pin_id,pin1,pin2,pin3, pinetc... is functional.
If the number of pins is unknown, a first table as pin that stored the pin_id and any info columns related to that pin ID, and then a second table as pin_id, pin_seq,pin_value is also functional (though you may need to pivot this later on to make sense of the data). In this case, select pin_seq where pin_value = 260 would work.
I have a table which contains value in array something like this.
id | contents_id
1 | [1, 3, 5]
2 | [1, 2]
3 | [3, 4, 6]
4 | [2, 5]
How to write a query array e.g. [1, 2] such that it check value of array not array as a whole ?
If any common value of array is found get all tuples.
If [1, 2] is queried it must fetch id => 1, 2, 4 from above table as it contains 1 or 2.
Consider using the intarray extension. It provides a && operator for testing integer array overlapping. Here is a fiddle, with an example.
select id from test where ARRAY[1,2] && contents_id;
Though you can query it with the operator, I think it will be better to make a junction table with integer IDs.
On 1-D int arrays && operator arrayoverlap is the fastest as #LaposhasĂșAcsa suggested.
so my answer stands only if arrayoverlap is not avaliable or want to work with anything other than one dimensional integer arrays.
Check UNNEST https://www.postgresql.org/docs/current/static/functions-array.html
CREATE TABLE t45407507 (
id SERIAL PRIMARY KEY
,c int[]
);
insert into t45407507 ( c) values
(ARRAY[1,3,5])
, (ARRAY[1,2])
, (ARRAY[3,4,6])
, (ARRAY[2,5]);
select DISTINCT id from
(SELECT id,unnest(c) as c
from t45407507) x
where x.c in (1,2);
Can be shortened with LATERAL join
select DISTINCT id from
t45407507 x,unnest(c) ec
where ec in (1,2);
The comma (,) in the FROM clause is short notation for CROSS JOIN.
LATERAL is assumed automatically for table functions like unnest().
Rewrite WHERE to use ARRAY as parameter
SELECT DISTINCT id FROM
t45407507 x,unnest(c) ec
WHERE ec = ANY(ARRAY[1,2]);
I have a table with a array column like this:
my_table
id array
-- -----------
1 {1, 3, 4, 5}
2 {19,2, 4, 9}
3 {23,46, 87, 6}
4 {199,24, 93, 6}
And i want as result what and where is the repeated values, like this:
value_repeated is_repeated_on
-------------- -----------
4 {1,2}
6 {3,4}
Is it possible? I don't know how to do this. I don't how to start it! I'm lost!
Use unnest to convert the array to rows, and then array_agg to build an array from the ids
It should look something like this:
SELECT v AS value_repeated,array_agg(id) AS is_repeated_on FROM
(select id,unnest(array) as v from my_table)
GROUP by v HAVING Count(Distinct id) > 1
Note that HAVING Count(Distinct id) > 1 is filtering values that don't appear even once
The clean way to call a set-returning function like unnest() is in a LATERAL join, available since Postgres 9.3:
SELECT value_repeated, array_agg(id) AS is_repeated_on
FROM my_table
, unnest(array_col) value_repeated
GROUP BY value_repeated
HAVING count(*) > 1
ORDER BY value_repeated; -- optional
About LATERAL:
Call a set-returning function with an array argument multiple times
There is nothing in your question to rule out shortcut duplicates (the same element more than once in the same array (like I#MSoP commented), so it must be count(*), not count (DISTINCT id).