JSONB nested array query - check existence of attribute - sql

I want to check the existence of an attribute in a JSONB column using SQL.
Using this is I can check if attribute equals value:
SELECT count(*) AS "count" FROM "table" WHERE column->'node' #> '[{"Attribute":"value"}]'
What syntax do I use to check the existence of Attribute?

Usually you'll check for null:
SELECT count(*) AS "count" FROM "table"
WHERE column->'node'->'Attribute' is not null

The ? operator means Does the string exist as a top-level key within the JSON value? However, you want to check whether a key exists in a nested json array of objects, so you cannot use the operator directly. You have to unnest arrays.
Sample data:
create table my_table(id serial primary key, json_column jsonb);
insert into my_table (json_column) values
('{"node": [{"Attribute":"value"}, {"other key": 0}]}'),
('{"node": [{"Attribute":"value", "other key": 0}]}'),
('{"node": [{"Not Attribute":"value"}]}');
Use jsonb_array_elements() in a lateral join to find out whether a key exists in any element of the array:
select
id,
value,
value ? 'Attribute' as key_exists_in_object
from my_table
cross join jsonb_array_elements(json_column->'node')
id | value | key_exists_in_object
----+----------------------------------------+----------------------
1 | {"Attribute": "value"} | t
1 | {"other key": 0} | f
2 | {"Attribute": "value", "other key": 0} | t
3 | {"Not Attribute": "value"} | f
(4 rows)
But this is not exactly what you are expecting. You need to aggregate results for arrays:
select
id,
json_column->'node' as array,
bool_or(value ? 'Attribute') as key_exists_in_array
from my_table
cross join jsonb_array_elements(json_column->'node')
group by id
order by id
id | array | key_exists_in_array
----+--------------------------------------------+---------------------
1 | [{"Attribute": "value"}, {"other key": 0}] | t
2 | [{"Attribute": "value", "other key": 0}] | t
3 | [{"Not Attribute": "value"}] | f
(3 rows)
Well, this looks a bit complex. You can make it easier using the function:
create or replace function key_exists_in_array(key text, arr jsonb)
returns boolean language sql immutable as $$
select bool_or(value ? key)
from jsonb_array_elements(arr)
$$;
select
id,
json_column->'node' as array,
key_exists_in_array('Attribute', json_column->'node')
from my_table
id | array | key_exists_in_array
----+--------------------------------------------+---------------------
1 | [{"Attribute": "value"}, {"other key": 0}] | t
2 | [{"Attribute": "value", "other key": 0}] | t
3 | [{"Not Attribute": "value"}] | f
(3 rows)

Related

Match two jsonb documents by order of elements in array

I have table of data jsonb documents in postgres and second table containing templates for data.
I need to match data jsonb row with template jsonb row just by order of elements in array in effective way.
template jsonb document:
{
"template":1,
"rows":[
"first row",
"second row",
"third row"
]
}
data jsonb document:
{
"template":1,
"data":[
125,
578,
445
]
}
desired output:
Desc
Amount
first row
125
second row
578
third row
445
template table:
| id | jsonb |
| -------- | ------------------------------------------------------ |
| 1 | {"template":1,"rows":["first row","second row","third row"]} |
| 2 | {"template":2,"rows":["first row","second row","third row"]} |
| 3 | {"template":3,"rows":["first row","second row","third row"]} |
data table:
| id | jsonb |
| -------- | ------------------------------------------- |
| 1 | {"template":1,"data":[125,578,445]} |
| 2 | {"template":1,"data":[125,578,445]} |
| 3 | {"template":2,"data":[125,578,445]} |
I have millions of data jsonb documents and hundreds of templates.
I would do it just by converting both to tables, then use row_number windowed function but it does not seem very effective way to me.
Is there better way of doing this?
You will have to normalize this mess "on-the-fly" to get the output you want.
You need to unnest each array using jsonb_array_elements() using the with ordinality option to get the array index. You can join the two tables by extracting the value of the template key:
Assuming you want to return this for a specific row from the data table:
select td.val, dt.val
from data
cross join jsonb_array_elements_text(data.jsonb_column -> 'data') with ordinality as dt(val, idx)
left join template tpl
on tpl.jsonb_column ->> 'template' = data.jsonb_column ->> 'template'
left join jsonb_array_elements_text(tpl.jsonb_column -> 'rows') with ordinality as td(val, idx)
on td.idx = dt.idx
where data.id = 1;
Online example

Query Sum of jsonb column of objects

I have a jsonb column on my DB called reactions which has the structure like below.
[
{
"id": "1234",
"count": 1
},
{
"id": "2345",
"count": 1
}
]
The field holds an array of objects, each with a count and id field. I'm trying to find which object in the reactions field has the highest count across the db. Basically, I'd like to sum each object by it's id and find the max.
I've figured out how to sum up all of the reaction counts, but I'm getting stuck grouping it by the ID, and finding the sum for each individual id.
SELECT SUM((x->>'count')::integer) FROM (SELECT id, reactions FROM messages) as m
CROSS JOIN LATERAL jsonb_array_elements(m.reactions) AS x
Ideally I'd end up with something like this:
id | sum
-----------
1234 | 100
2345 | 70
5678 | 50
The messages table looks something like this
id | user | reactions
------------------------
1 | 3456 | jsonb
2 | 8573 | jsonb
The data calculation needs to take some transformation steps.
flat the jsonb column from array to individual jsonb objects using jsonb_array_elements function ;
postgres=# select jsonb_array_elements(reactions)::jsonb as data from messages;
data
----------------------------
{"id": "1234", "count": 1}
{"id": "2345", "count": 1}
{"id": "1234", "count": 1}
{"id": "2345", "count": 1}
...
populate each jsonb objects to seperate columns with jsonb_populate_record function ;
postgres=# create table data(id text ,count int);
CREATE TABLE
postgres=# select r.* from (select jsonb_array_elements(reactions)::jsonb as data from messages) as tmp, jsonb_populate_record(NULL::data, data) r;
id | count
------+-------
1234 | 1
2345 | 1
1234 | 1
2345 | 1
...
do the sum with group by.
postgres=# select r.id, sum(r.count) from (select jsonb_array_elements(reactions)::jsonb as data from messages) as tmp, jsonb_populate_record(NULL::data, data) r group by r.id;
id | sum
------+-----
2345 | 2
1234 | 2
...
The above steps should make it.
you can use the below - to convert the jsonb array to standard rows
see https://dba.stackexchange.com/questions/203250/getting-specific-key-values-from-jsonb-into-columns
select "id", sum("count")
from messages
left join lateral jsonb_to_recordset(reactions) x ("id" text, "count" int) on true
group by "id" order by 1;

Postgresql order by array overlap

I want to sort a table with array. The record with the most overlaps should be on top. I already have a where statement to filter the records with arrays. With the same array I want to determine the number of overlaps for sorting. Do you have an idea how the order by statement might look like ?
My Table
SELECT * FROM "nodes"
+-----------+---------------------------+
| name | tags |
+-----------+---------------------------+
| Max | ["foo", "orange", "app"] |
| Peter | ["foo", "bar", "baz"] |
| Maria | ["foo", "bar"] |
| John | ["apple"] |
+-----------+---------------------------+
Result with where
SELECT * FROM "nodes" WHERE (tags && '{"foo", "bar", "baz"}')
+-----------+---------------------------+
| name | tags |
+-----------+---------------------------+
| Max | ["foo", "orange", "app"] |
| Peter | ["foo", "bar", "baz"] |
| Maria | ["foo", "bar"] |
+-----------+---------------------------+
Result with Order
SELECT * FROM "nodes" WHERE (tags && '{"foo", "bar", "baz"}') ORDER BY ????
+-----------+---------------------------+
| name | tags |
+-----------+---------------------------+
| Peter | ["foo", "bar", "baz"] |
| Maria | ["foo", "bar"] |
| Max | ["foo", "orange", "app"] |
+-----------+---------------------------+
The only thing I can think of, is to create a function that computes the number of common elements:
create or replace function num_overlaps(p_one text[], p_other text[])
returns bigint
as
$$
select count(*)
from (
select *
from unnest(p_one)
intersect
select *
from unnest(p_other)
) x
$$
language sql
immutable;
Then use it in the order by clause:
SELECT *
FROM nodes
WHERE tags && '{"foo", "bar", "baz"}'
order by num_overlaps(tags, '{"foo", "bar", "baz"}') desc;
The drawback is, that you need to repeat the list of tags you are testing for.
It's unclear to me if those values are JSON arrays (because that's the syntax in the sample data) or native Postgres arrays (because of the && operator which doesn't work for JSON arrays) - if you are using jsonb you can replace unnest() with jsonb_array_elements_text()
First of all, the arrays are needed for the identifiers of both sides of && operator, such as STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[] instead of tags and STRING_TO_ARRAY('foo,bar,baz',',')) instead of '{"foo", "bar", "baz"}' pattern, respectively.
Then, you can unnest array elements for tags column by using JSON_ARRAY_ELEMENTS() function in order to count the occurence of each elements of returned value columns within '{"foo", "bar", "baz"}' pattern through use of STRPOS() and SIGN() functions together with SUM() aggregation :
SELECT name, tags::text
FROM "nodes"
CROSS JOIN JSON_ARRAY_ELEMENTS(tags) AS js
WHERE ( STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[]
&& STRING_TO_ARRAY('foo,bar,baz',','))
GROUP BY name, tags::text
ORDER BY SUM( SIGN( STRPOS('{"foo", "bar", "baz"}'::text,value::text) ) ) DESC
But, you may have repeated elements within tags column. In this case the above query fails. So, I suggest to use this one below containing rows eliminated by DISTINCT keyword :
SELECT name, tags
FROM
(
SELECT DISTINCT name, tags::text, STRPOS('{"foo", "bar", "baz"}'::text,value::text)
FROM "nodes"
CROSS JOIN JSON_ARRAY_ELEMENTS(tags) AS js
WHERE ( STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[]
&& STRING_TO_ARRAY('foo,bar,baz',','))
) n
GROUP BY name, tags::text
ORDER BY SUM( SIGN( strpos ) ) DESC
Demo

Unnest json string array

I'm using psql and I have a table that looks like this:
id | dashboard_settings
-----------------------
1 | {"query": {"year_end": 2018, "year_start": 2015, "category": ["123"]}}
There are numerous rows, but for every row the "category" value is an array with one integer (in string format).
Is there a way I can 'unpackage' the category object? So that it just has 123 as an integer?
I've tried this but had no success:
SELECT jsonb_extract_path_text(dashboard_settings->'query', 'category') from table
This returns:
jsonb_extract_path_text | ["123"]
when I want:
jsonb_extract_path_text | 123
You need to use the array access operator for which is simply ->> followed by the array index:
select jsonb_extract_path(dashboard_settings->'query', 'category') ->> 0
from the_table
alternatively:
select dashboard_settings -> 'query' -> 'category' ->> 0
from the_table
Consider:
select dashboard_settings->'query'->'category'->>0 c from mytable
Demo on DB Fiddle:
| c |
| :-- |
| 123 |

Extract fields from postgres jsonb column

I have a postgres table with jsonb column, which has the value as follows:
id | messageStatus | payload
-----|----------------------|-------------
1 | 123 | {"commissionEvents":[{"id":1,"name1":"12","name2":15,"name4":"apple","name5":"fruit"},{"id":2,"name1":"22","name2":15,"name4":"sf","name5":"fdfjkd"}]}
2 | 124 | {"commissionEvents":[{"id":3,"name1":"32","name2":15,"name4":"sf","name5":"fdfjkd"},{"id":4,"name1":"42","name2":15,"name4":"apple","name5":"fruit"}]}
3 | 125 | {"commissionEvents":[{"id":5,"name1":"42","name2":15,"name4":"apple","name5":"fdfjkd"},{"id":6,"name1":"52","name2":15,"name4":"sf","name5":"fdfjkd"},{"id":7,"name1":"62","name2":15,"name4":"apple","name5":"fdfjkd"}]}
here payload column is a jsonb datatype, I want to write a postgres query to fetch name1 from commissionEvents where name4 = apple.
So my result will be like:
Since I was new to this jsonb, can anyone please suggest me some solution for it.
You need to unnest all array elements, then you can apply a WHERE condition on that to filter out those with the desired name.
select t.id, x.o ->> 'name1'
from the_table t
cross join lateral jsonb_array_elements(t.payload -> 'commissionEvents') as x(o)
where x.o ->> 'name4' = 'apple'
Online example: https://rextester.com/XWHG26387