Query Sum of jsonb column of objects - sql

I have a jsonb column on my DB called reactions which has the structure like below.
[
{
"id": "1234",
"count": 1
},
{
"id": "2345",
"count": 1
}
]
The field holds an array of objects, each with a count and id field. I'm trying to find which object in the reactions field has the highest count across the db. Basically, I'd like to sum each object by it's id and find the max.
I've figured out how to sum up all of the reaction counts, but I'm getting stuck grouping it by the ID, and finding the sum for each individual id.
SELECT SUM((x->>'count')::integer) FROM (SELECT id, reactions FROM messages) as m
CROSS JOIN LATERAL jsonb_array_elements(m.reactions) AS x
Ideally I'd end up with something like this:
id | sum
-----------
1234 | 100
2345 | 70
5678 | 50
The messages table looks something like this
id | user | reactions
------------------------
1 | 3456 | jsonb
2 | 8573 | jsonb

The data calculation needs to take some transformation steps.
flat the jsonb column from array to individual jsonb objects using jsonb_array_elements function ;
postgres=# select jsonb_array_elements(reactions)::jsonb as data from messages;
data
----------------------------
{"id": "1234", "count": 1}
{"id": "2345", "count": 1}
{"id": "1234", "count": 1}
{"id": "2345", "count": 1}
...
populate each jsonb objects to seperate columns with jsonb_populate_record function ;
postgres=# create table data(id text ,count int);
CREATE TABLE
postgres=# select r.* from (select jsonb_array_elements(reactions)::jsonb as data from messages) as tmp, jsonb_populate_record(NULL::data, data) r;
id | count
------+-------
1234 | 1
2345 | 1
1234 | 1
2345 | 1
...
do the sum with group by.
postgres=# select r.id, sum(r.count) from (select jsonb_array_elements(reactions)::jsonb as data from messages) as tmp, jsonb_populate_record(NULL::data, data) r group by r.id;
id | sum
------+-----
2345 | 2
1234 | 2
...
The above steps should make it.

you can use the below - to convert the jsonb array to standard rows
see https://dba.stackexchange.com/questions/203250/getting-specific-key-values-from-jsonb-into-columns
select "id", sum("count")
from messages
left join lateral jsonb_to_recordset(reactions) x ("id" text, "count" int) on true
group by "id" order by 1;

Related

I want to extract Json format data with BigQuery. UDF or json_extract

I have a table with the following structure.
user_id int,
purchase_ids string(in Json format)
The JSON contained in one record in this table looks like this:
user_id = 0001
1:{
shop_id:1,
product_id :1111,
value: 1
},
2:{
shop_id:1,
product_id :2222,
value: 1
},
3:{
shop_id:1,
product_id :3333,
value: 1
},
.... Numbers fluctuate as records approach
Final output to aim for
| user_id | shop_id | product_id | value |
| 0001 | 1 | 1111 | 1 |
| 0001 | 1 | 2222 | 1 |
| 0001 | 1 | 3333 | 1 |
I tried the following query when I was thinking but it doesn't seem to be done right
shop_id and product_id return null.
CREATE TEMP FUNCTION jsonparse(json_row STRING)
RETURNS STRING
LANGUAGE js AS """
var res = array();
json_row.forEach(([key, value]) => {
res = value;
});
return res
""";
with
parse as(
select
user_id,
jsonparse(purchase_ids) as pids
from
sample
)
select
user_id,
JSON_EXTRAXT(pid,"$.shop_id") as shop_id,
JSON_EXTRAXT(pid,"$.product_id") as product_id
from
parse,
unnest(pids,",") pid
How do you get it right in this situation?
Below is the working version of your use case (BigQuery Standard SQL)
#standardSQL
CREATE TEMP FUNCTION jsonparse(input STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(input).map(x=>JSON.stringify(x));
""";
WITH sample AS (
SELECT "0001" AS user_id,
'''[{"shop_id": 1, "product_id" :1111, "value": 1},
{"shop_id": 1, "product_id" :2222, "value": 1},
{"shop_id": 1, "product_id" :3333, "value": 1}]''' AS purchase_ids
), parse AS (
SELECT user_id,
jsonparse(purchase_ids) AS pids
FROM sample
)
SELECT
user_id,
JSON_EXTRACT(pid,"$.shop_id") AS shop_id,
JSON_EXTRACT(pid,"$.product_id") AS product_id,
JSON_EXTRACT(pid,"$.value") AS value
FROM parse,
UNNEST(pids) pid
with result
Row user_id shop_id product_id value
1 0001 1 1111 1
2 0001 1 2222 1
3 0001 1 3333 1
From my point of view, your use case needs to use a NESTED and REAPEATED column that can be represented with a json structure. For example, the following query return the result you are looking for:
WITH users AS
(SELECT "0001" as user_id, ARRAY<STRUCT<shop_id INT64, product_id INT64, value INT64>>[(1, 1111,1),
(1, 2222,1), (1, 3333,1)] AS shops)
SELECT u.user_id, s.*
FROM users u, UNNEST(shops) s;
For simplicity you can create this type of column from the Console to try this approach by following this guide.

How to convert single json array to row

I'm trying json array column value convert to row.
That's my table:
id | json_v |
1 | ["1", "3"] |
2 | ["4", "5", "6"] |
3 | ["7", "8"] |
I want to get just simple result:
json_v
1
2
3
4
...
I found this code:
select json_array_elements(json_v::json) from table_name where id=1
it worked. But when I tried get all:
select a from table_name t, json_array_elements((t.json_v)::json) a
I'm getting error:
SQL Error [22023]: ОШИБКА: вызывать json_array_elements со скаляром нельзя (you cannot call json_array_elements with a scalar)
So how I can convert all json array values to row?

How to perform a pattern matching query on the keys of a hstore/jsonb column?

I'm trying to perform a pattern matching on an hstore column on a Postgresql database table.
Here's what I tried:
SELECT
*
FROM
products
WHERE
'iphone8' LIKE ANY(AVALS(available_devices))
however, it seems that the ANY operator only supports <, <=, <>, etc.
I also tried this:
SELECT
*
FROM
products
WHERE
ANY(AVALS(available_devices)) LIKE 'iphone8'
but then it raises a SyntaxError.
So, can I do a query with a WHERE clause in which I pass a parameter and the results of the query are the rows that contain any key in the informed hstore_column that match the given parameter?
eg:
for rows
id | hstore_column
1 { country: 'brazil' }
2 { city: 'amsterdam' }
3 { state: 'new york' }
4 { count: 10 }
5 { counter: 'Irelia' }
I'd like to perform a WHERE with a parameter 'count' and I expect the results to be:
id | hstore_column
1 { country: 'brazil' }
4 { count: 10 }
5 { counter: 'Irelia' }
You can use jsonb_object_keys to turn the keys into a column. Then match against the key.
For example, here's my test data.
select * from test;
id | stuff
----+---------------------------------------------
1 | {"country": "brazil"}
2 | {"city": "amsterdam"}
3 | {"count": 10}
4 | {"pearl": "jam", "counting": "crows"}
5 | {"count": "chocula", "count down": "final"}
Then we can use jsonb_object_keys to turn each key into its own row.
select id, stuff, jsonb_object_keys(stuff) as key
from test;
id | stuff | key
----+---------------------------------------------+------------
1 | {"country": "brazil"} | country
2 | {"city": "amsterdam"} | city
3 | {"count": 10} | count
4 | {"pearl": "jam", "counting": "crows"} | pearl
4 | {"pearl": "jam", "counting": "crows"} | counting
5 | {"count": "chocula", "count down": "final"} | count
5 | {"count": "chocula", "count down": "final"} | count down
This can be used in a sub-select to get each matching key/value pair.
select id, stuff, key, stuff->key as value
from (
select id, stuff, jsonb_object_keys(stuff) as key
from test
) pairs
where key like 'count%';
id | stuff | key | value
----+---------------------------------------------+------------+-----------
1 | {"country": "brazil"} | country | "brazil"
3 | {"count": 10} | count | 10
4 | {"pearl": "jam", "counting": "crows"} | counting | "crows"
5 | {"count": "chocula", "count down": "final"} | count | "chocula"
5 | {"count": "chocula", "count down": "final"} | count down | "final"
Or we can use distinct to get just the matching rows.
select distinct id, stuff
from (
select id, stuff, jsonb_object_keys(stuff) as key
from test
) pairs
where key like 'count%';
id | stuff
----+---------------------------------------------
1 | {"country": "brazil"}
3 | {"count": 10}
4 | {"pearl": "jam", "counting": "crows"}
5 | {"count": "chocula", "count down": "final"}
dbfiddle
Note: having to search the keys indicates your data structure might need rethinking. A traditional key/value table might work better. The values can still be jsonb. There's a little more setup, but the queries are simpler and it is easier to index.
create table attribute_group (
id bigserial primary key
);
create table test (
id bigserial primary key,
attribute_group_id bigint
references attribute_group(id)
on delete cascade
);
create table attributes (
attribute_group_id bigint
references attribute_group(id) not null,
key text not null,
value jsonb not null
);
select test.id, attrs.key, attrs.value
from test
join attributes attrs on attrs.attribute_group_id = test.attribute_group_id
where attrs.key like 'count%';
dbfiddle

JSONB nested array query - check existence of attribute

I want to check the existence of an attribute in a JSONB column using SQL.
Using this is I can check if attribute equals value:
SELECT count(*) AS "count" FROM "table" WHERE column->'node' #> '[{"Attribute":"value"}]'
What syntax do I use to check the existence of Attribute?
Usually you'll check for null:
SELECT count(*) AS "count" FROM "table"
WHERE column->'node'->'Attribute' is not null
The ? operator means Does the string exist as a top-level key within the JSON value? However, you want to check whether a key exists in a nested json array of objects, so you cannot use the operator directly. You have to unnest arrays.
Sample data:
create table my_table(id serial primary key, json_column jsonb);
insert into my_table (json_column) values
('{"node": [{"Attribute":"value"}, {"other key": 0}]}'),
('{"node": [{"Attribute":"value", "other key": 0}]}'),
('{"node": [{"Not Attribute":"value"}]}');
Use jsonb_array_elements() in a lateral join to find out whether a key exists in any element of the array:
select
id,
value,
value ? 'Attribute' as key_exists_in_object
from my_table
cross join jsonb_array_elements(json_column->'node')
id | value | key_exists_in_object
----+----------------------------------------+----------------------
1 | {"Attribute": "value"} | t
1 | {"other key": 0} | f
2 | {"Attribute": "value", "other key": 0} | t
3 | {"Not Attribute": "value"} | f
(4 rows)
But this is not exactly what you are expecting. You need to aggregate results for arrays:
select
id,
json_column->'node' as array,
bool_or(value ? 'Attribute') as key_exists_in_array
from my_table
cross join jsonb_array_elements(json_column->'node')
group by id
order by id
id | array | key_exists_in_array
----+--------------------------------------------+---------------------
1 | [{"Attribute": "value"}, {"other key": 0}] | t
2 | [{"Attribute": "value", "other key": 0}] | t
3 | [{"Not Attribute": "value"}] | f
(3 rows)
Well, this looks a bit complex. You can make it easier using the function:
create or replace function key_exists_in_array(key text, arr jsonb)
returns boolean language sql immutable as $$
select bool_or(value ? key)
from jsonb_array_elements(arr)
$$;
select
id,
json_column->'node' as array,
key_exists_in_array('Attribute', json_column->'node')
from my_table
id | array | key_exists_in_array
----+--------------------------------------------+---------------------
1 | [{"Attribute": "value"}, {"other key": 0}] | t
2 | [{"Attribute": "value", "other key": 0}] | t
3 | [{"Not Attribute": "value"}] | f
(3 rows)

How to sum a value in a JSONB array in Postgresql?

Given the following data in the jsonb column p06 in the table ryzom_characters:
-[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
p06 | {
"id": 675010,
"cname": "Bob",
"rpjobs": [
{
"progress": 25
},
{
"progress": 13
},
{
"progress": 30
}
]
}
I am attempting to sum the value of progress. I have attempted the following:
SELECT
c.cname AS cname,
jsonb_array_elements(c.p06->'rpjobs')::jsonb->'progress' AS value
FROM ryzom_characters c
Where cid = 675010
ORDER BY value DESC
LIMIT 50;
Which correctly lists the values:
cname | value
--------+-------
Savisi | 30
Savisi | 25
Savisi | 13
(3 rows)
But now I would like to sum these values, which could be null.
How do I correctly sum an object field within an array?
Here is the table structure:
Table "public.ryzom_characters"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+---------
cid | bigint | | |
cname | character varying(255) | | not null |
p06 | jsonb | | |
x01 | jsonb | | |
Use the function jsonb_array_elements() in a lateral join in the from clause:
select cname, sum(coalesce(value, '0')::int) as value
from (
select
p06->>'cname' as cname,
value->>'progress' as value
from ryzom_characters
cross join jsonb_array_elements(p06->'rpjobs')
where cid = 675010
) s
group by cname
order by value desc
limit 50;
You can use left join instead of cross join to protect the query against inconsistent data:
left join jsonb_array_elements(p06->'rpjobs')
on jsonb_typeof(p06->'rpjobs') = 'array'
where p06->'rpjobs' <> 'null'
The function jsonb_array_elements() is a set-returning function. You should therefore use it as a row source (in the FROM clause). After the call you have a table where every row contains an array element. From there on it is relatively easy.
SELECT cname,
sum(coalesce(r.prog->>'progress'::int, 0)) AS value
FROM ryzom_characters c,
jsonb_array_elements(c.p06->'rpjobs') r (prog)
WHERE c.cid = 675010
GROUP BY cname
ORDER BY value DESC
LIMIT 50;