Querying over PostgreSQL JSONB column - sql

I have a table "blobs" with a column "metadata" in jsonb data-type,
Example:
{
"total_count": 2,
"items": [
{
"name": "somename",
"metadata": {
"metas": [
{
"id": "11258",
"score": 6.1,
"status": "active",
"published_at": "2019-04-20T00:29:00",
"nvd_modified_at": "2022-04-06T18:07:00"
},
{
"id": "9251",
"score": 5.1,
"status": "active",
"published_at": "2018-01-18T23:29:00",
"nvd_modified_at": "2021-01-08T12:15:00"
}
]
}
]
}
I want to identify statuses in the "metas" array that match with certain, given strings. I have tried the following so far but without results:
SELECT * FROM blobs
WHERE metadata is not null AND
(
SELECT count(*) FROM jsonb_array_elements(metadata->'metas') AS cn
WHERE cn->>'status' IN ('active','reported')
) > 0;
It would also be sufficient if I could compare the string with "status" in the first array object.
I am using PostgreSQL 9.6.24

for some clarity I usually break code into series of WITH statements. My idea for your problem would be to use json path (https://www.postgresql.org/docs/12/functions-json.html#FUNCTIONS-SQLJSON-PATH) and function jsonb_path_query.
Below code gives a list of counts, I will leave the rest to you, to get final data.
I've added ID column just to have something to join on. Otherwise join on metadata.
Also, note additional " in where condition. Left join in blob_ext is there just to have null value if metadata is not present or that path does not work.
with blob as (
select row_number() over()"id", * from (VALUES
(
'{
"total_count": 2,
"items": [
{
"name": "somename",
"metadata": {
"metas": [
{
"id": "11258",
"score": 6.1,
"status": "active",
"published_at": "2019-04-20T00:29:00",
"nvd_modified_at": "2022-04-06T18:07:00"
},
{
"id": "9251",
"score": 5.1,
"status": "active",
"published_at": "2018-01-18T23:29:00",
"nvd_modified_at": "2021-01-08T12:15:00"
}
]
}
}
]}'::jsonb),
(null::jsonb)) b(metadata)
)
, blob_ext as (
select bb.*, blob_sts.status
from blob bb
left join (
select
bb2.id,
jsonb_path_query (bb2.metadata::jsonb, '$.items[*].metadata.metas[*].status'::jsonpath)::character varying "status"
FROM blob bb2
) as blob_sts ON
blob_sts.id = bb.id
)
select bbe.id, count(*) cnt, bbe.metadata
from blob_ext bbe
where bbe.status in ('"active"', '"reported"')
group by bbe.id, bbe.metadata;

A way is to peel one layer at a time with jsonb_extract_path() and jsonb_array_elements():
with cte_items as (
select id,
metadata,
jsonb_extract_path(jx.value,'metadata','metas') as metas
from blobs,
lateral jsonb_array_elements(jsonb_extract_path(metadata,'items')) as jx),
cte_metas as (
select id,
metadata,
jsonb_extract_path_text(s.value,'status') as status
from cte_items,
lateral jsonb_array_elements(metas) s)
select distinct
id,
metadata
from cte_metas
where status in ('active','reported');

Related

How to Add Multiple JSON_BUILD_OBJECT entries to a JSON_AGG

I am using PostgreSQL 9.4 and I have a requirement to have an 'addresses' array which contains closed JSON objects for different types of address (residential and correspondence). The structure should look like this:
[
{
"addresses": [
{
"addressLine1": "string",
"type": "residential"
},
{
"addressLine1": "string",
"type": "correspondence"
}
],
"lastName": "string"
}
]
...and here's some example data to illustrate the desired result:
[
{
"addresses": [
{
"addressLine1": "54 ASHFIELD PADDOCK",
"type": "residential"
},
{
"addressLine1": "135 MERRION HILL",
"type": "correspondence"
}
],
"lastName": "WRIGHT"
},
{
"addresses": [
{
"addressLine1": "13 BOAKES GROVE",
"type": "residential"
},
{
"addressLine1": "46 BEACONSFIELD GRANGE",
"type": "correspondence"
}
],
"lastName": "DOHERTY"
}
]
This is where I've gotten to with my SQL:
SELECT
json_agg(
json_build_object('addresses',(SELECT json_agg(json_build_object('addressLine1',c2.address_line_1,
'type',c2.address_type
)
)
FROM my_customer_table c2
WHERE c2.person_id=c.person_id
),
'addresses',(SELECT json_agg(json_build_object('addressLine1',c2.corr_address_line_1,
'type',c2.corr_address_type
)
)
FROM my_customer_table c2
WHERE c2.person_id=c.person_id
),
'lastName',c.surname
)
) AS customer_json
FROM
my_customer_table c
WHERE
c.corr_address_type IS NOT NULL /*exclude customers without correspondence addresses*/
...and this runs, however it repeats the 'addresses' object twice and has an array for each address variant, not around the overall array.
What I stupidly thought would work, is the following:
SELECT
json_agg(
json_build_object('addresses',(SELECT json_agg(json_build_object('addressLine1',c2.address_line_1,
'type',c2.address_type
),
json_build_object('addressLine1',c2.corr_address_line_1,
'type',c2.corr_address_type
)
)
FROM my_customer_table c2
WHERE c2.person_id=c.person_id
),
'lastName',c.surname
)
) AS customer_json
FROM
my_customer_table c
WHERE
c.corr_address_type IS NOT NULL /*exclude customers without correspondence addresses*/
...however this throws an error:
"ERROR: function json_agg(json, json) does not exist.
LINE 3: json_build_object('addresses',(SELECT json_agg(json_build_o...
HINT: No function matches the given name and argument types. You might need to add explicit type casts."
I've Googled this, however no posts found seem to relate to the same kind of result I'm trying to get.
Does anyone know if it's possible to have multiple JSON_BUILD_OBJECT entries inside of an array?
A colleague has found a solution to this. It's totally different to the way I was approaching it, but works nicely. Here's the working code:
SELECT
json_agg(subquery.customer_json) AS customer_json
FROM
(
SELECT
row_to_json(t) AS customer_json
FROM (
SELECT
(
SELECT array_to_json(array_agg(addresses_union))
FROM (
SELECT
c.address_line_1 AS "addressLine1",
c.address_type as type
UNION ALL
SELECT
c.corr_address_Line_1 AS "addressLine1",
c.corr_address_type as type
) AS addresses_union
) as addresses,
c.surname AS "lastName"
FROM
my_customer_table c
WHERE
c.corr_address_type IS NOT NULL /*exclude customers without correspondence address*/
) t
) subquery

Average of numeric values in Postgres JSON column

I have a jsonb column with the following structure:
{
"key1": {
"type": "...",
"label": "...",
"variables": [
{
"label": "Height",
"value": 131315.9289,
"variable": "myVar1"
},
{
"label": "Width",
"value": 61085.7525,
"variable": "myVar2"
}
]
},
}
I want to query for the average height across all rows. The top-level key values are unknown, so I have something like this:
select id,
avg((latVars ->> 'value')::numeric) as avg
from "MyTable",
jsonb_array_elements((my_json_field->jsonb_object_keys(my_json_field)->>'variables')::jsonb) as latVars
where my_json_field is not null
group by id;
It's throwing the following error:
ERROR: set-returning functions must appear at top level of FROM
Moving the jsonb_array_elements function above MyTable in the FROM clause doesn't work.
I'm following the basic advice found in this SO answer to no avail.
Any advice?
jsonb_array_elements is not relevant until my_json_field is a json array at the top level.
You can use instead the jsonb_path_query function based on the jsonpath language if postgres >= 12 :
select id
, avg(v.value :: numeric) as avg
from "MyTable"
, jsonb_path_query(my_json_field, '$.*.variables[*] ? (#.label == "Height").value') AS v(value)
where my_json_field is not null
group by id;

JSON database table query

I have JSON table with some objects and I am trying to query the amount value in the object
{
"authorizations": [
{
"id": "d50",
"type": "passed",
"amount": 100,
"fortId": 5050,
"status": "GENERATED",
"voided": false,
"cardNumber": 3973,
"expireDate": null,
"description": "Success",
"customerCode": "858585",
"paymentMethod": "cash",
"changeDatetime": null,
"createDatetime": 000000000,
"reservationCode": "202020DD",
"authorizationCode": "D8787"
},
{
"id": "d50",
"type": "passed",
"amount": 100,
"fortId": 5050,
"status": "GENERATED",
"voided": false,
"cardNumber": 3973,
"expireDate": null,
"description": "Success",
"customerCode": "858585",
"paymentMethod": "cash",
"changeDatetime": null,
"createDatetime": 000000000,
"reservationCode": "202020DD",
"authorizationCode": "D8787"
}
],
}
I have tried the following four options, but none of these give me the value of the object:
SELECT info #> 'authorizations:[{amount}]'
FROM idv.reservations;
SELECT info -> 'authorizations:[{amount}]'
FROM idv.reservations;
info -> ''authorizations' ->> 'amount'
FROM idv.reservations
select (json_array_elements(info->'authorizations')->'amount')::int from idv.reservations
note I am using DBeaver
If you want one row per object contained in the "authorizations" JSON array, with the corresponding amount, you can use a lateral join and jsonb_array_elements():
select r.*, (x.obj ->> 'amount')::int as amount
from reservations r
cross join lateral jsonb_array_elements(r.info -> 'authorizations') x(obj)
We can also extract all amounts at once and put them in an array, like so:
select r.*,
jsonb_path_query_array(r.info, '$.authorizations[*].amount') as amounts
from reservations r
Demo on DB Fiddlde

Can't hide node "data"

Column data is jsonb
SELECT
json_agg(shop_order)
FROM (
SELECT data from shop_order
WHERE data->'contacts'->'customer'->>'phone' LIKE '%1234567%' LIMIT 3 OFFSET 3
) shop_order
and here result as array:
[
{
"data": {
"id": 211111,
"cartCount": 4,
"created_at": "2020-10-28T12:58:33.387Z",
"modified_at": "2020-10-28T12:58:33.387Z"
}
}
]
Nice. But... I need to hide node data.
The result must be
[
{
"id": 211111,
"cartCount": 4,
"created_at": "2020-10-28T12:58:33.387Z",
"modified_at": "2020-10-28T12:58:33.387Z"
}
]
Is it possible?
you should be able to perform a second select on the result. then specificaly select data
SELECT (result->>'data') as result,
FROM result
example

How to generate JSON array from multiple rows, then return with values of another table

I am trying to build a query which combines rows of one table into a JSON array, I then want that array to be part of the return.
I know how to do a simple query like
SELECT *
FROM public.template
WHERE id=1
And I have worked out how to produce the JSON array that I want
SELECT array_to_json(array_agg(to_json(fields)))
FROM (
SELECT id, name, format, data
FROM public.field
WHERE template_id = 1
) fields
However, I cannot work out how to combine the two, so that the result is a number of fields from public.template with the output of the second query being one of the returned fields.
I am using PostGreSQL 9.6.6
Edit, as requested more information, a definition of field and template tables and a sample of each queries output.
Currently, I have a JSONB row on the template table which I am using to store an array of fields, but I want to move fields to their own table so that I can more easily enforce a schema on them.
Template table contains:
id
name
data
organisation_id
But I would like to remove data and replace it with the field table which contains:
id
name
format
data
template_id
At the moment the output of the first query is:
{
"id": 1,
"name": "Test Template",
"data": [
{
"id": "1",
"data": null,
"name": "Assigned User",
"format": "String"
},
{
"id": "2",
"data": null,
"name": "Office",
"format": "String"
},
{
"id": "3",
"data": null,
"name": "Department",
"format": "String"
}
],
"id_organisation": 1
}
This output is what I would like to recreate using one query and both tables. The second query outputs this, but I do not know how to merge it into a single query:
[{
"id": 1,
"name": "Assigned User",
"format": "String",
"data": null
},{
"id": 2,
"name": "Office",
"format": "String",
"data": null
},{
"id": 3,
"name": "Department",
"format": "String",
"data": null
}]
The feature you're looking for is json concatenation. You can do that by using the operator ||. It's available since PostgreSQL 9.5
SELECT to_jsonb(template.*) || jsonb_build_object('data', (SELECT to_jsonb(field) WHERE template_id = templates.id)) FROM template
Sorry for poorly phrasing what I was trying to achieve, after hours of Googling I have worked it out and it was a lot more simple than I thought in my ignorance.
SELECT id, name, data
FROM public.template, (
SELECT array_to_json(array_agg(to_json(fields)))
FROM (
SELECT id, name, format, data
FROM public.field
WHERE template_id = 1
) fields
) as data
WHERE id = 1
I wanted the result of the subquery to be a column in the ouput rather than compiling the entire output table as a JSON.