Update specific object in array of objects Postgres jsonb - sql

I am attempting to update a jsonb column pagesRead on table Books which contains an array of objects. The structure of it looks similar to this:
[{
"book": "Moby Dick",
"pagesRead": [
"1",
"2",
"3",
"4"
]
},
{
"book": "Book Thief",
"pagesRead": [
"1",
"2"
]
}]
What I am trying to do is update the pagesRead when a specific page of the book is read or if someone has started a new book, add an extra entry into it.
I am able to retrieve the specific book details, but I am unsure about how to update it.
EDIT: So I had to use the Update query from S-Man to add a book entry, but I used the Insert query from Barbaros Özhan to handle updating the page

Some thoughts before:
You should never store structured data as it is in one column. This yields problems with updates, indexing (so, searching/performance), filtering, everything. Please normalize everything into proper tables and columns
You should never store arrays. Normalize it.
Do not use type text to store integer (pages)
"pagesRead" is a sibling of your filter element ("book"). This makes it much more complicated to reference it than referencing it as a child. So think about the book name (or better: an id) as key like {"my_id_for_book_thief": {"name" : "Book Thief", "pagesRead": [...]}}. In that case, you could use a path for referencing it. Otherwise, we need to extract the array, have a look into each book attribute and reference its sibling
demo:db<>fiddle
Adding a book is quite simple (Assuming that you are using type jsonb instead of type json):
SELECT mydata || '{"book": "Lord Of The Rings", "pagesRead": []}'
FROM mytable
Update:
UPDATE mytable
SET mycolumn = mycolumn || '{"book": "Lord Of The Rings", "pagesRead": []}'
Adding a pagesRead value:
SELECT
jsonb_agg( -- 4
jsonb_build_object( -- 3
'book', elem -> 'book',
'pagesRead', CASE WHEN elem ->> 'book' = 'Moby Dick' THEN -- 2
elem -> 'pagesRead' || '"42"'
ELSE elem -> 'pagesRead' END
)
) as new_array
FROM mytable,
jsonb_array_elements(mydata) as elem -- 1
Extract the array into one record per element
Add a page if element contains correct book
Rebuild the object
Reaggregate your array.
Update would be:
UPDATE mytable
SET mycolumn = s.new_array
FROM (
-- <query above>
) s

Assuming you want to add a new page for the second book (Book Thief), then using JSONB_INSERT() function with the following Update Statement will be enough
UPDATE books
SET pagesRead = JSONB_INSERT(pagesRead,'{1,pagesRead,1}','"3"'::JSONB,true)
But, in order to make it a dynamical solution, without knowing the position of the book within the main array, and adding the new page number to the end of the pagesRead array of the desired book, determine the position, and the related array's length within the subquery as
WITH b AS
(
SELECT idx-1 AS pos1,
JSONB_ARRAY_LENGTH( (j ->> 'pagesRead')::JSONB )-1 AS pos2
FROM books
CROSS JOIN JSONB_ARRAY_ELEMENTS(pagesRead)
WITH ORDINALITY arr(j,idx)
WHERE j ->> 'book' = 'Book Thief'
)
UPDATE books
SET pagesRead =
JSONB_INSERT(
pagesRead,
('{'||pos1||',pagesRead,'||pos2||'}')::TEXT[],
--# pos1 stands for the position within the main array
--# pos2 stands for the position within the related pagesRead array
'"3"'::JSONB, --# an arbitrary page number
true --# the new page value will be inserted after the target path
)
FROM b
Demo

Related

How to remove/update a JSONB array element where key equals a value?

I'd like remove/replace an element from a JSONB array where a property is equal to a set value. I've found a number of functions that will accomplish this but I'd like to know if there's a way to do it without one as I have database restrictions?
Here's an example JSONB value:
[
{ "ID": "valuea" },
{ "ID": "valueb" },
{ "ID": "valuec" }
]
I'd like to remove the second array position where ID is equal to valueb with a single update statement. I'd imagine this could finding the position/order in the array, jsonb_set() to remove it.
It would also be helpful if there was a way to update the row and not just remove it. Likely a similar query, again with jsonb_set().
Unfortunately, there is no function to return the position of a JSON array element (yet) as of Postgres 15.
To remove a single matching element:
UPDATE tbl t
SET js = t.js - (SELECT j.ord::int - 1
FROM jsonb_array_elements(t.js) WITH ORDINALITY j(v,ord)
WHERE j.v = '{"ID": "valueb"}'
LIMIT 1)
WHERE t.js #> '[{"ID": "valueb"}]' -- optional
AND jsonb_typeof(t.js) = 'array'; -- optional
This UPDATE uses a correlated subquery with jsonb_array_elements().
About WITH ORDINALITY:
PostgreSQL unnest() with element number
Both WHERE clauses are optional.
Use the filter t.js #> '[{"ID": "valueb"}]' to suppress (potentially expensive!) empty updates and make good use of an existing GIN index on the jsonb column
Use the filter jsonb_typeof(t.js) = 'array' to only suppress errors from non-arrays.
Note how the outer filter includes enclosing array decorators [], while the inner filter (after unnesting) does not.
To remove all matching elements:
UPDATE tbl t
SET js = (SELECT jsonb_agg(j.v)
FROM jsonb_array_elements(t.js) j(v)
WHERE NOT j.v #> '{"ID": "valueb"}')
WHERE t.js #> '[{"ID": "valueb"}]';
fiddle
The second query aggregates a new array from remaining elements.
This time, the inner filter uses #> instead of = to allow for additional keys. Chose the appropriate filter.
Aside: jsonb_set() might be useful additionally if the array in question is actually nested, unlike your example.

How to retrieve the list of dynamic nested keys of BigQuery nested records

My ELT tools imports my data in bigquery and generates/extends automatically the schema for dynamic nested keys (in the schema below, under properties)
It looks like this
How can I get the list of nested keys of a repeated record ? so for example I can group by properties when those items have said property non-null ?
I have tried
select column_name
from my_schema.INFORMATION_SCHEMA.COLUMNS
where
table_name = 'my_table
But it will only list first level keys
From the picture above, I want, as a first step, a SQL query that returns
message
user_id
seeker
liker_id
rateable_id
rateable_type
from_organization
likeable_type
company
existing_attempt
...
My real goal through, is to group/count my data based on a non-null value of a 2nd level nested properties properties.filters.[filter_type]
The schema may evolve when our application adds more filters, so this need to be dynamically generated, I can't just hard-code the list of nested keys.
Note: this is very similar to this question How to extract all the keys in a JSON object with BigQuery but in my case my data is already in a shcema and it's not a JSON object
EDIT:
Suppose I have a list of such records with nested properties, how do I write a SQL query that adds a field "enabled_filters" which aggregates, for each item, the list of properties for wihch said property is not null ?
Example input (properties.x are dynamic and not known by the programmer)
search_id
properties.filters.school
properties.filters.type
1
MIT
master
2
Princetown
null
3
null
master
Example output
search_id
enabled_filters
1
["school", "type"]
2
["school"]
3
["type"]
Have you looked at COLUMN_FIELD_PATHS? It should give you the paths for all columns.
select field_path from my_schema.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS where table_name = '<table>'
[https://cloud.google.com/bigquery/docs/information-schema-column-field-paths]
The field properties is not nested by array only by structures. Then a UDF in JavaScript to parse thise field should work fast enough.
CREATE TEMP FUNCTION jsonObjectKeys(input STRING, shownull BOOL,fullname Bool)
RETURNS Array<String>
LANGUAGE js AS """
function test(input,old){
var out=[]
for(let x in input){
let te=input[x];
out=out.concat(te==null ? (shownull?[x+'==null']:[]) : typeof te=='object' ? test(te,old+x+'.') : [fullname ? old+x : x] );
}
return out;
Object.keys(JSON.parse(input));
}
return test(JSON.parse(input),"");
""";
with tbl as (select struct(1 as alpha,struct(2 as x, 3 as y,[1,2,3] as z ) as B) A from unnest(generate_array(1,10*1))
union all select struct(null,struct(null,1,[999])) )
select *,
TO_JSON_STRING (A ) as string_output,
jsonObjectKeys(TO_JSON_STRING (A),true,false) as output1,
jsonObjectKeys(TO_JSON_STRING (A),false,true) as output2,
concat('["', array_to_string(jsonObjectKeys(TO_JSON_STRING (A),false,true),'","' ) ,'"]') as output_sring,
jsonObjectKeys(TO_JSON_STRING (A.B),false,true) as outpu
from tbl

Update object field of element in array jsonb with postgres

I have following jsonb column which name is data in my sql table.
{
"special_note": "Some very long special note",
"extension_conditions": [
{
"condition_id": "5bfb8b8d-3a34-4cc3-9152-14139953aedb",
"condition_type": "OPTION_ONE"
},
{
"condition_id": "fbb60052-806b-4ae0-88ca-4b1a7d8ccd97",
"condition_type": "OPTION_TWO"
}
],
"floor_drawings_file": "137c3ec3-f078-44bb-996e-161da8e20f2b",
}
What I need to do is to update every object's field with name condition_type in extension_conditions array field from OPTION_ONE to MARKET_PRICE and OPTION_TWO leave the same.
Consider that this extension_conditions array field is optional so I need to filter rows where extension_conditions is null
I need a query which will update all my jsonb columns of rows of this table by rules described above.
Thanks in advance!
You can use such a statement containing JSONB_SET() function after determining the position(index) of the related key within the array
WITH j AS
(
SELECT ('{extension_conditions,'||idx-1||',condition_type}')::TEXT[] AS path, j
FROM tab
CROSS JOIN JSONB_ARRAY_ELEMENTS(data->'extension_conditions')
WITH ORDINALITY arr(j,idx)
WHERE j->>'condition_type'='OPTION_ONE'
)
UPDATE tab
SET data = JSONB_SET(data,j.path,'"MARKET_PRICE"',false)
FROM j
Demo 1
Update : In order to update for multiple elements within the array, the following query containing nested JSONB_SET() might be preferred to use
UPDATE tab
SET data =
(
SELECT JSONB_SET(data,'{extension_conditions}',
JSONB_AGG(CASE WHEN j->>'condition_type' = 'OPTION_ONE'
THEN JSONB_SET(j, '{condition_type}', '"MARKET_PRICE"')
ELSE j
END))
FROM JSONB_ARRAY_ELEMENTS(data->'extension_conditions') AS j
)
WHERE data #> '{"extension_conditions": [{"condition_type": "OPTION_ONE"}]}';
Demo 2

Using Postgres JSON Functions on table columns

I have searched extensively (in Postgres docs and on Google and SO) to find examples of JSON functions being used on actual JSON columns in a table.
Here's my problem: I am trying to extract key values from an array of JSON objects in a column, using jsonb_to_recordset(), but get syntax errors. When I pass the object literally to the function, it works fine:
Passing JSON literally:
select *
from jsonb_to_recordset('[
{ "id": 0, "name": "400MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"},
{ "id": 0, "name": "1000MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"}
]') as f(name text);`
results in:
400MB-PDF.pdf
1000MB-PDF.pdf
It extracts the value of the key "name".
Here's the JSON in the column, being extracted using:
select journal.data::jsonb#>>'{context,data,files}'
from journal
where id = 'ap32bbofopvo7pjgo07g';
resulting in:
[ { "id": 0, "name": "400MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"},
{ "id": 0, "name": "1000MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"}
]
But when I try to pass jsonb#>>'{context,data,files}' to jsonb_to_recordset() like this:
select id,
journal.data::jsonb#>>::jsonb_to_recordset('{context,data,files}') as f(name text)
from journal
where id = 'ap32bbofopvo7pjgo07g';
I get a syntax error. I have tried different ways but each time it complains about a syntax error:
Version:
PostgreSQL 9.4.10 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
The expressions after select must evaluate to a single value. Since jsonb_to_recordset returns a set of rows and columns, you can't use it there.
The solution is a cross join lateral, which allows you to expand one row into multiple rows using a function. That gives you single rows that select can act on. For example:
select *
from journal j
cross join lateral
jsonb_to_recordset(j.data#>'{context, data, files}') as d(id int, name text)
where j.id = 'ap32bbofopvo7pjgo07g'
Note that the #>> operator returns type text, and the #> operator returns type jsonb. As jsonb_to_recordset expects jsonb as its first parameter I'm using #>.
See it working at rextester.com
jsonb_to_recordset is a set-valued function and can only be invoked in specific places. The FROM clause is one such place, which is why your first example works, but the SELECT clause is not.
In order to turn your JSON array into a "table" that you can query, you need to use a lateral join. The effect is rather like a foreach loop on the source recordset, and that's where you apply the jsonb_to_recordset function. Here's a sample dataset:
create table jstuff (id int, val jsonb);
insert into jstuff
values
(1, '[{"outer": {"inner": "a"}}, {"outer": {"inner": "b"}}]'),
(2, '[{"outer": {"inner": "c"}}]');
A simple lateral join query:
select id, r.*
from jstuff
join lateral jsonb_to_recordset(val) as r("outer" jsonb) on true;
id | outer
----+----------------
1 | {"inner": "a"}
1 | {"inner": "b"}
2 | {"inner": "c"}
(3 rows)
That's the hard part. Note that you have to define what your new recordset looks like in the AS clause -- since each element in our val array is a JSON object with a single field named "outer", that's what we give it. If your array elements contain multiple fields you're interested in, you declare those in a similar manner. Be aware also that your JSON schema needs to be consistent: if an array element doesn't contain a key named "outer", the resulting value will be null.
From here, you just need to pull the specific value you need out of each JSON object using the traversal operator as you were. If I wanted only the "inner" value from the sample dataset, I would specify select id, r.outer->>'inner'. Since it's already JSONB, it doesn't require casting.

Update new column with part of JSON column

I have a json column titled 'classifiers' with data like this:
[ { "category": "Building & Trades", "type": "Services"
, "subcategory": "Construction" } ]
I would like to pull each element and insert into columns on the same row titled, for example, 'category', 'type' and 'subcategory'.
This query pulls out what I want, in this case 'category':
SELECT parts->'category' AS category
FROM (SELECT json_array_elements(classifiers) AS parts FROM <tablename>) AS more_parts
I can't figure out the 'WHERE' part in an UPDATE/SET/WHERE type of query, for example:
UPDATE <table>
SET category = (SELECT parts->'category' AS category
FROM (SELECT json_array_elements(classifiers) AS parts
FROM <tablename>
) AS more_parts
) WHERE ???
Without WHERE multiple rows are returned.
I would like to pull each element and insert into columns on the same
row titled, for example, 'category', 'type' and 'subcategory'.
Sounds like you really want this:
UPDATE tbl
SET category = classifiers->0->'category'
,type = classifiers->0->'type'
,subcategory = classifiers->0->'subcategory'
Updates all rows. Requires Postgres 9.3+.
The first operator ->0 reverences the only object in the array (json array index starting from 0 unlike Postgres arrays, which start from 1 per default).
The second operator ->'category' gets the field from the object.
Refer to the manual here.