Nesting jsonb in postgres without converting to jsonb[] - sql

I have 1 table with 2 columns 1 is an index that holds the group number and a column of jsonb data
| Index | payload |
|----------------|----------------|
| 1 | {jsonb} |
| 1 | {jsonb} |
| 2 | {jsonb} |
| 2 | {jsonb} |
I then want to nest the payload into another jsonb, but it must not be an array.
Expected Output:
| Index | payload |
|----------------|----------------|
| 1 |{{jsonb},{jsonb}}|
| 2 |{{jsonb},{jsonb}}|
Actual Output:
| Index | payload |
|----------------|----------------|
| 1 |[{{jsonb},{jsonb}}]|
| 2 |[{{jsonb},{jsonb}}]|
SELECT index, jsonb_agg(payload) as "payload"
FROM table1
GROUP BY 1
ORDER BY 1
As you can see the output does aggregate the columns into a jsonb, but also converts it into an array. Is it possible to remove the array?

You can create your own aggregate that just appends the JSONB values:
create aggregate jsonb_append_agg(jsonb)
(
sfunc = jsonb_concat(jsonb, jsonb),
stype = jsonb
);
Then you can do:
SELECT index, jsonb_append_agg(payload) as "payload"
FROM table1
GROUP BY 1
ORDER BY 1

Related

Update of value in array of jsonb returns error"invalid input syntax for type json"

I have a column of type jsonb which contains json arrays of the form
[
{
"Id": 62497,
"Text": "BlaBla"
}
]
I'd like to update the Id to the value of a column word_id (type uuid) from a different table word.
I tried this
update inflection_copy
SET inflectionlinks = s.json_array
FROM (
SELECT jsonb_agg(
CASE
WHEN elems->>'Id' = (
SELECT word_copy.id::text
from word_copy
where word_copy.id::text = elems->>'Id'
) THEN jsonb_set(
elems,
'{Id}'::text [],
(
SELECT jsonb(word_copy.word_id::text)
from word_copy
where word_copy.id::text = elems->>'Id'
)
)
ELSE elems
END
) as json_array
FROM inflection_copy,
jsonb_array_elements(inflectionlinks) elems
) s;
Until now I always get the following error:
invalid input syntax for type json
DETAIL: Token "c66a4353" is invalid.
CONTEXT: JSON data, line 1: c66a4353...
The c66a4535 is part of one of the uuids of the word table. I don't understand why this is marked as invalid input.
EDIT:
To give an example of one of the uuids:
select to_jsonb(word_id::text) from word_copy limit(5);
returns
+----------------------------------------+
| to_jsonb |
|----------------------------------------|
| "078c979d-e479-4fce-b27c-d14087f467c2" |
| "ef288256-1599-4f0f-a932-aad85d666c9a" |
| "d1d95b60-623e-47cf-b770-de46b01042c5" |
| "f97464c6-b872-4be8-9d9d-83c0102fb26a" |
| "9bb19719-e014-4286-a2d1-4c0cf7f089fc" |
+----------------------------------------+
As requested the respective columns id and word_id from the word table:
+---------------------------------------------------+
| row |
|---------------------------------------------------|
| ('27733', '078c979d-e479-4fce-b27c-d14087f467c2') |
| ('72337', 'ef288256-1599-4f0f-a932-aad85d666c9a') |
| ('72340', 'd1d95b60-623e-47cf-b770-de46b01042c5') |
| ('27741', 'f97464c6-b872-4be8-9d9d-83c0102fb26a') |
| ('72338', '9bb19719-e014-4286-a2d1-4c0cf7f089fc') |
+---------------------------------------------------+
+----------------+----------+----------------------------+
| Column | Type | Modifiers |
|----------------+----------+----------------------------|
| id | bigint | |
| value | text | |
| homonymnumber | smallint | |
| pronounciation | text | |
| audio | text | |
| level | integer | |
| alpha | bigint | |
| frequency | bigint | |
| hanja | text | |
| typeeng | text | |
| typekr | text | |
| word_id | uuid | default gen_random_uuid() |
+----------------+----------+----------------------------+
I would suggest you to modify your sub query as follow :
update inflection_copy AS ic
SET inflectionlinks = s.json_array
FROM
(SELECT jsonb_agg(CASE WHEN wc.word_id IS NULL THEN e.elems ELSE jsonb_set(e.elems, array['Id'], to_jsonb(wc.word_id::text)) END ORDER BY e.id ASC) AS json_array
FROM inflection_copy AS ic
CROSS JOIN LATERAL jsonb_path_query(ic.inflectionlinks, '$[*]') WITH ORDINALITY AS e(elems, id)
LEFT JOIN word_copy AS wc
ON wc.id::text = e.elems->>'Id'
) AS s
The LEFT JOIN clause will return wc.word_id = NULL when there is no wc.id which corresponds to e.elems->>'id', so that e.elems is unchanged in the CASE.
The ORDER BY clause in the aggregate function jsonb_agg will ensure that the order is unchanged in the jsonb array.
jsonb_path_query is used instead of jsonb_array_elements so that to not raise an error when ic.inflectionlinks is not a jsonb array and it is used in lax mode (which is the default behavior).
see the test result in dbfiddle

Casting string to int i.e. the string "res"

I have a column in a table which is type array<string>. The table is partitioned daily since 2018-01-01. At some stage, the values in the array goes from strings to integers. The data looks like this:
| yyyy_mm_dd | h_id | p_id | con |
|------------|-------|------|---------------|
| 2018-10-01 | 52988 | 1 | ["res", "av"] |
| 2018-10-02 | 52988 | 1 | ["1","2"] |
| 2018-10-03 | 52988 | 1 | ["1","2"] |
There is a mapping between the strings and integers. "res" maps to 1 and "av" maps to 2 etc. However, I've written a query to perform some logic. Here is a snippet (subquery) of it:
SELECT
t.yyyy_mm_dd,
t.h_id,
t.p_id,
CAST(e.con AS INT) AS api
FROM
my_table t
LATERAL VIEW EXPLODE(con) e AS con
My problem is that this doesn't work for the earlier dates when strings were used instead of integers. Is there anyway to select con and remap the strings to integers so the data is across all partitions?
Expected output:
| yyyy_mm_dd | h_id | p_id | con |
|------------|-------|------|---------------|
| 2018-10-01 | 52988 | 1 | ["1","2"] |
| 2018-10-02 | 52988 | 1 | ["1","2"] |
| 2018-10-03 | 52988 | 1 | ["1","2"] |
Once the values selected are all integers (within a string array), then the CAST(e.con AS INT) will work
Edit: To clarify, I will put the solution as a subquery before I use lateral view explode. This way I am exploding on a table where all partitions have integers in con. I hope this makes sense.
CAST(e.api as INT) returns NULL if not possible to cast. collect_list will collect an array including duplicates and without NULLs. If you need array without duplicated elements, use collect_set().
SELECT
t.yyyy_mm_dd,
t.h_id,
t.p_id,
collect_list(--array of integers
--cast case as string if you need array of strings
CASE WHEN e.api = 'res' THEN 1
WHEN e.api = 'av' THEN 2
--add more cases
ELSE CAST(e.api as INT)
END
) as con
FROM
my_table t
LATERAL VIEW EXPLODE(con) e AS api
GROUP BY t.yyyy_mm_dd, t.h_id, t.p_id

Find max value from column that has a json object with key-value pairs

I Have a table that has a column of a JSON string (key-value pairs) of items, I want to return only the key-value pair of the largest value
I can do this by first UNNESTing the JSON object and then taking the largest value by ORDER BY item, value (DESC) and using array_agg to get the largest one. The problem is that this means creating multiple tables and is slow. I am hoping that in one operation, I'll be able to extract the largest key-value pair.
This:
| id | items |
| -- | ---------------------------------- |
| 1 | {Item1=7.3, Item2=1.3, Item3=9.8} |
| 2 | {Item2=4.4, Item3=5.2, Item1=0.1} |
| 3 | {Item5=6.6, Item2=1.4, Item4=1.5} |
| 4 | {Item6=0.9, Item7=11.2, Item4=8.1} |
Should become:
| id | item | value |
| -- | ----- | ----- |
| 1 | Item3 | 9.8 |
| 2 | Item3 | 5.2 |
| 3 | Item5 | 6.6 |
| 4 | Item7 | 11.2 |
I don't actually need the value, so long as the item is the largest from the JSON object, so the following would be fine as well:
| id | item |
| -- | ----- |
| 1 | Item3 |
| 2 | Item3 |
| 3 | Item5 |
| 4 | Item7 |
Presto's UNNEST performance got improved in Presto 316. However, you don't need UNNEST in this case.
You can
convert your JSON to arary of key/value pairs using JSON CAST and map_entries
reduce the array to pick the key for highest value
since key/value pairs are represented as anonymous row elements, it's very convenient to use positional access to row elements with subscript operator, (available since Presto 314)
Use query like
SELECT
id,
reduce(
-- conver JSON to array of key/value pairs
map_entries(CAST(data AS map(varchar, double))),
-- initial state for reduce (must be same type as key/value pairs)
(CAST(NULL AS varchar), -1e0), -- assuming your values cannot be negative
-- reduction function
(state, element) -> if(state[2] > element[2], state, element),
-- reduce output function
state -> state[1]
) AS top
FROM (VALUES
(1, JSON '{"Item1":7.3, "Item2":1.3, "Item3":9.8}'),
(4, JSON '{"Item6":0.9, "Item7":11.2, "Item4":8.1}'),
(5, JSON '{}'),
(6, NULL)
) t(id, data);
Output
id | top
----+-------
1 | Item3
4 | Item7
5 | NULL
6 | NULL
(4 rows)
Store the values one per row in a child table.
CREATE TABLE child (
id INT NOT NULL,
item VARCHAR(6) NOT NULL,
value DECIMAL(9,1),
PRIMARY KEY (id, item)
);
You don't have to do a join to find the largest per group, just use a window function:
WITH cte AS (
SELECT id, item, ROW_NUMBER() OVER (PARTITION BY id ORDER BY value DESC) AS rownum
FROM mytable
)
SELECT * FROM cte WHERE rownum = 1;
Solving this with JSON is a bad idea. It makes your table denormalized, it makes the queries harder to design, and I predict it will make the query performance worse.

How can I get last element and length in bigint[] form postgresql

I have a table in postgresql like:
Column | Type | Modifiers
--------+----------+-----------
id | bigint | not null
nodes | bigint[] | not null
tags | text[] |
And some example data:
id | nodes
----------+------------
25373389 | {1}
25373582 | {1,2,3,2,6}
25373585 | {1,276,3,2}
I want to get last element and length of nodes,so I expect result is:
id | nodes | last | length
----------+------------+------------+------------
25373389 | {1} | 1 | 1
25373582 | {1,2,3,2,6}| 6 | 5
25373585 | {1,276,3,2}| 2 | 4
I use follow code to get first element,but I can't use nodes[-1] to get last element.
select id,nodes[1] from table limit 3;
How can I get it? Thanks in advance.
use array_length
select nodes[array_length(nodes,1)] as last, array_length(nodes,1) as length from t;
Demo

Can I sum an array of jsonb in Postgresql with dynamic keys in a select statement?

I have a jsonb object in postgres:
[{"a": 1, "b":5}, {"a":2, "c":3}]
I would like to get an aggregate sum per unique key:
{"a":3, "b":5, "c":3}
The keys are unpredictable.
Is it possible to do this in Postgres with a select statement?
Query:
SELECT key, SUM(value::INTEGER)
FROM (
SELECT (JSONB_EACH_TEXT(j)).*
FROM JSONB_ARRAY_ELEMENTS('[{"a": 1, "b":5}, {"a":2, "c":3}]') j
) j
GROUP BY key
ORDER BY key;
Results:
| key | sum |
| --- | --- |
| a | 3 |
| b | 5 |
| c | 3 |
DB Fiddle