postgres how to add a key to dicts in a jsonb array - sql

I have an array of dicts in a jsonb column. I have to update and add a key to all the dicts in this array. Can this be done in a single update statement?
Jsonb column:
select '[{"a":"val1"}, {"b":"val2"}, {"c":"val3"}]'::jsonb;
How do I update it to:
[
{
"a": "val1",
"x": "xval1"
},
{
"b": "val2",
"x": "xval2"
},
{
"c": "val3",
"x": "xval3"
}
]

Firstly jsonb_array_elements_text() function might be used to unnest the elements of jsonb data, and then regexp_replace() might be applied to get new jsonb objects with common keys("x") within the subquery.
In the next step, replace() function together with jsonb_agg() would yield the desired result as in the following query :
select id,
jsonb_agg(
(replace(jj.value,'}',',')||replace(jsonb_set(value2::jsonb, '{x}',
('"x'||(jj.value2::jsonb->>'x')::text||'"')::jsonb)::text,'{',''))::jsonb
)
as result
from
(
select t.id, j.value, regexp_replace(j.value,'[[:alpha:]]+','x') as value2
from t
cross join jsonb_array_elements_text(jsdata) j
) jj
group by id;
Demo
Indeed, using '[[:alpha:]]' pattern for regexp_replace is enough, the plus sign is added for the cases of the data would have key values with more than one letter.

Assuming that your dicts have one and only one key:
update your_table set
jsonb_col = (
select jsonb_agg(
v || jsonb_build_object(
'x',
'x' || (v->>(select min(x) from jsonb_object_keys(v) as x))))
from jsonb_array_elements(jsonb_col) as v);

Related

In Snowflake, how do I parse an unnamed JSON array and access each key with key value rather than array slicing methods?

I have a JSON array in Snowflake, where I need to parse into a table, i.e., convert all information into a row. Assume my table includes two columns: id and json_tag column.
An example row is as below:
{ [
json_tag
[
{
"key": "app.name",
"value": "myapp1"
},
{
"key": "device.name",
"value": "myiPhone11"
},
{
"key": "iOS.dist",
"value": "latestDist5"
}
]
This is not a standard example, another example can have 5 or even 6 with new key names. An example is below:
{ [
json_tag
[
{
"key": "app.name",
"value": "myapp2"
},
{
"key": "app.cost",
"value": "$2.99"
},
{
"key": "device.name",
"value": "myiPhoneX"
},
{
"key": "device.color",
"value": "gold"
},
{
"key": "iOS.dist",
"value": "latestDist4.9"
}
]
What I want is working on a table (without creating a new one) where the json row is split into columns as below:
id app.name app.cost device.name device.color iOS.dist
1 myapp1 null myiPhone11 null latestDist5
2 myapp2 $2.99 myiPhoneX gold latestDist4.9
I tried the following snippet:
with parsed_tb as (
select id,
to_variant(parse_json(json_tag)) as parsed_json_tag
from mytable
)
select parsed_json_tag[0]:value::varchar as app_name,
parsed_json_tag[1]:value::varchar as app_cost,
parsed_json_tag[2]:value::varchar as device_name
from tb;
As you can imagine, the snippet above does not work when there is no app.cost in key values or every row differs in number of keys and values.
I tried lateral flatten command in Snowflake, but out creates many rows and I cannot figure out how to put them in columns in the same row. I also tried using recursive command, and could not achieve it.
So my question is:
How can I access a key by its name rather than slicing an array as I do above? - this would solve my problem I guess.
If #1 solution I imagine does not fix, how can I attain the table above?
I have found a solution to this problem with the help of following thread:
Mysql, reshape data from long / tall to wide
Solution is first parsing json column with unique identifier, and then lateral flattening and then grouping the results by unique identifier [formatting table from long to wide as given in the link above].
So for the examples I provided above, the solution would be as follows:
with tb as (
select id,
to_variant(parse_json(json_tags)) as parsed_json_tags
from mytable
),
flattened as (
select id,
VALUE:value::string as value,
VALUE:key::string as key
from mytable, lateral flatten(input => (tb.parsed_json_tags))
)
select id,
MAX( IFF( key='app.name', value, NULL ) ) AS app_name,
MAX( IFF( key='app.cost', value, NULL ) ) AS app_cost,
MAX( IFF( key='device.name', value, NULL ) ) AS device_name,
MAX( IFF( key='device.color', value, NULL ) ) AS device_color,
MAX( IFF( key='iOS.dist', value, NULL ) ) AS ios_dist
from flattened
group by 1;

How to iterate over PostgreSQL jsonb array of objects and modify elements?

Given jsonb array and PostgreSQL 12:
[{"data":"42","type":"TEMPERATURE"},{"data":"1.1","type":"PRESSURE"}]
Need to convert it to:
[{"data":"42","type":"temperature"},{"data":"1.1","type":"pressure"}]
Is it possible somehow to iterate over jsonb array and downcase only "type" values?
I tried:
SELECT jsonb_agg(
jsonb_build_object(k, CASE WHEN k <> 'type' THEN v ELSE lower(v::text)::jsonb END)
)
FROM jsonb_array_elements(
'[{"data":"42","type":"TEMPERATURE"},{"data":"1.1","type":"PRESSURE"}]'::jsonb
) e(e), lateral jsonb_each(e) p(k, v)
but it separates data and type pairs into separateelements.
[{"data": "42"}, {"type": "temperature"}, {"data": "1.1"}, {"type": "pressure"}]
You need an intermediate level of nesting to rebuild the objects before you aggregate them in the array: for this, you can use a lateral join.
I would also recommend keeping track of the position of each object in the original array, so you can propagate it the the final result - with ordinality comes handy.
SELECT jsonb_agg(x.obj order by e.n)
FROM jsonb_array_elements('[{"data":"42","type":"TEMPERATURE"},{"data":"1.1","type":"PRESSURE"}]'::jsonb) with ordinality e(e, n)
CROSS JOIN LATERAL (
SELECT jsonb_object_agg(k, CASE WHEN k <> 'type' THEN v ELSE lower(v::text)::jsonb END)
FROM jsonb_each(e) p(k, v)
) x(obj)
Demo on DB Fiddle:
| jsonb_agg |
| :--------------------------------------------------------------------------- |
| [{"data": "42", "type": "temperature"}, {"data": "1.1", "type": "pressure"}] |

Select from JSON Array postgresql JSON column

I have the following JSON stored in a PostgreSQL JSON column
{
"status": "Success",
"message": "",
"data": {
"serverIp": "XXXX",
"ruleId": 32321,
"results": [
{
"versionId": 555555,
"PriceID": "8abf35ec-3e0e-466b-a4e5-2af568e90eec",
"price": 550,
"Convert": 0.8922953080331764,
"Cost": 10
}
]
}
}
I would like to search for a specific priceID across the entire JSON column (name info) and select the entire results element by the PriceID.
How do i do that in postgresql JSON?
One option uses exists and json(b)_array_elements(). Assuming that your table is called mytable and that the jsonb column is mycol, this would look like:
select t.*
from mytable t
where exists (
select 1
from jsonb_array_elements(t.mycol -> 'data' -> 'results') x(elt)
where x.elt ->> 'PriceID' = '8abf35ec-3e0e-466b-a4e5-2af568e90eec'
)
In the subquery, jsonb_array_elements() unnest the json array located at the given path. Then, the where clause ensures that at least one elment in the array has the given PriceID.
If your data is of json datatype rather than jsonb, you need to use json_array_elements() instead of jsonb_array_elements().
If you want to display some information coming from the matching array element, then it is different. You can use a lateral join instead of exists. Keep in mind, though, that this will duplicate the rows if more than one array element matches:
select t.*, x.elt ->> 'price' price
from mytable t
cross join lateral jsonb_array_elements(t.mycol -> 'data' -> 'results') x(elt)
where x.elt ->> 'PriceID' = '8abf35ec-3e0e-466b-a4e5-2af568e90eec'

PostgreSQL: exclude complete jsonb array if one element fails the WHERE clause

Assume a table json_table with columns id (int), data (jsonb).
A sample jsonb value would be
{"a": [{"b":{"c": "xxx", "d": 1}},{"b":{"c": "xxx", "d": 2}}]}
When I use an SQL statement like the following:
SELECT data FROM json_table j, jsonb_array_elements(j.data#>'{a}') dt WHERE (dt#>>'{b,d}')::integer NOT IN (2,4,6,9) GROUP BY id;
... the two array elements are unnested and the one that qualifies the WHERE clause is still returned. This makes sense since each array element is considered individually. In this example I will get back the complete row
{"a": [{"b":{"c": "xxx", "d": 1}},{"b":{"c": "xxx", "d": 2}}]}
I'm looking for a way to exclude the complete json_table row when any jsonb array element fails the condition
You can move the condition to the WHERE clause and use NOT EXISTS:
SELECT data
FROM json_table j
WHERE NOT EXISTS (SELECT 1
FROM jsonb_array_elements(j.data#>'{a}') dt
WHERE (dt#>>'{b,d}')::integer IN (2, 4, 6, 9)
);
You can achieve it with the following query:
select data
from json_table
where jsonb_path_match(data, '!exists($.a[*].b.d ? ( # == 2 || # == 4 || # == 6 || # == 9))')

BigQuery select expect double nested column

I am trying to remove a column from a BigQuery table and I've followed the instructions as stated here:
https://cloud.google.com/bigquery/docs/manually-changing-schemas#deleting_a_column_from_a_table_schema
This did not work directly as the column I'm trying to remove is nested twice in a struct. The following SO questions are relevant but none of them solve this exact case.
Single nested field:
BigQuery select * except nested column
Double nested field (solution has all fields in the schema enumerated, which is not useful for me as my schema is huge):
BigQuery: select * replace from multiple nested column
I've tried adapting the above solutions and I think I'm close but can't quite get it to work.
This one will remove the field, but returns only the nested field, not the whole table (for the examples I want to remove a.b.field_name. See the example schema at the end):
SELECT AS STRUCT * EXCEPT(a), a.* REPLACE (
(SELECT AS STRUCT a.b.* EXCEPT (field_name)) AS b
)
FROM `table`
This next attempt gives me an error: Scalar subquery produced more than one element:
WITH a_tmp AS (
SELECT AS STRUCT a.* REPLACE (
(SELECT AS STRUCT a.b.* EXCEPT (field_name)) AS b
)
FROM `table`
)
SELECT * REPLACE (
(SELECT AS STRUCT a.* FROM a_tmp) AS a
)
FROM `table`
Is there a generalised way to solve this? Or am I forced to use the enumerated solution in the 2nd link?
Example Schema:
[
{
"name": "a",
"type": "RECORD",
"fields": [
{
"name": "b",
"type": "RECORD"
"fields": [
{
"name": "field_name",
"type": "STRING"
},
{
"name": "other_field_name".
"type": "STRING"
}
]
},
]
}
]
I would like the final schema to be the same but without field_name.
Below is for BigQuery Standard SQL
#standardSQL
SELECT * REPLACE(
(SELECT AS STRUCT(SELECT AS STRUCT a.b.* EXCEPT (field_name)) b)
AS a)
FROM `project.dataset.table`
you can test, play with it using dummy data as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT STRUCT<b STRUCT<field_name STRING, other_field_name STRING>>(STRUCT('1', '2')) a
)
SELECT * REPLACE(
(SELECT AS STRUCT(SELECT AS STRUCT a.b.* EXCEPT (field_name)) b)
AS a)
FROM `project.dataset.table`