SQL-style GROUP BY aggregate functions in jq (COUNT, SUM and etc) - sql

Similar questions asked here before:
Count items for a single key: jq count the number of items in json by a specific key
Calculate the sum of object values:
How do I sum the values in an array of maps in jq?
Question
How to emulate the COUNT aggregate function which should behave similarly to its SQL original? Let's extend this question even more to include other regular SQL functions:
COUNT
SUM / MAX/ MIN / AVG
ARRAY_AGG
The last one is not a standard SQL function - it's from PostgreSQL but is quite useful.
At input comes a stream of valid JSON objects. For demonstration let's pick a simple story of owners and their pets.
Model and data
Base relation: Owner
id name age
1 Adams 25
2 Baker 55
3 Clark 40
4 Davis 31
Base relation: Pet
id name litter owner_id
10 Bella 4 1
20 Lucy 2 1
30 Daisy 3 2
40 Molly 4 3
50 Lola 2 4
60 Sadie 4 4
70 Luna 3 4
Source
From above we get a derivative relation Owner_Pet (a result of SQL JOIN of the above relations) presented in JSON format for our jq queries (the source data):
{ "owner_id": 1, "owner": "Adams", "age": 25, "pet_id": 10, "pet": "Bella", "litter": 4 }
{ "owner_id": 1, "owner": "Adams", "age": 25, "pet_id": 20, "pet": "Lucy", "litter": 2 }
{ "owner_id": 2, "owner": "Baker", "age": 55, "pet_id": 30, "pet": "Daisy", "litter": 3 }
{ "owner_id": 3, "owner": "Clark", "age": 40, "pet_id": 40, "pet": "Molly", "litter": 4 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pet_id": 50, "pet": "Lola", "litter": 2 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pet_id": 60, "pet": "Sadie", "litter": 4 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pet_id": 70, "pet": "Luna", "litter": 3 }
Requests
Here are sample requests and their expected output:
COUNT the number of pets per owner:
{ "owner_id": 1, "owner": "Adams", "age": 25, "pets_count": 2 }
{ "owner_id": 2, "owner": "Baker", "age": 55, "pets_count": 1 }
{ "owner_id": 3, "owner": "Clark", "age": 40, "pets_count": 1 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pets_count": 3 }
SUM up the number of whelps per owner and get their MAX (MIN/AVG):
{ "owner_id": 1, "owner": "Adams", "age": 25, "litter_total": 6, "litter_max": 4 }
{ "owner_id": 2, "owner": "Baker", "age": 55, "litter_total": 3, "litter_max": 3 }
{ "owner_id": 3, "owner": "Clark", "age": 40, "litter_total": 4, "litter_max": 4 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "litter_total": 9, "litter_max": 4 }
ARRAY_AGG pets per owner:
{ "owner_id": 1, "owner": "Adams", "age": 25, "pets": [ "Bella", "Lucy" ] }
{ "owner_id": 2, "owner": "Baker", "age": 55, "pets": [ "Daisy" ] }
{ "owner_id": 3, "owner": "Clark", "age": 40, "pets": [ "Molly" ] }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pets": [ "Lola", "Sadie", "Luna" ] }

Here's an alternative, not using any custom functions with basic JQ. (I took the liberty to get rid of redundant parts of the question)
Count
In> jq -s 'group_by(.owner_id) | map({ owner_id: .[0].owner_id, count: map(.pet) | length})'
Out>[{"owner_id": "1","pets_count": 2}, ...]
Sum
In> jq -s 'group_by(.owner_id) | map({owner_id: .[0].owner_id, sum: map(.litter) | add})'
Out> [{"owner_id": "1","sum": 6}, ...]
Max
In> jq -s 'group_by(.owner_id) | map({owner_id: .[0].owner_id, max: map(.litter) | max})'
Out> [{"owner_id": "1","max": 4}, ...]
Aggregate
In> jq -s 'group_by(.owner_id) | map({owner_id: .[0].owner_id, agg: map(.pet) })'
Out> [{"owner_id": "1","agg": ["Bella","Lucy"]}, ...]
Sure, these might not be the most efficient implementations, but they show nicely how to implement custom functions oneself. All that changes between the different functions is inside the last map and the function after the pipe | (length, add, max)
The first map iterates over the different groups, taking the name from the first item, and using map again to iterate over the same-group items. Not as pretty as SQL, but not terribly more complicated.
I learned JQ today, and managed to do this already, so this should be encouraging for anyone getting started. JQ is neither like sed nor like SQL, but not terribly hard either.

Extended jq solution:
Custom count() function:
jq -sc 'def count($k): group_by(.[$k])[] | length as $l | .[0]
| .pets_count = $l
| del(.pet_id, .pet, .litter);
count("owner_id")' source.data
The output:
{"owner_id":1,"owner":"Adams","age":25,"pets_count":2}
{"owner_id":2,"owner":"Baker","age":55,"pets_count":1}
{"owner_id":3,"owner":"Clark","age":40,"pets_count":1}
{"owner_id":4,"owner":"Davis","age":31,"pets_count":3}
Custom sum() function:
jq -sc 'def sum($k): group_by(.[$k])[] | map(.litter) as $litters | .[0]
| . + {litter_total: $litters | add, litter_max: $litters | max}
| del(.pet_id, .pet, .litter);
sum("owner_id")' source.data
The output:
{"owner_id":1,"owner":"Adams","age":25,"litter_total":6,"litter_max":4}
{"owner_id":2,"owner":"Baker","age":55,"litter_total":3,"litter_max":3}
{"owner_id":3,"owner":"Clark","age":40,"litter_total":4,"litter_max":4}
{"owner_id":4,"owner":"Davis","age":31,"litter_total":9,"litter_max":4}
Custom array_agg() function:
jq -sc 'def array_agg($k): group_by(.[$k])[] | map(.pet) as $pets | .[0]
| .pets = $pets | del(.pet_id, .pet, .litter);
array_agg("owner_id")' source.data
The output:
{"owner_id":1,"owner":"Adams","age":25,"pets":["Bella","Lucy"]}
{"owner_id":2,"owner":"Baker","age":55,"pets":["Daisy"]}
{"owner_id":3,"owner":"Clark","age":40,"pets":["Molly"]}
{"owner_id":4,"owner":"Davis","age":31,"pets":["Lola","Sadie","Luna"]}

This is a nice exercise, but SO is not a programming service, so I will focus here on some key concepts for generic solutions in jq that are efficient, even for very large collections.
GROUPS_BY
The key to efficiency here is avoiding the built-in group_by, as it requires sorting. Since jq is fundamentally stream-oriented, the following definition of GROUPS_BY is likewise stream-oriented. It takes advantage of the efficiency of key-based lookups, while avoiding calling tojson on strings:
# emit a stream of the groups defined by f
def GROUPS_BY(stream; f):
reduce stream as $x ({};
($x|f) as $s
| ($s|type) as $t
| (if $t == "string" then $s else ($s|tojson) end) as $y
| .[$t][$y] += [$x] )
| .[][] ;
distinct and count_distinct
# Emit an array of the distinct entities in `stream`, without sorting
def distinct(stream):
reduce stream as $x ({};
($x|type) as $t
| (if $t == "string" then $x else ($x|tojson) end) as $y
| if (.[$t] | has($y)) then . else .[$t][$y] += [$x] end )
| [.[][]] | add ;
# Emit the number of distinct items in the given stream
def count_distinct(stream):
def sum(s): reduce s as $x (0;.+$x);
reduce stream as $x ({};
($x|type) as $t
| (if $t == "string" then $x else ($x|tojson) end) as $y
| .[$t][$y] = 1 )
| sum( .[][] ) ;
Convenience function
def owner: {owner_id,owner,age};
Example: "COUNT the number of pets per owner"
GROUPS_BY(inputs; .owner_id)
| (.[0] | owner) + {pets_count: count_distinct(.[]|.pet_id)}
Invocation: jq -nc -f program1.jq input.json
Output:
{"owner_id":1,"owner":"Adams","age":25,"pets_count":2}
{"owner_id":2,"owner":"Baker","age":55,"pets_count":1}
{"owner_id":3,"owner":"Clark","age":40,"pets_count":1}
{"owner_id":4,"owner":"Davis","age":31,"pets_count":3}
Example: "SUM up the number of whelps per owner and get their MAX"
GROUPS_BY(inputs; .owner_id)
| (.[0] | owner)
+ {litter_total: (map(.litter) | add)}
+ {litter_max: (map(.litter) | max)}
Invocation: jq -nc -f program2.jq input.json
Output: as given.
Example: "ARRAY_AGG pets per owner"
GROUPS_BY(inputs; .owner_id)
| (.[0] | owner) + {pets: distinct(.[]|.pet)}
Invocation: jq -nc -f program3.jq input.json
Output:
{"owner_id":1,"owner":"Adams","age":25,"pets":["Bella","Lucy"]}
{"owner_id":2,"owner":"Baker","age":55,"pets":["Daisy"]}
{"owner_id":3,"owner":"Clark","age":40,"pets":["Molly"]}
{"owner_id":4,"owner":"Davis","age":31,"pets":["Lola","Sadie","Luna"]}

Related

MariaDB extract json nested data

I have following SQL query and trying to extract nested json data field.
*************************** 2. row ***************************
created_at: 2023-01-05 14:25:52
updated_at: 2023-01-05 14:26:02
deleted_at: NULL
deleted: 0
id: 2
instance_uuid: ef6380b4-5455-48f8-9e4b-3d04199be3f5
numa_topology: NULL
pci_requests: []
flavor: {"cur": {"nova_object.name": "Flavor", "nova_object.namespace": "nova", "nova_object.version": "1.2", "nova_object.data": {"id": 2, "name": "tempest2", "memory_mb": 512, "vcpus": 1, "root_gb": 1, "ephemeral_gb": 0, "flavorid": "202", "swap": 0, "rxtx_factor": 1.0, "vcpu_weight": 0, "disabled": false, "is_public": true, "extra_specs": {}, "description": null, "created_at": "2023-01-05T05:30:36Z", "updated_at": null, "deleted_at": null, "deleted": false}}, "old": null, "new": null}
vcpu_model: {"nova_object.name": "VirtCPUModel", "nova_object.namespace": "nova", "nova_object.version": "1.0", "nova_object.data": {"arch": null, "vendor": null, "topology": {"nova_object.name": "VirtCPUTopology", "nova_object.namespace": "nova", "nova_object.version": "1.0", "nova_object.data": {"sockets": 1, "cores": 1, "threads": 1}, "nova_object.changes": ["cores", "threads", "sockets"]}, "features": [], "mode": "host-model", "model": null, "match": "exact"}, "nova_object.changes": ["mode", "model", "vendor", "features", "topology", "arch", "match"]}
migration_context: NULL
keypairs: {"nova_object.name": "KeyPairList", "nova_object.namespace": "nova", "nova_object.version": "1.3", "nova_object.data": {"objects": []}}
device_metadata: NULL
trusted_certs: NULL
vpmems: NULL
resources: NULL
In flavor: section i have some json data and i am trying to extract "name": "tempest2" value in my question but it's nested so i am not able to find way to extract that value.
My query but how do i remove [] square brackets in value
MariaDB [nova]> select uuid, instances.created_at, instances.deleted_at, json_extract(flavor, '$.cur.*.name') AS FLAVOR from instances join instance_extra on instances.uuid = instance_extra.instance_uuid;
+--------------------------------------+---------------------+---------------------+--------------+
| uuid | created_at | deleted_at | FLAVOR |
+--------------------------------------+---------------------+---------------------+--------------+
| edb0facb-3353-4848-82e2-f12701a0a3aa | 2023-01-05 05:37:13 | 2023-01-05 05:37:49 | ["tempest1"] |
| ef6380b4-5455-48f8-9e4b-3d04199be3f5 | 2023-01-05 14:25:51 | NULL | ["tempest2"] |
+--------------------------------------+---------------------+---------------------+--------------+
#Update
This is the MariaDB version I have
MariaDB [nova]> SELECT VERSION();
+-------------------------------------------+
| VERSION() |
+-------------------------------------------+
| 10.5.12-MariaDB-1:10.5.12+maria~focal-log |
+-------------------------------------------+
1 row in set (0.000 sec)

SQL-Query to get nested JSON Array

I have the following sample data in a MS-SQL database:
(Microsoft SQL Server Standard Version 13; Microsoft SQL Server Management Studio 18)
+----------+-----------+-----+--------+---------+---------+
| LastName | Firstname | Age | Weight | Sallery | Married |
+----------+-----------+-----+--------+---------+---------+
| Smith | Stan | 58 | 87 | 59.000 | true |
| Smith | Maria | 53 | 57 | 45.000 | true |
| Brown | Chris | 48 | 77 | 159.000 | true |
| Brown | Stepahnie | 39 | 67 | 95.000 | true |
| Brown | Angela | 12 | 37 | 0.0 | false |
+----------+-----------+-----+--------+---------+---------+
I want to get a nested JSON array from it that looks like this:
[
{
"Smith": [
{
"Stan": [
{
"Age": 58,
"Weight": 87,
"Sallery": 59.000,
"Married": true
}
],
"Maria": [
{
"Age": 53,
"Weight": 57,
"Sallery": 45.000,
"Married": true
}
]
}
],
"Brown": [
{
"Chris": [
{
"Age": 48,
"Weight": 77,
"Sallery": 159.000,
"Married": true
}
],
"Stepahnie": [
{
"Age": 39,
"Weight": 67,
"Sallery": 95.000,
"Married": true
}
],
"Angela": [
{
"Age": 12,
"Weight": 37,
"Sallery": 0.0,
"Married": false
}
]
}
]
}
]
How do I have to build the SQL query?
I have tried different ways but I don't get to dynamize the root or the root keeps repeating itself....
For example, I tried the following query:
I get one Level with:
WITH cte AS
(
SELECT FirstName
js = json_query(
(
SELECT Age,
Weight,
Sallery,
Married
FOR json path,
without_array_wrapper ) )
FROM Table1)
SELECT '[' + stuff(
(
SELECT '},{"' + FirstName + '":' + '[' + js + ']'
FROM cte
FOR xml path ('')), 1, 2, '') + '}]'
But I need one more nested level with LastName
Another try:
SELECT
LastName ,json
FROM Table1 as a
OUTER APPLY (
SELECT
FirstName
FROM Table1 as b
WHERE a.LastName = b.LastName
FOR JSON PATH
) child(json)
FOR JSON PATH
Unfortunately, SQL Server does not support JSON_AGG nor JSON_OBJECT_AGG, which would have helped here. But we can hack it with STRING_AGG and STRING_ESCAPE
WITH ByFirstName AS
(
SELECT
p.LastName,
p.FirstName,
json = STRING_AGG(j.json, ',')
FROM Person p
CROSS APPLY (
SELECT
p.Age,
p.Weight,
p.Sallery,
p.Married
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) AS j(json)
GROUP BY
p.LastName,
p.FirstName
),
ByLastName AS
(
SELECT
p.LastName,
json = STRING_AGG(CONCAT(
'"',
STRING_ESCAPE(p.FirstName, 'json'),
'":[',
p.json,
']'
), ',')
FROM ByFirstName p
GROUP BY
p.LastName
)
SELECT '[{' +
STRING_AGG(CONCAT(
'"',
STRING_ESCAPE(p.LastName, 'json'),
'":{',
p.json,
'}'
), ',') + '}]'
FROM ByLastName p
db<>fiddle
This gets you
[
{
"Brown": {
"Angela": [
{
"Age": 12,
"Weight": 37,
"Sallery": 0,
"Married": false
}
],
"Chris": [
{
"Age": 48,
"Weight": 77,
"Sallery": 159000,
"Married": true
}
],
"Stepahnie": [
{
"Age": 39,
"Weight": 67,
"Sallery": 95000,
"Married": true
}
]
},
"Smith": {
"Maria": [
{
"Age": 53,
"Weight": 57,
"Sallery": 45000,
"Married": true
}
],
"Stan": [
{
"Age": 58,
"Weight": 87,
"Sallery": 59000,
"Married": true
}
]
}
}
]
It's certainly possible to get your desired JSON output but, as you can see below, the code is rather convoluted...
/*
* Data setup...
*/
create table dbo.Person (
LastName varchar(10),
FirstName varchar(10),
Age int,
Weight int,
Sallery int,
Married bit
);
insert dbo.Person (LastName, FirstName, Age, Weight, Sallery, Married)
values
('Smith', 'Stan', 58, 87, 59000, 1),
('Smith', 'Maria', 53, 57, 45000, 1),
('Brown', 'Chris', 48, 77, 159000, 1),
('Brown', 'Stepahnie', 39, 67, 95000, 1),
('Brown', 'Angela', 12, 37, 0, 0);
/*
* Example JSON query...
*/
with Persons as (
select LastName, Stan, Maria, Chris, Stepahnie, Angela
from (
select
LastName,
FirstName,
(
select Age, Weight, Sallery, Married
for json path
) as data
from dbo.Person
) src
pivot (max(data) for FirstName in (Stan, Maria, Chris, Stepahnie, Angela)) pvt
)
select
json_query((
select
json_query(Stan) as Stan,
json_query(Maria) as Maria
from Persons
where LastName = 'Smith'
for json path
)) as Smith,
json_query((
select
json_query(Chris) as Chris,
json_query(Stepahnie) as Stepahnie,
json_query(Angela) as Angela
from Persons
where LastName = 'Brown'
for json path
)) as Brown
for json path;
Which yields the output...
[
{
"Smith": [
{
"Stan": [
{
"Age": 58,
"Weight": 87,
"Sallery": 59000,
"Married": true
}
],
"Maria": [
{
"Age": 53,
"Weight": 57,
"Sallery": 45000,
"Married": true
}
]
}
],
"Brown": [
{
"Chris": [
{
"Age": 48,
"Weight": 77,
"Sallery": 159000,
"Married": true
}
],
"Stepahnie": [
{
"Age": 39,
"Weight": 67,
"Sallery": 95000,
"Married": true
}
],
"Angela": [
{
"Age": 12,
"Weight": 37,
"Sallery": 0,
"Married": false
}
]
}
]
}
]

Postgres Build Complex JSON Object from Wide Column Like Design to Key Value

I could really use some help here before my mind explodes...
Given the following data structure:
SELECT * FROM (VALUES (1, 1, 1, 1), (2, 2, 2, 2)) AS t(day, apple, banana, orange);
day | apple | banana | orange
-----+-------+--------+--------
1 | 1 | 1 | 1
2 | 2 | 2 | 2
I want to construct a JSON object which looks like the following:
{
"data": [
{
"day": 1,
"fruits": [
{
"key": "apple",
"value": 1
},
{
"key": "banana",
"value": 1
},
{
"key": "orange",
"value": 1
}
]
}
]
}
Maybe I am not so far away from my goal:
SELECT json_build_object(
'data', json_agg(
json_build_object(
'day', t.day,
'fruits', t)
)
) FROM (VALUES (1, 1, 1, 1), (2, 2, 2, 2)) AS t(day, apple, banana, orange);
Results in:
{
"data": [
{
"day": 1,
"fruits": {
"day": 1,
"apple": 1,
"banana": 1,
"orange": 1
}
}
]
}
I know that there is json_each which may do the trick. But I am struggling to apply it to the query.
Edit:
This is my updated query which, I guess, is pretty close. I have dropped the thought to solve it with json_each. Now I only have to return an array of fruits instead appending to the fruits object:
SELECT json_build_object(
'data', json_agg(
json_build_object(
'day', t.day,
'fruits', json_build_object(
'key', 'apple',
'value', t.apple,
'key', 'banana',
'value', t.banana,
'key', 'orange',
'value', t.orange
)
)
)
) FROM (VALUES (1, 1, 1, 1), (2, 2, 2, 2)) AS t(day, apple, banana, orange);
Would I need to add a subquery to prevent a nested aggregate function?
Use the function jsonb_each() to get pairs (key, value), so you do not have to know the number of columns and their names to get a proper output:
select jsonb_build_object('data', jsonb_agg(to_jsonb(s) order by day))
from (
select day, jsonb_agg(jsonb_build_object('key', key, 'value', value)) as fruits
from (
values (1, 1, 1, 1), (2, 2, 2, 2)
) as t(day, apple, banana, orange),
jsonb_each(to_jsonb(t)- 'day')
group by 1
) s;
The above query gives this object:
{
"data": [
{
"day": 1,
"fruits": [
{
"key": "apple",
"value": 1
},
{
"key": "banana",
"value": 1
},
{
"key": "orange",
"value": 1
}
]
},
{
"day": 2,
"fruits": [
{
"key": "apple",
"value": 2
},
{
"key": "banana",
"value": 2
},
{
"key": "orange",
"value": 2
}
]
}
]
}

multiply a value of each item of a json array with postgres 9.6

I tried many different things that I gathered here and there (official docs, blog posts, SO, …) but didn't succeed, so here's my question to you all:
Given this table:
basik=# select id, jsonb_pretty(range_price_list_values::jsonb) from product;
id | jsonb_pretty
--------------------------------------+--------------------------
cc80c862-c264-4bfe-a929-a52478c8d59e | [ +
| { +
| "to": 10, +
| "from": 5, +
| "price": 1 +
| }, +
| { +
| "to": 20, +
| "from": 15, +
| "price": 1298000+
| }, +
| { +
| "to": 30, +
| "from": 25, +
| "price": 500000 +
| } +
| ]
How to multiply by 1000 the price key of each element of each row of the table ?
PS: my failed tentative was to look around jsonb_* functions and window functions:
WITH prices as (select id, jsonb_array_elements(range_price_list_values::jsonb) from product)
UPDATE product SET range_price_list_values = JSONB_SET(
range_price_list_values::jsonb,
'{' || price.rank || ',price}', jsonb_extract_path('{' || price.rank || ',price}')::int * 1000, false
)::json;
Thanks for taking time to read! :)
You'll need a sub-select, as you want to update multiple fields in your JSON:
update product
set range_price_list_values = (
select jsonb_agg(case
when jsonb_typeof(elem -> 'price') = 'number'
then jsonb_set(elem, array['price'], to_jsonb((elem ->> 'price')::numeric * 1000))
else elem
end)
from jsonb_array_elements(range_price_list_values::jsonb) elem
)::json;
Note: this will only update numeric price keys, otherwise an exception would be thrown, when a price is not a number.
http://rextester.com/PQN70851
First that came (quite ugly):
t=# create table product (id text, range_price_list_values jsonb);
CREATE TABLE
t=# insert into product select 'cc80c862-c264-4bfe-a929-a52478c8d59e','[
t'# {
t'# "to": 10,
t'# "from": 5,
t'# "price": 1
t'# },
t'# {
t'# "to": 20,
t'# "from": 15,
t'# "price": 1298000
t'# },
t'# {
t'# "to": 30,
t'# "from": 25,
t'# "price": 500000
t'# }
t'# ]';
INSERT 0 1
t=# with b as (with a as (select id, jsonb_array_elements(range_price_list_values::jsonb) j from product) select id,jsonb_set(j,'{price}',((j->>'price')::int * 1000)::text::jsonb) from a) select distinct id, jsonb_pretty(concat('[',string_agg(jsonb_set::text,',') over (partition by id),']')::jsonb) from b;
id | jsonb_pretty
--------------------------------------+-----------------------------
cc80c862-c264-4bfe-a929-a52478c8d59e | [ +
| { +
| "to": 10, +
| "from": 5, +
| "price": 1000 +
| }, +
| { +
| "to": 20, +
| "from": 15, +
| "price": 1298000000+
| }, +
| { +
| "to": 30, +
| "from": 25, +
| "price": 500000000 +
| } +
| ]
(1 row)
having that in CTE, you can update values against it

Postgres order by price lowest to highest using jsonb array

Say I have the following product schema which has common properties like title etc, as well as variants in an array.
How would I go about ordering the products by price lowest to highest?
drop table if exists product;
create table product (
id int,
data jsonb
);
insert into product values (1, '
{
"product_id": 10000,
"title": "product 10000",
"variants": [
{
"variantId": 100,
"price": 9.95,
"sku": 100,
"weight": 388
},
{
"variantId": 101,
"price": 19.95,
"sku": 101,
"weight": 788
}
]
}');
insert into product values (2, '
{
"product_id": 10001,
"title": "product 10001",
"variants": [
{
"variantId": 200,
"price": 89.95,
"sku": 200,
"weight": 11
},
{
"variantId": 201,
"price": 99.95,
"sku": 201,
"weight": 22
}
]
}');
insert into product values (3, '
{
"product_id": 10002,
"title": "product 10002",
"variants": [
{
"variantId": 300,
"price": 1.00,
"sku": 300,
"weight": 36
}
]
}');
select * from product;
1;"{"title": "product 10000", "variants": [{"sku": 100, "price": 9.95, "weight": 388, "variantId": 100}, {"sku": 101, "price": 19.95, "weight": 788, "variantId": 101}], "product_id": 10000}"
2;"{"title": "product 10001", "variants": [{"sku": 200, "price": 89.95, "weight": 11, "variantId": 200}, {"sku": 201, "price": 99.95, "weight": 22, "variantId": 201}], "product_id": 10001}"
3;"{"title": "product 10002", "variants": [{"sku": 300, "price": 1.00, "weight": 36, "variantId": 300}], "product_id": 10002}"
Use jsonb_array_elements() to unnest variants, e.g.:
select
id, data->'product_id' product_id,
var->'sku' as sku, var->'price' as price
from
product, jsonb_array_elements(data->'variants') var
order by 4;
id | product_id | sku | price
----+------------+-----+-------
3 | 10002 | 300 | 1.00
1 | 10000 | 100 | 9.95
1 | 10000 | 101 | 19.95
2 | 10001 | 200 | 89.95
2 | 10001 | 201 | 99.95
(5 rows)