Nested leaf node - sql

I've stored a tree that contains multiple nodes. Every record in that table represents a node and its parent-node, as follows:
node_id
parent_id
A
null
B
A
C
A
D
B
E
B
As a result, the visual tree would look like this: tree-nodes
My goal is to create a function that'll hold the JSON path for every leaf in the tree. So for my current table, the result should behave as shown below:
leaf_id
json_path
C
{"name": "A", "children": [{ "name": "C", "children": [] }] }
D
{"name": "A", "children": [{ "name": "B", "children": [{ "name": "D", "children": [] }] }] }
E
{"name": "A", "children": [{ "name": "B", "children": [{ "name": "E", "children": [] }] }] }
There's already a question with a function that does the format I'm trying to achieve (link below):
nested-json-object.
However, the written function selects the entire tree. Therefore, as I mentioned above, I need the path of every leaf node.

Using a recursive cte:
with recursive cte(id, l, js) as (
select t.node_id, t.node_id, jsonb_build_object('name', t.node_id, 'children', '[]'::jsonb) from tbl t
where not exists (select 1 from tbl t1 where t1.parent_id = t.node_id)
union all
select t3.parent_id, t3.l, jsonb_build_object('name', t3.parent_id, 'children', t3.js)
from (select t2.parent_id, t2.l, jsonb_agg(t2.js) js
from (select t1.parent_id, c.l, c.js from tbl t1
join cte c on c.id = t1.node_id) t2 group by t2.parent_id, t2.l) t3
)
select l leaf_id, v json_path from cte
cross join jsonb_array_elements(js -> 'children') v where id is null
See fiddle

Related

How to return a JSON tree format from SQL query?

I currently have a table that contains a content_id, root_id, parent_id and content_level. This table is self-referencing, in which a record could have related child records. The parent records do not know about the child records but the child record know about the parents via the parent_id field.
This is the query used for fetching all the records with the root content at the top. The root content has content_level = 0, and both root_id and parent_id = NULL. For the rest of the records, the root_id field will match the content_id of root record.
SELECT *
FROM jccontent c2
WHERE c2.content_id = 138412032
UNION ALL
(
SELECT j.*
FROM jccontent AS c
INNER JOIN jccontent j on c.content_id = j.parent_id
WHERE j.root_id = 138412032
)
ORDER BY content_level ;
From here, I would like to build a JSON tree structure where it will contain the root as the top element, and then nested children elements that follows. I would like to complete this portion using purely SQL. Currently I have done it in code and it works well, but would like to see if doing it in SQL will be better.
My desired output would be something like this:
{
"content_id": 138412032,
"root_id": null,
"parent_id": null,
"content_level": 0,
"children": [
{
"content_id": 1572864000,
"root_id": 138412032,
"parent_id": 138412032,
"content_level": 1,
"children": [
{
"content_id": 1606418432,
"root_id": 138412032,
"parent_id": 1572864000,
"content_level": 2,
"children": []
},
{
"content_id": 515899393,
"root_id": 138412032,
"parent_id": 1572864000,
"content_level": 2,
"children": [
{
"content_id": 75497471,
"root_id": 138412032,
"parent_id": 515899393,
"content_level": 3,
"children": []
}
]
}
]
},
{
"content_id": 1795162113,
"root_id": 138412032,
"parent_id": 138412032,
"content_level": 1,
"children": []
}
]
}
If there is any additional information required, please let me know. I will be glad to share. Thank you.
try
WITH recursive cte AS (
SELECT content_id, parent_id, content_level
FROM jccontent
WHERE content_id = 138412032
UNION ALL
SELECT j.content_id, j.parent_id, j.content_level
FROM jccontent j
INNER JOIN cte c ON j.parent_id = c.content_id
)
SELECT JSON_OBJECT('id' VALUE cte.content_id, 'parent_id' VALUE cte.parent_id, 'level' VALUE cte.content_level)
FROM cte
ORDER BY cte.content_level;

Querying over PostgreSQL JSONB column

I have a table "blobs" with a column "metadata" in jsonb data-type,
Example:
{
"total_count": 2,
"items": [
{
"name": "somename",
"metadata": {
"metas": [
{
"id": "11258",
"score": 6.1,
"status": "active",
"published_at": "2019-04-20T00:29:00",
"nvd_modified_at": "2022-04-06T18:07:00"
},
{
"id": "9251",
"score": 5.1,
"status": "active",
"published_at": "2018-01-18T23:29:00",
"nvd_modified_at": "2021-01-08T12:15:00"
}
]
}
]
}
I want to identify statuses in the "metas" array that match with certain, given strings. I have tried the following so far but without results:
SELECT * FROM blobs
WHERE metadata is not null AND
(
SELECT count(*) FROM jsonb_array_elements(metadata->'metas') AS cn
WHERE cn->>'status' IN ('active','reported')
) > 0;
It would also be sufficient if I could compare the string with "status" in the first array object.
I am using PostgreSQL 9.6.24
for some clarity I usually break code into series of WITH statements. My idea for your problem would be to use json path (https://www.postgresql.org/docs/12/functions-json.html#FUNCTIONS-SQLJSON-PATH) and function jsonb_path_query.
Below code gives a list of counts, I will leave the rest to you, to get final data.
I've added ID column just to have something to join on. Otherwise join on metadata.
Also, note additional " in where condition. Left join in blob_ext is there just to have null value if metadata is not present or that path does not work.
with blob as (
select row_number() over()"id", * from (VALUES
(
'{
"total_count": 2,
"items": [
{
"name": "somename",
"metadata": {
"metas": [
{
"id": "11258",
"score": 6.1,
"status": "active",
"published_at": "2019-04-20T00:29:00",
"nvd_modified_at": "2022-04-06T18:07:00"
},
{
"id": "9251",
"score": 5.1,
"status": "active",
"published_at": "2018-01-18T23:29:00",
"nvd_modified_at": "2021-01-08T12:15:00"
}
]
}
}
]}'::jsonb),
(null::jsonb)) b(metadata)
)
, blob_ext as (
select bb.*, blob_sts.status
from blob bb
left join (
select
bb2.id,
jsonb_path_query (bb2.metadata::jsonb, '$.items[*].metadata.metas[*].status'::jsonpath)::character varying "status"
FROM blob bb2
) as blob_sts ON
blob_sts.id = bb.id
)
select bbe.id, count(*) cnt, bbe.metadata
from blob_ext bbe
where bbe.status in ('"active"', '"reported"')
group by bbe.id, bbe.metadata;
A way is to peel one layer at a time with jsonb_extract_path() and jsonb_array_elements():
with cte_items as (
select id,
metadata,
jsonb_extract_path(jx.value,'metadata','metas') as metas
from blobs,
lateral jsonb_array_elements(jsonb_extract_path(metadata,'items')) as jx),
cte_metas as (
select id,
metadata,
jsonb_extract_path_text(s.value,'status') as status
from cte_items,
lateral jsonb_array_elements(metas) s)
select distinct
id,
metadata
from cte_metas
where status in ('active','reported');

SingleStore (MemSQL) How to flatten array using JSON_TO_ARRAY

I have a table source:
data
{ "results": { "rows": [ { "title": "A", "count": 61 }, { "title": "B", "count": 9 } ] }}
{ "results": { "rows": [ { "title": "C", "count": 43 } ] }}
And I want a table dest:
title
count
A
61
B
9
C
43
I found there is JSON_TO_ARRAY function that might be helpful, but got stuck how to apply it.
How to correctly flatten the json array from the table?
I have the following that works on your example but it might help you with the syntax.
In this query I created a table called json_tab with a column called jsondata.
With t as (
select table_col AS title FROM json_tab join TABLE(JSON_TO_ARRAY(jsondata::results::rows)))
SELECT t.title::$title title,t.title::$count count FROM t
I took example from the code snippet to work with Nested Arrays in a JSON Column
https://github.com/singlestore-labs/singlestoredb-samples/blob/main/JSON/Analyzing_nested_arrays.sql
Three options I came up with, which are essentially the same:
INSERT INTO dest
WITH t AS(
SELECT table_col AS arrRows FROM source JOIN TABLE(JSON_TO_ARRAY(data::results::rows))
)
SELECT arrRows::$title as title, arrRows::%count as count FROM t;
INSERT INTO dest
SELECT arrRows::$title as title, arrRows::%count as count FROM
(SELECT table_col AS arrRows FROM source JOIN TABLE(JSON_TO_ARRAY(data::results::rows)));
INSERT INTO dest
SELECT t.table_col::$title as title, t.table_col::%count as count
FROM source JOIN TABLE(json_to_array(data::results::rows)) t;

Parent average value for children sql - recursive

I have a table. ShopUnit:
name,
price,
type (OFFER,CATEGORY),
parentID - (fk to shopUnit (self)).
I have an id. I need to return row with this id. and every children. If item or children have a type == CATEGORY. I need to set price = AVG value for children of a row.
I think of a recursive
with recursive unit_tree as (
select s1.id,
s1.price,
s1.parent_id,
s1.type,
0 as level
from shop_unit s1
where s1.id = 'a'
union all
select s2.id,
s2.price,
s2.parent_id,
s2.type,
level + 1
from shop_unit s2
join unit_tree ut on ut.id = s2.parent_id
)
select unit_tree.id,
unit_tree.parent_id,
unit_tree.type,
unit_tree.level,
unit_tree.price
from unit_tree;
but how do i count the average for every category.
here's example
{
"id": "3fa85f64-5717-4562-b3fc-2c963f66a111",
"name": "Категория",
"type": "CATEGORY",
"parentId": null,
"date": "2022-05-28T21:12:01.516Z",
"price": 6,
"children": [
{
"name": "Оффер 1",
"id": "3fa85f64-5717-4562-b3fc-2c963f66a222",
"price": 4,
"date": "2022-05-28T21:12:01.516Z",
"type": "OFFER",
"parentId": "3fa85f64-5717-4562-b3fc-2c963f66a111"
},
{
"name": "Подкатегория",
"type": "CATEGORY",
"id": "3fa85f64-5717-4562-b3fc-2c963f66a333",
"date": "2022-05-26T21:12:01.516Z",
"parentId": "3fa85f64-5717-4562-b3fc-2c963f66a111",
"price": 8,
"children": [
{
"name": "Оффер 2",
"id": "3fa85f64-5717-4562-b3fc-2c963f66a444",
"parentId": "3fa85f64-5717-4562-b3fc-2c963f66a333",
"date": "2022-05-26T21:12:01.516Z",
"price": 8,
"type": "OFFER"
}
]
}
]
}
In order to determine if a unit is a child of another unit, you need to capture the path for each element in your recursive CTE. Then you can use a LATERAL JOIN on the unit_tree to find and average the price of the children of each unit.
WITH RECURSIVE shop_unit(id,name,price,parent_id,type) as (
(VALUES
('a','Propane',null,null,'CATEGORY'),
('b','Fuels',null,'a','CATEGORY'),
('c','HD5',5,'b','ITEM'),
('d','HD10',10,'b','ITEM'),
('e','Commercial',15,'b','ITEM'),
('f','Accessories',null,'a','CATEGORY'),
('g','Grill',100,'f','ITEM'),
('h','NFT',null,'f','CATEGORY'),
('i','bwaah.jpg',20000,'h','ITEM'),
('j','jaypeg.jpg',100000,'h','ITEM'),
('k','WD-40',2,null,'ITEM')
)
),
unit_tree as (
SELECT
s1.id,
s1.name,
s1.price,
s1.parent_id,
s1.type,
0 as level,
array[id] as path
FROM
shop_unit s1
WHERE
s1.id = 'a'
UNION ALL
SELECT
s2.id,
s2.name,
s2.price,
s2.parent_id,
s2.type,
level + 1,
ut.path || s2.id as path --generate the path for every unit so that we can check if it is a child of another element
FROM
shop_unit s2
JOIN unit_tree ut ON ut.id = s2.parent_id
)
SELECT
ut.id,
ut.name,
ut.parent_id,
ut.type,
case when ut.type = 'CATEGORY' then ap.avg_price else ut.price end as price,
ut.level,
ut.path
FROM
unit_tree ut
-- The JOIN LATERAL subquery roughly means "for each row of ut run this query"
-- Must be a LEFT JOIN LATERAL in order to keep rows of ut that have no children.
LEFT JOIN LATERAL (
SELECT
avg(ut2.price) avg_price
FROM
unit_tree ut2
WHERE
ut.level < ut2.level --is deeper level
and ut.id = any(path) --is in the path
GROUP BY
ut.id
) ap ON TRUE
ORDER BY id
You can use recursion to build the output JSON you want, as well:
with recursive top_down as (
select s.id, s.name, s.type, s."parentId", s.price, 1 as level,
array[s.id] as path, s.id as root
from shopunit s
where s."parentId" is null
union all
select c.id, c.name, c.type, c."parentId", c.price, p.level + 1 as level,
p.path||c.id as path, p.root
from shopunit c
join top_down p on p.id = c."parentId"
), category_averages as (
select p."parentId", avg(c.price) as price, p.level, p.root
from top_down p
join top_down c
on p."parentId" = any(c.path)
group by p."parentId", p.level, p.root
), fill_missing as (
select s.id, s.name, s.type, s."parentId",
coalesce(a.price, s.price)::numeric(8,2) as price,
t.level, max(t.level) over (partition by t.root) as max_depth,
row_number() over (partition by s."parentId" order by s.id) as n,
count(1) over (partition by s."parentId") as max_n,
now() as date
from shopunit s
left join category_averages a on a."parentId" = s.id
join top_down t on t.id = s.id
), build_json as (
select id, "parentId", level, max_depth, n, max_n,
to_jsonb(fill_missing) - 'level' - 'max_depth' - 'n' - 'max_n' as j
from fill_missing
where level = max_depth
and n = max_n
union all
select next.id, next."parentId", next.level, next.max_depth, next.n, next.max_n,
case
when next.level = prev.level
then '[]'::jsonb||(to_jsonb(next) - 'level' - 'max_depth' - 'n' - 'max_n')||prev.j
else
jsonb_set(
to_jsonb(next) - 'level' - 'max_depth' - 'n' - 'max_n',
'{children}', '[]'::jsonb || prev.j
)
end as j
from fill_missing next
join build_json prev
on (prev.n = 1 and prev."parentId" = next.id and next.n = next.max_n)
or (prev.n > 1 and prev."parentId" = next."parentId" and next.n = prev.n - 1)
)
select id, jsonb_pretty(j) as j
from build_json
where "parentId" is null;
Which results in:
{
"id": "3fa85f64-5717-4562-b3fc-2c963f66a111",
"date": "2022-06-11T16:14:44.11989+01:00",
"name": "Категория",
"type": "CATEGORY",
"price": 6.00,
"children": [
{
"id": "3fa85f64-5717-4562-b3fc-2c963f66a222",
"date": "2022-06-11T16:14:44.11989+01:00",
"name": "Оффер 1",
"type": "OFFER",
"price": 4.00,
"parentId": "3fa85f64-5717-4562-b3fc-2c963f66a111"
},
{
"id": "3fa85f64-5717-4562-b3fc-2c963f66a333",
"date": "2022-06-11T16:14:44.11989+01:00",
"name": "Подкатегория",
"type": "CATEGORY",
"price": 8.00,
"children": [
{
"id": "3fa85f64-5717-4562-b3fc-2c963f66a444",
"date": "2022-06-11T16:14:44.11989+01:00",
"name": "Оффер 2",
"type": "OFFER",
"price": 8.00,
"parentId": "3fa85f64-5717-4562-b3fc-2c963f66a333"
}
],
"parentId": "3fa85f64-5717-4562-b3fc-2c963f66a111"
}
],
"parentId": null
}
db<>fiddle here
(The hidden query populates the table from your example JSON)

How to group multiple values to only two groups?

So, I have 2 tables.
Type table
id
Name
1.
General
2.
Mostly Used
3.
Low
Component table
id
Name
typeId
1.
Component 1
1
2.
Component 2
1
4.
Component 4
2
6.
Component 6
2
7.
Component 5
3
There can be numerous types but I want to get only 'General' and 'Others' as types along with the component as follows:
[{
"General": [{
"id": "1",
"name": "General",
"component": [{
"id": 1,
"name": "component 1",
"componentTypeId": 1
}, {
"id": 2,
"name": "component 2",
"componentTypeId": 1
}]
}],
"Others": [{
"id": "2",
"name": "Mostly Used",
"component": [{
"id": 4,
"name": "component 4",
"componentTypeId": 2
}, {
"id": 6,
"name": "component 6",
"componentTypeId": 2
}]
},
{
"id": "3",
"name": "Low",
"component": [{
"id": 7,
"name": "component 5",
"componentTypeId": 3
}]
}
]
}]
WITH CTE_TYPES AS (
SELECT
CASE WHEN t. "name" <> 'General' THEN
'Others'
ELSE
'General'
END AS TYPE,
t.id,
t.name
FROM
type AS t
GROUP BY
TYPE,
t.id
),
CTE_COMPONENT AS (
SELECT
c.id,
c.name,
c.typeid
FROM
component c
)
SELECT
JSON_AGG(jsonb_build_object ('id', CT.id, 'name', CT.name, 'type', CT.type, 'component', CC))
FROM
CTE_COMPONENTTYPES CT
INNER JOIN CTE_COMPONENT CC ON CT.id = CC.tradingplancomponenttypeid
GROUP BY
CT.type
I get 2 types from the query as I expected but the components are not grouped together
Can you also point to resources to learn advanced SQL queries?
Here after is a solution to get your expected result as specified in your question :
First part
The first part of the query aggregates all the components with the same TypeId into a jsonb array. It also calculates the new type column with the value 'Others' for all the type names different from General or with the value 'General' :
SELECT CASE WHEN t.name <> 'General' THEN 'Others' ELSE 'General' END AS type
, t.id, t.name
, jsonb_build_object('id', t.id, 'name', t.name, 'component', jsonb_agg(jsonb_build_object('id', c.id, 'name', c.name, 'componentTypeId', c.typeid))) AS list
FROM component AS c
INNER JOIN type AS t
ON t.id = c.typeid
GROUP BY t.id, t.name
jsonb_build_object builds a jsonb object from a set of key/value arguments
jsonb_agg aggregates jsonb objects into a single jsonb array.
Second part
The second part of the query is much more complex because of the structure of your expected result where you want to nest the types which are different from General with their components inside each other according to the TypeId order, ie Low type with TypeId = 3 is nested inside Mostly Used type with TypeId = 2 :
{ "id": "2",
, "name": "Mostly Used"
, "component": [ { "id": 4
, "name": "component 4"
, "componentTypeId": 2
}
, { ... }
, { "id": "3"
, "name": "Low" --> 'Low' type is nested inside 'Mostly Used' type
, "component": [ { "id": 7
, "name": "component 5"
, "componentTypeId": 3
}
, { ... }
]
}
]
}
To do such a nested structure with a random number of TypeId, you could create a recursive query, but I prefer here to create a user-defined aggregate function which will make the query much more simple and readable, see the manual. The aggregate function jsonb_set_inv_agg is based on the user-defined function jsonb_set_inv which inserts the jsonb object x inside the existing jsonb object z according to the path p. This function is based on the jsonb_set standard function :
CREATE OR REPLACE FUNCTION jsonb_set_inv(x jsonb, p text[], z jsonb, b boolean)
RETURNS jsonb LANGUAGE sql IMMUTABLE AS
$$
SELECT jsonb_set(z, p, COALESCE(z#>p || x, z#>p), b) ;
$$ ;
CREATE AGGREGATE jsonb_set_inv_agg(p text[], z jsonb, b boolean)
( sfunc = jsonb_set_inv
, stype = jsonb
) ;
Based on the newly created aggregate function jsonb_set_inv_agg and the jsonb_agg and jsonb_build_object standard functions already seen above, the final query is :
SELECT jsonb_agg(jsonb_build_object('General', x.list)) FILTER (WHERE x.type = 'General')
|| jsonb_build_object('Others', jsonb_set_inv_agg('{component}', x.list, true ORDER BY x.id DESC) FILTER (WHERE x.type = 'Others'))
FROM
( SELECT CASE WHEN t.name <> 'General' THEN 'Others' ELSE 'General' END AS type
, t.id, t.name
, jsonb_build_object('id', t.id, 'name', t.name, 'component', jsonb_agg(jsonb_build_object('id', c.id, 'name', c.name, 'componentTypeId', c.typeid))) AS list
FROM component AS c
INNER JOIN type AS t
ON t.id = c.typeid
GROUP BY t.id, t.name
) AS x
see the full test result in dbfiddle.