Extract JSON values recursivey in postgres - sql

I have some JSON data stored in a column. I want to parse the json data and extract all the values against a particular key.
Here's my sample data:
{
"fragments": [
{
"fragments": [
{
"fragments": [
{
"fragments": [],
"fragmentName": "D"
},
{
"fragments": [],
"fragmentName": "E"
},
{
"fragments": [],
"fragmentName": "F"
}
],
"fragmentName": "C"
}
],
"fragmentName": "B"
}
],
"fragmentName": "A"
}
Expected output:
D, E, F, C, B, A
I want to extract all fragmentName values from the above JSON.
I have gone through the below stacks, but haven't found anything useful:
Collect Recursive JSON Keys In Postgres
Postgres recursive query with row_to_json
Edited:
Here's one approach I have tried on the above stacks:
WITH RECURSIVE key_and_value_recursive(key, value) AS (
SELECT
t.key,
t.value
FROM temp_frg_mapping, json_each(temp_frg_mapping.info::json) AS t
WHERE id=2
UNION ALL
SELECT
t.key,
t.value
FROM key_and_value_recursive,
json_each(CASE
WHEN json_typeof(key_and_value_recursive.value) <> 'object' THEN '{}' :: JSON
ELSE key_and_value_recursive.value
END) AS t
)
SELECT *
FROM key_and_value_recursive;
Output:
Getting only 0 level nesting.

I would use a recursive query, but with jsonb_array_elements():
with recursive cte as (
select id, info ->> 'fragmentName' as val, info -> 'fragments' as info, 1 lvl
from mytable
where id = 2
union all
select c.id, x.info ->> 'fragmentName', x.info -> 'fragments', c.lvl + 1
from cte c
cross join lateral jsonb_array_elements(c.info) as x(info)
where c.info is not null
)
select id, val, lvl
from cte
where val is not null
The query traverses the object depth-first; at each step of the way, we unnest the json array and check if a fragment name is available. We don't need to check the types of the returned values: we just use the standard functions, until the data exhausts.
Demo on DB Fiddle
Sample data:
{
"fragments": [
{
"fragments": [
{
"fragments": [
{
"fragments": [
],
"fragmentName": "D"
},
{
"fragments": [
],
"fragmentName": "E"
},
{
"fragments": [
],
"fragmentName": "F"
}
],
"fragmentName": "C"
}
],
"fragmentName": "B"
}
],
"fragmentName": "A"
}
Results:
id | val | lvl
-: | :-- | --:
2 | A | 1
2 | B | 2
2 | C | 3
2 | D | 4
2 | E | 4
2 | F | 4

Related

How to parse this JSON file in Snowflake?

So I have a column in a Snowflake table that stores JSON data but the column is of a varchar data type.
The JSON looks like this:
{
"FLAGS": [],
"BANNERS": {},
"TOOLS": {
"game.appConfig": {
"type": [
"small",
"normal",
"huge"
],
"flow": [
"control",
"noncontrol"
]
}
},
"PLATFORM": {}
}
I want to filter only the data inside TOOLS and want to get the following result:
TOOLS_ID
TOOLS
game.appConfig
type
game.appConfig
flow
How can I achieve this?
I assumed that the TOOLs can have more than one tool ID, so I wrote this query:
with mydata as ( select
'{
"FLAGS": [],
"BANNERS": {},
"TOOLS": {
"game.appConfig": {
"type": [
"small",
"normal",
"huge"
],
"flow": [
"control",
"noncontrol"
]
}
},
"PLATFORM": {}
}' as v1 )
select main.KEY TOOLS_ID, sub.KEY TOOLS
from mydata,
lateral flatten ( parse_json(v1):"TOOLS" ) main,
lateral flatten ( main.VALUE ) sub;
+----------------+-------+
| TOOLS_ID | TOOLS |
+----------------+-------+
| game.appConfig | flow |
| game.appConfig | type |
+----------------+-------+
Assuming the column name is C1 and table name T1:
select a.t:"TOOLS":"game.appConfig"::string from (select
parse_json(to_variant(C1))t from T1) a

What is the best practise for outputting JSON Type arrays from Table values without repetition?

I have the following SQL data that I am trying to output as a structured JSON string as follows:
Table data
TableId ContainerId MaterialId SizeId
848 1 1 1
849 1 1 2
850 1 2 1
851 1 2 2
852 1 3 1
853 1 4 1
854 2 2 1
855 2 2 2
856 2 2 3
JSON output
{
"container": [
{
"id": 1,
"material": [
{
"id": 1,
"size": [
{
"id": 1
},
{
"id": 2
}
]
},
{
"id": 2,
"size": [
{
"id": 1
},
{
"id": 2
}
]
},
{
"id": 3,
"size": [
{
"id": 1
}
]
},
{
"id": 4,
"size": [
{
"id": 1
}
]
}
]
},
{
"id": 2,
"material": [
{
"id": 2,
"size": [
{
"id": 1
},
{
"id": 2
},
{
"id": 3
}
]
}
]
}
]
}
I have tried several ways of outputting it but I am struggling to stop duplicated Container and Material Id records. Is anyone able to demonstrate the best working practices for extracting JSON from a table such as this please?
Well, it isn't pretty but this appears to work:
WITH
container As (SELECT distinct containerid As id FROM jsonArray1 As container)
, material As (SELECT distinct materialid As id, containerid As cid FROM jsonArray1 As material)
, size As (SELECT sizeid As id, materialid As tid, containerid As cid FROM jsonArray1 As size)
SELECT container.id id, material.id id, size.id id
FROM container
JOIN material ON material.cid = container.id
JOIN size ON size.tid = material.id AND size.cid = material.cid
FOR JSON AUTO, ROOT
sqlfiddle example
AUTO will structure JSON for you, but only by following the structure of the data tables used in the query. Since the data starts out "flat" in a single table, AUTO won't create any structure. So the trick I applied here was to use WITH CTE's to restructure this flat data into three virtual tables whose relationships had the necessary structure.
Everything here is super-sensitive in a way that normal relational SQL would not be. For instance, just changing the order of the JOINs will restructure the JSON hierarchy even though that would have no effect on a normal SQL query.
I also had to play around with the table and column aliases (a lot) to get it to put the right names on everything in the JSON.
You can use the following query
SELECT CONCAT('{"container": [',string_agg(json,','),']}') as json
FROM
(SELECT CONCAT('{"id:"',CAST(ContainerId as nvarchar(100)),
',"material":[{',string_agg(json,','),'}]}') as json,
dense_rank() over(partition by ContainerId order by ContainerId) rnk
FROM
(SELECT ContainerId ,MaterialId,CONCAT('"id":',CAST(MaterialId as nvarchar(100))
,',"size":[',string_agg('{"id":' + CAST(SizeId as nvarchar(100)) + '}',','),']') as json
FROM tb
GROUP BY ContainerId,MaterialId) T
GROUP BY ContainerId) T
GROUP BY rnk
demo in db<>fiddle

How to unpack Array to Rows in Snowflake?

I have a table that looks like the following in Snowflake:
ID | CODES
2 | [ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]
And I want to make it into:
ID | CODES
2 | 'CODE1'
2 | 'CODE2'
So far I've tried
SELECT ID,CODES[0]:list
FROM MY_TABLE
But that only gets me as far as:
ID | CODES
2 | [ { "item": "CODE1" }, { "item": "CODE2" } ]
How can I break out every 'item' element from every index of this list into its own row with each CODE as a string?
Update: Here is the answer I got working at the same time as the answer below, looks like we both used FLATTEN:
SELECT ID,f.value:item
FROM MY_TABLE,
lateral flatten(input => MY_TABLE.CODES[0]:list) f
So as you note you have hard coded your access into the codes, via codes[0] which gives you the first item from that array, if you use FLATTEN you can access all of the objects of the first array.
WITH my_table(id,codes) AS (
SELECT 2, parse_json('[ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]')
)
SELECT ID, c.*
FROM my_table,
table(flatten(codes)) c;
gives:
2 1 [0] 0 { "list": [ { "item": "CODE1" }, { "item": "CODE2" }]} [ { "list": [{"item": "CODE1"}, { "item": "CODE2" }]}]
so now you want to loop across the items in list, so we use another FLATTEN on that:
WITH my_table(id,codes) AS (
SELECT 2, parse_json('[ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]')
)
SELECT ID, c.value, l.value
FROM my_table,
table(flatten(codes)) c,
table(flatten(c.value:list)) l;
gives:
2 {"list":[{"item": "CODE1"},{"item":"CODE2"}]} {"item":"CODE1"}
2 {"list":[{"item": "CODE1"},{"item":"CODE2"}]} {"item":"CODE2"}
so you can pull apart that l.value how you need to access the parts you need.

How to sum of jsonb inner field in postgres?

I want to sum one field PC value from table test group by metal_id.
I tried this Postgres GROUP BY on jsonb inner field
but not worked for me as I have different JSON format
tbl_test
id json
1 [
{
"pc": "4",
"metal_id": "1"
},
{
"pc": "4",
"metal_id": "5"
}
]
2. [
{
"pc": "1",
"metal_id": "1"
},
{
"pc": "1",
"metal_id": "2"
},
]
output I want is :(group by metal_id and sum of pc).
Thanks in advance!
[
"metal_id": 1
{
"pc": "5",
}
]
You can use json_array_element() to expand the jsonb array, and then aggregate by metal id:
select obj ->> 'metal_id' metal_id, sum((obj ->> 'pc')::int) cnt
from mytable t
cross join lateral jsonb_array_elements(t.js) j(obj)
group by obj ->> 'metal_id'
order by metal_id
Demo on DB Fiddle:
metal_id | cnt
:------- | --:
1 | 5
2 | 1
5 | 4

How do I flatten a nested json key/value pair into a single array of values?

In SNOWFLAKE, I have a data structure like:
ORGANIZATION TABLE
------------------
Org:variant
------------------
{
relationships: [{
{ name: 'mother', value: a },
{ name: 'siblings', value: [ 'c', 'd' ] }
}]
}
PEOPLE TABLE
-------------------
Person:variant
-------------------
{
id: a
name: Mary
}
-------------------
{
id: b
name: Joe
}
-------------------
{
id: c
name: John
}
I want to have a result of:
ORGANIZATION | PEOPLE
---------------------------------------------------|----------------------------
{ |[
relationships: [{ | {
{ name: 'mother', value: a }, | id: a,
{ name: 'siblings', value: [ 'c', 'd' ] } | name: Mary
}] | },
} | {
| id: b,
| name: Joe
| },
| {
| id: c,
| name: john
| }
|]
I'm sure ARRAY_AGG is involved somehow but I'm at a loss how I would aggregate the results up into a single array of values.
My current query:
SELECT Org, ARRAY_AGG(Person) as People
FROM Organizations
INNER JOIN People ON People.id IN Org.relationships...?? (I'm lost here)
GROUP BY Org
The below queries illustrate how to use FLATTEN and ARRAY_AGG to get the output you want.
FLATTEN unnests each array so that you can join on the values within.
ARRAY_AGG aggregates the values grouped by org.
The CASE statement accounts for org.relationships not always being an array.
CREATE OR REPLACE TABLE organizations (org variant) AS
SELECT parse_json('{relationships: [{ name: "mother", value: "a" }, { name: "siblings", value: [ "b", "c" ] } ] } ');
CREATE OR REPLACE TABLE people (person variant) AS
SELECT parse_json($1)
FROM
VALUES ('{id:"a", name: "Mary"}'),
('{id:"b", name: "Joe"}'),
('{id:"c", name: "John"}');
WITH org_people AS
(SELECT o.org,
relationship.value AS relationship,
CASE is_array(relationship:value)
WHEN TRUE THEN person_in_relationship.value
ELSE relationship:value
END AS person_in_relationship
FROM organizations o,
LATERAL FLATTEN(o.org:relationships) relationship ,
LATERAL FLATTEN(relationship.value:value, OUTER=>TRUE) person_in_relationship
)
SELECT op.org,
ARRAY_AGG(p.person) AS people
FROM org_people op
JOIN people p ON p.person:id = op.person_in_relationship
GROUP BY op.org;