Asking for help on correct way to us SQL with CTE to create JSON_OBJECT - sql

The requested JSON needs to be in this form:
{
"header": {
"InstanceName": "US"
},
"erpReferenceData": {
"erpReferences": [
{
"ServiceID": "fb16e421-792b-4e9c-935b-3cea04a84507",
"ERPReferenceID": "J0000755"
},
{
"ServiceID": "7d13d907-0932-44c0-ad81-600c9b97b6e5",
"ERPReferenceID": "J0000756"
}
]
}
}
The program that I created looks like this:
dcl-s OutFile sqltype(dbclob_file);
exec sql
With x as (
select json_object(
'InstanceName' : trim(Cntry) ) objHeader
from xmlhdr
where cntry = 'US'),
y as (
select json_object(
'ServiceID' VALUE S.ServiceID,
'ERPReferenceID' VALUE I.RefCod) oOjRef
FROM IMH I
INNER JOIN GUIDS G ON G.REFCOD = I.REFCOD
INNER JOIN SERV S ON S.GUID = G.GUID
WHERE G.XMLTYPE = 'Service')
VALUES (
select json_object('header' : objHeader Format json ,
'erpReferenceData' : json_object(
'erpReferences' VALUE
JSON_ARRAYAGG(
ObjRef Format json)))
from x
LEFT OUTER JOIN y ON 1=1
Group by objHeader)
INTO :OutFile;
This is the compile error I get:
SQL0122: Position 41 Column OBJHEADER or expression in SELECT list not valid.
I am asking if this is the correct way to create this SQL statement, is there a better easier way? Any idea how to rewrite the SQL statement to make it work correctly?

The key with generating JSON or XML for that matter is to start from the inside and work your way out.
(I've simplified the raw data into just a test table...)
with elm as(select json_object
('ServiceID' VALUE ServiceID,
'ERPReferenceID' VALUE RefCod) as erpRef
from jsontst)
select * from elm;
Now add the next layer as a CTE the builds on the first CTE..
with elm as(select json_object
('ServiceID' VALUE ServiceID,
'ERPReferenceID' VALUE RefCod) as erpRef
from jsontst)
, arr (arrDta) as (values json_array (select erpRef from elm))
select * from arr;
And the next layer...
with elm as(select json_object
('ServiceID' VALUE ServiceID,
'ERPReferenceID' VALUE RefCod) as erpRef
from jsontst)
, arr (arrDta) as (values json_array (select erpRef from elm))
, erpReferences (refs) as ( select json_object
('erpReferences' value arrDta )
from arr)
select *
from erpReferences;
Nice thing about building with CTE's is at each step, you can see the results so far...
You can actually always go back and stick a Select * from CTE; in the middle to see what you have at some point.
Note that I'm building this in Run SQL Scripts. Once you have the statement complete, you can embed it in your RPG program.

Related

Parse Json - CTE & filtering

I need to remove a few records (that contain t) in order to parse/flatten the data column. The query in the CTE that creates 'tab', works independent but when inside the CTE i get the same error while trying to parse json, if I were not have tried to filter out the culprit.
with tab as (
select * from table
where data like '%t%')
select b.value::string, a.* from tab a,
lateral flatten( input => PARSE_JSON( a.data) ) b ;
;
error:
Error parsing JSON: unknown keyword "test123", pos 8
example data:
Date Data
1-12-12 {id: 13-43}
1-12-14 {id: 43-43}
1-11-14 {test12}
1-11-14 {test2}
1-02-14 {id: 44-43}
It is possible to replace PARSE_JSON(a.data) with TRY_PARSE_JSON(a.data) which will produce NULL instead of error for invalid input.
More at: TRY_PARSE_JSON

Replacing substring with variables in SQL

I am currently figuring out how to do a bit more complex data migration in my database and whether it is even possible to do in SQL (not very experienced SQL developer myself).
Let's say that I store JSONs in one of my text columns in a Postgres table wtih roughly the following format:
{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}]}
Now, I would like to migrate the value part to a bit more complex format:
{"type":"something","params":[{"value":{"id":"00de1be5-f75b-4072-ba30-c67e4fdf2333","path":"/hardcoded/string"}}]}
Furthermore, I also need to reason whether the value contains a UUID pattern, and if not, use slightly different structure:
{"type":"something-else","params":[{"value":"not-id"}]} ---> {"type":"something-else","params":[{"value":{"value":"not-id","path":""}}]}
I know I can define a procedure and use REGEX_REPLACE: REGEXP_REPLACE(source, pattern, replacement_string,[, flags]) but I have no idea how to approach the reasoning about whether the content contains ID or not. Could someone suggest at least some direction or hint how to do this?
You can use jsonb function for extract data and change them. At the end you should extend data.
Sample data structure and query result: dbfiddle
select
(t.data::jsonb || jsonb_build_object(
'params',
jsonb_agg(
jsonb_build_object(
'value',
case
when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
else
jsonb_build_object('value', 'not-id', 'path', '')
end
)
)
))::text
from
test t
cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.data
PS:
If your table had id or unique field you can change group by t.data to do things like that:
select
(t.data::jsonb || jsonb_build_object(
'params',
jsonb_agg(
jsonb_build_object(
'value',
case
when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
else
jsonb_build_object('value', 'not-id', 'path', '')
end
)
)
))::text
from
test t
cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.id
To replace values at any depth, you can use a recursive CTE to run replacements for each value of a value key, using a conditional to check if the value is a UUID, and producing the proper JSON object accordingly:
with recursive cte(v, i, js) as (
select (select array_to_json(array_agg(distinct t.i))
from (select (regexp_matches(js, '"value":("[\w\-]+")', 'g'))[1] i) t), 0, js from (select '{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}]}' js) t1
union all
select c.v, c.i+1, regexp_replace(
regexp_replace(c.js, regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'),
case when not ((c.v -> c.i)::text ~ '\w+\-\w+\-\w+\-\w+\-\w+') then
json_build_object('value', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '')::text
else json_build_object('id', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '/hardcoded/path')::text end, 'g'),
'(")(?=\{)|(?<=\})(")', '', 'g')
from cte c where c.i < json_array_length(c.v)
)
select js from cte order by i desc limit 1
Output:
{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}]}
On a more complex JSON input string:
{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}, {"more":[{"additional":[{"value":"00f41be5-g75b-4072-ba30-c67e4fdf3777"}]}]}]}
Output:
{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}, {"more":[{"additional":[{"value":{"id" : "00f41be5-g75b-4072-ba30-c67e4fdf3777", "path" : "/hardcoded/path"}}]}]}]}

PostgreSQL: select from field with json format

Table has column, named "config" with following content:
{
"A":{
"B":[
{"name":"someName","version":"someVersion"},
{"name":"someName","version":"someVersion"}
]
}
}
The task is to select all name and version values. The output is expected selection with 2 columns: name and value.
I successfully select the content of B:
select config::json -> 'A' -> 'B' as B
from my_table;
But when I'm trying to do something like:
select config::json -> 'A' -> 'B' ->> 'name' as name,
config::json -> 'A' -> 'B' ->> 'version' as version
from my_table;
I receive selection with empty-value columns
If the array size is fixed, you just need to tell which element of the array you want to retrieve,e.g.:
SELECT config->'A'->'B'->0->>'name' AS name,
config->'A'->'B'->0->>'version' AS version
FROM my_table;
But as your array contains multiple elements, use the function jsonb_array_elements in a subquery or CTE and in the outer query parse the each element individually, e.g:
SELECT rec->>'name', rec->>'version'
FROM (SELECT jsonb_array_elements(config->'A'->'B')
FROM my_table) j (rec);
Demo: db<>fiddle
First you should use the jsonb data type instead of json, see the documentation :
In general, most applications should prefer to store JSON data as
jsonb, unless there are quite specialized needs, such as legacy
assumptions about ordering of object keys.
Using jsonb, you can do the following :
SELECT DISTINCT ON (c) c->'name' AS name, c->'version' AS version
FROM my_table
CROSS JOIN LATERAL jsonb_path_query(config :: jsonb, '$.** ? (exists(#.name))') AS c
dbfiddle
select e.value ->> 'name', e.value ->> 'version'
from
my_table cross join json_array_elements(config::json -> 'A' -> 'B') e

How to aggregate the elements in a struct in bigquery

"struct": [
{
"ele_1": "abcd",
"ele_2": "1.0"
},
{
"ele_1": "egf",
"ele_2": "1.0"
}
]
i have data like this in struct format , i am trying to get to something like
1st string_agg on ele_1 in a strut and then sum on ele_2, i have tried unnest( struct) but that is causing duplicated.
"ele_1": "abcd,egf",
"ele_2": "2.0"
You seem to have an array of records and want to separately aggregate the fields:
select t.*,
(select array_agg(rec.ele_1)
from unnest(t.record_array) rec
),
(select sum(rec.ele_2)
from unnest(t.record_array) rec
)
from t;
Another [optimized] option below
select * except(struct_col),
(select as struct
string_agg(ele_1) ele_1,
sum(ele_2) ele_2
from t.struct_col
).*
from `project.dataset.table` t

How to group by duplicate value of nested array in Postgresql?

Previously question : How to group by duplicate value and nested the array Postgresql
Using this query :
SELECT json_build_object(
'nama_perusahaan',"a"."nama_perusahaan",
'proyek', json_agg(
json_build_object(
'no_izin',"b"."no_izin",
'kode',c.kode,
'judul_kode',d.judul
)
)
)
FROM "t_pencabutan" "a"
LEFT JOIN "t_pencabutan_non" "b" ON "a"."id_pencabutan" = "b"."id_pencabutan"
LEFT JOIN "t_pencabutan_non_b" "c" ON "b"."no_izin" = "c"."no_izin"
LEFT JOIN "t_pencabutan_non_c" "d" ON "c"."id_proyek" = "d"."id_proyek"
GROUP BY "a"."nama_perusahaan"
The result is shown below:
{
"nama_perusahaan" : "JASA FERRIE",
"proyek" :
{
"no_izin" : "26A/E/IU/PMA/D8FD",
"kode" : "14302",
"judul_kode" : "IND"
}
{
"no_izin" : "26A/E/IU/PMA/D8FD",
"kode" : "13121",
"judul_kode" : "IND B"
}
}
As you could see, the proyek have been nested, so the duplicate proyek will be grouped. Now i have to group the same value of no_izin so it will double nested array like expected result below.
{
"nama_perusahaan" : "JASA FERRIE",
"proyek" :
[{
"no_izin" : "26A/E/IU/PMA/D8FD",
"kode_list":[
{
"kode" : "14302",
"judul_kode" : "IND"
},
{
"kode" : "13121",
"judul_kode" : "IND B"
}]
}]
}
I tried to use this query:
SELECT json_build_object(
'nama_perusahaan',"a"."nama_perusahaan",
'proyek', json_agg(
json_build_object(
'no_izin',"b"."no_izin",
'kode_list',json_agg(
json_build_object(
'kode',c.kode,
'judul_kode',d.judul
)
)
)
)
)
FROM "t_pencabutan" "a"
LEFT JOIN "t_pencabutan_non" "b" ON "a"."id_pencabutan" = "b"."id_pencabutan"
LEFT JOIN "t_pencabutan_non_b" "c" ON "b"."no_izin" = "c"."no_izin"
LEFT JOIN "t_pencabutan_non_c" "d" ON "c"."id_proyek" = "d"."id_proyek"
GROUP BY "a"."nama_perusahaan", b.no_izin
but it didnt work, it gives ERROR: aggregate function calls cannot be nested LINE 6:'kode_list',json_agg(.
What could go wrong with my code ?
Disclaimer: It is very hard for us to construct a query without knowing the input data and table structure and have to handle a language we don't know. Please try to minimize your further questions (e.g. For your question it is not relevant that you need to join some tables before converting the result into a JSON output), create examples in English (handling foreign languages makes the code looking confusing and leads to spelling errors, so the probably right idea fails on writing the words wrong) and add the input data! This would help you as well: You would get an answer faster and the chance of code mistakes is much more less (because now without the data we cannot create a runnable example to check our ideas).
Creating a nested JSON structure is only possible doing it from the innermost nested object to the outermost one. So first you have to create the no_izin array in a subquery. This can be used to create the proyek object:
SELECT
json_build_object(
'nama_perusahaan',"s"."nama_perusahaan",
'proyek', json_agg(no_izin)
)
)
FROM (
SELECT
"a"."nama_perusahaan",
json_build_object(
'no_izin',
"b"."no_izin",
'kode_list',
json_agg(
json_build_object(
'kode',c.kode,
'judul_kode',d.judul
)
)
) AS no_izin
FROM "t_pencabutan" "a"
LEFT JOIN "t_pencabutan_non" "b" ON "a"."id_pencabutan" = "b"."id_pencabutan"
LEFT JOIN "t_pencabutan_non_b" "c" ON "b"."no_izin" = "c"."no_izin"
LEFT JOIN "t_pencabutan_non_c" "d" ON "c"."id_proyek" = "d"."id_proyek"
GROUP BY "c"."id_proyek", "a"."nama_perusahaan"
) AS s
GROUP BY "s"."nama_perusahaan"