how to return "sparse" json (choose a number of attributes ) from PostgreSQL - sql

MongoDB has a way of choosing the fields of a JSON documents that are returned as a result of query. I am looking for the same with PostgreSQL.
Let's assume that I've got a JSON like this:
{
a: valuea,
b: valueb,
c: valuec,
...
z: valuez
}
The particular values may be either simple values or subobjects with further nesting.
I want to have a way of returning JSON Documents containing only the atttributes I choose, something like:
SELECT json_col including_only a,b,c,g,n from table where...
I know that there is the "-" operator, allowing eliminating specific attributes, but is there an operator that does exactly the opposite?

In trivial cases you can use jsonb_to_record(jsonb)
with data(json_col) as (
values
('{"a": 1, "b": 2, "c": 3, "d": 4}'::jsonb)
)
select *, to_jsonb(rec) as result
from data
cross join jsonb_to_record(json_col) as rec(a int, d int)
json_col | a | d | result
----------------------------------+---+---+------------------
{"a": 1, "b": 2, "c": 3, "d": 4} | 1 | 4 | {"a": 1, "d": 4}
(1 row)
See JSON Functions and Operators.
If you need a more generic tool, the function does the job:
create or replace function jsonb_sparse(jsonb, text[])
returns jsonb language sql immutable as $$
select $1 - (
select array_agg(key)
from jsonb_object_keys($1) as key
where key <> all($2)
)
$$;
-- use:
select jsonb_sparse('{"a": 1, "b": 2, "c": 3, "d": 4}', '{a, d}')
Test it in db<>fiddle.

Related

Using JSON_VALUE in ORACLE PL/SQL to retrieve a nested key

I know that this returns the "2":
SELECT JSON_VALUE('{"x": "1", "y": {"a": "2"}}}', '$.y.a') AS value FROM DUAL;
How do I return "a" from this query:
SELECT JSON_VALUE('{"x": "1", "y": {"a": "2"}}', '????') AS value FROM DUAL;
This returns null
SELECT JSON_VALUE('{"x": "1", "y": {"a": "2"}}', '$.y') AS value FROM DUAL;
Assuming you are using Oracle 12 or later (which is when JSON support was introduced) then to get the y attribute as JSON then you can use:
SELECT y
FROM JSON_TABLE(
'{"x": "1", "y": {"a": "2"}}',
'$'
COLUMNS
y VARCHAR2(4000) FORMAT JSON PATH '$.y'
);
Which outputs:
Y
{"a":"2"}
fiddle

postgresql triggers to insert json data into columns and keep left over fields

I have found this question PostgreSQL: Efficiently split JSON array into rows
I have a similar situation but for inserts instead.
Considering I do not have a table but raw json in a ndjson file...
{"x": 1}
{"x": 2, "y": 3}
{"x": 8, "z": 3}
{"x": 5, "y": 2, "z": 3}
I want to insert the data into a table of the form (where json fields which do not have a column are stored in the json column)
x
y
json
1
NULL
NULL
2
3
NULL
8
NULL
{"z": 3}
5
2
{"z": 3}
How do I define my table such that postgresql does it automatically on insert or \copy
Use the operator -> and cast the value to the proper type for values of existing regular columns. Use the delete operator to get the remaining JSON values.
I have used CTE in the example. Instead, create the table json_data with a single JSONB column and copy the JSON file to it with \copy
with json_data(json) as (
values
('{"x": 1}'::jsonb),
('{"x": 2, "y": 3}'),
('{"x": 8, "z": 3}'),
('{"x": 5, "y": 2, "z": 3}')
)
select
(json->'x')::int as x,
(json->'y')::int as y,
nullif(json- 'x'- 'y', '{}') as json
from json_data
Read about JSON Functions and Operators in the documentation.
Note. In Postgres 10 or earlier use the ->> operator instead of ->.
To automate the conversion when importing json data, define a trigger:
create table json_data(json jsonb);
create or replace function json_data_trigger()
returns trigger language plpgsql as $$
begin
insert into target_table
select
(new.json->>'x')::int,
(new.json->>'y')::int,
nullif(new.json- 'x'- 'y', '{}');
return new;
end $$;
create trigger json_data_trigger
before insert on json_data
for each row execute procedure json_data_trigger();
Test it in Db<>Fiddle.

BigQuery Dynamic JSON attributes as columnar data

I have a table with one of the columns containing JSON.
Col_A
Col_B
Col_C
1
Abc1
{“a”: “a_val”, “b”: “b_val”}
2
Abc2
{"a": “a_val2”, “c”: “c_val2”}
3
Abc3
{"b": “b_val3”, “c”: “c_val3”, “d”: {“x”: “x_val”, “y”: “y_val”}}
How can I put together BQ SQL to extract the columns and attributes of the JSON as additional columns. Need to go only 1 level deep into the JSON. So the output should look like:
Col_A
Col_B
A
B
C
D
1
Abc1
a_val
b_val
2
Abc2
a_val2
c_val2
3
Abc3
b_val3
c_val3
{“x”: “x_val”, “y”: “y_val”}
Consider below approach
create temp function json_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function json_path(json string, json_path string)
returns string
language js as """
try { var parsed = JSON.parse(json);
return JSON.stringify(jsonPath(parsed, json_path));
} catch (e) { return null }
"""
OPTIONS (
library="gs://my-storage/jsonpath-0.8.0.js"
);
select * from (
select t.* except(col_c), key, trim(json_path(col_c, '$.' || key), '"[]') value
from your_table t,
unnest(json_keys(col_c)) key
)
pivot (min(value) for key in ('a', 'b', 'c', 'd'))
if applied to sample data in your question
with your_table as (
select 1 col_a, 'abc1' col_b, '{"a": "a_val", "b": "b_val"}' col_c union all
select 2, 'abc2', '{"a": "a_val2", "c": "c_val2"}' union all
select 3, 'abc3', '{"b": "b_val3", "c": "c_val3", "d": {"x": "x_val", "y": "y_val"}}'
)
output is
In order to use above you need to upload jsonpath-0.8.0.js (can be download at https://code.google.com/archive/p/jsonpath/downloads) into your GCS bucket gs://my-storage/
As you can see - above solution assumes you know key names in advance
Obviously, when you know keys in advance you would simply use json_extract, but this would not work for when you don't know keys in advance!
So, if you don't know - above solution can be used as a template for dynamically generating query (with real keys in for key in ('a', 'b', 'c', 'd')) to be executed with EXECUTE IMMEDIATE

PostgreSQL: exclude complete jsonb array if one element fails the WHERE clause

Assume a table json_table with columns id (int), data (jsonb).
A sample jsonb value would be
{"a": [{"b":{"c": "xxx", "d": 1}},{"b":{"c": "xxx", "d": 2}}]}
When I use an SQL statement like the following:
SELECT data FROM json_table j, jsonb_array_elements(j.data#>'{a}') dt WHERE (dt#>>'{b,d}')::integer NOT IN (2,4,6,9) GROUP BY id;
... the two array elements are unnested and the one that qualifies the WHERE clause is still returned. This makes sense since each array element is considered individually. In this example I will get back the complete row
{"a": [{"b":{"c": "xxx", "d": 1}},{"b":{"c": "xxx", "d": 2}}]}
I'm looking for a way to exclude the complete json_table row when any jsonb array element fails the condition
You can move the condition to the WHERE clause and use NOT EXISTS:
SELECT data
FROM json_table j
WHERE NOT EXISTS (SELECT 1
FROM jsonb_array_elements(j.data#>'{a}') dt
WHERE (dt#>>'{b,d}')::integer IN (2, 4, 6, 9)
);
You can achieve it with the following query:
select data
from json_table
where jsonb_path_match(data, '!exists($.a[*].b.d ? ( # == 2 || # == 4 || # == 6 || # == 9))')

Extract all JSON keys

I have a JSON column j like:
{'a': 2, 'b': {'b1': 3, 'b2': 5}}
{'c': 3, 'a': 5}
{'d': 1, 'c': 7}
How can I get all distinct (top-level) key names from Presto? I.e. I something like
select distinct foo(j)
To return
['a', 'b', 'c', 'd']
(note that in this instance I'm not too concerned with the nested keys)
Presto documentation doesn't have any function that explicitly fits the bill. The only thing that looks close is mention of JSONPath syntax, but even this seems to be inaccurate. At least one of the following should return something but all failed in Presto for me:
select json_extract(j, '$.*')
select json_extract(j, '$..*')
select json_extract(j, '$[*]')
select json_extract(j, '*')
select json_extract(j, '..*')
select json_extract(j, '$*.*')
Further, I suspect this will return the values, not the keys, from j (i.e., [2, 3, 5, 3, 5, 1, 7]).
You can
extract JSON top-level keys with map_keys(cast(json_column as map<varchar,json>))
later "flatten" the key collections using CROSS JOIN UNNEST
then you can SELECT DISTINCT to get distinct top-level keys.
Example putting this together:
presto> SELECT DISTINCT m.key
-> FROM (VALUES JSON '{"a": 2, "b": {"b1": 3, "b2": 5}}', JSON '{"c": 3, "a": 5}')
-> example_table(json_column)
-> CROSS JOIN UNNEST (map_keys(CAST(json_column AS map<varchar,json>))) AS m(key);
key
-----
a
b
c
(3 rows)