How to extract nested json with SQL - sql

I have a table name ‘my doc’. And there is a column which is a nested json called ‘element’. The structure is as below.
I want to extract league_id. How to make it with SQL? Thanks
{
“api”: {
“Results

Different SQL implementations have different standards..
E.g. mysql can handle a JSON field in their table
-- returns {"a": 1, "b": 2}:
SELECT JSON_OBJECT('a', 1, 'b', 2);
https://www.sitepoint.com/use-json-data-fields-mysql-databases/

Related

postgresql triggers to insert json data into columns and keep left over fields

I have found this question PostgreSQL: Efficiently split JSON array into rows
I have a similar situation but for inserts instead.
Considering I do not have a table but raw json in a ndjson file...
{"x": 1}
{"x": 2, "y": 3}
{"x": 8, "z": 3}
{"x": 5, "y": 2, "z": 3}
I want to insert the data into a table of the form (where json fields which do not have a column are stored in the json column)
x
y
json
1
NULL
NULL
2
3
NULL
8
NULL
{"z": 3}
5
2
{"z": 3}
How do I define my table such that postgresql does it automatically on insert or \copy
Use the operator -> and cast the value to the proper type for values of existing regular columns. Use the delete operator to get the remaining JSON values.
I have used CTE in the example. Instead, create the table json_data with a single JSONB column and copy the JSON file to it with \copy
with json_data(json) as (
values
('{"x": 1}'::jsonb),
('{"x": 2, "y": 3}'),
('{"x": 8, "z": 3}'),
('{"x": 5, "y": 2, "z": 3}')
)
select
(json->'x')::int as x,
(json->'y')::int as y,
nullif(json- 'x'- 'y', '{}') as json
from json_data
Read about JSON Functions and Operators in the documentation.
Note. In Postgres 10 or earlier use the ->> operator instead of ->.
To automate the conversion when importing json data, define a trigger:
create table json_data(json jsonb);
create or replace function json_data_trigger()
returns trigger language plpgsql as $$
begin
insert into target_table
select
(new.json->>'x')::int,
(new.json->>'y')::int,
nullif(new.json- 'x'- 'y', '{}');
return new;
end $$;
create trigger json_data_trigger
before insert on json_data
for each row execute procedure json_data_trigger();
Test it in Db<>Fiddle.

Extracting JSON returns null (Presto Athena)

I'm working with SQL Presto in Athena and in a table I have a column named "data.input.additional_risk_data.basket" that has a json like this:
[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]
I need to extract some of the data there, for example data.input.additional_risk_data.basket.val.item_reference. I'm not used to working with jsons but I tried a few things:
json_extract("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference')
json_extract_scalar("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference)
They all returned null. I'm wondering what is the correct way to get the values from that json
Thank you!
There are multiple "problems" with your data and json path selector. Keys are not conventional (and I have not found a way to tell athena to escape them) and your json is actually an array of json objects. What you can do - cast data to an array and process it. For example:
-- sample data
WITH dataset (json_val) AS (
VALUES (json '[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]')
)
--query
select arr[1]['data.input.additional_risk_data.basket.val.item_reference'] item_reference -- or use unnest if there are actually more than 1 element in array expected
from(
select cast(json_val as array(map(varchar, json))) arr
from dataset
)
Output:
item_reference
"26484651"

ADF DataFlow Activity how to create dynamic derived column

I have input fixed width txt file as source.
test file sample below
column_1
12ABC3455
13XYZ5678
How to build dynamic column pattern to produce derived columns.
column Name : empId -> substring(column_1,1,2)
derive Column setting
I can hardcode the empid in & substring(column_1,1,2) in expression.
but i need to make it dynamic with the JSON input to derive dynamic derived columns with column pattern.
Below sample JSON input parameter.
My input JSON formatted parameter
[
{
"colname": "empid",
"startpos": 1,
"length": 2
},
{
"colname": "empname",
"startpos": 3,
"length": 3
},
{
"colname": "empSal",
"startpos": 6,
"length": 4
}
]
help me to build the column pattern with the json input
I tested many times and can't achieve that.
Just per my experience, I'm afraid to tell you that it's impossible to to that in Data Factory actives or Data Flow with json parameter.

PostgreSQL: exclude complete jsonb array if one element fails the WHERE clause

Assume a table json_table with columns id (int), data (jsonb).
A sample jsonb value would be
{"a": [{"b":{"c": "xxx", "d": 1}},{"b":{"c": "xxx", "d": 2}}]}
When I use an SQL statement like the following:
SELECT data FROM json_table j, jsonb_array_elements(j.data#>'{a}') dt WHERE (dt#>>'{b,d}')::integer NOT IN (2,4,6,9) GROUP BY id;
... the two array elements are unnested and the one that qualifies the WHERE clause is still returned. This makes sense since each array element is considered individually. In this example I will get back the complete row
{"a": [{"b":{"c": "xxx", "d": 1}},{"b":{"c": "xxx", "d": 2}}]}
I'm looking for a way to exclude the complete json_table row when any jsonb array element fails the condition
You can move the condition to the WHERE clause and use NOT EXISTS:
SELECT data
FROM json_table j
WHERE NOT EXISTS (SELECT 1
FROM jsonb_array_elements(j.data#>'{a}') dt
WHERE (dt#>>'{b,d}')::integer IN (2, 4, 6, 9)
);
You can achieve it with the following query:
select data
from json_table
where jsonb_path_match(data, '!exists($.a[*].b.d ? ( # == 2 || # == 4 || # == 6 || # == 9))')

Using Postgres JSON Functions on table columns

I have searched extensively (in Postgres docs and on Google and SO) to find examples of JSON functions being used on actual JSON columns in a table.
Here's my problem: I am trying to extract key values from an array of JSON objects in a column, using jsonb_to_recordset(), but get syntax errors. When I pass the object literally to the function, it works fine:
Passing JSON literally:
select *
from jsonb_to_recordset('[
{ "id": 0, "name": "400MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"},
{ "id": 0, "name": "1000MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"}
]') as f(name text);`
results in:
400MB-PDF.pdf
1000MB-PDF.pdf
It extracts the value of the key "name".
Here's the JSON in the column, being extracted using:
select journal.data::jsonb#>>'{context,data,files}'
from journal
where id = 'ap32bbofopvo7pjgo07g';
resulting in:
[ { "id": 0, "name": "400MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"},
{ "id": 0, "name": "1000MB-PDF.pdf", "extension": ".pdf",
"transferId": "ap31fcoqcajjuqml6rng"}
]
But when I try to pass jsonb#>>'{context,data,files}' to jsonb_to_recordset() like this:
select id,
journal.data::jsonb#>>::jsonb_to_recordset('{context,data,files}') as f(name text)
from journal
where id = 'ap32bbofopvo7pjgo07g';
I get a syntax error. I have tried different ways but each time it complains about a syntax error:
Version:
PostgreSQL 9.4.10 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
The expressions after select must evaluate to a single value. Since jsonb_to_recordset returns a set of rows and columns, you can't use it there.
The solution is a cross join lateral, which allows you to expand one row into multiple rows using a function. That gives you single rows that select can act on. For example:
select *
from journal j
cross join lateral
jsonb_to_recordset(j.data#>'{context, data, files}') as d(id int, name text)
where j.id = 'ap32bbofopvo7pjgo07g'
Note that the #>> operator returns type text, and the #> operator returns type jsonb. As jsonb_to_recordset expects jsonb as its first parameter I'm using #>.
See it working at rextester.com
jsonb_to_recordset is a set-valued function and can only be invoked in specific places. The FROM clause is one such place, which is why your first example works, but the SELECT clause is not.
In order to turn your JSON array into a "table" that you can query, you need to use a lateral join. The effect is rather like a foreach loop on the source recordset, and that's where you apply the jsonb_to_recordset function. Here's a sample dataset:
create table jstuff (id int, val jsonb);
insert into jstuff
values
(1, '[{"outer": {"inner": "a"}}, {"outer": {"inner": "b"}}]'),
(2, '[{"outer": {"inner": "c"}}]');
A simple lateral join query:
select id, r.*
from jstuff
join lateral jsonb_to_recordset(val) as r("outer" jsonb) on true;
id | outer
----+----------------
1 | {"inner": "a"}
1 | {"inner": "b"}
2 | {"inner": "c"}
(3 rows)
That's the hard part. Note that you have to define what your new recordset looks like in the AS clause -- since each element in our val array is a JSON object with a single field named "outer", that's what we give it. If your array elements contain multiple fields you're interested in, you declare those in a similar manner. Be aware also that your JSON schema needs to be consistent: if an array element doesn't contain a key named "outer", the resulting value will be null.
From here, you just need to pull the specific value you need out of each JSON object using the traversal operator as you were. If I wanted only the "inner" value from the sample dataset, I would specify select id, r.outer->>'inner'. Since it's already JSONB, it doesn't require casting.