Json Array element extract from json file bigquery - google-bigquery

I have JSON file like below and facing challenge to extract value in bigquery
{
'c_fields': [
{
'id': 34605,
'value': None,
}
]
}
output
id value
24605 null
how i will extract value of id and value

If your JSON is in a table column, you could use the JSON FUNCTIONS to accomplish that, here are the docs.
It could be something like:
WITH test_table AS (
SELECT '{"c_fields":[{"id":34605,"value":"None"},{"id":34606,"value":"32"}]}' AS json_field
)
SELECT JSON_EXTRACT(json_value, '$.id') AS id, JSON_EXTRACT(json_value, '$.value') AS value
FROM test_table, UNNEST(JSON_EXTRACT_ARRAY(json_field, '$.c_fields')) AS json_value

Related

How do I Unnest varchar to json in Athena

I am crawling data from Google Big Query and staging them into Athena.
One of the columns crawled as string, contains json :
{
"key": "Category",
"value": {
"string_value": "something"
}
I need to unnest these and flatten them to be able to use them in a query. I require key and string value (so in my query it will be where Category = something
I have tried the following :
WITH dataset AS (
SELECT cast(json_column as json) as json_column
from "thedatabase"
LIMIT 10
)
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset
which is returning null.
Casting the json_column as json adds \ into them :
"[{\"key\":\"something\",\"value\":{\"string_value\":\"app\"}}
If I use replace on the json, it doesn't allow me as it's not a varchar object.
So how do I extract the values from the some_column field?
Presto's json_extract_scalar actually supports extracting just from the varchar (string) value :
-- sample data
WITH dataset(json_column) AS (
values ('{
"key": "Category",
"value": {
"string_value": "something"
}}')
)
--query
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset;
Output:
string_value
something
Casting to json will encode data as json (in case of string you will get a double encoded one), not parse it, use json_parse (in this particular case it is not needed, but there are cases when you will want to use it):
-- query
SELECT
json_extract_scalar(json_parse(json_column), '$.value.string_value') AS string_value
FROM dataset;

Json Arrays of objects PostgreSQL Table format

I have a JSON file (array of objects) which I have to convert into a table format using a PostgreSQL query.
Follow Sample Data.
"b", "c", "d", "e" are to be extracted as separate tables as they are arrays and in these arrays, there are objects
I have tried using json_populate_recordset() but it only works if I have a single array.
[{a:"1",b:"2"},{a:"10",b:"20"}]
I have referred to some links and codes.
jsonb_array_element example
postgreSQL functions
Expected Output
Sample Data:
{
"b":[
{columnB1:value, columnB2:value},
{columnB1:value, columnB2:value},
],
"c":[
{columnC1:value, columnC2:value, columnC3:value},
{columnC1:value, columnC2:value, columnC3:value},
{columnC1:value, columnC2:value, columnC3:value}
],
"d":[
{columnD1:value, columnD2:value},
{columnD1:value, columnD2:value},
],
"e":[
{columnE1:value, columnE2:value},
]
}
expected output
b should be one table in which columnA1 and columnA2 are displayed with their values.
Similarly table c, d, e with their respective columns and values.
Expected Output
You can use jsonb_to_recordset() but you need to unnest your JSON. You need to do this inline as this is a JSON Processing Function which cannot used derived values.
I am using validated JSON as simplified and formatted at end of this answer
To unnest your JSON use below notation which extracts JSON object field with the given key.
--one level
select '{"a":1}'::json->'a'
result : 1
--two levels
select '{"a":{"b":[2]}}'::json->'a'->'b'
result : [2]
We now expand this to include json_to_recordset()
select * from
json_to_recordset(
'{"a":{"b":[{"f1":2,"f2":4},{"f1":3,"f2":6}]}}'::json->'a'->'b' --inner table b
)
as x("f1" int, "f2" int); --fields from table b
or using json_array_elements. Either way we need to list our fields. With second solution type will be json not int so you cant sum etc
with b as (select json_array_elements('{"a":{"b":[{"f1":2,"f2":4},{"f1":3,"f2":6}]}}'::json->'a'->'b') as jx)
select jx->'f1' as f1, jx->'f2' as f2 from b;
Output
f1 f2
2 4
3 6
We now use your data structure in jsonb_to_recordset()
select * from jsonb_to_recordset( '{"a":{"b":[{"columnname1b":"value1b","columnname2b":"value2b"},{"columnname1b":"value","columnname2b":"value"}],"c":[{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"}]}}'::jsonb->'a'->'b') as x(columnname1b text, columnname2b text);
Output:
columnname1b columnname2b
value1b value2b
value value
For table c
select * from jsonb_to_recordset( '{"a":{"b":[{"columnname1b":"value1b","columnname2b":"value2b"},{"columnname1b":"value","columnname2b":"value"}],"c":[{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"}]}}'::jsonb->'a'->'c') as x(columnname1 text, columnname2 text);
Output
columnname1 columnname2
value value
value value
value value
Sample JSON
{
"a": {
"b": [
{
"columnname1b": "value1b",
"columnname2b": "value2b"
},
{
"columnname1b": "value",
"columnname2b": "value"
}
],
"c": [
{
"columnname1": "value",
"columnname2": "value"
},
{
"columnname1": "value",
"columnname2": "value"
},
{
"columnname1": "value",
"columnname2": "value"
}
]
}
}
Well, I came up with some ideas, here is one that worked. I was able to get one table at a time.
https://www.postgresql.org/docs/9.5/functions-json.html
I am using json_populate_recordset.
The column used in the first select statement comes from a table whose column is a JSON type which we are trying to extract into a table.
The 'tablename from column' in the json_populate_recordset function, is the table we are trying to extract followed with b its columns and datatypes.
WITH input AS(
SELECT cast(column as json) as a
FROM tablename
)
SELECT b.*
FROM input c,
json_populate_recordset(NULL::record,c.a->'tablename from column') as b(columnname1 datatype, columnname2 datatype)

Query for retrieve matching json Objects as a list

Assume i have a table called MyTable and this table have a JSON type column called myjson and this column have next value as a json array hold multiple objects, for example like next:
[
{
"budgetType": "CF",
"financeNumber": 1236547,
"budget": 1000000
},
{
"budgetType": "ENVELOPE",
"financeNumber": 1236888,
"budget": 2000000
}
]
So how i can search if the record has any JSON objects inside its JSON array with financeNumber=1236547
Something like this:
SELECT
t.*
FROM
"MyTable",
LATERAL json_to_recordset(myjson) AS t ("budgetType" varchar,
"financeNumber" int,
budget varchar)
WHERE
"financeNumber" = 1236547;
Obviously not tested on your data, but it should provide a starting point.
with a as(
SELECT json_array_elements(myjson)->'financeNumber' as col FROM mytable)
select exists(select from a where col::text = '1236547'::text );
https://www.postgresql.org/docs/current/functions-json.html
json_array_elements return setof json, so you need cast.
Check if a row exists: Fastest check if row exists in PostgreSQL

How to select single field from array of json objects?

I have a JSONB column with values in following JSON structure
{
"a": "value1", "b": [{"b1": "value2", "b3": "value4"}, {"b1": "value5", "b3": "value6"}]
}
I need to select only b1 field in the result. So expected result would be
["value2", "value5"]
I can select complete array using query
select columnname->>'b' from tablename
step-by-step demo:db<>fiddle
SELECT
jsonb_agg(elements -> 'b1') -- 2
FROM mytable,
jsonb_array_elements(mydata -> 'b') as elements -- 1
a) get the JSON array from the b element (b) extract the array elements into one row each
a) get the b1 values from the array elements (b) reaggregate these values into a new JSON array
If you are using Postgres 12 or later, you an use a JSON path query:
select jsonb_path_query_array(the_column, '$.b[*].b1')
from the_table;

SELECT JSON_VALUE From table returns null instead of value

JSON stored in a column 'DataJson' in table
[{
"KickOffDate": "1-Jan-2019",
"TeamSize": "11",
"ClientEngineer": "Sagar",
"WaitingPeriod": "16.5"
}]
Query
SELECT JSON_VALUE(DataJson,'$.KickOffDate') AS KickOffDate
, JSON_VALUE(DataJson,'$.ClientEngineer') AS ClientEngineer
FROM [ABC].[Deliver]
Result
KickOffDate ClientEngineer
NULL NULL
Result should be:
KickOffDate ClientEngineer
1-Jan-2019 Sagar
Your sql query is wrong.
You have to correct query like below.
SELECT JSON_VALUE(DataJson,'$[0].KickOffDate') AS KickOffDate ,JSON_VALUE(DataJson,'$[0].ClientEngineer') AS ClientEngineer FROM [ABC].[Deliver]
The data stored in table is not JSON Object, it's JSON Array.
So in order to get each value of JSON Object, need to set index of JSON Object in JSON Array.
Otherwise, you can store data as JSON Object, and then your query can be work normally.
Your JSON appears to be malformed, at least from the point of view of SQL Server's JSON API. From what I have read, if your JSON data consists of a top level JSON array, then the array needs to have a key name, and also the entire contents should be wrapped in { ... }.
The following setup has been tested and works:
WITH yourTable AS (
SELECT '{ "data" : [{"KickOffDate": "1-Jan-2019", "TeamSize": "11", "ClientEngineer": "Sagar", "WaitingPeriod": "16.5"}] }' AS DataJson
)
SELECT
JSON_VALUE(DataJson, '$.data[0].KickOffDate') AS KickOffDate,
JSON_VALUE(DataJson, '$.data[0].ClientEngineer') AS ClientEngineer
FROM yourTable;
Demo
Here is what the input JSON I used looks like:
{
"data" : [
{
"KickOffDate": "1-Jan-2019",
"TeamSize": "11",
"ClientEngineer": "Sagar",
"WaitingPeriod": "16.5"
}
]
}