How to read JSON key values as a data column in Snowflake? - sql

I have the below sample JSON:
{
"Id1": {
"name": "Item1.jpg",
"Status": "Approved"
},
"Id2": {
"name": "Item2.jpg",
"Status": "Approved"
}
}
and I am trying to get the following output:
_key name Status
Id1 Item1.jpg Approved
Id2 Item2.jpg Approved
Is there any way I can achieve this in Snowflake using SQL?

You should use Snowflake's VARIANT data type in any column holding JSON data. Let's break this down step by step:
create temporary table FOO(v variant); -- Temp table to hold the JSON. Often you'll see a variant column simply called "V"
-- Insert into the variant column. Parse the JSON because variants don't hold string types. They hold semi-structured types.
insert into FOO select parse_json('{"Id1": {"name": "Item1.jpg", "Status": "Approved"}, "Id2": {"name": "Item2.jpg", "Status": "Approved"}}');
-- See how it looks in its raw state
select * from FOO;
-- Flatten the top-level JSON. The flatten function breaks down the JSON into several usable columns
select * from foo, lateral flatten(input => (foo.v)) ;
-- Now traverse the JSON using the column name and : to get to the property you want. Cast to string using ::string.
-- If you must have exact case on your column names, you need to double quote them.
select KEY as "_key",
VALUE:name::string as "name",
VALUE:Status::string as "Status"
from FOO, lateral flatten(input => (FOO.V)) ;

Related

How do I Unnest varchar to json in Athena

I am crawling data from Google Big Query and staging them into Athena.
One of the columns crawled as string, contains json :
{
"key": "Category",
"value": {
"string_value": "something"
}
I need to unnest these and flatten them to be able to use them in a query. I require key and string value (so in my query it will be where Category = something
I have tried the following :
WITH dataset AS (
SELECT cast(json_column as json) as json_column
from "thedatabase"
LIMIT 10
)
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset
which is returning null.
Casting the json_column as json adds \ into them :
"[{\"key\":\"something\",\"value\":{\"string_value\":\"app\"}}
If I use replace on the json, it doesn't allow me as it's not a varchar object.
So how do I extract the values from the some_column field?
Presto's json_extract_scalar actually supports extracting just from the varchar (string) value :
-- sample data
WITH dataset(json_column) AS (
values ('{
"key": "Category",
"value": {
"string_value": "something"
}}')
)
--query
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset;
Output:
string_value
something
Casting to json will encode data as json (in case of string you will get a double encoded one), not parse it, use json_parse (in this particular case it is not needed, but there are cases when you will want to use it):
-- query
SELECT
json_extract_scalar(json_parse(json_column), '$.value.string_value') AS string_value
FROM dataset;

Json Arrays of objects PostgreSQL Table format

I have a JSON file (array of objects) which I have to convert into a table format using a PostgreSQL query.
Follow Sample Data.
"b", "c", "d", "e" are to be extracted as separate tables as they are arrays and in these arrays, there are objects
I have tried using json_populate_recordset() but it only works if I have a single array.
[{a:"1",b:"2"},{a:"10",b:"20"}]
I have referred to some links and codes.
jsonb_array_element example
postgreSQL functions
Expected Output
Sample Data:
{
"b":[
{columnB1:value, columnB2:value},
{columnB1:value, columnB2:value},
],
"c":[
{columnC1:value, columnC2:value, columnC3:value},
{columnC1:value, columnC2:value, columnC3:value},
{columnC1:value, columnC2:value, columnC3:value}
],
"d":[
{columnD1:value, columnD2:value},
{columnD1:value, columnD2:value},
],
"e":[
{columnE1:value, columnE2:value},
]
}
expected output
b should be one table in which columnA1 and columnA2 are displayed with their values.
Similarly table c, d, e with their respective columns and values.
Expected Output
You can use jsonb_to_recordset() but you need to unnest your JSON. You need to do this inline as this is a JSON Processing Function which cannot used derived values.
I am using validated JSON as simplified and formatted at end of this answer
To unnest your JSON use below notation which extracts JSON object field with the given key.
--one level
select '{"a":1}'::json->'a'
result : 1
--two levels
select '{"a":{"b":[2]}}'::json->'a'->'b'
result : [2]
We now expand this to include json_to_recordset()
select * from
json_to_recordset(
'{"a":{"b":[{"f1":2,"f2":4},{"f1":3,"f2":6}]}}'::json->'a'->'b' --inner table b
)
as x("f1" int, "f2" int); --fields from table b
or using json_array_elements. Either way we need to list our fields. With second solution type will be json not int so you cant sum etc
with b as (select json_array_elements('{"a":{"b":[{"f1":2,"f2":4},{"f1":3,"f2":6}]}}'::json->'a'->'b') as jx)
select jx->'f1' as f1, jx->'f2' as f2 from b;
Output
f1 f2
2 4
3 6
We now use your data structure in jsonb_to_recordset()
select * from jsonb_to_recordset( '{"a":{"b":[{"columnname1b":"value1b","columnname2b":"value2b"},{"columnname1b":"value","columnname2b":"value"}],"c":[{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"}]}}'::jsonb->'a'->'b') as x(columnname1b text, columnname2b text);
Output:
columnname1b columnname2b
value1b value2b
value value
For table c
select * from jsonb_to_recordset( '{"a":{"b":[{"columnname1b":"value1b","columnname2b":"value2b"},{"columnname1b":"value","columnname2b":"value"}],"c":[{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"}]}}'::jsonb->'a'->'c') as x(columnname1 text, columnname2 text);
Output
columnname1 columnname2
value value
value value
value value
Sample JSON
{
"a": {
"b": [
{
"columnname1b": "value1b",
"columnname2b": "value2b"
},
{
"columnname1b": "value",
"columnname2b": "value"
}
],
"c": [
{
"columnname1": "value",
"columnname2": "value"
},
{
"columnname1": "value",
"columnname2": "value"
},
{
"columnname1": "value",
"columnname2": "value"
}
]
}
}
Well, I came up with some ideas, here is one that worked. I was able to get one table at a time.
https://www.postgresql.org/docs/9.5/functions-json.html
I am using json_populate_recordset.
The column used in the first select statement comes from a table whose column is a JSON type which we are trying to extract into a table.
The 'tablename from column' in the json_populate_recordset function, is the table we are trying to extract followed with b its columns and datatypes.
WITH input AS(
SELECT cast(column as json) as a
FROM tablename
)
SELECT b.*
FROM input c,
json_populate_recordset(NULL::record,c.a->'tablename from column') as b(columnname1 datatype, columnname2 datatype)

PostgreSQL JSON - String_agg in json data with multiple objects

I have create table with Json datatype field in PostgreSQL.
Also i have inserted data with multiple object in json data
like this
[
{
"Type": "1",
"Amount": "1000",
"Occurrence": 2,
"StartDate": "1990-01-19",
"EndDate": "1999-04-03"
},
{
"Type": "2",
"Amount": "2000",
"Occurrence": 2,
"StartDate": "1984-11-19",
"EndDate": "1997-09-29"
}
]
Now i have to retrieve my data as per below formate in single row like string_agg() function output.
Type Amount
1--2 1000-2000
also i have checked inbuilt function for json in PostgreSQL (https://www.postgresqltutorial.com/postgresql-json/) but not find any solutions for the same.
You will have to unnest the array and aggregate the individual keys:
The following assumes you have some kind of primary key column on the table (in addition to your JSON column):
select t.id,
string_agg(x.element ->> 'Type', '-') as types,
string_agg(x.element ->> 'Amount', '-') as amounts
from the_table t
cross join jsonb_array_elements(t.data) as x(element)
group by t.id;
jsonb_array_elements() extracts each array element and the main query then aggregates that back per ID using string_agg().
The above assumes your column is defined with the type jsonb (which it should be). If it is not, you need to use json_array_elements() instead.
Online example

Select from JSON Array postgresql JSON column

I have the following JSON stored in a PostgreSQL JSON column
{
"status": "Success",
"message": "",
"data": {
"serverIp": "XXXX",
"ruleId": 32321,
"results": [
{
"versionId": 555555,
"PriceID": "8abf35ec-3e0e-466b-a4e5-2af568e90eec",
"price": 550,
"Convert": 0.8922953080331764,
"Cost": 10
}
]
}
}
I would like to search for a specific priceID across the entire JSON column (name info) and select the entire results element by the PriceID.
How do i do that in postgresql JSON?
One option uses exists and json(b)_array_elements(). Assuming that your table is called mytable and that the jsonb column is mycol, this would look like:
select t.*
from mytable t
where exists (
select 1
from jsonb_array_elements(t.mycol -> 'data' -> 'results') x(elt)
where x.elt ->> 'PriceID' = '8abf35ec-3e0e-466b-a4e5-2af568e90eec'
)
In the subquery, jsonb_array_elements() unnest the json array located at the given path. Then, the where clause ensures that at least one elment in the array has the given PriceID.
If your data is of json datatype rather than jsonb, you need to use json_array_elements() instead of jsonb_array_elements().
If you want to display some information coming from the matching array element, then it is different. You can use a lateral join instead of exists. Keep in mind, though, that this will duplicate the rows if more than one array element matches:
select t.*, x.elt ->> 'price' price
from mytable t
cross join lateral jsonb_array_elements(t.mycol -> 'data' -> 'results') x(elt)
where x.elt ->> 'PriceID' = '8abf35ec-3e0e-466b-a4e5-2af568e90eec'

Query data inside an attribute array in a json column in Postgres 9.6

I have a table say types, which had a JSON column, say location that looks like this:
{ "attribute":[
{
"type": "state",
"value": "CA"
},
{
"type": "distance",
"value": "200.00"
} ...
]
}
Each row in the table has the data, and all have the "type": "state" in it. I want to just extract the value of "type": "state" from every row in the table, and put it in a new column. I checked out several questions on SO, like:
Query for element of array in JSON column
Index for finding an element in a JSON array
Query for array elements inside JSON type
but could not get it working. I do not need to query on this. I need the value of this column. I apologize in advance if I missed something.
create table t(data json);
insert into t values('{"attribute":[{"type": "state","value": "CA"},{"type": "distance","value": "200.00"}]}'::json);
select elem->>'value' as state
from t, json_array_elements(t.data->'attribute') elem
where elem->>'type' = 'state';
| state |
| :---- |
| CA |
dbfiddle here
I mainly use Redshift where there is a built-in function to do this. So on the off-chance you're there, check it out.
redshift docs
It looks like Postgres has a similar function set:
https://www.postgresql.org/docs/current/static/functions-json.html
I think you'll need to chain three functions together to make this work.
SELECT
your_field::json->'attribute'->0->'value'
FROM
your_table
What I'm trying is a json extract by key name, followed by a json array extract by index (always the 1st, if your example is consistent with the full data), followed finally by another extract by key name.
Edit: got it working for your example
SELECT
'{ "attribute":[
{
"type": "state",
"value": "CA"
},
{
"type": "distance",
"value": "200.00"
}
]
}'::json->'attribute'->0->'value'
Returns "CA"
2nd edit: nested querying
#McNets is the right, better answer. But in this dive, I discovered you can nest queries in Postgres! How frickin' cool!
I stored the json as a text field in a dummy table and successfully ran this:
SELECT
(SELECT value FROM json_to_recordset(
my_column::json->'attribute') as x(type text, value text)
WHERE
type = 'state'
)
FROM dummy_table