Issues creating Json object on snowflake using Sql - sql

I am new to Snowflake and I am trying to create a new table from an existing table and converting some rows into json format.
what I have on snowflake (table name: lex)
area
source
type
date
One
modak
good
2021
what I want to achieve on snowflake
area
sources
One
[{"source": "modak","period": {"type": "good","date": "2021"} }]
Any direction on how to go about it using SQl will be appreciated.

You can use object_construct to produce JSON objects:
select
area, array_construct( object_construct( 'source', source, 'period', object_construct( 'type',type, 'date',date )) ) sources
from
values ('One','modak','good','2021') tmp (area, source,type,date);
+------+-------------------------------------------------------------------------+
| AREA | SOURCES |
+------+-------------------------------------------------------------------------+
| One | [ { "period": { "date": "2021", "type": "good" }, "source": "modak" } ] |
+------+-------------------------------------------------------------------------+
I also used array_construct to add the brackets ([])
OBJECT_CONSTRUCT:
https://docs.snowflake.com/en/sql-reference/functions/object_construct.html
ARRAY_CONSTRUCT:
https://docs.snowflake.com/en/sql-reference/functions/array_construct.html
PS: Order of the JSON elements are not important when accessing them:
select parse_json(j):source, parse_json(j):period.type
from
values
('{ "period": { "date": "2021", "type": "good" }, "source": "modak" }'),
('{"source": "modak","period": {"type": "good","date": "2021"} }') tmp(j);
+----------------------+---------------------------+
| PARSE_JSON(J):SOURCE | PARSE_JSON(J):PERIOD.TYPE |
+----------------------+---------------------------+
| "modak" | "good" |
| "modak" | "good" |
+----------------------+---------------------------+

So the complete answer is:
WITH lex AS (
SELECT * FROM VALUES
('One', 'modak', 'good', 2021)
v(area, source, type, date)
)
SELECT l.area,
ARRAY_AGG(OBJECT_CONSTRUCT('source', l.source, 'period', OBJECT_CONSTRUCT('type', l.type, 'date', l.date))) as sources
FROM lex AS l
GROUP BY 1
ORDER BY 1;
which gives:
AREA
SOURCES
One
[ { "period": { "date": 2021, "type": "good" }, "source": "modak" } ]
Gokhan's answer show how to build a static array with ARRAY_CONSTRUCT, but if you have a dynamic input like:
WITH lex AS (
SELECT * FROM VALUES
('One', 'modak', 'good', 2021),
('One', 'modak', 'bad', 2022),
('Two', 'modak', 'good', 2021)
v(area, source, type, date)
)
And you want, the below, you will need to use ARRAY_AGG
AREA
SOURCES
One
[ { "period": { "date": 2021, "type": "good" }, "source": "modak" }, { "period": { "date": 2022, "type": "bad" }, "source": "modak" } ]
Two
[ { "period": { "date": 2021, "type": "good" }, "source": "modak" } ]

Related

select node value from json column type

A table I called raw_data with three columns: ID, timestamp, payload, the column paylod is a json type having values such as:
{
"data": {
"author_id": "1461871206425108480",
"created_at": "2022-08-17T23:19:14.000Z",
"geo": {
"coordinates": {
"type": "Point",
"coordinates": [
-0.1094,
51.5141
]
},
"place_id": "3eb2c704fe8a50cb"
},
"id": "1560043605762392066",
"text": " ALWAYS # London, United Kingdom"
},
"matching_rules": [
{
"id": "1560042248007458817",
"tag": "london-paris"
}
]
}
From this I want to select rows where the coordinates is available, such as [-0.1094,51.5141]in this case.
SELECT *
FROM raw_data, json_each(payload)
WHERE json_extract(json_each.value, '$.data.geo.') IS NOT NULL
LIMIT 20;
Nothing was returned.
EDIT
NOT ALL json objects have the coordinates node. For example this value:
{
"data": {
"author_id": "1556031969062010881",
"created_at": "2022-08-18T01:42:21.000Z",
"geo": {
"place_id": "006c6743642cb09c"
},
"id": "1560079621017796609",
"text": "Dear Desperate sister say husband no dey oo."
},
"matching_rules": [
{
"id": "1560077018183630848",
"tag": "kaduna-kano-katsina-dutse-zaria"
}
]
}
The correct path is '$.data.geo.coordinates.coordinates' and there is no need for json_each():
SELECT *
FROM raw_data
WHERE json_extract(payload, '$.data.geo.coordinates.coordinates') IS NOT NULL;
See the demo.

How to parse this JSON file in Snowflake?

So I have a column in a Snowflake table that stores JSON data but the column is of a varchar data type.
The JSON looks like this:
{
"FLAGS": [],
"BANNERS": {},
"TOOLS": {
"game.appConfig": {
"type": [
"small",
"normal",
"huge"
],
"flow": [
"control",
"noncontrol"
]
}
},
"PLATFORM": {}
}
I want to filter only the data inside TOOLS and want to get the following result:
TOOLS_ID
TOOLS
game.appConfig
type
game.appConfig
flow
How can I achieve this?
I assumed that the TOOLs can have more than one tool ID, so I wrote this query:
with mydata as ( select
'{
"FLAGS": [],
"BANNERS": {},
"TOOLS": {
"game.appConfig": {
"type": [
"small",
"normal",
"huge"
],
"flow": [
"control",
"noncontrol"
]
}
},
"PLATFORM": {}
}' as v1 )
select main.KEY TOOLS_ID, sub.KEY TOOLS
from mydata,
lateral flatten ( parse_json(v1):"TOOLS" ) main,
lateral flatten ( main.VALUE ) sub;
+----------------+-------+
| TOOLS_ID | TOOLS |
+----------------+-------+
| game.appConfig | flow |
| game.appConfig | type |
+----------------+-------+
Assuming the column name is C1 and table name T1:
select a.t:"TOOLS":"game.appConfig"::string from (select
parse_json(to_variant(C1))t from T1) a

Select only the jsonb object keys which are filtered and ordered by values in child object

in postgres 13, I have a Jsonb object and I am able to get only the keys using jsonb_object_keys like this.
SELECT keys from jsonb_object_keys('{
"135": {
"timestamp": 1659010504308,
"priority": 5,
"age": 20
},
"136": {
"timestamp": 1659010504310,
"priority": 2,
"age": 20
},
"137": {
"timestamp": 1659010504312,
"priority": 2,
"age": 20
},
"138": {
"timestamp": 1659010504319,
"priority": 1,
"age": 20
}}') as keys
Now, I want to get the keys which have priority more than 1 and which are ordered by priority and timestamp
I am able to achieve this using this query
select key from (
SELECT data->>'key' key, data->'value' value
FROM
jsonb_path_query(
'{
"135": {
"name": "test1",
"timestamp": 1659010504308,
"priority": 5,
"age": 20
},
"136": {
"name": "test2",
"timestamp": 1659010504310,
"priority": 7,
"age": 20
},
"137": {
"name": "test3",
"timestamp": 1659010504312,
"priority": 5,
"age": 20
},
"138": {
"name": "test4",
"timestamp": 1659010504319,
"priority": 1,
"age": 20
}}'
, '$.keyvalue() ? (#.value.priority > 1)')
as data) as foo, jsonb_to_record(value) x("name" text, "timestamp" decimal,
"priority" int,
"age" int)
order by priority desc, timestamp desc
This doesn't seem to be the efficient way of doing this.
Please share if this can be achieved in a better way (by using jsonb_object_keys !??)
Thanks in advance.
I would first 'normalize' JSON data into a table (the t CTE) and then do a trivial select.
with t (key, priority, ts) as
(
select key, (value ->> 'priority')::integer, value ->> 'timestamp'
from jsonb_each('{
"135": {"timestamp": 1659010504308,"priority": 5,"age": 20},
"136": {"timestamp": 1659010504310,"priority": 2,"age": 20},
"137": {"timestamp": 1659010504312,"priority": 2,"age": 20},
"138": {"timestamp": 1659010504319,"priority": 1,"age": 20}
}')
)
select key
from t
where priority > 1
order by priority, ts;

How to extract a value in a JSON table on BigQuery?

I have a JSON table which has over 30.000 rows. There are different rows like this:
JSON_columns
------------
{
"level": 20,
"nickname": "ABCDE",
"mission_name": "take_out_the_trash",
"mission_day": "150",
"duration": "0",
"properties": []
}
{
"nickname": "KLMNP",
"mission_name": "recycle",
"mission_day": "180",
"properties": [{
"key": "bottle",
"value": {
"string_value": "blue_bottle"
}
}, {
"key": "bottleRecycle",
"value": {
"string_value": "true"
}
}, {
"key": "price",
"value": {
"float_value": 21.99
}
}, {
"key": "cost",
"value": {
"float_value": 15.39
}
}]
}
I want to take the sum of costs the table. But firtsly, I want to extract the cost from the table.
I tried the code below. It returns null:
SELECT JSON_VALUE('$.properties[3].value.float_value') AS profit
FROM `missions.missions_study`
WHERE mission_name = "recycle"
My question is, how can I extract the cost values right, and sum them?
Common way to extract cost from your json is like below.
WITH sample_table AS (
SELECT '{"level":20,"nickname":"ABCDE","mission_name":"take_out_the_trash","mission_day":"150","duration":"0","properties":[]}' json
UNION ALL
SELECT '{"nickname":"KLMNP","mission_name":"recycle","mission_day":"180","properties":[{"key":"bottle","value":{"string_value":"blue_bottle"}},{"key":"bottleRecycle","value":{"string_value":"true"}},{"key":"price","value":{"float_value":21.99}},{"key":"cost","value":{"float_value":15.39}}]}' json
)
SELECT SUM(cost) AS total FROM (
SELECT CAST(JSON_VALUE(prop, '$.value.float_value') AS FLOAT64) AS cost
FROM sample_table, UNNEST(JSON_QUERY_ARRAY(json, '$.properties')) prop
WHERE JSON_VALUE(json, '$.mission_name') = 'recycle'
AND JSON_VALUE(prop, '$.key') = 'cost'
);

User selected values from JSON

When a user fills out a form in a mobile application a json is created. I load this json into a postgres database and wanting to pull is apart and select the inputs that the user has selected.
I find this hard to explain you really need to see the json and the expected results. The json looks like this...
{
"iso_created_at":"2019-06-25T14:50:59+10:00",
"form_fields":[
{
"field_type":"DateAndTime",
"mandatory":false,
"form_order":0,
"editable":true,
"visibility":"public",
"label":"Time & Date Select",
"value":"2019-06-25T14:50:00+10:00",
"key":"f_10139_64_14",
"field_visibility":"public",
"data":{
"default_to_current":true
},
"id":89066
},
{
"field_type":"Image",
"mandatory":false,
"form_order":6,
"editable":true,
"visibility":"public",
"label":"Photos",
"value":[
],
"key":"f_10139_1_8",
"field_visibility":"public",
"data":{
},
"id":67682
},
{
"field_type":"DropDown",
"mandatory":true,
"form_order":2,
"editable":true,
"visibility":"public",
"label":"Customer ID",
"value":"f_10139_35_13_35_1",
"key":"f_10139_35_13",
"field_visibility":"public",
"data":{
"options":[
{
"is_default":false,
"display_order":0,
"enabled":true,
"value":"f_10139_35_13_35_1",
"label":"27"
}
],
"multi_select":false
},
"id":86039
},
{
"field_type":"CheckBox",
"mandatory":true,
"form_order":3,
"editable":true,
"visibility":"public",
"label":"Measure",
"value":[
"f_7422_10_7_10_1",
"f_7422_10_7_10_2"
],
"key":"f_10139_1_5",
"field_visibility":"public",
"data":{
"options":[
{
"is_default":true,
"display_order":0,
"enabled":true,
"value":"f_7422_10_7_10_1",
"label":"Kg"
},
{
"is_default":true,
"display_order":0,
"enabled":true,
"value":"f_7422_10_7_10_2",
"label":"Mm"
}
],
"multi_select":true
},
"id":67679
},
{
"field_type":"ShortTextBox",
"mandatory":true,
"form_order":4,
"editable":true,
"visibility":"public",
"label":"Qty",
"value":"1000",
"key":"f_10139_9_9",
"field_visibility":"public",
"data":{
},
"id":85776
}
],
"address":"Latitude: -37.811812 Longitude: 144.971745",
"shape_id":6456,
"category_id":75673,
"id":345,
"account_id":778
}
Can anyone help me?
Expected results:
account_id | report_id | field_label | field_value
------------------------------------------------------------------------
778 | 345 | Time & Date Select | 2019-06-25T14:50:00+10:00
778 | 345 | Photos | []
778 | 345 | Customer ID | 27
778 | 345 | Measure | Kg
778 | 345 | Measure | Mm
778 | 345 | Qty | 1000
like say #Amadan, might need to hardcode each field separately instead of making a clever loop ,especially with the fields "Customer ID" "and" "Measure" or any other that requires it.
You can use the json function: json_array_elements_text, here you have a example, you can adjust to your case, i am trying with your case:
select account_id::text,report_id::text,field_label::text,
--case ""Customer ID"" and ""Measure""
case
when field_label::text='"Customer ID"' then ((todo->'data'->'options')->0->'label')::text
when field_label::text='"Measure"' then ((todo->'data'->'options')->0->'label')::text ||',' ||((todo->'data'->'options')->1->'label')::text
else
field_value::text
end as field_value
from
(
select dato->'account_id' as account_id,dato->'id' as report_id,
(json_array_elements_text(dato->'form_fields')::json)->'label' as field_label,
(json_array_elements_text(dato->'form_fields')::json)->'value' as field_value,
(json_array_elements_text(dato->'form_fields')::json) as todo
from (
select '{"iso_created_at": "2019-06-25T14:50:59+10:00", "form_fields": [
{"field_type": "DateAndTime", "mandatory": false, "form_order": 0, "editable": true, "visibility": "public", "label": "Time & Date Select", "value": "2019-06-25T14:50:00+10:00", "key": "f_10139_64_14", "field_visibility": "public", "data":
{"default_to_current": true}, "id": 89066},
{"field_type": "Image", "mandatory": false, "form_order": 6, "editable": true, "visibility": "public", "label": "Photos", "value": [], "key": "f_10139_1_8", "field_visibility": "public", "data": {}, "id": 67682},
{"field_type": "DropDown", "mandatory": true, "form_order": 2, "editable": true, "visibility": "public", "label": "Customer ID", "value": "f_10139_35_13_35_1", "key": "f_10139_35_13", "field_visibility": "public", "data": {"options": [{"is_default": false, "display_order": 0, "enabled": true, "value": "f_10139_35_13_35_1", "label": "27"}], "multi_select": false}, "id": 86039},
{"field_type": "CheckBox", "mandatory": true, "form_order": 3, "editable": true, "visibility": "public", "label": "Measure", "value": ["f_7422_10_7_10_1","f_7422_10_7_10_2"], "key": "f_10139_1_5", "field_visibility": "public", "data": {"options": [{"is_default": true, "display_order": 0, "enabled": true, "value": "f_7422_10_7_10_1", "label": "Kg"},{"is_default": true, "display_order": 0, "enabled": true, "value": "f_7422_10_7_10_2", "label": "Mm"}], "multi_select": true}, "id": 67679},
{"field_type": "ShortTextBox", "mandatory": true, "form_order": 4, "editable": true, "visibility": "public", "label": "Qty", "value": "1000", "key": "f_10139_9_9", "field_visibility": "public", "data": {}, "id": 85776}
], "address": "Latitude: -37.811812 Longitude: 144.971745", "shape_id": 6456, "category_id": 75673, "id": 345, "account_id": 778}'::json as dato) as dat
) dat2
and i get this result, similar to you:
take this example and ajust to you
regards
You need to unnest the values in form_fields and then pick the label and the value from that JSON object:
select fd.account_id,
fd.report_id,
ff.field ->> 'label' as field_label,
ff.field ->> 'value' as field_value
from form_data fd
left join jsonb_array_elements(data -> 'form_fields') as ff(field) on true;
The left join is needed to still see the row from form_data even if no form_fields is available in the main JSON column.
The above assumes a table form_data with the columns account_id, report_id and data (which contains the JSON)
Online example: https://rextester.com/RNIBSB94484