In SQL Server 2017, I'd like to "SELECT" a JSON object embedded within another as a string so we can store/process them later.
eg JSON:
[
{"key1":"value1",
"level2_Obj":{"key2":"value12"}
},
{"key1":"value2",
"level2_Obj":{"key22":"value22"}
},
]
From above JSON, I'd like to SELECT whole of the level2Obj JSON object, see below for what I'd like to see the "selection" result.
value1 |{"key2" :"value12"}
value2 |{"key22":"value22"}
I tried below with no luck:
SELECT * FROM
OPENJSON(#json,'$."data1"')
WITH(
[key1] nvarchar(50),
[embedded_json] nvarchar(max) '$."level2Obj"'
) AS DAP
Can some one please help how I select the contents of the 2nd level JSON object as a string?
The idea is to Write 1st level JSON properties into individual cells and rest of JSON levels into a single column of type nvarchar(max) (i.e whole of sub-level JSON object into a single column as a string for further processing in later stages).
Good day,
Firstly, Your JSON text is not properly formatted. There is extra comma after the last object in the array. I will remove this extra comma for the sake of the answer, but if this is the format you have then first step will be to clear the text and make sure that is is well formatted.
Please check if this solve your needs:
declare #json nvarchar(MAX) = '
[
{
"key1":"value1",
"level2_Obj":{"key2":"value12"}
}
,
{
"key1":"value2",
"level2_Obj":{"key22":"value22"}
}
]
'
SELECT JSON_VALUE (t1.[value], '$."key1"'), JSON_QUERY (t1.[value], '$."level2_Obj"')
FROM OPENJSON(#json,'$') t1
Related
Imported database tables :
id | JSON
-------------|---------
Signed 32int | Raw JSON
It is easier to search via the properties of the JSON data than by id of the row itself. Each piece of JSON data contains (for this demo):
json: {
displayProperties: {},
hash: "foo"
itemType: "bar"
}
When I select I would like to matching hash, and then filter those results by a matching itemType.
My query :
SELECT json_extract(ItemDefinition.json, '$')
FROM ItemDefinition, json_tree(ItemDefinition.json, '$')
WHERE json_tree.key = 'hash' AND json_tree.value IN ${hashList}
However this returns every item that has a matching hash value. From here, I would like to also filter by key: itemType and value: "19". So I tried :
SELECT json_extract(ItemDefinition.json, '$')
FROM ItemDefinition, json_tree(ItemDefinition.json, '$')
WHERE json_tree.key = 'hash' AND json_tree.value IN ${hashList}
AND WHERE json_tree.key = 'itemType' AND json_tree.value = 19
But this isn't syntactically correct, let alone output what I am looking for. Error:
SQLITE_ERROR: near "WHERE": syntax error
The title of the question turned out to not be accurate to what I was looking for. I miss-understood what json_tree actually did. json_tree actually builds a new object with values that are filled in by the database.
What I was actually looking for was to filter by a specific value in the json column, which can be achieved by json_extract. json_extract('{column}', $.{filterValue}) will pull the raw json object out of the json column
This is the query that is working for me now:
SELECT json_extract(ItemDefinition.json, '$')
FROM ItemDefinition, json_tree(ItemDefinition.json, '$')
WHERE json_tree.key = 'hash'
AND json_tree.value IN ${hashList}
AND json_extract(ItemDefinition.json, '$.itemType') = 19
This selects the json column from ItemDefinition
Creates a json_tree from the json column
Filters results by json tree key and value
Finally filters by the property itemType from the raw json column
I am on Presto 0.273 and I have a complex JSON data from which I am trying to extract only specific values.
First, I ran SELECT JSON_EXTRACT(library_data, '.$books') which gets me all the books from a certain library. The problem is this returns an array of JSON objects that look like this:
[{
"book_name":"abc",
"book_size":"453",
"requestor":"27657899462"
"comments":"this is a comment"
}, {
"book_name":"def",
"book_size":"354",
"requestor":"67657496274"
"comments":"this is a comment"
}, ...
]
I would like the code to return just a list of the JSON objects, not an array. My intention is to later be able to loop through the JSON objects to find ones from a specific requester. Currently, when I loop through the given arrays using python, I get a range of errors around this data being a Series, hence trying to extract it properly rather.
I tried this SELECT JSON_EXTRACT(JSON_EXTRACT(data, '$.domains'), '$[0]') but this doesn't work because the index position of the object needed is not known.
I also tried SELECT array_join(array[books], ', ') but getting "Error casting array element to VARCHAR " error.
Can anyone please point me in the right direction?
Cast to array(json):
SELECT CAST(JSON_EXTRACT(library_data, '.$books') as array(json))
Or you can use it in unnest to flatten it to rows:
SELECT *,
js_obj -- will contain single json object
FROM table
CROSS JOIN UNNEST CAST(JSON_EXTRACT(library_data, '.$books') as array(json)) as t(js_obj)
I'm working with SQL Presto in Athena and in a table I have a column named "data.input.additional_risk_data.basket" that has a json like this:
[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]
I need to extract some of the data there, for example data.input.additional_risk_data.basket.val.item_reference. I'm not used to working with jsons but I tried a few things:
json_extract("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference')
json_extract_scalar("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference)
They all returned null. I'm wondering what is the correct way to get the values from that json
Thank you!
There are multiple "problems" with your data and json path selector. Keys are not conventional (and I have not found a way to tell athena to escape them) and your json is actually an array of json objects. What you can do - cast data to an array and process it. For example:
-- sample data
WITH dataset (json_val) AS (
VALUES (json '[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]')
)
--query
select arr[1]['data.input.additional_risk_data.basket.val.item_reference'] item_reference -- or use unnest if there are actually more than 1 element in array expected
from(
select cast(json_val as array(map(varchar, json))) arr
from dataset
)
Output:
item_reference
"26484651"
I've got some problems with extracting values from nested json values in column.
I've got a column of data with values that looks almost like nested json, but some of jsons got \ between values and I need to clean them.
JSON looks like this:
{"mopub_json":
"{\"currency\":\"USD\",
\"country\":\"US\",
\"publisher_revenue\":0.01824}
"}
I need to get currency and publisher revenue as different columns and try this:
SET json_serialization_enable TO true;
SET json_serialization_parse_nested_strings TO true;
SELECT
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'publisher_revenue') as revenue_mopub,
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'currency') as currency_mopub
FROM(
SELECT replace(column_name, "\t", '')
FROM table_name)
I receive the next error:
[Amazon](500310) Invalid operation: column "\t" does not exist in events
When I'm trying this:
SET json_serialization_parse_nested_strings TO true;
SELECT
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'publisher_revenue') as revenue_mopub,
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'currency') as currency_mopub
FROM(
SELECT replace(column_name, chr(92), '')
FROM table_name)
I receive
Invalid operation: JSON parsing error
When I'm trying to extract values without replacing , I'm receiving empty columns.
Thank you for your help!
So your json isn't valid. JSON doesn't allow multiline text strings but I expect that the issue. Based on your query I think you don't want a single key and string but the whole structure. The reason the that quotes are backslashed is because they are inside a string. The json should look like:
{
"mopub_json": {
"currency": "USD",
"country": "US",
"publisher_revenue": 0.01824
}
}
Then the SQL you have should work.
I need to pull some GUIDs out of a json string in SQL Server. An example of what the string might look like is as follows:
{"priorityArea":"a273b556-f0ab-4d7a-97ac-ddb7dab06130","priority":"Ensure best possible provision for pupils with specific behaviour issues","startDatePicker":"10/05/2019","deadlineDatePicker":"18/09/2019","userPicker":"48698,48693","actionWidget-1555338252504":"85e3ad8f-2586-4612-a9e7-e1c9d3f66181,6b66328f-c13f-4d8c-81ec-fccb8c1caa6e","resourceWidget-1557502650616":"98714348-cf7d-4583-89d5-c7d61cafea72","sdpGrade-1555338253145":"4"}
The GUID(s) I need is the one that comes after 'resourceWidget-[number]'. I would struggle with this even if the json string looked the same everytime, but there are further challenges:
The position of resourceWidget changes in the string depending on front-end behaviour
The unique number that comes after 'resourceWidget-' changes in every string
Sometimes more than one resource GUID is returned in the string, e.g.
resourceWidget-1555338252504":"98714348-cf7d-4583-89d5-c7d61cafea72, 87ea276b-5b7f-4b44-b05e-775e9fd2690c
If anyone is able to help, it would be much appreciated.
Seems like a simple OPENJSON call and a WHERE would work:
DECLARE #JSON nvarchar(MAX) = N'{
"priorityArea": "a273b556-f0ab-4d7a-97ac-ddb7dab06130",
"priority": "Ensure best possible provision for pupils with specific behaviour issues",
"startDatePicker": "10/05/2019",
"deadlineDatePicker": "18/09/2019",
"userPicker": "48698,48693",
"actionWidget-1555338252504": "85e3ad8f-2586-4612-a9e7-e1c9d3f66181,6b66328f-c13f-4d8c-81ec-fccb8c1caa6e",
"resourceWidget-1557502650616": "98714348-cf7d-4583-89d5-c7d61cafea72",
"sdpGrade-1555338253145": "4"
}';
SELECT TRY_CONVERT(uniqueidentifier,[value]) AS resourceWidget
FROM OPENJSON(#JSON)
WHERE [key] LIKE N'resourceWidget-%';
If the JSON can contain a delimited string, add a STRING_SPLIT:
DECLARE #JSON nvarchar(MAX) = N'{
"priorityArea": "a273b556-f0ab-4d7a-97ac-ddb7dab06130",
"priority": "Ensure best possible provision for pupils with specific behaviour issues",
"startDatePicker": "10/05/2019",
"deadlineDatePicker": "18/09/2019",
"userPicker": "48698,48693",
"actionWidget-1555338252504": "85e3ad8f-2586-4612-a9e7-e1c9d3f66181,6b66328f-c13f-4d8c-81ec-fccb8c1caa6e",
"resourceWidget-1555338252504":"98714348-cf7d-4583-89d5-c7d61cafea72, 87ea276b-5b7f-4b44-b05e-775e9fd2690c",
"sdpGrade-1555338253145": "4"
}';
SELECT TRY_CONVERT(uniqueidentifier,TRIM(SS.[value])) AS resourceWidget --TRIM because your example has a leading space
FROM OPENJSON(#JSON) OJ
CROSS APPLY STRING_SPLIT(OJ.[value],',') SS
WHERE OJ.[key] LIKE N'resourceWidget-%';